Google releases AI for Dolphin Talk that runs on Pixel devices

Image created by Decrypt using AI

It’s easier to talk to animals than to try and understand Gen Z.

Today, Google unveiled DolphinGemma, an open-source AI model designed to decode dolphin communication by analyzing their clicks, whistles, and burst pulses. National Dolphin Day coincided with the announcement.

This model was created by Georgia Tech in collaboration with the Wild Dolphin Project and is able to generate sound sequences that are similar to dolphin vocalizations.

It could be determined if the dolphins’ communication is at the level of a language.

Trained on the world's longest-running underwater dolphin research project, DolphinGemma leverages decades of meticulously labeled audio and video data collected by WDP since 1985.

The project has studied Atlantic Spotted Dolphins in the Bahamas across generations using a non-invasive approach they call "In Their World, on Their Terms."

"By identifying recurring sound patterns, clusters and reliable sequences, the model can help researchers uncover hidden structures and potential meanings within the dolphins' natural communication—a task previously requiring immense human effort," Google said in its announcement.

The AI, which has 400,000,000 parameters, can be run by researchers on the Pixel phone. It processes dolphin sounds using Google's SoundStream tokenizer and predicts subsequent sounds in a sequence, much like how human language models predict the next word in a sentence.

DolphinGemma doesn't operate in isolation. It is used in conjunction with CHAT, the Cetacean Hearing Augmentation Telemetry system. CHAT associates specific whistles and objects enjoyed by dolphins such as sargassums, seagrasses or scarves.

"Eventually, these patterns, augmented with synthetic sounds created by the researchers to refer to objects with which the dolphins like to play, may establish a shared vocabulary with the dolphins for interactive communication," according to Google.

The Pixel 6 is being used by field researchers to analyse dolphin sounds live.

In summer 2025, the team will upgrade to Pixel 9 device which includes speaker and mic functions and runs both deep-learning models and template matching algorithms at the same time.

A major benefit of the shift from custom hardware to smartphone technology is that it reduces costs for fieldwork in marine environments. DolphinGemma's predictive capabilities can help researchers anticipate and identify potential mimics earlier in vocalization sequences, making interactions more fluid.

Understand what you cannot understand

DolphinGemma has joined several AI initiatives that aim to crack the code of animal communications.

The Earth Species Project (ESP), a nonprofit organization, recently developed NatureLM, an audio language model capable of identifying animal species, approximate age, and whether sounds indicate distress or play—not really language, but still, ways of establishing some primitive communication.

The model, trained on a mix of human language, environmental sounds, and animal vocalizations, has shown promising results even with species it hasn't encountered before.

Project CETI is a significant new initiative in this area.

The project, led by Imperial College London’s Michael Bronstein, analyzes the complex patterns that sperm whales use to communicate over distances.

The team has identified 143 click combinations that might form a kind of phonetic alphabet, which they're now studying using deep neural networks and natural language processing techniques.

Researchers at New York University, however, have drawn inspiration from the baby’s development to develop AI.

Their Child's View for Contrastive Learning model (CVCL) learned language by viewing the world through a baby's perspective, using footage from a head-mounted camera worn by an infant from 6 months to 2 years old.

NYU’s team discovered that its AI can learn from data that is naturalistic, similar to what infants are taught. This contrasts sharply with the traditional AI models which require training using trillions of words.

Google intends to release an updated version DolphinGemma in the summer. It could be used for more than just Atlantic spotted dolphins. Still, the model may require fine-tuning for different species' vocalizations.

WDP has focused extensively on correlating dolphin sounds with specific behaviors, including signature whistles used by mothers and calves to reunite, burst-pulse "squawks" during conflicts, and click "buzzes" used during courtship or when chasing sharks.

"We're not just listening anymore," Google noted. "We're beginning to understand the patterns within the sounds, paving the way for a future where the gap between human and dolphin communication might just get a little smaller."

Sebastian Sinclair, Josh Quittner and Josh Sinclair edited the book

Lesley John

John Lesley, known as LeadZevs, is a seasoned trader with extensive expertise in technical analysis and cryptocurrency market forecasting. With over 14 years of experience across diverse markets and assets, including currencies, indices, and commodities, John has established himself as a leading voice in the trading community.

As the author of highly popular topics on major forums, which have garnered millions of views, John serves as both a skilled analyst and professional trader. He provides expert insights and trading services for clients while also managing his own trading portfolio. His deep understanding of market trends and technical indicators makes him a trusted figure in the cryptocurrency space.

Rate author
Bitcoin Recovery