Startup breaks through “accent barrier” with real-time translator

The accent translator can integrate into Zoom, WhatsApp, or phone calls.

After struggling to understand each other’s accents, three Stanford students — from China, Russia, and Venezuela — developed a technology that can listen to English spoken with one accent and replay it with another.

They’ve now formed a startup, Sanas, to release the tech, which they say is the world’s first real-time speech accent translator.

The challenge: Of the 1.5 billion people who know English, more than 1 billion speak it as a second language. Those who speak it as a first language hail from the U.S., the U.K., Ireland, Australia, and other regions with their own unique pronunciations of English words.

Given all of that, it’s easy to see how two people can both be speaking English and still have difficulty communicating due to accents, from either their home region or their home language.

“We knew from our own experience that forcing a different accent on yourself is uncomfortable.”

Andres Perez Soderi

Speech therapy can help non-native speakers lose their accents, but it takes a long time and doesn’t work for everyone, and some people would rather not “fake” a local accent.

“[W]e knew from our own experience that forcing a different accent on yourself is uncomfortable,” Sanas CFO Andres Perez Soderi told IEEE Spectrum. “I went to a British high school and tried to force a British accent; it was an experience that was hard to digest.”

How it works: Rather than trying to change how people speak, the students decided to train an accent translator algorithm. First, they had to feed it a lot of recordings of the exact same phrases spoken with different accents.

“You aren’t just doing audio signal processing, changing the pitch and tone — you have to change the phonetics,” Sanas CTO Shawn Zhang explained.

“So we really needed parallel data sets, created by readers using the same source material, so the neural network could learn to map from one to the other, examining both to learn how to transform the pronunciation,” he continued.

accent translator
A preview of what users see when using the accent translator. Credit: Sanas

Their finished accent translator works for five accents: American, British, Australian, Filipino, and Spanish — you could say something in Spanish-accented English, for example, and have it translated into a British accent.

It has a 150-millisecond delay (about one-sixth of a second), runs directly on a person’s computer (not in the cloud), and can integrate into apps such as  Zoom and WhatsApp. 

The total delay experienced while using the tech depends on the app you’re communicating with — Zoom, for example, averages a 50-millisecond delay itself, so someone using the accent translator with that service would experience a total delay of 200 milliseconds.

However, Soderi told IEEE Spectrum that anything below 300 milliseconds is generally imperceptible.

It could be a boon to businesses that provide customer service and tech support over the phone.

The next steps: Sanas has secured $5.5 million in funding, which the students will use to expand their team and further develop the tech. In addition to adding other accents within English, they plan to start translating other languages, too (Spanish spoken in various accents, for example).

While the students’ personal lives may have inspired them to develop the accent translator, they think it could be a boon to many businesses, particularly those that provide customer service and technical support over the phone — they already have seven such companies piloting the tech.

“There are also creative use cases such as those in entertainment and media where producers can make their films and programs understandable in different parts of the world by matching accents to localities,” Sanas CEO Maxim Serebryakov said.

“We are also exploring how machines can better interpret what people are saying,” he continued. “We’ve only begun to explore the possibilities.”

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at tips@freethink.com.

Related
Farmers can fight invasive insects with AI and a robotic arm
As the invasive spotted lanternfly threatens to expand its range, Carnegie Mellon researchers are developing a robot to fight back.
Google unveils AI try-on feature for shopping
Google’s AI-powered virtual try-on feature lets shoppers see what an article of clothing would look like on a wide range of models.
GitHub CEO says Copilot will write 80% of code “sooner than later”
GitHub CEO Thomas Dohmke goes in depth to answer questions about how AI-powered development will change the future of innovation itself.
No, AI probably won’t kill us all – and there’s more to this fear campaign than meets the eye
A dose of scepticism is warranted when considering the AI doomsayer narrative — there are commercial incentives to manufacture fear of AI.
AI is riding to the rescue on wildfires
AI-powered systems designed to detect, confirm, and detail wildfires at the earliest possible time may help firefighters tame infernos in the West.
Up Next
Subscribe to Freethink for more great stories