Meta has unveiled a first-of-its-kind AI that can translate up to 100 languages, putting the company a step closer to its goal of creating a universal translator.
The language barrier: An estimated 7,000 languages are currently spoken across the globe, but 87% of people know just one or two of them.
Even if you know the most popular language — English — it’s only spoken by 18% of the world, meaning there is still a language barrier between you and the vast majority of people.
This barrier can do more than just make it hard for you to chat with others or order food on vacation. If you don’t speak the local language, you can have trouble getting around, finding and keeping a job, or accessing quality healthcare and education.
“The goal here is instantaneous speech-to-speech translation across all languages.”
Mark Zuckerberg
A universal translator: Science fiction has already come up with a solution to the language barrier problem. Dubbed a “universal translator,” it’s a device capable of instantly translating anything written or spoken in one language into text or speech in any other.
Meta is on a mission to make such a device for real, with the help of AI.
“The goal here is instantaneous speech-to-speech translation across all languages, even those that are mostly spoken; the ability to communicate with anyone in any language,” said CEO Mark Zuckerberg in 2022. “That’s a superpower that people dreamed of forever and AI is going to deliver that within our lifetimes.”
How it works: While there are already translation apps, they aren’t able to translate every language like sci-fi’s universal translator, and the translations they produce often contain major errors.
This inaccuracy, according to Meta, is partially due to how current translation tech takes a “cascading” approach that requires the use of multiple subsystems.
To translate spoken Spanish into spoken French, for example, the tech might use a speech recognition system to transcribe the Spanish audio into text, a text-to-text translation system to translate that into French, and then a text-to-speech system to turn that into French audio.
Meta’s newly announced translation tool, SeamlessM4T, can recognize audio or text in nearly 100 languages and translate it into text in any of those languages or speech in 36 languages, including English. And unlike other translation tools, it relies on just one AI model to complete all of these tasks.
According to Meta, this makes it “the first all-in-one multilingual multimodal AI translation and transcription model,” and it led to greater accuracy and faster translation.
“Compared to approaches using separate models, SeamlessM4T’s single system approach reduces errors and delays, increasing the efficiency and quality of the translation process,” Meta researchers wrote in a white paper.
The cold water: SeamlessM4T may be an improvement on existing systems, but it’s still not 100% accurate and is limited to just a fraction of the many languages spoken worldwide, meaning Meta may be closer to achieving the dream of a universal translator, but it isn’t there just yet.
The white paper also notes that SeamlessM4T exhibited some signs of gender bias in its translations — in languages with grammatical genders, it might offer the male form of the word “doctor,” for example, even when the context of the sentence makes it clear that the doctor is a woman. This is because the data used to train the AI wasn’t free of gender bias — better training data and filters could help alleviate that issue.
Additionally, mistakes made by SeamlessM4T introduced “toxic elements” into about 0.16% of translations. Though Meta says that is less than existing translation tech, it’s still an area that could be improved.
“This is only the latest step in our ongoing effort to build AI-powered technology that helps connect people across languages.”
Meta
Looking ahead: Meta hasn’t announced any plans to turn SeamlessM4T into a commercial product, but it has released a demo tool that allows users to say something in one language and have it translated into speech and text in up to three others.
It has also made the AI model and its code available online so that other researchers can build on the tech.
“This is only the latest step in our ongoing effort to build AI-powered technology that helps connect people across languages,” wrote Meta in a blog post. “In the future, we want to explore how this foundational model can enable new communication capabilities — ultimately bringing us closer to a world where everyone can be understood.”
We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at tips@freethink.com.