Tech News, Magazine & Review WordPress Theme 2017
  • Blog
  • Der Digital Schamane
    • Ikigai: Das japanische Geheimnis für ein erfülltes  Leben
    • Entfesseln Sie Ihr innovatives Potenzial mit den Denkhüten von de Bono
    • Enthüllen Sie die Geheimnisse Ihres inneren Teams: Eine einfacher Leitfaden
    • Die Kunst der kollegialen Fallberatung: Förderung einer Kultur der Zusammenarbeit und des Lernens
    • Vom Träumen zur Wirklichkeit: Die Kraft der Walt Disney Methode!
  • Spiele
Montag, 13. Oktober 2025
No Result
View All Result
  • Blog
  • Der Digital Schamane
    • Ikigai: Das japanische Geheimnis für ein erfülltes  Leben
    • Entfesseln Sie Ihr innovatives Potenzial mit den Denkhüten von de Bono
    • Enthüllen Sie die Geheimnisse Ihres inneren Teams: Eine einfacher Leitfaden
    • Die Kunst der kollegialen Fallberatung: Förderung einer Kultur der Zusammenarbeit und des Lernens
    • Vom Träumen zur Wirklichkeit: Die Kraft der Walt Disney Methode!
  • Spiele
No Result
View All Result
Arbeit 4.0 und KI: die Zukunft ist jetzt!
No Result
View All Result

Meta’s new AI model can translate speech from more than 100 languages

by Scott J Mulligan
15. Januar 2025
144 6
Home AI
Share on FacebookShare on Twitter

Meta has released a new AI model that can translate speech from 101 different languages. It represents a step toward real-time, simultaneous interpretation, where words are translated as soon as they come out of someone’s mouth. 

Typically, translation models for speech use a multistep approach. First they translate speech into text. Then they translate that text into text in another language. Finally, that translated text is turned into speech in the new language. This method can be inefficient, and at each step, errors and mistranslations can creep in. But Meta’s new model, called SeamlessM4T, enables more direct translation from speech in one language to speech in another. The model is described in a paper published today in Nature. 

Seamless can translate text with 23% more accuracy than the top existing models. And although another model, Google’s AudioPaLM, can technically translate more languages—113 of them, versus 101 for Seamless—it can translate them only into English. SeamlessM4T can translate into 36 other languages.

The key is a process called parallel data mining, which finds instances when the sound in a video or audio matches a subtitle in another language from crawled web data. The model learned to associate those sounds in one language with the matching pieces of text in another. This opened up a whole new trove of examples of translations for their model.

“Meta has done a great job having a breadth of different things they support, like text-to-speech, speech-to-text, even automatic speech recognition,” says Chetan Jaiswal, a professor of computer science at Quinnipiac University, who was not involved in the research. “The mere number of languages they are supporting is a tremendous achievement.”

Human translators are still a vital part of the translation process, the researchers say in the paper, because they can grapple with diverse cultural contexts and make sure the same meaning is conveyed from one language into another. This step is important, says Lynne Bowker of the University of Ottawa’s School of Translation & Interpretation, who didn’t work on Seamless. “Languages are a reflection of cultures, and cultures have their own ways of knowing things,” she says. 

When it comes to applications like medicine or law, machine translations need to be thoroughly checked by a human, she says. If not, misunderstandings can result. For example, when Google Translate was used to translate public health information about the covid-19 vaccine from the Virginia Department of Health in January 2021, it translated “not mandatory” in English into “not necessary” in Spanish, changing the whole meaning of the message.

AI models have much more examples to train on in some languages than others. This means current speech-to-speech models may be able to translate a language like Greek into English, where there may be many examples, but cannot translate from Swahili to Greek. The team behind Seamless aimed to solve this problem by pre-training the model on millions of hours of spoken audio in different languages. This pre-training allowed it to recognize general patterns in language, making it easier to process less widely spoken languages because it already had some baseline for what spoken language is supposed to sound like.  

The system is open-source, which the researchers hope will encourage others to build upon its current capabilities. But some are skeptical of how useful it may be compared with available alternatives. “Google’s translation model is not as open-source as Seamless, but it’s way more responsive and fast, and it doesn’t cost anything as an academic,” says Jaiswal.

The most exciting thing about Meta’s system is that it points to the possibility of instant interpretation across languages in the not-too-distant future—like the Babel fish in Douglas Adams’ cult novel The Hitchhiker’s Guide to the Galaxy. SeamlessM4T is faster than existing models but still not instant. That said, Meta claims to have a newer version of Seamless that’s as fast as human interpreters. 

“While having this kind of delayed translation is okay and useful, I think simultaneous translation will be even more useful,” says Kenny Zhu, director of the Arlington Computational Linguistics Lab at the University of Texas at Arlington, who is not affiliated with the new research.

Scott J Mulligan

Next Post

What’s next for agentic AI? LangChain founder looks to ambient agents

Please login to join discussion

Recommended.

Stop guessing why your LLMs break: Anthropic’s new tool shows you exactly what goes wrong

4. Juni 2025

OpenAI just made ChatGPT Plus free for millions of college students — and it’s a brilliant competitive move against Anthropic

4. April 2025

Trending.

KURZGESCHICHTEN: Sammlung moderner Kurzgeschichten für die Schule

24. März 2025

We’ve come a long way from RPA: How AI agents are revolutionizing automation

16. Dezember 2024

Gartner: 2025 will see the rise of AI agents (and other top trends)

21. Oktober 2024

Spexi unveils LayerDrone decentralized network for crowdsourcing high-res drone images of Earth

17. April 2025

UNTERRICHT: Mit dem Growth Mindset ins neue Schuljahr

11. August 2024
Arbeit 4.0 und KI: die Zukunft ist jetzt!

Menü

  • Impressum
  • Datenschutzerklärung

Social Media

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Home
  • Review
  • Apple
  • Applications
  • Computers
  • Gaming
  • Microsoft
  • Photography
  • Security