In context: Machine translation technology has come a long way since its inception. Whereas the likes of Google Translate were once rough, unreliable, and only useful for the most basic translations, nowadays, they can be frighteningly accurate, thanks to the power of AI. However, some archaic translation methods still persist.
For example, over on Facebook, sentences are first translated from a base language to English and then from English to a target language. There are several reasons for this, one of which is the lack of useful AI training data for non-English language-to-language translations.
Plenty of people translate words and phrases from English to French or French to English (thus creating plenty of usable data), but far fewer translate content from, say, French to Spanish or Spanish to German.
This makes training an AI to understand the intricacies of these language-to-language translations quite a difficult process. However, according to a newly-released Facebook blog post, the social media giant has finally tackled this problem and come up with a solution.
That solution comes in the form of "M2M-100," the first-ever "multilingual machine translation model." The model can translate between "any pair" of 100 languages without relying on any English datasets whatsoever. If you doubt its effectiveness, the model is entirely open source, so you can inspect it yourself right here.
Facebook says its multilingual translation model "better [preverves] meaning" compared to what it calls "English-centric" translation systems. The company claims that M2M-100 outperforms such methods by "10 points" on the machine translation-evaluating BLEU scale.
This project has been years in the making, according to Facebook, and though there's still plenty of room for improvement, the company is satisfied with the progress it has achieved thus far.
It's unclear when or if M2M-100 will be rolling out directly to Facebook (if it hasn't already -- we'll be reaching out for clarification), but we'll update this article when we find out.
Masthead credit: Chinnapong