Google announced that its Translate service now supports up to 110 new languages, up from 133.
This expansion is a big step forward as the translation service now supports up to 243 languages. The PaLM 2 AI model helps the translation service learn these new languages.
The PaLM 2 AI model is effective at learning closely related languages, such as those closely related to Hindi, such as Awadhi and Marwadi, and French Creoles, such as Seychellois Creole and Mauritian Creole, said Isaac Caswell, a software engineer at Google.
Carswell said the list of new languages supported by Translate includes Cantonese, which has long been one of the languages requested by Google Translate.
“Because Cantonese often overlaps with Mandarin Chinese in writing, it is difficult to find data and training models,” Carswell said, adding that about a quarter of new languages come from Africa.
These languages include Afar, Cantonese, Manx, Nko, Punjabi (Shamouq), Berber (Berber), and Tok Pisin.
The added languages represent more than 614 million native speakers, or approximately 8% of the global population, the company said.
Google notes that these languages are in different stages of use. Some of these languages have around 100 million speakers, others have no active speakers at all, and people are trying to preserve them.
New languages include languages spoken by small indigenous communities as well as common names. Google is trying to preserve some languages, such as Manx, which almost disappeared after the death of its last native speaker in 1974.
Google said it took into account factors such as diversity, regional dialects and different spelling standards when adding language support.
“Our approach is to prioritize the most frequently used variants in each language,” Caswell said. “For example, Romani is a language that has multiple dialects across Europe.”
The addition of 110 languages to the Google Translate service comes as part of its plan to support 1,000 languages using artificial intelligence, which the company announced in 2022.
In the same year, the company expanded its support to 24 languages spoken by more than 300 million people through a technology called “imageless machine translation.”