Using Simple Google Translator on New Language
Introduction of Competition
The goal of this challenge is to build a machine translation model to translate sentences from Yorùbá language to English language in several domains like news articles, daily conversations, spoken dialog transcripts and books. Your solution will be judged by how well your translation prediction is semantically similar to the reference translation.
The translation models developed will assist human translators in their jobs, help English speakers to have better communication with native speakers of Yorùbá, and improve the automatic translation of Yorùbá web pages to English language.
This competition is one of five NLP challenges we will be hosting on Zindi as part of AI4D’s ongoing African language NLP project, and is a continuation of the African language dataset challenges we hosted earlier this year. You can read more about the work here.
Installing deep_translation libray
You can find more information on Deep Translation here.
- This library contains FREE and UNLIMITED tool to translate between different languages in a simple way using multiple translators.
- Free software: MIT license
When you should use it
- If you want to translate text using python
- If you want to translate from a file
- If you want to get translations from many sources and not only one
- If you want to automate translations
- If you want to compare different translations
- If you want to detect language automatically
For Detecting Language
Appling function of Test.csv
Using apply method for translation
Using batch function
There are other translators but by far Google is best, but for experimental purpose we can use another translator Mymemory Translator.
- microsoft translator
- Pons translator
- Linguee translator
- Yandex translator
- QCRI translator
- DeepL translator
As you can see some of them have limitations and some of them require a private API key which can be quite tedious when you want a quick fix, so I will suggest you use google as your go-to tool for translation.
- This was my first try at dealing with rare languages and I have used the Google tool for translation which is not allowed in competition but it is useful for other projects and experiments.
- I will be uploading all my solution using a pre-trained model from hugging face and how to use other tools to get better results.
- Right now I am in 4th position with BLEU: 0.458
- This notebook gave me a score of 0.38 which was not bad as I haven't fine-tuned it.
- If you want to improve the score, you can Used pre-trained model mbart by Facebook or Helsinki NLP Bert.
- mbart gave me score of 0.450 and Helsinki NLP gave me score of 0.456
For accessing the data you can participate in competition here.
All rights reserved with Zindi and AI4D and for lincensing you can contact them for more information.