Tasks

Translation

Translation is the task of converting text from one language to another.

Inputs
Input

My name is Omar and I live in Zürich.

Translation Model
Output
Output

Mein Name ist Omar und ich wohne in Zürich.

About Translation

Use Cases

You can find over a thousand Translation models on the Hub, but sometimes you might not find a model for the language pair you are interested in. When this happen, you can use a pretrained multilingual Translation model like mBART and further train it on your own data in a process called fine-tuning.

Multilingual conversational agents

Translation models can be used to build conversational agents across different languages. This can be done in two ways.

  • Translate the dataset to a new language. You can translate a dataset of intents (inputs) and responses to the target language. You can then train a new intent classification model with this new dataset. This allows you to proofread responses in the target language and have better control of the chatbot's outputs.
  • Translate the input and output of the agent. You can use a Translation model in user inputs so that the chatbot can process it. You can then translate the output of the chatbot into the language of the user. This approach might be less reliable as the chatbot will generate responses that were not defined before.

Inference

You can use the 🤗 Transformers library with the translation_xx_to_yy pattern where xx is the source language code and yy is the target language code. The default model for the pipeline is t5-base which under the hood adds a task prefix indicating the task itself, e.g. “translate: English to French”.

from transformers import pipeline
en_fr_translator = pipeline("translation_en_to_fr")
en_fr_translator("How old are you?")
## [{'translation_text': ' quel âge êtes-vous?'}]

If you’d like to use a specific model checkpoint that is from one specific language to another, you can also directly use the translation pipeline.

from transformers import pipeline

model_checkpoint = "Helsinki-NLP/opus-mt-en-fr"
translator = pipeline("translation", model=model_checkpoint)
translator("How are you?")
# [{'translation_text': 'Comment allez-vous ?'}]

Useful Resources

Would you like to learn more about Translation? Great! Here you can find some curated resources that you may find helpful!

Notebooks

Scripts for training

Documentation

Compatible libraries

Translation demo

No example widget is defined for this task.

Note Contribute by proposing a widget for this task !

Models for Translation
Browse Models (0)

No example model is defined for this task.

Note Contribute by proposing a model for this task !

Datasets for Translation
Browse Datasets (0)

No example dataset is defined for this task.

Note Contribute by proposing a dataset for this task !

Spaces using Translation

No example Space is defined for this task.

Note Contribute by proposing a Space for this task !

Metrics for Translation
bleu
BLEU score is calculated by counting the number of shared single or subsequent tokens between the generated sequence and the reference. Subsequent n tokens are called “n-grams”. Unigram refers to a single token while bi-gram refers to token pairs and n-grams refer to n subsequent tokens. The score ranges from 0 to 1, where 1 means the translation perfectly matched and 0 did not match at all
sacrebleu