Text Generation
Generating text is the task of producing new text. These models can, for example, fill in incomplete text or paraphrase.
Input
Once upon a time,
Output
Once upon a time, we knew that our ancestors were on the verge of extinction. The great explorers and poets of the Old World, from Alexander the Great to Chaucer, are dead and gone. A good many of our ancient explorers and poets have
About Text Generation
This task covers guides on both text-generation and text-to-text generation models. Popular large language models that are used for chats or following instructions are also covered in this task. You can find the list of selected open-source large language models here, ranked by their performance scores.
Use Cases
Instruction Models
A model trained for text generation can be later adapted to follow instructions. One of the most used open-source models for instruction is OpenAssistant, which you can try at Hugging Chat.
Code Generation
A Text Generation model, also known as a causal language model, can be trained on code from scratch to help the programmers in their repetitive coding tasks. One of the most popular open-source models for code generation is StarCoder, which can generate code in 80+ languages. You can try it here.
Stories Generation
A story generation model can receive an input like "Once upon a time" and proceed to create a story-like text based on those first words. You can try this application which contains a model trained on story generation, by MosaicML.
If your generative model training data is different than your use case, you can train a causal language model from scratch. Learn how to do it in the free transformers course!
Task Variants
Completion Generation Models
A popular variant of Text Generation models predicts the next word given a bunch of words. Word by word a longer text is formed that results in for example:
- Given an incomplete sentence, complete it.
- Continue a story given the first sentences.
- Provided a code description, generate the code.
The most popular models for this task are GPT-based models (such as GPT-2). These models are trained on data that has no labels, so you just need plain text to train your own model. You can train GPT models to generate a wide variety of documents, from code to stories.
Text-to-Text Generation Models
These models are trained to learn the mapping between a pair of texts (e.g. translation from one language to another). The most popular variants of these models are T5, T0 and BART. Text-to-Text models are trained with multi-tasking capabilities, they can accomplish a wide range of tasks, including summarization, translation, and text classification.
Inference
You can use the 🤗 Transformers library text-generation
pipeline to do inference with Text Generation models. It takes an incomplete text and returns multiple outputs with which the text can be completed.
from transformers import pipeline
generator = pipeline('text-generation', model = 'gpt2')
generator("Hello, I'm a language model", max_length = 30, num_return_sequences=3)
## [{'generated_text': "Hello, I'm a language modeler. So while writing this, when I went out to meet my wife or come home she told me that my"},
## {'generated_text': "Hello, I'm a language modeler. I write and maintain software in Python. I love to code, and that includes coding things that require writing"}, ...
Text-to-Text generation models have a separate pipeline called text2text-generation
. This pipeline takes an input containing the sentence including the task and returns the output of the accomplished task.
from transformers import pipeline
text2text_generator = pipeline("text2text-generation")
text2text_generator("question: What is 42 ? context: 42 is the answer to life, the universe and everything")
[{'generated_text': 'the answer to life, the universe and everything'}]
text2text_generator("translate from English to French: I'm very happy")
[{'generated_text': 'Je suis très heureux'}]
The T0 model is even more robust and flexible on task prompts.
text2text_generator = pipeline("text2text-generation", model = "bigscience/T0")
text2text_generator("Is the word 'table' used in the same meaning in the two previous sentences? Sentence A: you can leave the books on the table over there. Sentence B: the tables in this book are very hard to read." )
## [{"generated_text": "No"}]
text2text_generator("A is the son's of B's brother. What is the family relationship between A and B?")
## [{"generated_text": "brother"}]
text2text_generator("Is this review positive or negative? Review: this is the best cast iron skillet you will ever buy")
## [{"generated_text": "positive"}]
text2text_generator("Reorder the words in this sentence: justin and name bieber years is my am I 27 old.")
## [{"generated_text": "Justin Bieber is my name and I am 27 years old"}]
Useful Resources
Would you like to learn more about the topic? Awesome! Here you can find some curated resources that you may find helpful!
- Course Chapter on Training a causal language model from scratch
- TO Discussion with Victor Sanh
- Hugging Face Course Workshops: Pretraining Language Models & CodeParrot
- Training CodeParrot 🦜 from Scratch
- How to generate text: using different decoding methods for language generation with Transformers
- Guiding Text Generation with Constrained Beam Search in 🤗 Transformers
- Code generation with Hugging Face
- 🌸 Introducing The World's Largest Open Multilingual Language Model: BLOOM 🌸
- The Technology Behind BLOOM Training
- Faster Text Generation with TensorFlow and XLA
- Assisted Generation: a new direction toward low-latency text generation
- Introducing RWKV - An RNN with the advantages of a transformer
- Creating a Coding Assistant with StarCoder
- StarCoder: A State-of-the-Art LLM for Code
Notebooks
Scripts for training
Documentation
Compatible libraries
No example widget is defined for this task.
Note Contribute by proposing a widget for this task !
No example model is defined for this task.
Note Contribute by proposing a model for this task !
No example dataset is defined for this task.
Note Contribute by proposing a dataset for this task !
No example Space is defined for this task.
Note Contribute by proposing a Space for this task !
- Cross Entropy
- Cross Entropy is a metric that calculates the difference between two probability distributions. Each probability distribution is the distribution of predicted words
- Perplexity
- The Perplexity metric is the exponential of the cross-entropy loss. It evaluates the probabilities assigned to the next word by the model. Lower perplexity indicates better performance