# T5
[T5](https://huggingface.co/papers/1910.10683) is a encoder-decoder transformer available in a range of sizes from 60M to 11B parameters. It is designed to handle a wide range of NLP tasks by treating them all as text-to-text problems. This eliminates the need for task-specific architectures because T5 converts every NLP task into a text generation task.
To formulate every task as text generation, each task is prepended with a task-specific prefix (e.g., translate English to German: ..., summarize: ...). This enables T5 to handle tasks like translation, summarization, question answering, and more.
You can find all official T5 checkpoints under the [T5](https://huggingface.co/collections/google/t5-release-65005e7c520f8d7b4d037918) collection.
> [!TIP]
> Click on the T5 models in the right sidebar for more examples of how to apply T5 to different language tasks.
The example below demonstrates how to generate text with [`Pipeline`], [`AutoModel`], and how to translate with T5 from the command line.
```py
import torch
from transformers import pipeline
pipeline = pipeline(
task="text2text-generation",
model="google-t5/t5-base",
torch_dtype=torch.float16,
device=0
)
pipeline("translate English to French: The weather is nice today.")
```
```py
import torch
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(
"google-t5/t5-base"
)
model = AutoModelForSeq2SeqLM.from_pretrained(
"google-t5/t5-base",
torch_dtype=torch.float16,
device_map="auto"
)
input_ids = tokenizer("translate English to French: The weather is nice today.", return_tensors="pt").to("cuda")
output = model.generate(**input_ids, cache_implementation="static")
print(tokenizer.decode(output[0], skip_special_tokens=True))
```
```bash
echo -e "translate English to French: The weather is nice today." | transformers run --task text2text-generation --model google-t5/t5-base --device 0
```
Quantization reduces the memory burden of large models by representing the weights in a lower precision. Refer to the [Quantization](../quantization/overview) overview for more available quantization backends.
The example below uses [torchao](../quantization/torchao) to only quantize the weights to int4.
```py
# pip install torchao
import torch
from transformers import TorchAoConfig, AutoModelForSeq2SeqLM, AutoTokenizer
quantization_config = TorchAoConfig("int4_weight_only", group_size=128)
model = AutoModelForSeq2SeqLM.from_pretrained(
"google/t5-v1_1-xl",
torch_dtype=torch.bfloat16,
device_map="auto",
quantization_config=quantization_config
)
tokenizer = AutoTokenizer.from_pretrained("google/t5-v1_1-xl")
input_ids = tokenizer("translate English to French: The weather is nice today.", return_tensors="pt").to("cuda")
output = model.generate(**input_ids, cache_implementation="static")
print(tokenizer.decode(output[0], skip_special_tokens=True))
```
## Notes
- You can pad the encoder inputs on the left or right because T5 uses relative scalar embeddings.
- T5 models need a slightly higher learning rate than the default used in [`Trainer`]. Typically, values of `1e-4` and `3e-4` work well for most tasks.
## T5Config
[[autodoc]] T5Config
## T5Tokenizer
[[autodoc]] T5Tokenizer
- build_inputs_with_special_tokens
- get_special_tokens_mask
- create_token_type_ids_from_sequences
- save_vocabulary
## T5TokenizerFast
[[autodoc]] T5TokenizerFast
## T5Model
[[autodoc]] T5Model
- forward
## T5ForConditionalGeneration
[[autodoc]] T5ForConditionalGeneration
- forward
## T5EncoderModel
[[autodoc]] T5EncoderModel
- forward
## T5ForSequenceClassification
[[autodoc]] T5ForSequenceClassification
- forward
## T5ForTokenClassification
[[autodoc]] T5ForTokenClassification
- forward
## T5ForQuestionAnswering
[[autodoc]] T5ForQuestionAnswering
- forward
## TFT5Model
[[autodoc]] TFT5Model
- call
## TFT5ForConditionalGeneration
[[autodoc]] TFT5ForConditionalGeneration
- call
## TFT5EncoderModel
[[autodoc]] TFT5EncoderModel
- call
## FlaxT5Model
[[autodoc]] FlaxT5Model
- __call__
- encode
- decode
## FlaxT5ForConditionalGeneration
[[autodoc]] FlaxT5ForConditionalGeneration
- __call__
- encode
- decode
## FlaxT5EncoderModel
[[autodoc]] FlaxT5EncoderModel
- __call__