# T5 [T5](https://huggingface.co/papers/1910.10683) is a encoder-decoder transformer available in a range of sizes from 60M to 11B parameters. It is designed to handle a wide range of NLP tasks by treating them all as text-to-text problems. This eliminates the need for task-specific architectures because T5 converts every NLP task into a text generation task. To formulate every task as text generation, each task is prepended with a task-specific prefix (e.g., translate English to German: ..., summarize: ...). This enables T5 to handle tasks like translation, summarization, question answering, and more. You can find all official T5 checkpoints under the [T5](https://huggingface.co/collections/google/t5-release-65005e7c520f8d7b4d037918) collection. > [!TIP] > Click on the T5 models in the right sidebar for more examples of how to apply T5 to different language tasks. The example below demonstrates how to generate text with [`Pipeline`], [`AutoModel`], and how to translate with T5 from the command line. ```py import torch from transformers import pipeline pipeline = pipeline( task="text2text-generation", model="google-t5/t5-base", torch_dtype=torch.float16, device=0 ) pipeline("translate English to French: The weather is nice today.") ``` ```py import torch from transformers import AutoModelForSeq2SeqLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained( "google-t5/t5-base" ) model = AutoModelForSeq2SeqLM.from_pretrained( "google-t5/t5-base", torch_dtype=torch.float16, device_map="auto" ) input_ids = tokenizer("translate English to French: The weather is nice today.", return_tensors="pt").to("cuda") output = model.generate(**input_ids, cache_implementation="static") print(tokenizer.decode(output[0], skip_special_tokens=True)) ``` ```bash echo -e "translate English to French: The weather is nice today." | transformers run --task text2text-generation --model google-t5/t5-base --device 0 ``` Quantization reduces the memory burden of large models by representing the weights in a lower precision. Refer to the [Quantization](../quantization/overview) overview for more available quantization backends. The example below uses [torchao](../quantization/torchao) to only quantize the weights to int4. ```py # pip install torchao import torch from transformers import TorchAoConfig, AutoModelForSeq2SeqLM, AutoTokenizer quantization_config = TorchAoConfig("int4_weight_only", group_size=128) model = AutoModelForSeq2SeqLM.from_pretrained( "google/t5-v1_1-xl", torch_dtype=torch.bfloat16, device_map="auto", quantization_config=quantization_config ) tokenizer = AutoTokenizer.from_pretrained("google/t5-v1_1-xl") input_ids = tokenizer("translate English to French: The weather is nice today.", return_tensors="pt").to("cuda") output = model.generate(**input_ids, cache_implementation="static") print(tokenizer.decode(output[0], skip_special_tokens=True)) ``` ## Notes - You can pad the encoder inputs on the left or right because T5 uses relative scalar embeddings. - T5 models need a slightly higher learning rate than the default used in [`Trainer`]. Typically, values of `1e-4` and `3e-4` work well for most tasks. ## T5Config [[autodoc]] T5Config ## T5Tokenizer [[autodoc]] T5Tokenizer - build_inputs_with_special_tokens - get_special_tokens_mask - create_token_type_ids_from_sequences - save_vocabulary ## T5TokenizerFast [[autodoc]] T5TokenizerFast ## T5Model [[autodoc]] T5Model - forward ## T5ForConditionalGeneration [[autodoc]] T5ForConditionalGeneration - forward ## T5EncoderModel [[autodoc]] T5EncoderModel - forward ## T5ForSequenceClassification [[autodoc]] T5ForSequenceClassification - forward ## T5ForTokenClassification [[autodoc]] T5ForTokenClassification - forward ## T5ForQuestionAnswering [[autodoc]] T5ForQuestionAnswering - forward ## TFT5Model [[autodoc]] TFT5Model - call ## TFT5ForConditionalGeneration [[autodoc]] TFT5ForConditionalGeneration - call ## TFT5EncoderModel [[autodoc]] TFT5EncoderModel - call ## FlaxT5Model [[autodoc]] FlaxT5Model - __call__ - encode - decode ## FlaxT5ForConditionalGeneration [[autodoc]] FlaxT5ForConditionalGeneration - __call__ - encode - decode ## FlaxT5EncoderModel [[autodoc]] FlaxT5EncoderModel - __call__