mirror of https://github.com/huggingface/transformers.git synced 2025-07-04 13:20:12 +06:00

Maria Khalusova 5964f820db

[Docs] Model_doc structure/clarity improvements (#26876 )

* first batch of structure improvements for model_docs

* second batch of structure improvements for model_docs

* more structure improvements for model_docs

* more structure improvements for model_docs

* structure improvements for cv model_docs

* more structural refactoring

* addressed feedback about image processors

2023-11-03 10:57:03 -04:00

2.4 KiB

Raw Blame History

FLAN-T5

Overview

FLAN-T5 was released in the paper Scaling Instruction-Finetuned Language Models - it is an enhanced version of T5 that has been finetuned in a mixture of tasks.

One can directly use FLAN-T5 weights without finetuning the model:

>>> from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

>>> model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-small")
>>> tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-small")

>>> inputs = tokenizer("A step by step recipe to make bolognese pasta:", return_tensors="pt")
>>> outputs = model.generate(**inputs)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
['Pour a cup of bolognese into a large bowl and add the pasta']

FLAN-T5 includes the same improvements as T5 version 1.1 (see here for the full details of the model's improvements.)

Google has released the following variants:

The original checkpoints can be found here.

Refer to T5's documentation page for all API reference, code examples and notebooks. For more details regarding training and evaluation of the FLAN-T5, refer to the model card.

2.4 KiB Raw Blame History

FLAN-T5

Overview

2.4 KiB

Raw Blame History