diff --git a/docs/source/model_doc/t5.rst b/docs/source/model_doc/t5.rst index 2e7bd285f07..f7451300c86 100644 --- a/docs/source/model_doc/t5.rst +++ b/docs/source/model_doc/t5.rst @@ -38,13 +38,13 @@ T5 can be trained / fine-tuned both in a supervised and unsupervised fashion. In this setup spans of the input sequence are masked by so-called sentinel tokens (*a.k.a* unique mask tokens) and the output sequence is formed as a concatenation of the same sentinel tokens and the *real* masked tokens. - Each sentinel token represents a unique mask token for this sentence and should start with ````, ````, ... up to ````. As a default 100 sentinel tokens are available in ``T5Tokenizer``. + Each sentinel token represents a unique mask token for this sentence and should start with ````, ````, ... up to ````. As a default 100 sentinel tokens are available in ``T5Tokenizer``. *E.g.* the sentence "The cute dog walks in the park" with the masks put on "cute dog" and "the" should be processed as follows: :: - input_ids = tokenizer.encode('The walks in park', return_tensors='pt') - labels = tokenizer.encode(' cute dog the ', return_tensors='pt') + input_ids = tokenizer.encode('The walks in park', return_tensors='pt') + labels = tokenizer.encode(' cute dog the ', return_tensors='pt') # the forward function automatically creates the correct decoder_input_ids model(input_ids=input_ids, labels=labels)