diff --git a/docs/source/model_summary.rst b/docs/source/model_summary.rst index f87a488f216..5371492dea2 100644 --- a/docs/source/model_summary.rst +++ b/docs/source/model_summary.rst @@ -195,7 +195,7 @@ tokens in the sentence, then allows the model to use the last n tokens to predic with a mask, the sentence is actually fed in the model in the right order, but instead of masking the first n tokens for n+1, XLNet uses a mask that hides the previous tokens in some given permutation of 1,...,sequence length. -XLNet also uses the same recurrence mechanism as TransformerXL to build long-term dependencies. +XLNet also uses the same recurrence mechanism as Transformer-XL to build long-term dependencies. The library provides a version of the model for language modeling, token classification, sentence classification, multiple choice classification and question answering.