mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-06 22:30:09 +06:00

* First pass on utility classes and python tokenizers * finishing cleanup pass * style and quality * Fix tests * Updating following @mfuntowicz comment * style and quality * Fix Roberta * fix batch_size/seq_length inBatchEncoding * add alignement methods + tests * Fix OpenAI and Transfo-XL tokenizers * adding trim_offsets=True default for GPT2 et RoBERTa * style and quality * fix tests * add_prefix_space in roberta * bump up tokenizers to rc7 * style * unfortunately tensorfow does like these - removing shape/seq_len for now * Update src/transformers/tokenization_utils.py Co-Authored-By: Stefan Schweter <stefan@schweter.it> * Adding doc and docstrings * making flake8 happy Co-authored-by: Stefan Schweter <stefan@schweter.it>
101 lines
3.1 KiB
ReStructuredText
101 lines
3.1 KiB
ReStructuredText
OpenAI GPT2
|
|
----------------------------------------------------
|
|
|
|
Overview
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
OpenAI GPT-2 model was proposed in
|
|
`Language Models are Unsupervised Multitask Learners <https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf>`_
|
|
by Alec Radford*, Jeffrey Wu*, Rewon Child, David Luan, Dario Amodei** and Ilya Sutskever**.
|
|
It's a causal (unidirectional) transformer pre-trained using language modeling on a very large
|
|
corpus of ~40 GB of text data.
|
|
|
|
The abstract from the paper is the following:
|
|
|
|
*GPT-2 is a large transformer-based language model with 1.5 billion parameters, trained on a dataset[1]
|
|
of 8 million web pages. GPT-2 is trained with a simple objective: predict the next word, given all of the previous
|
|
words within some text. The diversity of the dataset causes this simple goal to contain naturally occurring
|
|
demonstrations of many tasks across diverse domains. GPT-2 is a direct scale-up of GPT, with more than 10X
|
|
the parameters and trained on more than 10X the amount of data.*
|
|
|
|
Tips:
|
|
|
|
- GPT-2 is a model with absolute position embeddings so it's usually advised to pad the inputs on
|
|
the right rather than the left.
|
|
- GPT-2 was trained with a causal language modeling (CLM) objective and is therefore powerful at predicting the next
|
|
token in a sequence. Leveraging this feature allows GPT-2 to generate syntactically coherent text as
|
|
it can be observed in the `run_generation.py` example script.
|
|
- The PyTorch models can take the `past` as input, which is the previously computed key/value attention pairs. Using
|
|
this `past` value prevents the model from re-computing pre-computed values in the context of text generation.
|
|
See `reusing the past in generative models <../quickstart.html#using-the-past>`_ for more information on the usage
|
|
of this argument.
|
|
|
|
`Write With Transformer <https://transformer.huggingface.co/doc/gpt2-large>`__ is a webapp created and hosted by
|
|
Hugging Face showcasing the generative capabilities of several models. GPT-2 is one of them and is available in five
|
|
different sizes: small, medium, large, xl and a distilled version of the small checkpoint: distilgpt-2.
|
|
|
|
The original code can be found `here <https://openai.com/blog/better-language-models/>`_.
|
|
|
|
|
|
GPT2Config
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.GPT2Config
|
|
:members:
|
|
|
|
|
|
GPT2Tokenizer
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.GPT2Tokenizer
|
|
:members: save_vocabulary
|
|
|
|
|
|
GPT2TokenizerFast
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.GPT2TokenizerFast
|
|
:members:
|
|
|
|
|
|
GPT2Model
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.GPT2Model
|
|
:members:
|
|
|
|
|
|
GPT2LMHeadModel
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.GPT2LMHeadModel
|
|
:members:
|
|
|
|
|
|
GPT2DoubleHeadsModel
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.GPT2DoubleHeadsModel
|
|
:members:
|
|
|
|
|
|
TFGPT2Model
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.TFGPT2Model
|
|
:members:
|
|
|
|
|
|
TFGPT2LMHeadModel
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.TFGPT2LMHeadModel
|
|
:members:
|
|
|
|
|
|
TFGPT2DoubleHeadsModel
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.TFGPT2DoubleHeadsModel
|
|
:members:
|