mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-08 07:10:06 +06:00

* Generation doc * MBartForConditionalGeneration (#6441) * add MBartForConditionalGeneration * style * rebase and fixes * add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS * fix docs * don't ignore mbart * doc * fix mbart fairseq link * put mbart before bart * apply doc suggestions * Use hash to clean the test dirs (#6475) * Use hash to clean the test dirs * Use hash to clean the test dirs * Use hash to clean the test dirs * fix * [EncoderDecoder] Add Cross Attention for GPT2 (#6415) * add cross attention layers for gpt2 * make gpt2 cross attention work * finish bert2gpt2 * add explicit comments * remove attention mask since not yet supported * revert attn mask in pipeline * Update src/transformers/modeling_gpt2.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_encoder_decoder.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Sort unique_no_split_tokens to make it deterministic (#6461) * change unique_no_split_tokens's type to set * use sorted list instead of set * style * Import accuracy_score (#6480) * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address comments * Styling * Generation doc * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address comments * Styling Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Kevin Canwen Xu <canwenxu@126.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> Co-authored-by: gijswijnholds <gijswijnholds@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
55 lines
1.7 KiB
ReStructuredText
55 lines
1.7 KiB
ReStructuredText
Models
|
|
----------------------------------------------------
|
|
|
|
The base classes :class:`~transformers.PreTrainedModel` and :class:`~transformers.TFPreTrainedModel` implement the
|
|
common methods for loading/saving a model either from a local file or directory, or from a pretrained model
|
|
configuration provided by the library (downloaded from HuggingFace's AWS S3 repository).
|
|
|
|
:class:`~transformers.PreTrainedModel` and :class:`~transformers.TFPreTrainedModel` also implement a few methods which
|
|
are common among all the models to:
|
|
|
|
- resize the input token embeddings when new tokens are added to the vocabulary
|
|
- prune the attention heads of the model.
|
|
|
|
The other methods that are common to each model are defined in :class:`~transformers.modeling_utils.ModuleUtilsMixin`
|
|
(for the PyTorch models) and :class:`~transformers.modeling_tf_utils.TFModuleUtilsMixin` (for the TensorFlow models) or
|
|
for text generation, :class:`~transformers.generation_utils.GenerationMixin` (for the PyTorch models) and
|
|
:class:`~transformers.generation_tf_utils.TFGenerationMixin` (for the TensorFlow models)
|
|
|
|
|
|
``PreTrainedModel``
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.PreTrainedModel
|
|
:members:
|
|
|
|
|
|
``ModuleUtilsMixin``
|
|
~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.modeling_utils.ModuleUtilsMixin
|
|
:members:
|
|
|
|
|
|
``TFPreTrainedModel``
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.TFPreTrainedModel
|
|
:members:
|
|
|
|
|
|
``TFModelUtilsMixin``
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.modeling_tf_utils.TFModelUtilsMixin
|
|
:members:
|
|
|
|
|
|
Generative models
|
|
~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.generation_utils.GenerationMixin
|
|
:members:
|
|
|
|
.. autoclass:: transformers.generation_tf_utils.TFGenerationMixin
|
|
:members: |