mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-05 22:00:09 +06:00

* make flax gpt2 working with cross attention * Remove encoder->decoder projection layer * A draft (incomplete) for FlaxEncoderDecoderModel * Add the method from_encoder_decoder_pretrained + the docstrings * Fix the mistakes of using EncoderDecoderModel * Fix style * Add FlaxEncoderDecoderModel to the library * Fix cyclic imports * Add FlaxEncoderDecoderModel to modeling_flax_auto.py * Remove question comments * add tests for FlaxEncoderDecoderModel * add flax_encoder_decoder to the lists of ignored entries in check_repo.py * fix missing required positional arguments * Remove **kwargs when creating FlaxEncoderDecoderModel in from_encoder_decoder_pretrained() Also fix generation eos/pad tokens issue * Fix: Use sequences from the generated_output * Change a check from assert to raise ValueError * Fix examples and token ids issues * Fix missing all_cross_attentions when outputting tuple in modeling_gpt2 * Remove the changes in configuration docstrings. * allow for bert 2 gpt2 * make fix-copies * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Change remaining examples to bert2gpt2 * Change the test to Bert2GPT2 * Fix examples * Fix import * Fix unpack bug * Rename to FlaxEncoderDecoderModelTest and change the test to bert2gpt2 * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Fix: NotImplentedError -> NotImplementedError * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * up * finalize Co-authored-by: ydshieh <ydshieh@user.noreply> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
50 lines
2.4 KiB
ReStructuredText
50 lines
2.4 KiB
ReStructuredText
..
|
|
Copyright 2020 The HuggingFace Team. All rights reserved.
|
|
|
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
|
the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
|
specific language governing permissions and limitations under the License.
|
|
|
|
Encoder Decoder Models
|
|
-----------------------------------------------------------------------------------------------------------------------
|
|
|
|
The :class:`~transformers.EncoderDecoderModel` can be used to initialize a sequence-to-sequence model with any
|
|
pretrained autoencoding model as the encoder and any pretrained autoregressive model as the decoder.
|
|
|
|
The effectiveness of initializing sequence-to-sequence models with pretrained checkpoints for sequence generation tasks
|
|
was shown in `Leveraging Pre-trained Checkpoints for Sequence Generation Tasks <https://arxiv.org/abs/1907.12461>`__ by
|
|
Sascha Rothe, Shashi Narayan, Aliaksei Severyn.
|
|
|
|
After such an :class:`~transformers.EncoderDecoderModel` has been trained/fine-tuned, it can be saved/loaded just like
|
|
any other models (see the examples for more information).
|
|
|
|
An application of this architecture could be to leverage two pretrained :class:`~transformers.BertModel` as the encoder
|
|
and decoder for a summarization model as was shown in: `Text Summarization with Pretrained Encoders
|
|
<https://arxiv.org/abs/1908.08345>`__ by Yang Liu and Mirella Lapata.
|
|
|
|
|
|
EncoderDecoderConfig
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.EncoderDecoderConfig
|
|
:members:
|
|
|
|
|
|
EncoderDecoderModel
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.EncoderDecoderModel
|
|
:members: forward, from_encoder_decoder_pretrained
|
|
|
|
|
|
FlaxEncoderDecoderModel
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.FlaxEncoderDecoderModel
|
|
:members: __call__, from_encoder_decoder_pretrained
|