* make flax gpt2 working with cross attention
* Remove encoder->decoder projection layer
* A draft (incomplete) for FlaxEncoderDecoderModel
* Add the method from_encoder_decoder_pretrained + the docstrings
* Fix the mistakes of using EncoderDecoderModel
* Fix style
* Add FlaxEncoderDecoderModel to the library
* Fix cyclic imports
* Add FlaxEncoderDecoderModel to modeling_flax_auto.py
* Remove question comments
* add tests for FlaxEncoderDecoderModel
* add flax_encoder_decoder to the lists of ignored entries in check_repo.py
* fix missing required positional arguments
* Remove **kwargs when creating FlaxEncoderDecoderModel in from_encoder_decoder_pretrained()
Also fix generation eos/pad tokens issue
* Fix: Use sequences from the generated_output
* Change a check from assert to raise ValueError
* Fix examples and token ids issues
* Fix missing all_cross_attentions when outputting tuple in modeling_gpt2
* Remove the changes in configuration docstrings.
* allow for bert 2 gpt2
* make fix-copies
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Change remaining examples to bert2gpt2
* Change the test to Bert2GPT2
* Fix examples
* Fix import
* Fix unpack bug
* Rename to FlaxEncoderDecoderModelTest and change the test to bert2gpt2
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Fix: NotImplentedError -> NotImplementedError
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* up
* finalize
Co-authored-by: ydshieh <ydshieh@user.noreply>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* removes the creation of separate config objects and uses the existing ones instead+overwrite resize_token_embeddings from parent class because it is not working for the EncoderDecoderModel
* rollback to current version of the huggingface master branch
* reworked version that ties the encoder and decoder config of the parent encoderdecoder instance
* overwrite of resize_token_embeddings throws an error now
* review comment suggestion
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* implemented warning in case encoderdecoder is created with differing configs of encoderdecoderconfig and decoderconfig or encoderconfig
* added test to avoid diverging configs of wrapper class and wrapped classes
* Update src/transformers/models/encoder_decoder/modeling_encoder_decoder.py
* make style
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Use the CI to identify failing tests
* Remove from all examples and tests
* More default switch
* Fixes
* More test fixes
* More fixes
* Last fixes hopefully
* Use the CI to identify failing tests
* Remove from all examples and tests
* More default switch
* Fixes
* More test fixes
* More fixes
* Last fixes hopefully
* Run on the real suite
* Fix slow tests
* Output cross-attention with decoder attention output
* Update src/transformers/modeling_bert.py
* add cross-attention for t5 and bart as well
* fix tests
* correct typo in docs
* add sylvains and sams comments
* correct typo
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* improve unit tests
this is a sample of one test according to the request in https://github.com/huggingface/transformers/issues/5973
before I apply it to the rest
* batch 1
* batch 2
* batch 3
* batch 4
* batch 5
* style
* non-tf template
* last deletion of check_loss_output
This construct isn't used anymore these days.
Running python tests/test_foo.py puts the tests/ directory on
PYTHONPATH, which isn't representative of how we run tests.
Use python -m unittest tests/test_foo.py instead.