* Fix decoder not returning hidden states from the last layer
* Resolve conflict
* Change the way to gather hidden states
* Add decoder hidden states test
* Make pytest and black happy
* Remove redundant line
* remove new line
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* fix mems in xlnet
* fix use_mems
* fix use_mem_len
* fix use mems
* clean docs
* fix tf typo
* make xlnet tf for generation work
* fix tf test
* refactor use cache
* add use cache for missing models
* correct use_cache in generate
* correct use cache in tf generate
* fix tf
* correct getattr typo
* make sylvain happy
* change in docs as well
* do not apply to cookie cutter statements
* fix tf test
* make pytorch model fully backward compatible
* bart output hidden states upstream
* same w/ decoder
* add tests
* fix prophetnet
* fix gpt2 and ctrl
* fix fstm and skip test for reformer and longformer
* fix all models
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* implement support for run-time dependency version checking
* try not escaping !
* use findall that works on py36
* small tweaks
* autoformatter worship
* simplify
* shorter names
* add support for non-versioned checks
* add deps
* revert
* tokenizers not required, check version only if installed
* make a proper distutils cmd and add make target
* tqdm must be checked before tokenizers
* workaround the DistributionNotFound peculiar setup
* handle the rest of packages in setup.py
* fully sync setup.py's install_requires - to check them all
* nit
* make install_requires more readable
* typo
* Update setup.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* restyle
* add types
* simplify
* simplify2
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Support BERT relative position embeddings
* Fix typo in README.md
* Address review comment
* Fix failing tests
* [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py
* make fix copies
* fix configs of electra and albert and fix longformer
* remove copy statement from longformer
* fix albert
* fix electra
* Add bert variants forward tests for various position embeddings
* [tiny] Fix style for test_modeling_bert.py
* improve docstring
* [tiny] improve docstring and remove unnecessary dependency
* [tiny] Remove unused import
* re-add to ALBERT
* make embeddings work for ALBERT
* add test for albert
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add early stopping patience and minimum threshold metric must improve to prevent early stopping to pytorch trainer
* Add early stopping test
* Set patience counter to 0 if best metric not defined yet
* Make early stopping a callback. Add callback event for updating the best metric for early stopping callback to trigger on.
* Run make style
* make funciton name sensible
* Improve new argument docstring wording and hope that flakey CI test passes.
* Use on_evaluation callback instead of custom. Remove some debug printing
* Move early stopping arguments and state into early stopping callback
* Run make style
* Remove old code
* Fix docs formatting. make style went rogue on me.
* Remove copied attributes and fix variable
* Add assertions on training arguments instead of mutating them. Move comment out of public docs.
* Make separate test for early stopping callback. Add test of invalid arguments.
* Run make style... I remembered before CI this time!
* appease flake8
* Add EarlyStoppingCallback to callback docs
* Make docstring EarlyStoppingCallabck match other callbacks.
* Fix typo in docs
* Make ci fail
* Try to make tests actually run?
* CI finally failing?
* Fix CI
* Revert "Fix CI"
This reverts commit ca7923be73.
* Ooops wrong one
* one more try
* Ok ok let's move this elsewhere
* Alternative to globals() (#8667)
* Alternative to globals()
* Error is raised later so return None
* Sentencepiece not installed make some tokenizers None
* Apply Lysandre wisdom
* Slightly clearer comment?
cc @sgugger
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* `disable_ngram_loss` fix for prophetnet
* add changes documentation
* fix _compute_loss to use mean reduction and -100 to masked tokens & remove unnecessary arguments
* mean label smoothing loss
* small refactor
* fix test
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
* working on LongformerForSequenceClassification
* add TFLongformerForMultipleChoice
* add TFLongformerForTokenClassification
* use add_start_docstrings_to_model_forward
* test TFLongformerForSequenceClassification
* test TFLongformerForMultipleChoice
* test TFLongformerForTokenClassification
* remove test from repo
* add test and doc for TFLongformerForSequenceClassification, TFLongformerForTokenClassification, TFLongformerForMultipleChoice
* add requested classes to modeling_tf_auto.py
update dummy_tf_objects
fix tests
fix bugs in requested classes
* pass all tests except test_inputs_embeds
* sync with master
* pass all tests except test_inputs_embeds
* pass all tests
* pass all tests
* work on test_inputs_embeds
* fix style and quality
* make multi choice work
* fix TFLongformerForTokenClassification signature
* fix TFLongformerForMultipleChoice, TFLongformerForSequenceClassification signature
* fix mult choice
* fix mc hint
* fix input embeds
* fix input embeds
* refactor input embeds
* fix copy issue
* apply sylvains changes and clean more
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Adding PrefixConstrainedLogitsProcessor
* fixing RAG and style_doc
* fixing black (v20 instead of v19)
* Improving doc in generation_logits_process.py
* Improving docs and typing in generation_utils.py
* docs improvement
* adding test and fixing doc typo
* fixing doc_len
* isort on test
* fixed test
* improve docstring a bit
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Tokenizers should be framework agnostic
* Run the slow tests
* Not testing
* Fix documentation
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Put models in subfolders
* Styling
* Fix imports in tests
* More fixes in test imports
* Sneaky hidden imports
* Fix imports in doc files
* More sneaky imports
* Finish fixing tests
* Fix examples
* Fix path for copies
* More fixes for examples
* Fix dummy files
* More fixes for example
* More model import fixes
* Is this why you're unhappy GitHub?
* Fix imports in conver command
* Use the CI to identify failing tests
* Remove from all examples and tests
* More default switch
* Fixes
* More test fixes
* More fixes
* Last fixes hopefully
* Use the CI to identify failing tests
* Remove from all examples and tests
* More default switch
* Fixes
* More test fixes
* More fixes
* Last fixes hopefully
* Run on the real suite
* Fix slow tests
* Fix passing token_type_ids during GPT2DoubleHeadsModel.generate() if used
and for GPT2LMHeadModel too
* Update tests to check token_type_ids usage in GPT2 models
* Simply insert T5Tokenizer's prepare_seq2seq_batch
* Update/Add some 'import'
* fix RunTimeError caused by '.view'
* Moves .view related error avoidance from seq2seq_trainer to inside prophetnet
* Update test_tokenization_prophetnet.py
* Format the test code with black
* Re-format the test code
* Update test_tokenization_prophetnet.py
* Add importing require_torch in the test code
* Add importing BatchEncoding in the test code
* Re-format the test code on Colab
* Fixing roberta for slow-fast tests
* WIP getting equivalence on pipelines
* slow-to-fast equivalence - working on question-answering pipeline
* optional FAISS tests
* Pipeline Q&A
* Move pipeline tests to their own test job again
* update tokenizer to add sequence id methods
* update to tokenizers 0.9.4
* set sentencepiecce as optional
* clean up squad
* clean up pipelines to use sequence_ids
* style/quality
* wording
* Switch to use_fast = True by default
* update tests for use_fast at True by default
* fix rag tokenizer test
* removing protobuf from required dependencies
* fix NER test for use_fast = True by default
* fixing example tests (Q&A examples use slow tokenizers for now)
* protobuf in main deps extras["sentencepiece"] and example deps
* fix protobug install test
* try to fix seq2seq by switching to slow tokenizers for now
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update some tests
* Small update
* Apply style
* Use max_position_embeddings
* Create a fake attribute
* Create a fake attribute
* Update wrong name
* Wrong TransfoXL model file
* Keep the common tests agnostic
* Create modeling_tf_dpr.py
* Add TFDPR
* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot
last commit accidentally deleted these 4 lines, so I recover them back
* Add TFDPR
* Add TFDPR
* clean up some comments, add TF input-style doc string
* Add TFDPR
* Make return_dict=False as default
* Fix return_dict bug (in .from_pretrained)
* Add get_input_embeddings()
* Create test_modeling_tf_dpr.py
The current version is already passed all 27 tests!
Please see the test run at :
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing
* fix quality
* delete init weights
* run fix copies
* fix repo consis
* del config_class, load_tf_weights
They shoud be 'pytorch only'
* add config_class back
after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion
* newline after .. note::
* import tf, np (Necessary for ModelIntegrationTest)
* slow_test from_pretrained with from_pt=True
At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug
* Add simple TFDPRModelIntegrationTest
Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet
* upload correct tf model
* remove position_ids as missing keys
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>
* Add next sentence prediction loss computation
* Apply style
* Fix tests
* Add forgotten import
* Add forgotten import
* Use a new parameter
* Remove kwargs and use positional arguments