* create model
* add integration
* save current state
* make integration tests pass
* add one more test
* add explanation to tests
* remove from bart
* add padding
* remove unnecessary test
* make all tests pass
* re-add cookie cutter tests
* finish PyTorch
* fix attention test
* Update tests/test_modeling_common.py
* revert change
* remove unused file
* add string to doc
* save intermediate
* make tf integration tests pass
* finish tf
* fix doc
* fix docs again
* add led to doctree
* add to auto tokenizer
* added tips for led
* make style
* apply jplus statements
* correct tf longformer
* apply lysandres suggestions
* apply sylvains suggestions
* Apply suggestions from code review
* Use extlinks to point hyperlink with the version of code
* Point to version on release and master until then
* Apply style
* Correct links
* Add missing backtick
* Simple missing backtick after all.
Co-authored-by: Raghavendra Sugeeth P S <raghav-5305@raghav-5305.csez.zohocorpin.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* add past_key_values
* add use_cache option
* make mask before cutting ids
* adjust position_ids according to past_key_values
* flatten past_key_values
* fix positional embeds
* fix _reorder_cache
* set use_cache to false when not decoder, fix attention mask init
* add test for caching
* add past_key_values for Roberta
* fix position embeds
* add caching test for roberta
* add doc
* make style
* doc, fix attention mask, test
* small fixes
* adress patrick's comments
* input_ids shouldn't start with pad token
* use_cache only when decoder
* make consistent with bert
* make copies consistent
* add use_cache to encoder
* add past_key_values to tapas attention
* apply suggestions from code review
* make coppies consistent
* add attn mask in tests
* remove copied from longformer
* apply suggestions from code review
* fix bart test
* nit
* simplify model outputs
* fix doc
* fix output ordering
* Add label smoothing in Trainer
* Add options for scheduler and Adafactor in Trainer
* Put Seq2SeqTrainer in the main lib
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Address review comments and adapt scripts
* Documentation
* Move test not using script to tests folder
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* First commit: adding all files from tapas_v3
* Fix multiple bugs including soft dependency and new structure of the library
* Improve testing by adding torch_device to inputs and adding dependency on scatter
* Use Python 3 inheritance rather than Python 2
* First draft model cards of base sized models
* Remove model cards as they are already on the hub
* Fix multiple bugs with integration tests
* All model integration tests pass
* Remove print statement
* Add test for convert_logits_to_predictions method of TapasTokenizer
* Incorporate suggestions by Google authors
* Fix remaining tests
* Change position embeddings sizes to 512 instead of 1024
* Comment out positional embedding sizes
* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
* Added more model names
* Fix truncation when no max length is specified
* Disable torchscript test
* Make style & make quality
* Quality
* Address CI needs
* Test the Masked LM model
* Fix the masked LM model
* Truncate when overflowing
* More much needed docs improvements
* Fix some URLs
* Some more docs improvements
* Test PyTorch scatter
* Set to slow + minify
* Calm flake8 down
* First commit: adding all files from tapas_v3
* Fix multiple bugs including soft dependency and new structure of the library
* Improve testing by adding torch_device to inputs and adding dependency on scatter
* Use Python 3 inheritance rather than Python 2
* First draft model cards of base sized models
* Remove model cards as they are already on the hub
* Fix multiple bugs with integration tests
* All model integration tests pass
* Remove print statement
* Add test for convert_logits_to_predictions method of TapasTokenizer
* Incorporate suggestions by Google authors
* Fix remaining tests
* Change position embeddings sizes to 512 instead of 1024
* Comment out positional embedding sizes
* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
* Added more model names
* Fix truncation when no max length is specified
* Disable torchscript test
* Make style & make quality
* Quality
* Address CI needs
* Test the Masked LM model
* Fix the masked LM model
* Truncate when overflowing
* More much needed docs improvements
* Fix some URLs
* Some more docs improvements
* Add add_pooling_layer argument to TapasModel
Fix comments by @sgugger and @patrickvonplaten
* Fix issue in docs + fix style and quality
* Clean up conversion script and add task parameter to TapasConfig
* Revert the task parameter of TapasConfig
Some minor fixes
* Improve conversion script and add test for absolute position embeddings
* Improve conversion script and add test for absolute position embeddings
* Fix bug with reset_position_index_per_cell arg of the conversion cli
* Add notebooks to the examples directory and fix style and quality
* Apply suggestions from code review
* Move from `nielsr/` to `google/` namespace
* Apply Sylvain's comments
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Rogge Niels <niels.rogge@howest.be>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
* add model parallelism to T5EncoderModel
add model parallelism to T5EncoderModel
* remove decoder from T5EncoderModel parallelize
* uodate T5EncoderModel docs
* Extend T5ModelTest for T5EncoderModel
* fix T5Stask using range for get_device_map
* fix style
Co-authored-by: Ahmed Elnaggar <elnaggar@rostlab.informatik.tu-muenchen.de>
* rm all model cards
* Update the .rst
@sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler
* Add a rootlevel README.md with simple instructions/context
* Update docs/source/model_sharing.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* make style
* rm all model cards
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* remove make on the fly linear embedding
* start refactor
* big first refactor
* save intermediate
* save intermediat
* correct mask issue
* save tests
* refactor padding masks
* make all tests pass
* further refactor
* make pegasus test pass
* fix bool if
* fix leftover tests
* continue
* bart renaming
* delete torchscript test hack
* fix imports in tests
* correct shift
* fix docs and repo cons
* re-add fix for FSTM
* typo in test
* fix typo
* fix another typo
* continue
* hot fix 2 for tf
* small fixes
* refactor types linting
* continue
* finish refactor
* fix import in tests
* better bart names
* further refactor and add test
* delete hack
* apply sylvains and lysandres commens
* small perf improv
* further perf improv
* improv perf
* fix typo
* make style
* small perf improv
* diverse beam search
* bug fixes
* bug fixes
* bug fix
* separate out diverse_beam_search function
* separate out diverse_beam_search function
* bug fix
* improve code quality
* bug fix
* bug fix
* separate out diverse beam search scorer
* code format
* code format
* code format
* code format
* add test
* code format
* documentation changes
* code quality
* add slow integration tests
* more general name
* refactor into logits processor
* add test
* avoid too much copy paste
* refactor
* add to docs
* fix-copies
* bug fix
* Revert "bug fix"
This reverts commit c99eb5a8dc.
* improve comment
* implement sylvains feedback
Co-authored-by: Ayush Jain <a.jain@sprinklr.com>
Co-authored-by: ayushtiku5 <40797286+ayushtiku5@users.noreply.github.com>
* Add TFGPT2ForSequenceClassification based on DialogRPT
* Add TFGPT2ForSequenceClassification based on DialogRPT
* TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing
* Add TFGPT2ForSequenceClassification based on DialogRPT
* TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing
* code refactor for latest other TF PR
* code refactor
* code refactor
* Update modeling_tf_gpt2.py
* Add badge w/ number of models on the hub
* try to apease @sgugger 😇
* not sure what this `c` was about [ci skip]
* Fix script and move stuff around
* Fix doc styling error
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
* fix mems in xlnet
* fix use_mems
* fix use_mem_len
* fix use mems
* clean docs
* fix tf typo
* make xlnet tf for generation work
* fix tf test
* refactor use cache
* add use cache for missing models
* correct use_cache in generate
* correct use cache in tf generate
* fix tf
* correct getattr typo
* make sylvain happy
* change in docs as well
* do not apply to cookie cutter statements
* fix tf test
* make pytorch model fully backward compatible
* Add early stopping patience and minimum threshold metric must improve to prevent early stopping to pytorch trainer
* Add early stopping test
* Set patience counter to 0 if best metric not defined yet
* Make early stopping a callback. Add callback event for updating the best metric for early stopping callback to trigger on.
* Run make style
* make funciton name sensible
* Improve new argument docstring wording and hope that flakey CI test passes.
* Use on_evaluation callback instead of custom. Remove some debug printing
* Move early stopping arguments and state into early stopping callback
* Run make style
* Remove old code
* Fix docs formatting. make style went rogue on me.
* Remove copied attributes and fix variable
* Add assertions on training arguments instead of mutating them. Move comment out of public docs.
* Make separate test for early stopping callback. Add test of invalid arguments.
* Run make style... I remembered before CI this time!
* appease flake8
* Add EarlyStoppingCallback to callback docs
* Make docstring EarlyStoppingCallabck match other callbacks.
* Fix typo in docs
* working on LongformerForSequenceClassification
* add TFLongformerForMultipleChoice
* add TFLongformerForTokenClassification
* use add_start_docstrings_to_model_forward
* test TFLongformerForSequenceClassification
* test TFLongformerForMultipleChoice
* test TFLongformerForTokenClassification
* remove test from repo
* add test and doc for TFLongformerForSequenceClassification, TFLongformerForTokenClassification, TFLongformerForMultipleChoice
* add requested classes to modeling_tf_auto.py
update dummy_tf_objects
fix tests
fix bugs in requested classes
* pass all tests except test_inputs_embeds
* sync with master
* pass all tests except test_inputs_embeds
* pass all tests
* pass all tests
* work on test_inputs_embeds
* fix style and quality
* make multi choice work
* fix TFLongformerForTokenClassification signature
* fix TFLongformerForMultipleChoice, TFLongformerForSequenceClassification signature
* fix mult choice
* fix mc hint
* fix input embeds
* fix input embeds
* refactor input embeds
* fix copy issue
* apply sylvains changes and clean more
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Updated the Extractive Question Answering code snippets
The Extractive Question Answering code snippets do not work anymore since the models return task-specific output objects. This commit fixes the pytorch and tensorflow examples but adding `.values()` to the model call.
* Update task_summary.rst
* Tokenizers should be framework agnostic
* Run the slow tests
* Not testing
* Fix documentation
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* <small>tiny typo</small>
* Tokenizers: ability to load from model subfolder
* use subfolder for local files as well
* Uniformize model shortcut name => model id
* from s3 => from huggingface.co
Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>
* Put models in subfolders
* Styling
* Fix imports in tests
* More fixes in test imports
* Sneaky hidden imports
* Fix imports in doc files
* More sneaky imports
* Finish fixing tests
* Fix examples
* Fix path for copies
* More fixes for examples
* Fix dummy files
* More fixes for example
* More model import fixes
* Is this why you're unhappy GitHub?
* Fix imports in conver command
* Use the CI to identify failing tests
* Remove from all examples and tests
* More default switch
* Fixes
* More test fixes
* More fixes
* Last fixes hopefully
* Use the CI to identify failing tests
* Remove from all examples and tests
* More default switch
* Fixes
* More test fixes
* More fixes
* Last fixes hopefully
* Run on the real suite
* Fix slow tests
* First addition of Flax/Jax documentation
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* make style
* Ensure input order match between Bert & Roberta
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Install dependencies "all" when building doc
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* wraps build_doc deps with ""
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Addressing @sgugger comments.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Use list to highlight JAX features.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Make style.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Let's not look to much into the future for now.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Style
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* Create modeling_tf_dpr.py
* Add TFDPR
* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot
last commit accidentally deleted these 4 lines, so I recover them back
* Add TFDPR
* Add TFDPR
* clean up some comments, add TF input-style doc string
* Add TFDPR
* Make return_dict=False as default
* Fix return_dict bug (in .from_pretrained)
* Add get_input_embeddings()
* Create test_modeling_tf_dpr.py
The current version is already passed all 27 tests!
Please see the test run at :
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing
* fix quality
* delete init weights
* run fix copies
* fix repo consis
* del config_class, load_tf_weights
They shoud be 'pytorch only'
* add config_class back
after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion
* newline after .. note::
* import tf, np (Necessary for ModelIntegrationTest)
* slow_test from_pretrained with from_pt=True
At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug
* Add simple TFDPRModelIntegrationTest
Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet
* upload correct tf model
* remove position_ids as missing keys
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>
* [testing utils] get_auto_remove_tmp_dir default change
Now that I have been using `get_auto_remove_tmp_dir default change` for a while, I realized that the defaults aren't most optimal.
99% of the time we want the tmp dir to be empty at the beginning of the test - so changing the default to `before=True` - this shouldn't impact any tests since this feature is used only during debug.
* simplify things
* update docs
* fix doc layout
* style
* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* better 3-state doc
* style
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* s/tmp/temporary/ + style
* correct the statement
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add training tests
* correct longformer
* fix docs
* fix some tests
* fix some more train tests
* remove ipdb
* fix multiple edge case model training
* fix funnel and prophetnet
* clean gpt models
* undo renaming of albert
* Output cross-attention with decoder attention output
* Update src/transformers/modeling_bert.py
* add cross-attention for t5 and bart as well
* fix tests
* correct typo in docs
* add sylvains and sams comments
* correct typo
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Output global_attentions in Longformer models
* make style
* small refactoring
* fix tests
* make fix-copies
* add for tf as well
* remove comments in test
* make fix-copies
* make style
* add docs
* make docstring pretty
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
* first draft
* show design proposition for new generate method
* up
* make better readable
* make first version
* gpt2 tests pass
* make beam search for gpt2 work
* add first encoder-decoder code
* delete typo
* make t5 work
* save indermediate
* make bart work with beam search
* finish beam search bart / t5
* add default kwargs
* make more tests pass
* fix no bad words sampler
* some fixes and tests for all distribution processors
* fix test
* fix rag slow tests
* merge to master
* add nograd to generate
* make all slow tests pass
* speed up generate
* fix edge case bug
* small fix
* correct typo
* add type hints and docstrings
* fix typos in tests
* add beam search tests
* add tests for beam scorer
* fix test rag
* finish beam search tests
* move generation tests in seperate file
* fix generation tests
* more tests
* add aggressive generation tests
* fix tests
* add gpt2 sample test
* add more docstring
* add more docs
* finish doc strings
* apply some more of sylvains and sams comments
* fix some typos
* make fix copies
* apply lysandres and sylvains comments
* final corrections on examples
* small fix for reformer
* move the helper code into testing_utils
* port test_trainer_distributed to work with pytest
* improve docs
* simplify notes
* doc
* doc
* style
* doc
* further improvements
* torch might not be available
* real fix
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* first attempt to add AzureML callbacks
* func arg fix
* var name fix, but still won't fix error...
* fixing as in https://discuss.huggingface.co/t/how-to-integrate-an-azuremlcallback-for-logging-in-azure/1713/2
* Avoid lint check of azureml import
* black compliance
* Make isort happy
* Fix point typo in docs
* Add AzureML to Callbacks docs
* Attempt to make sphinx happy
* Format callback docs
* Make documentation style happy
* Make docs compliant to style
Co-authored-by: Davide Fiocco <davide.fiocco@frontiersin.net>
* Important files
* Styling them all
* Revert "Styling them all"
This reverts commit 7d029395fd.
* Syling them for realsies
* Fix syntax error
* Fix benchmark_utils
* More fixes
* Fix modeling auto and script
* Remove new line
* Fixes
* More fixes
* Fix more files
* Style
* Add FSMT
* More fixes
* More fixes
* More fixes
* More fixes
* Fixes
* More fixes
* More fixes
* Last fixes
* Make sphinx happy