* add test in trainer and test tokenizer saving wi
th trainer
* quality
* reverse trainer changes
* replace test in test_trainer by a test for all the tokenizers
* format
* add can_save_slow_tokenizer attribute to all tokenizers
* fix Herbert
* format
* Change comment in error
* add comments and a new assert
* Update src/transformers/models/albert/tokenization_albert_fast.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* change ValueError barthez
* change ValueError BigBird
* change ValueError Camembert
* change ValueError Mbart50
* change ValueError Pegasus
* change ValueError ReFormer
* change ValueError T5
* change ValueError RoBERTa
* XLNET fast
* Update tests/test_tokenization_common.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* change `assert` into `self.assertIn`
* format
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix_torch_device_generate_test
* remove @
* up
* correct some bugs
* correct model
* finish speech2text extension
* up
* up
* up
* up
* Update utils/custom_init_isort.py
* up
* up
* update with tokenizer
* correct old tok
* correct old tok
* fix bug
* up
* up
* add more tests
* up
* fix docs
* up
* fix some more tests
* add better config
* correct some more things
"
* fix tests
* improve docs
* Apply suggestions from code review
* Apply suggestions from code review
* final fixes
* finalize
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* apply suggestions Lysandre and Sylvain
* apply nicos suggestions
* upload everything
* finish
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
Co-authored-by: your_github_username <your_github_email>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* added token_type_ids buffer to fix the issue #5664
* Handling the case that position_id buffer is not registered
* added token_type_ids buffer to fix the issue #5664
* modified to support device conversion when the model is traced
* registered buffer for position-ids to address issues similar to issue#5664
* added comment
* added the flag to prevent from adding the buffer into the state_dict
* Add the audio classification pipeline
* Remove autoconfig exception
* Mark ffmpeg test as slow
* Rearrange pipeline tests
* Add small test
* Replace asserts with ValueError
* Add option to add flax
* Add flax template for __init__.py
* Add flax template for .rst
* Copy TF modeling template
* Add a missing line in modeling_tf_... template
* Update first half of modeling_flax_..
* Update encoder flax template
* Copy test_modeling_tf... as test_modeling_flax...
* Replace some TF to Flax in test_modeling_flax_...
* Replace tf to np
some function might not work, like _assert_tensors_equal
* Replace remaining tf to np (might not work)
* Fix cookiecutter
* Add Flax in to_replace_... template
* Update transformers-cli add-new-model
* Save generate_flax in configuration.json
This will be read by transformers-cli
* Fix to_replace_... and cli
* Fix replace cli
* Fix cookiecutter name
* Move docstring earlier to avoid not defined error
* Fix a missing Module
* Add encoder-decoder flax template from bart
* Fix flax test
* Make style
* Fix endif
* Fix replace all "utf-8 -> unp-8"
* Update comment
* Fix flax template (add missing ..._DOCSTRING)
* Use flax_bart imports in template (was t5)
* Fix unp
* Update templates/adding_a_new_model/tests
* Revert "Fix unp"
This reverts commit dc9002a41d.
* Remove one line of copied from to suppress CI error
* Use generate_tensorflow_pytorch_and_flax
* Add a missing part
* fix typo
* fix flax config
* add examples for flax
* small rename
* correct modeling imports
* correct auto loading
* corrects some flax tests
* correct small typo
* correct as type
* finish modif
* correct more templates
* final fixes
* add file testers
* up
* make sure tests match template regex
* correct pytorch
* correct tf
* correct more tf
* correct imports
* minor error
* minor error
* correct init
* more fixes
* correct more flax tests
* correct flax test
* more fixes
* correct docs
* update
* fix
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Adding a TF variant of the DataCollatorForTokenClassification to get feedback
* Added a Numpy variant and a post_init check to fail early if a missing import is found
* Fixed call to Numpy variant
* Added a couple more of the collators
* Update src/transformers/data/data_collator.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Fixes, style pass, finished DataCollatorForSeqToSeq
* Added all the LanguageModeling DataCollators, except SOP and PermutationLanguageModeling
* Adding DataCollatorForPermutationLanguageModeling
* Style pass
* Add missing `__call__` for PLM
* Remove `post_init` checks for frameworks because the imports inside them were making us fail code quality checks
* Remove unused imports
* First attempt at some TF tests
* A second attempt to make any of those tests actually work
* TF tests, round three
* TF tests, round four
* TF tests, round five
* TF tests, all enabled!
* Style pass
* Merging tests into `test_data_collator.py`
* Merging tests into `test_data_collator.py`
* Fixing up test imports
* Fixing up test imports
* Trying shuffling the conditionals around
* Commenting out non-functional old tests
* Completed all tests for all three frameworks
* Style pass
* Fixed test typo
* Style pass
* Move standard `__call__` method to mixin
* Rearranged imports for `test_data_collator`
* Fix data collator typo "torch" -> "pt"
* Fixed the most embarrassingly obvious bug
* Update src/transformers/data/data_collator.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Renaming mixin
* Updating docs
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dalton Walker <dalton_walker@icloud.com>
Co-authored-by: Andrew Romans <andrew.romans@hotmail.com>
* Deberta_v2 tf
* added new line at the end of file, make style
* +V2, typo
* remove never executed branch of code
* rm cmnt and fixed typo in url filter
* cleanup according to review comments
* added #Copied from
* added missing __spec__ to _LazyModule
* test __spec__ is not None after module import
* changed module_spec arg to be optional in _LazyModule
* fix style issue
* added module spec test to test_file_utils
* Correct outdated function signatures on website.
* Upgrade sphinx to 3.5.4 (latest 3.x)
* Test
* Test
* Test
* Test
* Test
* Test
* Revert unnecessary changes.
* Change sphinx version to 3.5.4"
* Test python 3.7.11