* Update the processing so bbox coords are adjusted for padding
* Just pad masks
* Tidy up, add tests
* Better tests
* Fix yolos and mark as slow for pycocotols
* Fix yolos - return_tensors
* Clarify padding and normalization behaviour
* add sudachi_projection option
* Upgrade sudachipy>=0.6.8
* add a test case for sudachi_projection
* Compatible with older versions of SudachiPy
* make fixup
* make style
* error message for unidic download
* revert jumanpp test cases
* format options for sudachi_projection
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* format options for sudachi_split_mode and sudachi_dict_type
* comment
* add tests for full_tokenizer kwargs
* pass projection arg directly
* require_sudachi_projection
* make style
* revert upgrade sudachipy
* check is_sudachi_projection_available()
* revert dependency_version_table and bugfix
* style format
* simply raise ImportError
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* simply raise ImportError
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* refactor with addedtokens decoder
* style
* get rid of lang code to id
* style
* keep some things for BC
* update tests
* add the mask token at the end of the vocab
* nits
* nits
* fix final tests
* style
* nits
* Update src/transformers/models/nllb/tokenization_nllb_fast.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* nits
* style?
* Update src/transformers/convert_slow_tokenizer.py
* make it a tad bit more custom
* ruff please stop
Co-Authored by avidale
<dale.david@mail.ru>
* Update
Co-authored-by: avidale
<dale.david@mail.ru>
* Update
Co-authored-by: avidale <dale.david@mail.ru>
* oupts
* ouft
* nites
* test
* fix the remaining failing tests
* style
* fix failing test
* ficx other test
* temp dir + test the raw init
* update test
* style
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* This is a test commit
* testing commit
* final commit with some changes
* Removed copy statement
* Fixed formatting issues
* Fixed error added past_key_values in the forward method
* Fixed a trailing whitespace. Damn the formatting rules are strict
* Added the copy statement
* [WIP] Hard error when ignoring tensors.
* Better selection/error when saving a checkpoint.
- Find all names we should normally drop (those are in the transformers
config)
- Find all disjoint tensors (for those we can safely trigger a copy to
get rid of the sharing before saving)
- Clone those disjoint tensors getting rid of the issue
- Find all identical names (those should be declared in the config
but we try to find them all anyway.)
- For all identical names:
- If they are in the config, just ignore them everything is fine
- If they are not, warn about them.
- For all remainder tensors which are shared yet neither identical NOR
disjoint. raise a hard error.
* Adding a failing test on `main` that passes here.
* We don't need to keep the subfolder logic in this test.
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Fix typos and grammar mistakes in docs and examples
* Fix typos in docstrings and comments
* Fix spelling of `tokenizer` in model tests
* Remove erroneous spaces in decorators
* Remove extra spaces in Markdown link texts
* Adding [T5/MT5/UMT5]ForTokenClassification
* Add auto mappings for T5ForTokenClassification and variants
* Adding ForTokenClassification to the list of models
* Adding attention_mask param to the T5ForTokenClassification test
* Remove outdated comment in test
* Adding EncoderOnly and Token Classification tests for MT5 and UMT5
* Fix typo in umt5 string
* Add tests for all the existing MT5 models
* Fix wrong comment in dependency_versions_table
* Reverting change to common test for _keys_to_ignore_on_load_missing
The test is correctly picking up redundant keys in _keys_to_ignore_on_load_missing.
* Removing _keys_to_ignore_on_missing from MT5 since the key is not used in the model
* Add fix-copies to MT5ModelTest
* up
* Fix more
* Correct more
* Fix more tests
* fix fast tests
* Fix more
* fix more
* push all files
* finish all
* make style
* Fix timestamp wrap
* make style
* make style
* up
* up
* up
* Fix lang detection behavior
* Fix lang detection behavior
* Add lang detection test
* Fix lang detection behavior
* make style
* Update src/transformers/models/whisper/generation_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* better error message
* make style tests
* add warning
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* test that tied output embeddings aren't initialized on load
* don't initialize the output embeddings if we're going to tie them to the input embeddings
* Port core files + ESM (because ESM code is odd)
* Search-replace in modelling code
* Fix up transfo_xl as well
* Fix other core files + tests (still need to add correct import to tests)
* Fix cookiecutter
* make fixup, fix imports in some more core files
* Auto-add imports to tests
* Cleanup, add imports to sagemaker tests
* Use correct exception for importing tf_keras
* Fixes in modeling_tf_utils
* make fixup
* Correct version parsing code
* Ensure the pipeline tests correctly revert to float32 after each test
* Ensure the pipeline tests correctly revert to float32 after each test
* More tf.keras -> keras
* Add dtype cast
* Better imports of tf_keras
* Add a cast for tf.assign, just in case
* Fix callback imports
* Enable instantiating model with pretrained backbone weights
* Remove doc updates until changes made in modeling code
* Use load_backbone instead
* Add use_timm_backbone to the model configs
* Add missing imports and arguments
* Update docstrings
* Make sure test is properly configured
* Include recent DPT updates
* fix the function load_balancing_loss_func in Mixtral_Moe to include attention_mask
* format code using black and ruff
* skip computing mask if attention_mask=None
* add tests for load balancing loss Mixtral-Moe
* fix assert loss is different in mixtral_test
* fix pad_leng
* use assertNotAlmostEqual and print to debug
* remove print for debug
* minor updates
* reduce rtol and atol
* Enable instantiating model with pretrained backbone weights
* Update tests so backbone checkpoint isn't passed in
* Remove doc updates until changes made in modeling code
* Clarify pretrained import
* Update configs - docs and validation check
* Update src/transformers/utils/backbone_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Clarify exception message
* Update config init in tests
* Add test for when use_timm_backbone=True
* Small test updates
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* [DETA] fix freeze/unfreeze function
* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add freeze/unfreeze test case in DETA
* fix type
* fix typo 2
* fix : enable aux and enc loss in training pipeline
* Add unsynced variables from original DETA for training
* modification for passing CI test
* make style
* make fix
* manual make fix
* change deta_modeling_test of configuration 'two_stage' default to TRUE and minor change of dist checking
* remove print
* divide configuration in DetaModel and DetaForObjectDetection
* image smaller size than 224 will give topk error
* pred_boxes and logits should be equivalent to two_stage_num_proposals
* add missing part in DetaConfig
* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add docstring in configure and prettify TO DO part
* change distribute related code to accelerate
* Update src/transformers/models/deta/configuration_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/deta/test_modeling_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* protect importing accelerate
* change variable name to specific value
* wrong import
* fix aux_loss in conditional_detr
* add test aux_loss
* add aux_loss test in deta and table_transformer
* fix yolos since it doesn't have auxiliary function
* fix maskformer auxiliary_loss related code
* make style
* change param 'auxiliary_loss' to 'use_auxiliary_loss'
* change param 'auxiliary_loss' to 'use_auxiliary_loss' in tests
* make style & fix-copies, also revert yolos related parameter
* revert variable name 'use_auxiliary_loss' to 'auxiliary_loss' due to DetrConfig
* revert variable name in yolos
* revert maskformer
* add aux_loss test in maskformer
* make style
* Update src/transformers/models/yolos/configuration_yolos.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Allow non-special tokens to be added
* Add test, fix token adding code
* Revert changes to id_to_token and token_to_id
* Update the ESM tokenizer to be a bit more standardized
* Update src/transformers/models/esm/tokenization_esm.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* finalize
* make fix copies whisper
* [Tests] Make sure that we don't run tests mulitple times
* Update src/transformers/models/whisper/modeling_whisper.py
* [Tests] Make sure that we don't run tests mulitple times
* fix more
* improve
* improve
* improve further
* improve more
* improve
* fix more
* git commit and git push
* fix more
* fix more
* fix more
* New try
* Fix more whisper stuff
* Improve
* correct more
* correct more
* correct more
* Fix some tests
* Add more tests
* correct more
* correct more
* correct more
* push
* correct more
* Fix more
* Better
* without dec mask
* correct more
* clean
* save intermediate
* Fix more
* Fix VAD for large-v2
* Save new
* Correct more
* make cleaner
* correct tests
* correct src
* Finish
* Fix more
* Fix more
* finish
* Fix edge cases
* fix return_dict_in_generate
* fix all tests
* make style
* add docstrings
* add docstrings
* Fix logit processor
* make style
* fix pipeline test
* fix more style
* Apply suggestions from code review
* apply feedback Sanchit
* correct more
* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* correct more
* correct more
* correct more
* Fix staticmethod
* correct more
* fix
* fix slow tests
* make style
* fix tokenizer test
* fix tokenizer test
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* finish
* finish
* revert kwargs change
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>