* add tests
* fix whisper
* update
* nit
* add qwen2-vl
* more updates!
* better this way
* fix this one
* fix more tests
* fix final tests, hope so
* fix led
* Update tests/generation/test_utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* pr comments
* not pass pixels and extra for low-mem tests, very flaky because of visio tower
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* added sequences_scores to the output
* added beam_indices to output
* added test to check for beam_indices, sequences_scores and their shape
* removed redundant whitespaces
* make fixup
* Fix failing tensor placement in Whisper
* fix long form generation tests
* more return_timestamps=True
* make fixup
* [run_slow] whisper
* [run_slow] whisper
* Add validation for maximum sequence length in modeling_whisper.py
Added a validation check to ensure that the sequence length of labels does not exceed the maximum allowed length of 448 tokens. If the sequence length exceeds this limit, a ValueError is raised with a descriptive error message.
This change prevents the model from encountering errors or unexpected behavior due to excessively long sequences during training or fine-tuning, ensuring consistent input dimensions and improving overall robustness.
* Change exception message in src/transformers/models/whisper/modeling_whisper.py
The exception message is for whisper's label's sequence max length.
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
* Change 448 to config.max_target_positions in src/transformers/models/whisper/modeling_whisper.py
It's for whisper's config.max_target_positions.
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
* Change method's documentation in src/transformers/models/whisper/modeling_whisper.py
* Add test for maximum label's sequence length in test_modeling_whisper.py
* Add self to modeling_whisper.py
* Update test_modeling_whisper.py with respect to automatic validations
* Update modeling_whisper.py with respect to ci/circleci: check_code_quality
* Update test_modeling_whisper.py with respect to ci/circleci: check_code_quality
* Update test_modeling_whisper.py with respect to ci/circleci: tests_generate
* Update test_modeling_whisper.py with respect to ci/circleci: tests_generate
* Update test_modeling_whisper.py with respect to ci/circleci: check_code_quality
* Separate test_labels_sequence_max_length tests in test_modeling_whisper.py
* Update test_modeling_whisper.py with respect to ci/circleci: check_code_quality
* Remove assert from test_modeling_whisper.py
* Add max_target_positions to WhisperModelTester in test_modeling_whisper.py
* Update test_modeling_whisper.py with respect to ci/circleci: check_code_quality
* Update test_modeling_whisper.py with respect to ci/circleci: tests_generate
* Update test_modeling_whisper.py
* Change test_labels_sequence_max_length_error_after_changing_config in test_modeling_whisper.py
* Change self.config.max_target_positions to self.max_target_positions modeling_whisper.py
* Add new tests in test_modeling_whisper.py
* Update test_modeling_whisper.py
---------
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
* mvp
* added test (a few models need fixes)
* fix a few test cases
* test nits
* harder test 😈
* revert changes in stablelm
* test with improved condition
* add todo
* tmp commit
* merged with main
* nits
* add todo
* final corrections
* add docs for generation compilation
* docs nits
* add tip
* PR suggestions
* add more details to the compilation docs
* fix cache positions
* cache is now init in generate; update docs
* tag test as flaky
* docs
* post rebase make fixup and other nits
* remove unintended changes
* whisper (encoder-decoder) not supported
* move token default updates to ; add tests for token defaults
* push changes
* manual rebase
* chameleon doesn't support this
* fix test_static_cache_mha_mqa_gqa (broken in another PR)
* docs: dynamic is better with end-to-end compilation
* Pass datasets trust_remote_code
* Pass trust_remote_code in more tests
* Add trust_remote_dataset_code arg to some tests
* Revert "Temporarily pin datasets upper version to fix CI"
This reverts commit b7672826ca.
* Pass trust_remote_code in librispeech_asr_dummy docstrings
* Revert "Pin datasets<2.20.0 for examples"
This reverts commit 833fc17a3e.
* Pass trust_remote_code to all examples
* Revert "Add trust_remote_dataset_code arg to some tests" to research_projects
* Pass trust_remote_code to tests
* Pass trust_remote_code to docstrings
* Fix flax examples tests requirements
* Pass trust_remote_dataset_code arg to tests
* Replace trust_remote_dataset_code with trust_remote_code in one example
* Fix duplicate trust_remote_code
* Replace args.trust_remote_dataset_code with args.trust_remote_code
* Replace trust_remote_dataset_code with trust_remote_code in parser
* Replace trust_remote_dataset_code with trust_remote_code in dataclasses
* Replace trust_remote_dataset_code with trust_remote_code arg
* Rename to test_model_common_attributes
The method name is misleading - it is testing being able to get and set embeddings, not common attributes to all models
* Explicitly skip
* Add support for mixing languages in a single batch
* Update docstring
* Enable different detected languages in batch
* Do not require input_features
* Test list of languages
* Fix comment
* Make init_tokens length-1 if possible, broadcast at the end
* Test for ValueError with language list of incorrect length
* Slow test for batched multilingual transcription
* fixup
* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Address review, refactor
* Second attempt to move this line where it was originally
* Split test, fix a bug
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Fix generate_with_fallback **kwargs
* Change pop to get
* Delete keys from kwargs to prevent overriding generation_config
* Revert to passing kwargs by reference, but make a (shallow) copy
* dict -> copy.copy
* Add test_whisper_longform_multi_batch_beam
* add tests for batching support
* Update src/transformers/models/fastspeech2_conformer/modeling_fastspeech2_conformer.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/models/fastspeech2_conformer/modeling_fastspeech2_conformer.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update tests/test_modeling_common.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update tests/test_modeling_common.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update tests/test_modeling_common.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* fixes and comments
* use cosine distance for conv models
* skip mra model testing
* Update tests/models/vilt/test_modeling_vilt.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* finzalize and make style
* check model type by input names
* Update tests/models/vilt/test_modeling_vilt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fixed batch size for all testers
* Revert "fixed batch size for all testers"
This reverts commit 525f3a0a05.
* add batch_size for all testers
* dict from model output
* do not skip layoutlm
* bring back some code from git revert
* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* clean-up
* where did minus go in tolerance
* make whisper happy
* deal with consequences of losing minus
* deal with consequences of losing minus
* maskformer needs its own test for happiness
* fix more models
* tag flaky CV models from Amy's approval
* make codestyle
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* left-padding test revisited
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* up
* Fix more
* Correct more
* Fix more tests
* fix fast tests
* Fix more
* fix more
* push all files
* finish all
* make style
* Fix timestamp wrap
* make style
* make style
* up
* up
* up
* Fix lang detection behavior
* Fix lang detection behavior
* Add lang detection test
* Fix lang detection behavior
* make style
* Update src/transformers/models/whisper/generation_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* better error message
* make style tests
* add warning
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* finalize
* make fix copies whisper
* [Tests] Make sure that we don't run tests mulitple times
* Update src/transformers/models/whisper/modeling_whisper.py
* [Tests] Make sure that we don't run tests mulitple times
* fix more
* improve
* improve
* improve further
* improve more
* improve
* fix more
* git commit and git push
* fix more
* fix more
* fix more
* New try
* Fix more whisper stuff
* Improve
* correct more
* correct more
* correct more
* Fix some tests
* Add more tests
* correct more
* correct more
* correct more
* push
* correct more
* Fix more
* Better
* without dec mask
* correct more
* clean
* save intermediate
* Fix more
* Fix VAD for large-v2
* Save new
* Correct more
* make cleaner
* correct tests
* correct src
* Finish
* Fix more
* Fix more
* finish
* Fix edge cases
* fix return_dict_in_generate
* fix all tests
* make style
* add docstrings
* add docstrings
* Fix logit processor
* make style
* fix pipeline test
* fix more style
* Apply suggestions from code review
* apply feedback Sanchit
* correct more
* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* correct more
* correct more
* correct more
* Fix staticmethod
* correct more
* fix
* fix slow tests
* make style
* fix tokenizer test
* fix tokenizer test
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* finish
* finish
* revert kwargs change
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add sdpa
* wip
* cleaning
* add ref
* yet more cleaning
* and more :)
* wip llama
* working llama
* add output_attentions=True support
* bigcode sdpa support
* fixes
* gpt-bigcode support, require torch>=2.1.1
* add falcon support
* fix conflicts falcon
* style
* fix attention_mask definition
* remove output_attentions from attnmaskconverter
* support whisper without removing any Copied from statement
* fix mbart default to eager renaming
* fix typo in falcon
* fix is_causal in SDPA
* check is_flash_attn_2_available in the models init as well in case the model is not initialized through from_pretrained
* add warnings when falling back on the manual implementation
* precise doc
* wip replace _flash_attn_enabled by config.attn_implementation
* fix typo
* add tests
* style
* add a copy.deepcopy on the config in from_pretrained, as we do not want to modify it inplace
* obey to config.attn_implementation if a config is passed in from_pretrained
* fix is_torch_sdpa_available when torch is not installed
* remove dead code
* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/bart/modeling_bart.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* remove duplicate pretraining_tp code
* add dropout in llama
* precise comment on attn_mask
* add fmt: off for _unmask_unattended docstring
* precise num_masks comment
* nuke pretraining_tp in LlamaSDPAAttention following Arthur's suggestion
* cleanup modeling_utils
* backward compatibility
* fix style as requested
* style
* improve documentation
* test pass
* style
* add _unmask_unattended tests
* skip meaningless tests for idefics
* hard_check SDPA requirements when specifically requested
* standardize the use if XXX_ATTENTION_CLASSES
* fix SDPA bug with mem-efficient backend on CUDA when using fp32
* fix test
* rely on SDPA is_causal parameter to handle the causal mask in some cases
* fix FALCON_ATTENTION_CLASSES
* remove _flash_attn_2_enabled occurences
* fix test
* add OPT to the list of supported flash models
* improve test
* properly test on different SDPA backends, on different dtypes & properly handle separately the pad tokens in the test
* remove remaining _flash_attn_2_enabled occurence
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/perf_infer_gpu_one.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* remove use_attn_implementation
* fix docstring & slight bug
* make attn_implementation internal (_attn_implementation)
* typos
* fix tests
* deprecate use_flash_attention_2=True
* fix test
* add back llama that was removed by mistake
* fix tests
* remove _flash_attn_2_enabled occurences bis
* add check & test that passed attn_implementation is valid
* fix falcon torchscript export
* fix device of mask in tests
* add tip about torch.jit.trace and move bt doc below sdpa
* fix parameterized.expand order
* move tests from test_modeling_attn_mask_utils to test_modeling_utils as a relevant test class is already there
* update sdpaattention class with the new cache
* Update src/transformers/configuration_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/bark/modeling_bark.py
* address review comments
* WIP torch.jit.trace fix. left: test both eager & sdpa
* add test for torch.jit.trace for both eager/sdpa
* fix falcon with torch==2.0 that needs to use sdpa
* fix doc
* hopefully last fix
* fix key_value_length that has no default now in mask converter
* is it flacky?
* fix speculative decoding bug
* tests do pass
* fix following #27907
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* [Whisper] Add seq gen
* [Whisper] Add seq gen
* more debug
* Fix whisper logit processor
* Improve whisper code further
* Fix more
* more debug
* more debug
* Improve further
* Add tests
* Prep for batch size > 1
* Get batch_size>1 working
* Correct more
* Add extensive tests
* more debug
* more debug
* more debug
* add more tests
* more debug
* Apply suggestions from code review
* more debug
* add comments to explain the code better
* add comments to explain the code better
* add comments to explain the code better
* Add more examples
* add comments to explain the code better
* fix more
* add comments to explain the code better
* add comments to explain the code better
* correct
* correct
* finalize
* Apply suggestions from code review
* Apply suggestions from code review
* try to stylify using ruff
* might need to remove these changes?
* use ruf format andruff check
* use isinstance instead of type comparision
* use # fmt: skip
* use # fmt: skip
* nits
* soem styling changes
* update ci job
* nits isinstance
* more files update
* nits
* more nits
* small nits
* check and format
* revert wrong changes
* actually use formatter instead of checker
* nits
* well docbuilder is overwriting this commit
* revert notebook changes
* try to nuke docbuilder
* style
* fix feature exrtaction test
* remve `indent-width = 4`
* fixup
* more nits
* update the ruff version that we use
* style
* nuke docbuilder styling
* leve the print for detected changes
* nits
* Remove file I/O
Co-authored-by: charliermarsh
<charlie.r.marsh@gmail.com>
* style
* nits
* revert notebook changes
* Add # fmt skip when possible
* Add # fmt skip when possible
* Fix
* More ` # fmt: skip` usage
* More ` # fmt: skip` usage
* More ` # fmt: skip` usage
* NIts
* more fixes
* fix tapas
* Another way to skip
* Recommended way
* Fix two more fiels
* Remove asynch
Remove asynch
---------
Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com>