transformers/tests/utils
Cyril Vallez 163138a911
🚨🚨[core] Completely rewrite the masking logic for all attentions (#37866)
* start

* start having a clean 4d mask primitive

* Update mask_utils.py

* Update mask_utils.py

* switch name

* Update masking_utils.py

* add a new AttentionMask tensor class

* fix import

* nits

* fixes

* use full and quandrants

* general sdpa mask for all caches

* style

* start some tests

* tests with sliding, chunked

* add styling

* test hybrid

* Update masking_utils.py

* small temp fixes

* Update modeling_gemma2.py

* compile compatible

* Update masking_utils.py

* improve

* start making it more general

* Update masking_utils.py

* generate

* make it work with flex style primitives!

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* improve

* Update cache_utils.py

* Update masking_utils.py

* simplify - starting to look good!

* Update masking_utils.py

* name

* Update masking_utils.py

* style

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* small fix for flex

* flex compile

* FA2

* Update masking_utils.py

* Escape for TGI/vLLM!

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* General case without cache

* rename

* full test on llama4

* small fix for FA2 guard with chunk

* Update modeling_gemma2.py

* post rebase cleanup

* FA2 supports static cache!

* Update modeling_flash_attention_utils.py

* Update flex_attention.py

* Update masking_utils.py

* Update masking_utils.py

* Update utils.py

* override for export

* Update executorch.py

* Update executorch.py

* Update executorch.py

* Update executorch.py

* Update masking_utils.py

* Update masking_utils.py

* output attentions

* style

* Update masking_utils.py

* Update executorch.py

* Add doicstring

* Add license and put mask visualizer at the end

* Update test_modeling_common.py

* fix broken test

* Update test_modeling_gemma.py

* Update test_modeling_gemma2.py

* Use fullgraph=False with FA2

* Update utils.py

* change name

* Update masking_utils.py

* improve doc

* change name

* Update modeling_attn_mask_utils.py

* more explicit logic based on model's property

* pattern in config

* extend

* fixes

* make it better

* generalize to other test models

* fix

* Update masking_utils.py

* fix

* do not check mask equivalence if layer types are different

* executorch

* Update modeling_gemma2.py

* Update masking_utils.py

* use layer_idx instead

* adjust

* Update masking_utils.py

* test

* fix imports

* Update modeling_gemma2.py

* other test models

* Update modeling_llama4.py

* Update masking_utils.py

* improve

* simplify

* Update masking_utils.py

* typos

* typo

* fix

* Update masking_utils.py

* default DynamicCache

* remove default cache

* simplify

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* simplify

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* export

* Update executorch.py

* Update executorch.py

* Update flex_attention.py

* Update executorch.py

* upstream to modular gemma 1 & 2

* Update modular_mistral.py

* switch names

* use dict

* put it in the Layer directly

* update copy model source for mask functions

* apply so many modular (hopefully 1 shot)

* use explicite dicts for make style happy

* protect import

* check docstring

* better default in hybrid caches

* qwens

* Update modular_qwen2.py

* simplify core logic!

* Update executorch.py

* qwen3 moe

* Update masking_utils.py

* Update masking_utils.py

* simplify a lot sdpa causal skip

* Update masking_utils.py

* post-rebase

* gemma3 finally

* style

* check it before

* gemma3

* More general with newer torch

* align gemma3

* Update utils.py

* Update utils.py

* Update masking_utils.py

* Update test_modeling_common.py

* Update flex_attention.py

* Update flex_attention.py

* Update flex_attention.py

* test

* executorch

* Update test_modeling_common.py

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* Update executorch.py

* Update test_modeling_common.py

* fix copies

* device

* sdpa can be used without mask -> pass the torchscript tests in this case

* Use enum for check

* revert enum and add check instead

* remove broken test

* cohere2

* some doc & reorganize the Interface

* Update tensor_parallel.py

* Update tensor_parallel.py

* doc and dummy

* Update test_modeling_paligemma2.py

* Update modeling_falcon_h1.py

* Update masking_utils.py

* executorch patch

* style

* CIs

* use register in executorch

* final comments!

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2025-05-22 11:38:26 +02:00
..
import_structures Support for version spec in requires & arbitrary mismatching depths across folders (#37854) 2025-05-09 15:26:27 +02:00
__init__.py [Test refactor 1/5] Per-folder tests reorganization (#15725) 2022-02-23 15:46:28 -05:00
test_activations_tf.py TF: Add sigmoid activation function (#16819) 2022-04-19 16:13:08 +01:00
test_activations.py use torch.testing.assertclose instead to get more details about error in cis (#35659) 2025-01-24 16:55:28 +01:00
test_add_new_model_like.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_audio_utils.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_auto_docstring.py [AutoDocstring] Based on inspect parsing of the signature (#33771) 2025-05-08 17:46:07 -04:00
test_backbone_utils.py 🚨 out_indices always a list (#30941) 2024-05-22 15:23:04 +01:00
test_cache_utils.py 🚨🚨[core] Completely rewrite the masking logic for all attentions (#37866) 2025-05-22 11:38:26 +02:00
test_chat_template_utils.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_cli.py Transformers cli clean command (#37657) 2025-04-30 12:15:43 +01:00
test_configuration_utils.py Fix typos in strings and comments (#37799) 2025-04-28 11:39:11 +01:00
test_convert_slow_tokenizer.py Revert error back into warning for byte fallback conversion. (#22607) 2023-04-06 14:00:29 +02:00
test_deprecation.py enable utils test cases on XPU (#38005) 2025-05-09 08:45:01 +02:00
test_doc_samples.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_dynamic_module_utils.py Fix the regex in get_imports to support multiline try blocks and excepts with specific exception types (#23725) 2023-05-24 15:40:19 -04:00
test_expectations.py enable 2 llama UT cases on xpu (#37126) 2025-04-07 16:02:14 +02:00
test_feature_extraction_utils.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_file_utils.py Inheritance-based framework detection (#21784) 2023-02-27 15:31:55 +00:00
test_generic.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_hf_argparser.py Improve typing in TrainingArgument (#36944) 2025-05-21 13:54:38 +00:00
test_hub_utils.py Add test to ensure unknown exceptions reraising in utils/hub.py::cached_files() (#37651) 2025-04-22 11:38:10 +02:00
test_image_processing_utils.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_image_utils.py 🔴 Video processors as a separate class (#35206) 2025-05-12 11:55:51 +02:00
test_import_structure.py Support for version spec in requires & arbitrary mismatching depths across folders (#37854) 2025-05-09 15:26:27 +02:00
test_import_utils.py Fix test isolation for clear_import_cache utility (#36345) 2025-03-17 16:09:09 +01:00
test_logging.py Fix flaky test for log level (#21776) 2023-02-28 16:24:14 -05:00
test_model_card.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_model_debugging_utils.py Model debugger upgrades (#37391) 2025-04-18 16:45:54 +02:00
test_model_output.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_modeling_flax_utils.py [tests] remove flax-pt equivalence and cross tests (#36283) 2025-02-19 15:13:27 +00:00
test_modeling_rope_utils.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_modeling_tf_core.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_modeling_tf_utils.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_modeling_utils.py Support for transformers explicit filename (#38152) 2025-05-19 14:33:47 +02:00
test_offline.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_processing_utils.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_skip_decorators.py enable utils test cases on XPU (#38005) 2025-05-09 08:45:01 +02:00
test_tokenization_utils.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_versions_utils.py Remove old code for PyTorch, Accelerator and tokenizers (#37234) 2025-04-10 20:54:21 +02:00
test_video_utils.py [video processor] fix tests (#38104) 2025-05-14 10:24:07 +00:00
tiny_model_summary.json Add image text to text pipeline (#34170) 2024-10-31 15:48:11 -04:00