transformers/tests
Poedator 7c62e69326
GPT2Model StaticCache support (#35761)
* initial GPT2 changes

* causal_mask support

* return_legacy_cache

* cleanup

* fix1

* outputs shape fixes

* gpt2 return fix

* pkv, attn fixes

* fix dual_head

* is_causal arg fix

* decision transformer updated

* style fix

* batch_size from inputs_embeds

* DecisionTransformerModel fixes

* cross-attn support + cache warning

* x-attn @decision

* EDCache proper init

* simplified logic in `if use_cache:` for GPT2Model

* @deprecate_kwarg for DecisionTr attn fwd

* @deprecate_kwarg in gpt2

* deprecation version updated to 4.51

* kwargs in gradient_checkpointing_fn

* rename next_cache to past_key_values

* attention_mask prep

* +cache_position in GPT2DoubleHeadsModel

* undo kwargs in gradient checkpointing

* moved up `if self.gradient_checkpointing`

* consistency in decision_transformer

* pastkv, cache_pos in grad_checkpt args

* rm _reorder_cache

* output_attentions streamlined

* decision_transformer consistency

* return_legacy_cache improved

* ClvpForCausalLM used for legacy cache test now

* is_causal fixed

* attn_output cleanup

* consistency @ decision_transformer

* Updated deprecation notice version to 4.52

* upd deprecation

* consistent legacy cache code in decision transformers\

* next_cache -> past_kv in decision_tr

* cache support flags in decision_transf

* rm legacy cache warning

* consistency in cache init for decision transf

* no Static Cache for Decision Transformer

---------

Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
2025-04-24 14:46:35 +02:00
..
bettertransformer Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
deepspeed Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
extended Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
fixtures Implementation of SuperPoint and AutoModelForKeypointDetection (#28966) 2024-03-19 14:43:02 +00:00
fsdp Remove old code for PyTorch, Accelerator and tokenizers (#37234) 2025-04-10 20:54:21 +02:00
generation [tests, qwen2_5_omni] fix flaky tests (#37721) 2025-04-23 17:54:12 +01:00
models Update MllamaForConditionalGenerationIntegrationTest (#37750) 2025-04-24 14:29:46 +02:00
optimization Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
peft_integration enable several cases on XPU (#37516) 2025-04-16 11:01:04 +02:00
pipelines Process inputs directly in apply_chat_template in image-text-to-text pipeline (#35616) 2025-04-23 13:31:33 -04:00
quantization Fixing quantization tests (#37650) 2025-04-22 13:59:57 +02:00
repo_utils Simplify soft dependencies and update the dummy-creation process (#36827) 2025-04-11 11:08:36 +02:00
sagemaker Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
tensor_parallel enable tp on CPU (#36299) 2025-03-31 10:55:47 +02:00
tokenization Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
trainer enable 4 test_trainer cases on XPU (#37645) 2025-04-23 21:29:42 +02:00
utils GPT2Model StaticCache support (#35761) 2025-04-24 14:46:35 +02:00
__init__.py
test_backbone_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_configuration_common.py Update composition flag usage (#36263) 2025-04-09 11:48:49 +02:00
test_feature_extraction_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_image_processing_common.py Bridgetower fast image processor (#37373) 2025-04-16 22:39:18 +02:00
test_image_transforms.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_modeling_common.py Small fix on context manager detection (#37562) 2025-04-17 15:39:44 +02:00
test_modeling_flax_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_modeling_tf_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_pipeline_mixin.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_processing_common.py Add Qwen2.5-Omni (#36752) 2025-04-14 12:36:41 +02:00
test_sequence_feature_extraction_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_tokenization_common.py 🚨 🚨 Allow saving and loading multiple "raw" chat template files (#36588) 2025-04-11 16:37:23 +01:00
test_training_args.py Fix TrainingArguments.torch_empty_cache_steps post_init check (#36734) 2025-03-17 16:09:46 +01:00