transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

History

Poedator 7c62e69326 `GPT2Model` StaticCache support (#35761 ) * initial GPT2 changes * causal_mask support * return_legacy_cache * cleanup * fix1 * outputs shape fixes * gpt2 return fix * pkv, attn fixes * fix dual_head * is_causal arg fix * decision transformer updated * style fix * batch_size from inputs_embeds * DecisionTransformerModel fixes * cross-attn support + cache warning * x-attn @decision * EDCache proper init * simplified logic in `if use_cache:` for GPT2Model * @deprecate_kwarg for DecisionTr attn fwd * @deprecate_kwarg in gpt2 * deprecation version updated to 4.51 * kwargs in gradient_checkpointing_fn * rename next_cache to past_key_values * attention_mask prep * +cache_position in GPT2DoubleHeadsModel * undo kwargs in gradient checkpointing * moved up `if self.gradient_checkpointing` * consistency in decision_transformer * pastkv, cache_pos in grad_checkpt args * rm _reorder_cache * output_attentions streamlined * decision_transformer consistency * return_legacy_cache improved * ClvpForCausalLM used for legacy cache test now * is_causal fixed * attn_output cleanup * consistency @ decision_transformer * Updated deprecation notice version to 4.52 * upd deprecation * consistent legacy cache code in decision transformers\ * next_cache -> past_kv in decision_tr * cache support flags in decision_transf * rm legacy cache warning * consistency in cache init for decision transf * no Static Cache for Decision Transformer --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>		2025-04-24 14:46:35 +02:00
..
bettertransformer	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
deepspeed	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
extended	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
fixtures	Implementation of SuperPoint and AutoModelForKeypointDetection (#28966 )	2024-03-19 14:43:02 +00:00
fsdp	Remove old code for PyTorch, Accelerator and tokenizers (#37234 )	2025-04-10 20:54:21 +02:00
generation	[tests, `qwen2_5_omni`] fix flaky tests (#37721 )	2025-04-23 17:54:12 +01:00
models	Update `MllamaForConditionalGenerationIntegrationTest` (#37750 )	2025-04-24 14:29:46 +02:00
optimization	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
peft_integration	enable several cases on XPU (#37516 )	2025-04-16 11:01:04 +02:00
pipelines	Process inputs directly in apply_chat_template in image-text-to-text pipeline (#35616 )	2025-04-23 13:31:33 -04:00
quantization	Fixing quantization tests (#37650 )	2025-04-22 13:59:57 +02:00
repo_utils	Simplify soft dependencies and update the dummy-creation process (#36827 )	2025-04-11 11:08:36 +02:00
sagemaker	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
tensor_parallel	enable tp on CPU (#36299 )	2025-03-31 10:55:47 +02:00
tokenization	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
trainer	enable 4 test_trainer cases on XPU (#37645 )	2025-04-23 21:29:42 +02:00
utils	`GPT2Model` StaticCache support (#35761 )	2025-04-24 14:46:35 +02:00
__init__.py
test_backbone_common.py	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
test_configuration_common.py	Update composition flag usage (#36263 )	2025-04-09 11:48:49 +02:00
test_feature_extraction_common.py	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
test_image_processing_common.py	Bridgetower fast image processor (#37373 )	2025-04-16 22:39:18 +02:00
test_image_transforms.py	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
test_modeling_common.py	Small fix on context manager detection (#37562 )	2025-04-17 15:39:44 +02:00
test_modeling_flax_common.py	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
test_modeling_tf_common.py	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
test_pipeline_mixin.py	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
test_processing_common.py	Add Qwen2.5-Omni (#36752 )	2025-04-14 12:36:41 +02:00
test_sequence_feature_extraction_common.py	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
test_tokenization_common.py	🚨 🚨 Allow saving and loading multiple "raw" chat template files (#36588 )	2025-04-11 16:37:23 +01:00
test_training_args.py	Fix `TrainingArguments.torch_empty_cache_steps` post_init check (#36734 )	2025-03-17 16:09:46 +01:00