transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-03 12:50:06 +06:00

History

Peter St. John bab40c6838 [core] support tensor-valued _extra_state values in `from_pretrained` (#38155 ) Support tensor-valued _extra_state values TransformerEngine uses the pytorch get/set_extra_state API to store FP8 layer config information as bytes Tensor in the _extra_state entry in the state dict. With recent changes to from_pretrained, this functionality has broken and loading a model that uses this API doesn't appear to work. This PR fixes the save/load pretrained functions for extra state entries that use a pytorch tensor, and adds a (currently x-failing) test for a dictionary extra state. Signed-off-by: Peter St. John <pstjohn@nvidia.com>		2025-05-28 15:38:42 +02:00
..
bettertransformer	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
deepspeed	🚨 rm already deprecated pad_to_max_length arg (#37617 )	2025-05-01 15:21:55 +02:00
extended	Add Optional to remaining types (#37808 )	2025-04-28 14:20:45 +01:00
fixtures	Implementation of SuperPoint and AutoModelForKeypointDetection (#28966 )	2024-03-19 14:43:02 +00:00
fsdp	Fix the fsdp config cannot work issue. (#37549 )	2025-04-28 10:44:51 +02:00
generation	🚨Early-error🚨 config will error out if `output_attentions=True` and the attn implementation is wrong (#38288 )	2025-05-23 17:17:38 +02:00
models	🔴[`Attention`] Attention refactor for Whisper-based models (#38235 )	2025-05-28 13:32:38 +02:00
optimization	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
peft_integration	FIX: Faulty PEFT tests (#37757 )	2025-04-28 15:10:46 +02:00
pipelines	🚨Early-error🚨 config will error out if `output_attentions=True` and the attn implementation is wrong (#38288 )	2025-05-23 17:17:38 +02:00
quantization	enable large_gpu and torchao cases on XPU (#38355 )	2025-05-28 10:30:16 +02:00
repo_utils	Simplify soft dependencies and update the dummy-creation process (#36827 )	2025-04-11 11:08:36 +02:00
sagemaker	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
tensor_parallel	enable misc cases on XPU & use device agnostic APIs for cases in tests (#38192 )	2025-05-20 10:09:01 +02:00
tokenization	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
trainer	switch to device agnostic device calling for test cases (#38247 )	2025-05-26 10:18:53 +02:00
utils	[core] support tensor-valued _extra_state values in `from_pretrained` (#38155 )	2025-05-28 15:38:42 +02:00
__init__.py	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
causal_lm_tester.py	🚨 🚨 Inherited CausalLM Tests (#37590 )	2025-05-23 18:29:31 +01:00
test_backbone_common.py	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
test_configuration_common.py	Update composition flag usage (#36263 )	2025-04-09 11:48:49 +02:00
test_feature_extraction_common.py	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
test_image_processing_common.py	fix multi-image case for llava-onevision (#38084 )	2025-05-21 11:50:46 +02:00
test_image_transforms.py	Fix `pad` image transform for batched inputs (#37544 )	2025-05-08 10:51:15 +01:00
test_modeling_common.py	🔴[`Attention`] Attention refactor for Whisper-based models (#38235 )	2025-05-28 13:32:38 +02:00
test_modeling_flax_common.py	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
test_modeling_tf_common.py	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
test_pipeline_mixin.py	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
test_processing_common.py	🔴 Video processors as a separate class (#35206 )	2025-05-12 11:55:51 +02:00
test_sequence_feature_extraction_common.py	Use Python 3.9 syntax in tests (#37343 )	2025-04-08 14:12:08 +02:00
test_tokenization_common.py	🚨 rm already deprecated pad_to_max_length arg (#37617 )	2025-05-01 15:21:55 +02:00
test_training_args.py	Fix `TrainingArguments.torch_empty_cache_steps` post_init check (#36734 )	2025-03-17 16:09:46 +01:00
test_video_processing_common.py	🔴 Video processors as a separate class (#35206 )	2025-05-12 11:55:51 +02:00