transformers/tests
Mario Michael Krell bde41d69b4
Correctly drop tokens in SwitchTransformer (#37123)
Previously, the identity function was used for dropped tokens
with a weight from the expert that was not applied to the hidden states.
This was misleading, because dropping means, the expert weight is zero.
Instead of trying to fix the weight, we take an easier approach by initializing with zeros.

Fixes issue https://github.com/huggingface/transformers/issues/37017
2025-04-10 16:58:57 +02:00
..
agents Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
bettertransformer Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
deepspeed Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
extended Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
fixtures Implementation of SuperPoint and AutoModelForKeypointDetection (#28966) 2024-03-19 14:43:02 +00:00
fsdp byebye torch 2.0 (#37277) 2025-04-07 15:19:47 +02:00
generation Update composition flag usage (#36263) 2025-04-09 11:48:49 +02:00
models Correctly drop tokens in SwitchTransformer (#37123) 2025-04-10 16:58:57 +02:00
optimization Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
peft_integration Fix warning message for PEFT models in text-generation pipeline #36783 (#36887) 2025-04-09 15:36:52 +01:00
pipelines Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
quantization Quark Quantization gated repo (#37412) 2025-04-10 14:57:15 +02:00
repo_utils Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
sagemaker Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
tensor_parallel enable tp on CPU (#36299) 2025-03-31 10:55:47 +02:00
tokenization Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
trainer Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
utils [core] remove GenerationMixin inheritance by default in PreTrainedModel (#37173) 2025-04-08 16:42:05 +01:00
__init__.py
test_backbone_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_configuration_common.py Update composition flag usage (#36263) 2025-04-09 11:48:49 +02:00
test_feature_extraction_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_image_processing_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_image_transforms.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_modeling_common.py Allow rocm systems to run these tests (#37278) 2025-04-10 13:33:01 +02:00
test_modeling_flax_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_modeling_tf_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_pipeline_mixin.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_processing_common.py [chat-template] Unify tests and clean up 🧼 (#37275) 2025-04-10 14:42:32 +02:00
test_sequence_feature_extraction_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_tokenization_common.py 🚨 🚨 Setup -> setupclass conversion (#37282) 2025-04-08 17:15:37 +01:00
test_training_args.py Fix TrainingArguments.torch_empty_cache_steps post_init check (#36734) 2025-03-17 16:09:46 +01:00