transformers/tests
Hamza Benchekroun 797860c68c
Some checks are pending
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run
Build documentation / build (push) Waiting to run
New model PR merged notification / Notify new model (push) Waiting to run
Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions
Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run
Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions
Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions
Secret Leaks / trufflehog (push) Waiting to run
Update Transformers metadata / build_and_package (push) Waiting to run
feat: add flexible Liger Kernel configuration to TrainingArguments (#38911)
* feat: add flexible Liger Kernel configuration to TrainingArguments

Add support for granular Liger Kernel configuration through a new
`liger_kernel_config` parameter in TrainingArguments. This allows users
to selectively enable/disable specific kernels (rope, swiglu, cross_entropy,
etc.) instead of the current approach that rely on default configuration.

Features:
- Add `liger_kernel_config` dict parameter to TrainingArguments
- Support selective kernel application for all supported models
- Maintain full backward compatibility with existing `use_liger_kernel` flag

Example usage:
```python
TrainingArguments(
    use_liger_kernel=True,
    liger_kernel_config={
        "rope": True,
        "swiglu": True,
        "cross_entropy": False,
        "fused_linear_cross_entropy": True
    }
)
Closes #38905

* Address comments and update Liger section in Trainer docs
2025-06-19 15:54:08 +00:00
..
bettertransformer Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
deepspeed 🚨 rm already deprecated pad_to_max_length arg (#37617) 2025-05-01 15:21:55 +02:00
extended Add Optional to remaining types (#37808) 2025-04-28 14:20:45 +01:00
fixtures Implementation of SuperPoint and AutoModelForKeypointDetection (#28966) 2024-03-19 14:43:02 +00:00
fsdp Fix the fsdp config cannot work issue. (#37549) 2025-04-28 10:44:51 +02:00
generation enable misc test cases on XPU (#38852) 2025-06-18 09:20:49 +02:00
models Fix FalconMambaIntegrationTests (#38566) 2025-06-19 13:50:33 +02:00
optimization Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
peft_integration FIX: Faulty PEFT tests (#37757) 2025-04-28 15:10:46 +02:00
pipelines [BugFix] QA pipeline edge case: align_to_words=True in QuestionAnsweringPipeline can lead to duplicate answers (#38761) 2025-06-16 15:01:22 +00:00
quantization Fix HQQ model param device transfer issue (#38466) 2025-06-18 15:09:00 +02:00
repo_utils Use HF papers (#38184) 2025-06-13 11:07:09 +00:00
sagemaker Deprecate TF + JAX (#38758) 2025-06-11 17:28:06 +01:00
tensor_parallel [TP] Change command in tests to python3 (#38555) 2025-06-03 11:03:33 +00:00
tokenization Remove isort from dependencies (#38616) 2025-06-05 16:42:49 +00:00
trainer feat: add flexible Liger Kernel configuration to TrainingArguments (#38911) 2025-06-19 15:54:08 +00:00
utils More PYUP fixes (#38883) 2025-06-18 14:38:08 +01:00
__init__.py GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
causal_lm_tester.py Refactor DBRX tests to use CausalLMModelTest base classes (#38475) 2025-06-13 16:22:12 +01:00
test_backbone_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_configuration_common.py Update composition flag usage (#36263) 2025-04-09 11:48:49 +02:00
test_feature_extraction_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_image_processing_common.py enable more test cases on xpu (#38572) 2025-06-06 09:29:51 +02:00
test_image_transforms.py Fix pad image transform for batched inputs (#37544) 2025-05-08 10:51:15 +01:00
test_modeling_common.py Skip sdpa tests if submodule does not support sdpa (#38907) 2025-06-19 13:11:01 +00:00
test_pipeline_mixin.py No more Tuple, List, Dict (#38797) 2025-06-17 19:37:18 +01:00
test_processing_common.py [video processors] support frame sampling within processors (#38105) 2025-06-12 09:34:30 +00:00
test_sequence_feature_extraction_common.py No more Tuple, List, Dict (#38797) 2025-06-17 19:37:18 +01:00
test_tokenization_common.py 🚨 rm already deprecated pad_to_max_length arg (#37617) 2025-05-01 15:21:55 +02:00
test_training_args.py Fix TrainingArguments.torch_empty_cache_steps post_init check (#36734) 2025-03-17 16:09:46 +01:00
test_video_processing_common.py [video processors] support frame sampling within processors (#38105) 2025-06-12 09:34:30 +00:00