transformers/tests
Jingze Shi 48a309d0d2
Support constant lr with cooldown (#35453)
* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add more warmup and cooldown methods to 'get_wsc_schedule'

* Add more warmup and cooldown methods to 'get_wsc_schedule'

* Add more warmup and cooldown methods to 'get_wsc_schedule'

* Add more warmup and cooldown methods to 'get_wsc_schedule'

* Add more warmup and decay methods to 'get_wsd_schedule'

* support num_training_steps and num_stable_steps for get_wsd_schedule

* support num_training_steps and num_stable_steps for get_wsd_schedule

* get wsd scheduler before the `num_training_steps` decision

* fix code_quality

* Update stable branch logic

* fix code_quality

* Move stable stage decide to `get_wsd_schedule`

* Update docstring of `get_wsd_schedule`

* Update `num_train_steps` to optional

* Update `num_train_steps` to optional

* Update docstring of `get_wsd_schedule`

* Update src/transformers/optimization.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-10 13:21:55 +01:00
..
agents use torch.testing.assertclose instead to get more details about error in cis (#35659) 2025-01-24 16:55:28 +01:00
bettertransformer use torch.testing.assertclose instead to get more details about error in cis (#35659) 2025-01-24 16:55:28 +01:00
deepspeed DeepSpeed github repo move sync (#36021) 2025-02-05 08:19:31 -08:00
extended [tests] skip tests for xpu (#33553) 2024-09-19 19:28:04 +01:00
fixtures
fsdp [tests] make cuda-only tests device-agnostic (#35607) 2025-01-13 14:48:39 +01:00
generation Fix StopStringCriteria to handle tokens above len(tokenizer) (#35797) 2025-02-06 16:53:28 +00:00
models Add Apple's Depth-Pro for depth estimation (#34583) 2025-02-10 11:32:45 +00:00
optimization Support constant lr with cooldown (#35453) 2025-02-10 13:21:55 +01:00
peft_integration use torch.testing.assertclose instead to get more details about error in cis (#35659) 2025-01-24 16:55:28 +01:00
pipelines Move audio top_k tests to the right file and add slow decorator (#36072) 2025-02-07 14:32:30 +00:00
quantization Fix words typos in ggml test. (#36060) 2025-02-06 15:32:40 +00:00
repo_utils Fix modular edge case + modular sorting order (#35562) 2025-01-09 17:17:52 +01:00
sagemaker Trainer - deprecate tokenizer for processing_class (#32385) 2024-10-02 14:08:46 +01:00
tokenization tokenizer train from iterator without pre_tokenizers (#35396) 2025-01-09 15:34:43 +01:00
tp Update-tp test (#35844) 2025-02-03 09:37:02 +01:00
trainer layernorm_decay_fix (#35927) 2025-02-04 11:01:49 +01:00
utils Nail in edge case of torch dtype being overriden permantly in the case of an error (#35845) 2025-02-06 09:05:23 -05:00
__init__.py
test_backbone_common.py
test_configuration_common.py Load sub-configs from composite configs (#34410) 2024-11-05 11:34:01 +01:00
test_feature_extraction_common.py
test_image_processing_common.py Refactoring of ImageProcessorFast (#35069) 2025-02-04 17:52:31 -05:00
test_image_transforms.py fix: center_crop occasionally outputs off-by-one dimension matrix (#30934) 2024-05-21 13:56:52 +01:00
test_modeling_common.py Fix model kwargs (#35875) 2025-02-06 11:35:25 -05:00
test_modeling_flax_common.py 🚨All attention refactor🚨 (#35235) 2024-12-18 16:53:39 +01:00
test_modeling_tf_common.py 🚨All attention refactor🚨 (#35235) 2024-12-18 16:53:39 +01:00
test_pipeline_mixin.py Add image text to text pipeline (#34170) 2024-10-31 15:48:11 -04:00
test_processing_common.py Chat template: update for processor (#35953) 2025-02-10 09:52:19 +01:00
test_sequence_feature_extraction_common.py
test_tokenization_common.py apply_chat_template: consistent behaviour for return_assistant_tokens_mask=True return_tensors=True (#35582) 2025-02-04 10:27:52 +01:00