transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-08-02 19:21:31 +06:00

History

Jingze Shi 48a309d0d2 Support constant lr with cooldown (#35453 ) * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add more warmup and cooldown methods to 'get_wsc_schedule' * Add more warmup and cooldown methods to 'get_wsc_schedule' * Add more warmup and cooldown methods to 'get_wsc_schedule' * Add more warmup and cooldown methods to 'get_wsc_schedule' * Add more warmup and decay methods to 'get_wsd_schedule' * support num_training_steps and num_stable_steps for get_wsd_schedule * support num_training_steps and num_stable_steps for get_wsd_schedule * get wsd scheduler before the `num_training_steps` decision * fix code_quality * Update stable branch logic * fix code_quality * Move stable stage decide to `get_wsd_schedule` * Update docstring of `get_wsd_schedule` * Update `num_train_steps` to optional * Update `num_train_steps` to optional * Update docstring of `get_wsd_schedule` * Update src/transformers/optimization.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>		2025-02-10 13:21:55 +01:00
..
agents	use torch.testing.assertclose instead to get more details about error in cis (#35659 )	2025-01-24 16:55:28 +01:00
bettertransformer	use torch.testing.assertclose instead to get more details about error in cis (#35659 )	2025-01-24 16:55:28 +01:00
deepspeed	DeepSpeed github repo move sync (#36021 )	2025-02-05 08:19:31 -08:00
extended	[tests] skip tests for xpu (#33553 )	2024-09-19 19:28:04 +01:00
fixtures
fsdp	[tests] make cuda-only tests device-agnostic (#35607 )	2025-01-13 14:48:39 +01:00
generation	Fix StopStringCriteria to handle tokens above len(tokenizer) (#35797 )	2025-02-06 16:53:28 +00:00
models	Add Apple's Depth-Pro for depth estimation (#34583 )	2025-02-10 11:32:45 +00:00
optimization	Support constant lr with cooldown (#35453 )	2025-02-10 13:21:55 +01:00
peft_integration	use torch.testing.assertclose instead to get more details about error in cis (#35659 )	2025-01-24 16:55:28 +01:00
pipelines	Move audio top_k tests to the right file and add slow decorator (#36072 )	2025-02-07 14:32:30 +00:00
quantization	Fix words typos in ggml test. (#36060 )	2025-02-06 15:32:40 +00:00
repo_utils	Fix modular edge case + modular sorting order (#35562 )	2025-01-09 17:17:52 +01:00
sagemaker	Trainer - deprecate tokenizer for processing_class (#32385 )	2024-10-02 14:08:46 +01:00
tokenization	`tokenizer` train from iterator without pre_tokenizers (#35396 )	2025-01-09 15:34:43 +01:00
tp	Update-tp test (#35844 )	2025-02-03 09:37:02 +01:00
trainer	layernorm_decay_fix (#35927 )	2025-02-04 11:01:49 +01:00
utils	Nail in edge case of torch dtype being overriden permantly in the case of an error (#35845 )	2025-02-06 09:05:23 -05:00
__init__.py
test_backbone_common.py
test_configuration_common.py	Load sub-configs from composite configs (#34410 )	2024-11-05 11:34:01 +01:00
test_feature_extraction_common.py
test_image_processing_common.py	Refactoring of ImageProcessorFast (#35069 )	2025-02-04 17:52:31 -05:00
test_image_transforms.py	fix: center_crop occasionally outputs off-by-one dimension matrix (#30934 )	2024-05-21 13:56:52 +01:00
test_modeling_common.py	Fix model kwargs (#35875 )	2025-02-06 11:35:25 -05:00
test_modeling_flax_common.py	🚨All attention refactor🚨 (#35235 )	2024-12-18 16:53:39 +01:00
test_modeling_tf_common.py	🚨All attention refactor🚨 (#35235 )	2024-12-18 16:53:39 +01:00
test_pipeline_mixin.py	Add image text to text pipeline (#34170 )	2024-10-31 15:48:11 -04:00
test_processing_common.py	Chat template: update for processor (#35953 )	2025-02-10 09:52:19 +01:00
test_sequence_feature_extraction_common.py
test_tokenization_common.py	apply_chat_template: consistent behaviour for return_assistant_tokens_mask=True return_tensors=True (#35582 )	2025-02-04 10:27:52 +01:00