mirror of https://github.com/huggingface/transformers.git synced 2025-07-04 13:20:12 +06:00

Alexander Visheratin 7b1170b0fa

* Added WSD scheduler.

* Added tests.

* Fixed errors.

* Fix formatting.

* CI fixes.

2024-04-25 12:07:21 +01:00

Optimization

The .optimization module provides:

an optimizer with weight decay fixed that can be used to fine-tuned models, and
several schedules in the form of schedule objects that inherit from _LRSchedule:
a gradient accumulation class to accumulate the gradients of multiple batches

AdamW (PyTorch)

autodoc Adafactor

autodoc AdamWeightDecay

autodoc create_optimizer

autodoc SchedulerType

autodoc get_scheduler

autodoc get_constant_schedule

autodoc get_constant_schedule_with_warmup

autodoc get_cosine_schedule_with_warmup

autodoc get_cosine_with_hard_restarts_schedule_with_warmup

autodoc get_linear_schedule_with_warmup

autodoc get_polynomial_decay_schedule_with_warmup

autodoc get_inverse_sqrt_schedule

autodoc get_wsd_schedule

autodoc GradientAccumulator