transformers/docs/source
Younes Belkada f6261d7d81
FEAT / Optim: Add GaLore optimizer (#29588)
* add galore v1

* add import

* add tests and doc

* fix doctest

* forward contrib credits from discussions

* forward contrib credits from discussions

* Apply suggestions from code review

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* fix failing tests'

* switch to `optim_target_modules` and clarify docs

* more clarification

* enhance lookup logic

* update a test to add peak memory

* add regex, all-linear and single string support

* add layer-wise optimization through DummyOptimizers and LRSchedulers

* forward contrib credits from discussions and original idea

* add a section about DDP not supported in layerwise

* Update src/transformers/trainer.py

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* fix self

* check only if layer_wise

* Update src/transformers/training_args.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* oops

* make use of intervals

* clarify comment

* add matching tests

* GaLoRe -> GaLore

* move to `get_scheduler`

* add note on docs

* add a warning

* adapt a bit the docs

* update docstring

* support original API

* Update docs/source/en/trainer.md

* slightly refactor

* Update docs/source/en/trainer.md

Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

* Update src/transformers/training_args.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix args parsing and add tests

* remove warning for regex

* fix type hint

* add note about extra args

* make `is_regex` return optional

---------

Co-authored-by: Maxime <maximegmd @users.noreply.github.com>
Co-authored-by: Wing Lian <winglian @users.noreply.github.com>
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Co-authored-by: hiyouga <hiyouga@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
2024-03-19 11:40:23 +01:00
..
de Make torch xla available on GPU (#29334) 2024-03-11 14:07:16 +00:00
en FEAT / Optim: Add GaLore optimizer (#29588) 2024-03-19 11:40:23 +01:00
es [docs] Spanish translation of attention.md (#29681) 2024-03-15 11:55:35 -07:00
fr Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
hi Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
it Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
ja [docs] Remove broken ChatML format link from chat_templating.md (#29643) 2024-03-13 13:04:51 -07:00
ko Make torch xla available on GPU (#29334) 2024-03-11 14:07:16 +00:00
ms [Docs] Add missing language options and fix broken links (#28852) 2024-02-06 12:01:01 -08:00
pt Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
te Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
tr Translate index.md to Turkish (#27093) 2023-11-08 08:35:20 -05:00
zh [docs] Remove broken ChatML format link from chat_templating.md (#29643) 2024-03-13 13:04:51 -07:00
_config.py [Styling] stylify using ruff (#27144) 2023-11-16 17:43:19 +01:00