transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-06 14:20:04 +06:00

History

Younes Belkada f6261d7d81 FEAT / Optim: Add GaLore optimizer (#29588 ) * add galore v1 * add import * add tests and doc * fix doctest * forward contrib credits from discussions * forward contrib credits from discussions * Apply suggestions from code review Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix failing tests' * switch to `optim_target_modules` and clarify docs * more clarification * enhance lookup logic * update a test to add peak memory * add regex, all-linear and single string support * add layer-wise optimization through DummyOptimizers and LRSchedulers * forward contrib credits from discussions and original idea * add a section about DDP not supported in layerwise * Update src/transformers/trainer.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix self * check only if layer_wise * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * oops * make use of intervals * clarify comment * add matching tests * GaLoRe -> GaLore * move to `get_scheduler` * add note on docs * add a warning * adapt a bit the docs * update docstring * support original API * Update docs/source/en/trainer.md * slightly refactor * Update docs/source/en/trainer.md Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix args parsing and add tests * remove warning for regex * fix type hint * add note about extra args * make `is_regex` return optional --------- Co-authored-by: Maxime <maximegmd @users.noreply.github.com> Co-authored-by: Wing Lian <winglian @users.noreply.github.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: hiyouga <hiyouga@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>		2024-03-19 11:40:23 +01:00
..
de	Make torch xla available on GPU (#29334 )	2024-03-11 14:07:16 +00:00
en	FEAT / Optim: Add GaLore optimizer (#29588 )	2024-03-19 11:40:23 +01:00
es	[docs] Spanish translation of attention.md (#29681 )	2024-03-15 11:55:35 -07:00
fr	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
hi	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
it	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
ja	[docs] Remove broken ChatML format link from chat_templating.md (#29643 )	2024-03-13 13:04:51 -07:00
ko	Make torch xla available on GPU (#29334 )	2024-03-11 14:07:16 +00:00
ms	[Docs] Add missing language options and fix broken links (#28852 )	2024-02-06 12:01:01 -08:00
pt	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
te	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
tr	Translate index.md to Turkish (#27093 )	2023-11-08 08:35:20 -05:00
zh	[docs] Remove broken ChatML format link from chat_templating.md (#29643 )	2024-03-13 13:04:51 -07:00
_config.py	[`Styling`] stylify using ruff (#27144 )	2023-11-16 17:43:19 +01:00