Commit Graph

15 Commits

Author SHA1 Message Date
Steven Liu
01c081d138
[docs] Trainer docs (#28145)
* fsdp, debugging, gpu selection

* fix hfoption

* fix
2023-12-20 10:37:23 -08:00
Steven Liu
ebfdb9ca62
[docs] MPS (#28016)
* mps docs

* toctree
2023-12-15 13:17:29 -08:00
Steven Liu
0d63d17765
[docs] Trainer (#27986)
* first draft

* add to toctree

* edits

* feedback
2023-12-15 12:06:55 -08:00
Peter Pan
ce31508134
docs: replace torch.distributed.run by torchrun (#27528)
* docs: replace torch.distributed.run by torchrun

 `transformers` now officially support pytorch >= 1.10.
 The entrypoint `torchrun`` is present from 1.10 onwards.

Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>

* Update src/transformers/trainer.py

with @ArthurZucker's suggestion

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-11-27 16:26:33 +00:00
fxmarty
c13a43aaf2
Reflect RoCm support in the documentation (#27636)
* reflect RoCm support in the documentation

* Update docs/source/en/main_classes/trainer.md

Co-authored-by: Lysandre Debut <hi@lysand.re>

* fix review comments

* use ROCm instead of RoCm

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>
2023-11-25 00:59:17 +09:00
Sourab Mangrulkar
a761d6e9a0
Refactoring Trainer, adds save_only_model arg and simplifying FSDP integration (#27652)
* add code changes

1. Refactor FSDP
2. Add `--save_only_model` option: When checkpointing, whether to only save the model, or also the optimizer, scheduler & rng state.
3. Bump up the minimum `accelerate` version to `0.21.0`

* quality

* fix quality?

* Revert "fix quality?"

This reverts commit 149330a6ab.

* fix fsdp doc strings

* fix quality

* Update src/transformers/training_args.py

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* please fix the quality issue 😅

* Apply suggestions from code review

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* address comment

* simplify conditional check as per the comment

* update documentation

---------

Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
2023-11-24 11:40:52 +05:30
Peter Pan
e4280d650c
docs: fix 404 link (#27529)
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
2023-11-20 12:24:38 +00:00
Younes Belkada
309a90664f
[FEAT] Add Neftune into transformers Trainer (#27141)
* add v1 neftune

* use `unwrap_model` instead

* add test + docs

* Apply suggestions from code review

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* more details

* fixup

* Update docs/source/en/main_classes/trainer.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* refactor a bit

* more elaborated test

* fix unwrap issue

---------

Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-10-31 16:03:59 +01:00
Rockerz
84724efd10
Translating en/main_classes folder docs to Japanese 🇯🇵 (#26894)
* add

* add

* add

* Add deepspeed.md

* Add

* add

* Update docs/source/ja/main_classes/callback.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/output.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/pipelines.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/processors.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/processors.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/text_generation.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/processors.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update  logging.md

* Update toctree.yml

* Update docs/source/ja/main_classes/deepspeed.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Add suggesitons

* m

* Update docs/source/ja/main_classes/trainer.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update toctree.yml

* Update Quantization.md

* Update docs/source/ja/_toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update toctree.yml

* Update docs/source/en/main_classes/deepspeed.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/main_classes/deepspeed.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-30 09:39:14 -07:00
Leandro von Werra
b18e31407c
add info on TRL docs (#27024)
* add info on TRL docs

* add TRL link

* tweak text

* tweak text
2023-10-24 14:56:00 +02:00
Arup De
738ecd17d8
Arde/fsdp activation checkpointing (#25771)
* add FSDP config option to enable activation-checkpointing

* update docs

* add checks and remove redundant code

* fix formatting error
2023-08-29 12:52:14 +05:30
mchau
6f041fcbb8
fix documentation for CustomTrainer (#25635)
fix doc
2023-08-21 17:23:17 +02:00
Sourab Mangrulkar
f4eb459ef2
fsdp fixes and enhancements (#24980)
* fix fsdp prepare to remove the warnings and fix excess memory usage

* Update training_args.py

* parity for FSDP+XLA

* Update trainer.py
2023-07-21 17:52:48 +05:30
statelesshz
8ba26c18cf
deprecate sharded_ddp training argument (#24825)
* deprecate fairscale's ShardedDDP

* fix code style

* roll back

* deprecate the `sharded_ddp` training argument

---------

Co-authored-by: jihuazhong <jihuazhong1@huawei.com>
2023-07-17 06:57:42 -04:00
Sylvain Gugger
eb849f6604
Migrate doc files to Markdown. (#24376)
* Rename index.mdx to index.md

* With saved modifs

* Address review comment

* Treat all files

* .mdx -> .md

* Remove special char

* Update utils/tests_fetcher.py

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

---------

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2023-06-20 18:07:47 -04:00