Commit Graph

8 Commits

Author SHA1 Message Date
Younes Belkada
309a90664f
[FEAT] Add Neftune into transformers Trainer (#27141)
* add v1 neftune

* use `unwrap_model` instead

* add test + docs

* Apply suggestions from code review

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* more details

* fixup

* Update docs/source/en/main_classes/trainer.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* refactor a bit

* more elaborated test

* fix unwrap issue

---------

Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-10-31 16:03:59 +01:00
Rockerz
84724efd10
Translating en/main_classes folder docs to Japanese 🇯🇵 (#26894)
* add

* add

* add

* Add deepspeed.md

* Add

* add

* Update docs/source/ja/main_classes/callback.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/output.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/pipelines.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/processors.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/processors.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/text_generation.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/processors.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update  logging.md

* Update toctree.yml

* Update docs/source/ja/main_classes/deepspeed.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Add suggesitons

* m

* Update docs/source/ja/main_classes/trainer.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update toctree.yml

* Update Quantization.md

* Update docs/source/ja/_toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update toctree.yml

* Update docs/source/en/main_classes/deepspeed.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/main_classes/deepspeed.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-30 09:39:14 -07:00
Leandro von Werra
b18e31407c
add info on TRL docs (#27024)
* add info on TRL docs

* add TRL link

* tweak text

* tweak text
2023-10-24 14:56:00 +02:00
Arup De
738ecd17d8
Arde/fsdp activation checkpointing (#25771)
* add FSDP config option to enable activation-checkpointing

* update docs

* add checks and remove redundant code

* fix formatting error
2023-08-29 12:52:14 +05:30
mchau
6f041fcbb8
fix documentation for CustomTrainer (#25635)
fix doc
2023-08-21 17:23:17 +02:00
Sourab Mangrulkar
f4eb459ef2
fsdp fixes and enhancements (#24980)
* fix fsdp prepare to remove the warnings and fix excess memory usage

* Update training_args.py

* parity for FSDP+XLA

* Update trainer.py
2023-07-21 17:52:48 +05:30
statelesshz
8ba26c18cf
deprecate sharded_ddp training argument (#24825)
* deprecate fairscale's ShardedDDP

* fix code style

* roll back

* deprecate the `sharded_ddp` training argument

---------

Co-authored-by: jihuazhong <jihuazhong1@huawei.com>
2023-07-17 06:57:42 -04:00
Sylvain Gugger
eb849f6604
Migrate doc files to Markdown. (#24376)
* Rename index.mdx to index.md

* With saved modifs

* Address review comment

* Treat all files

* .mdx -> .md

* Remove special char

* Update utils/tests_fetcher.py

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

---------

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2023-06-20 18:07:47 -04:00