Lysandre Debut
8010fda9bf
Removes images to put them in a dataset ( #14781 )
...
* First try
* Update instructions
2021-12-16 04:42:02 -05:00
Stas Bekman
fdf3ce2827
[doc] performance: groups of operations by compute-intensity ( #14757 )
2021-12-14 19:01:23 -08:00
Stas Bekman
027074f4d0
[doc] document MoE model approach and current solutions ( #14725 )
...
* document MoE model approach
* additional info from Samyam
* fix
2021-12-10 18:24:38 -08:00
Stas Bekman
1228661285
[bf16 support] tweaks ( #14580 )
...
* [bf16 support] tweaks
* corrections
Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com>
2021-12-08 11:33:24 -08:00
Stas Bekman
71b1bf7ea8
[trainer] add tf32-mode control ( #14606 )
...
* [trainer] add --tf32 support
* it's pt>=.17
* it's pt>=.17
* flip the default to True
* add experimental note
* simplify logic
* style
* switch to 3-state logic
* doc
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* re-style code
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-12-03 10:08:58 -08:00
Mishig Davaadorj
275402bf2b
Update doc img links ( #14593 )
...
* Update doc img links
* Rename toctree.yml -> _toctree.yml (#14594 )
* Update doc img links
* Update performance.md img link
2021-12-02 09:01:35 +01:00
Stas Bekman
fbe278c76c
[doc] bf16/tf32 guide ( #14579 )
...
* [doc] bf16/tf32 guide
* expand
* expand
* Update docs/source/performance.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-12-01 14:18:58 -08:00
Stas Bekman
29dfb2dbb1
[doc] performance and parallelism updates ( #14391 )
...
* [doc] performance and parallelism doc update
* improve
* improve
2021-11-14 17:19:15 -08:00
Sylvain Gugger
27d4639779
Make gradient_checkpointing a training argument ( #13657 )
...
* Make gradient_checkpointing a training argument
* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Update src/transformers/configuration_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Fix tests
* Style
* document Gradient Checkpointing as a performance feature
* Small rename
* PoC for not using the config
* Adapt BC to new PoC
* Forgot to save
* Rollout changes to all other models
* Fix typo
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
2021-09-22 07:51:38 -04:00
Stas Bekman
31cfcbd3e2
[doc] performance: batch sizes ( #12725 )
2021-07-15 09:39:34 -07:00
Stas Bekman
bfd5da8e28
[docs] performance ( #12258 )
...
* initial performance document
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* rewrites based on suggestions
* 8x multiple is for AMP only
* add contribute section
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-06-22 15:34:19 -07:00