Commit Graph

11 Commits

Author SHA1 Message Date
Lysandre Debut
8010fda9bf
Removes images to put them in a dataset (#14781)
* First try

* Update instructions
2021-12-16 04:42:02 -05:00
Stas Bekman
fdf3ce2827
[doc] performance: groups of operations by compute-intensity (#14757) 2021-12-14 19:01:23 -08:00
Stas Bekman
027074f4d0
[doc] document MoE model approach and current solutions (#14725)
* document MoE model approach

* additional info from Samyam

* fix
2021-12-10 18:24:38 -08:00
Stas Bekman
1228661285
[bf16 support] tweaks (#14580)
* [bf16 support] tweaks

* corrections

Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com>
2021-12-08 11:33:24 -08:00
Stas Bekman
71b1bf7ea8
[trainer] add tf32-mode control (#14606)
* [trainer] add --tf32 support

* it's pt>=.17

* it's pt>=.17

* flip the default to True

* add experimental note

* simplify logic

* style

* switch to 3-state logic

* doc

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* re-style code

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-12-03 10:08:58 -08:00
Mishig Davaadorj
275402bf2b
Update doc img links (#14593)
* Update doc img links

* Rename toctree.yml -> _toctree.yml (#14594)

* Update doc img links

* Update performance.md img link
2021-12-02 09:01:35 +01:00
Stas Bekman
fbe278c76c
[doc] bf16/tf32 guide (#14579)
* [doc] bf16/tf32 guide

* expand

* expand

* Update docs/source/performance.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-12-01 14:18:58 -08:00
Stas Bekman
29dfb2dbb1
[doc] performance and parallelism updates (#14391)
* [doc] performance and parallelism doc update

* improve

* improve
2021-11-14 17:19:15 -08:00
Sylvain Gugger
27d4639779
Make gradient_checkpointing a training argument (#13657)
* Make gradient_checkpointing a training argument

* Update src/transformers/modeling_utils.py

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/configuration_utils.py

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Fix tests

* Style

* document Gradient Checkpointing as a performance feature

* Small rename

* PoC for not using the config

* Adapt BC to new PoC

* Forgot to save

* Rollout changes to all other models

* Fix typo

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
2021-09-22 07:51:38 -04:00
Stas Bekman
31cfcbd3e2
[doc] performance: batch sizes (#12725) 2021-07-15 09:39:34 -07:00
Stas Bekman
bfd5da8e28
[docs] performance (#12258)
* initial performance document

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* rewrites based on suggestions

* 8x multiple is for AMP only

* add contribute section

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-06-22 15:34:19 -07:00