Stas Bekman
|
580dd87c55
|
[Deepspeed] add support for bf16 mode (#14569)
* [WIP] add support for bf16 mode
* prep for bf16
* prep for bf16
* fix; zero2/bf16 is ok
* check bf16 is available
* test fixes
* enable zero3_bf16
* config files
* docs
* split stage_dtype; merge back to non-dtype-specific config file
* fix doc
* cleanup
* cleanup
* bfloat16 => bf16 to match the PR changes
* s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/
* test fixes/skipping
* move
* fix
* Update docs/source/main_classes/deepspeed.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* backticks
* cleanup
* cleanup
* cleanup
* new version
* add note about grad accum in bf16
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
|
2022-03-11 17:53:53 -08:00 |
|
Stas Bekman
|
b842d7277a
|
fix deepspeed tests (#15881)
* fix deepspeed tests
* style
* more fixes
|
2022-03-01 19:27:28 -08:00 |
|
Lysandre Debut
|
29c10a41d0
|
[Test refactor 1/5] Per-folder tests reorganization (#15725)
* Per-folder tests reorganization
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Stas Bekman <stas@stason.org>
|
2022-02-23 15:46:28 -05:00 |
|
Stas Bekman
|
1eb40338ac
|
[deepspeed tests] fix summarization (#15149)
|
2022-01-13 13:48:51 -08:00 |
|
Jeff Rasley
|
d0e96c6de6
|
[deepspeed] Enable multiple test runs on single box, defer to DS_TEST_PORT if set (#14331)
* defer to DS_TEST_PORT if set
* style
Co-authored-by: Stas Bekman <stas@stason.org>
|
2021-11-08 12:40:29 -08:00 |
|
Stas Bekman
|
78f5fe1416
|
[Deepspeed] adapt multiple models, add zero_to_fp32 tests (#12477)
* zero_to_fp32 tests
* args change
* remove unnecessary work
* use transformers.trainer_utils.get_last_checkpoint
* document the new features
* cleanup
* wip
* fix fsmt
* add bert
* cleanup
* add xlm-roberta
* electra works
* cleanup
* sync
* split off the model zoo tests
* cleanup
* cleanup
* cleanup
* cleanup
* reformat
* cleanup
* casing
* deepspeed>=0.4.3
* adjust distilbert
* Update docs/source/main_classes/deepspeed.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
|
2021-07-13 12:07:32 -07:00 |
|