Aleksander Smywiński-Pohl
|
c37573806a
|
Fix typo in deepspeed documentation (#13482)
* Fix typo in deepspeed documentation
* Add missing import in deepspeed configuration
|
2021-09-08 11:24:10 -07:00 |
|
Stas Bekman
|
807b6bd160
|
[Deepspeed] warmup_ratio docs (#12830)
* [Deepspeed] warmup_ratio docs
* Update docs/source/main_classes/deepspeed.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* style
* Update docs/source/main_classes/deepspeed.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
|
2021-07-21 10:49:29 -07:00 |
|
Stas Bekman
|
5dd0c956a8
|
non-native optimizers are mostly ok with zero-offload (#12690)
|
2021-07-13 20:18:51 -07:00 |
|
Stas Bekman
|
78f5fe1416
|
[Deepspeed] adapt multiple models, add zero_to_fp32 tests (#12477)
* zero_to_fp32 tests
* args change
* remove unnecessary work
* use transformers.trainer_utils.get_last_checkpoint
* document the new features
* cleanup
* wip
* fix fsmt
* add bert
* cleanup
* add xlm-roberta
* electra works
* cleanup
* sync
* split off the model zoo tests
* cleanup
* cleanup
* cleanup
* cleanup
* reformat
* cleanup
* casing
* deepspeed>=0.4.3
* adjust distilbert
* Update docs/source/main_classes/deepspeed.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
|
2021-07-13 12:07:32 -07:00 |
|
Stas Bekman
|
7682e97702
|
[models] respect dtype of the model when instantiating it (#12316)
* [models] respect dtype of the model when instantiating it
* cleanup
* cleanup
* rework to handle non-float dtype
* fix
* switch to fp32 tiny model
* improve
* use dtype.is_floating_point
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix the doc
* recode to use explicit torch_dtype_auto_detect, torch_dtype args
* docs and tweaks
* docs and tweaks
* docs and tweaks
* merge 2 args, add docs
* fix
* fix
* better doc
* better doc
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
|
2021-06-28 20:11:21 -07:00 |
|
Stas Bekman
|
07ae6103c3
|
[Deepspeed] new docs (#12077)
* document sub_group_size
* style
* install + issues reporting
* style
* style
* Update docs/source/main_classes/deepspeed.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* indent 4
* restore
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
|
2021-06-23 11:07:37 -07:00 |
|
Stas Bekman
|
0e82f0cbc2
|
typo
|
2021-06-08 12:55:17 -07:00 |
|
Stas Bekman
|
32290d87f6
|
[Deepspeed] various fixes (#12058)
* replace deprecated config
* sub_group_size was too big
* complete deprecation removal
|
2021-06-08 08:36:15 -07:00 |
|
Stas Bekman
|
2c73b93099
|
[Deepspeed] Assert on mismatches between ds and hf args (#12021)
* wip
* add mismatch validation + test
* renames
* Update docs/source/main_classes/deepspeed.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* renames
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
|
2021-06-04 08:58:23 -07:00 |
|
Stas Bekman
|
640318befa
|
[deepspeed] Move code and doc into standalone files (#11984)
* move code and docs
* style
* moved
* restore
|
2021-06-02 09:56:00 -07:00 |
|
Stas Bekman
|
7ec596ecda
|
[DeepSpeed] decouple DeepSpeedConfigHF from Trainer (#11966)
* decouple DeepSpeedConfigHF from Trainer
* add LoggingLevel ctx manager; add new test
* cleanup
* add docs
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* implemented suggested renames
* formatter workaround
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
|
2021-06-01 13:24:52 -07:00 |
|