jianan-gu
34097b3304
Extend Transformers Trainer Class to Enable CPU AMP and Integrate Intel Extension for PyTorch ( #17138 )
...
* init PR
* fix import ipex
* minor fix on bf16
* refine optimizer
* refine args notes
* refine code
* refine ipex optimize args
* refine half_precision_backend
* black format
* isort format
* isort format files
* flake8 format
* doc builder format
* refine codes
* remove jit and optim bits
* black preview format
* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* refine code
* refine notes
* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* code refine
* add ipex ut
* add performance cpu doc
* link to the cpu doc from main perf doc
* install ipex into CI's docker
* Update perf_train_cpu.mdx
* Update docs/source/en/perf_train_cpu.mdx
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Update perf_train_cpu.mdx
* Update perf_train_cpu.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2022-06-08 09:41:57 -04:00
Animesh Jain
897a8dd89f
Support compilation via Torchdynamo, AOT Autograd, NVFuser ( #17308 )
...
* Support compilation via Torchdynamo, AOT Autograd, NVFuser
* Address comments
* Lint
* Stas comments - missing quality test
* Lintere
* Quality test
* Doc lint
* Reset CUDA peak mem
* Add CustomTrainer
* require a single gpu
Co-authored-by: Stas Bekman <stas@stason.org>
2022-05-25 11:16:09 -04:00
Stas Bekman
3601aa8fc9
[tests] fix copy-n-paste error ( #17312 )
...
* [tests] fix copy-n-paste error
* fix
2022-05-18 16:00:47 -07:00
Yih-Dar
66b3e106a1
Make TrainerHyperParameterSigOptIntegrationTest slow test ( #17288 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-16 14:18:09 -04:00
Antoni Baum
edcc66d27c
Remove unnecessary columns for all dataset types in Trainer
( #17166 )
...
* Remove unneeded columns for IterableDataset
* Add test
* Update trainer tests
* Edit docstring
* Lint
* Apply feedback
* Apply feedback
2022-05-11 11:11:26 -04:00
Zachary Mueller
2fbb237967
Add the auto_find_batch_size capability from Accelerate into Trainer ( #17068 )
...
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
- Adds auto_batch_size finder
- Moves training loop to an inner training loop
2022-05-09 12:29:18 -04:00
Sylvain Gugger
1c9fcd0e04
Fix RNG reload in resume training from epoch checkpoint ( #17055 )
...
* Fix RNG reload in resume training from epoch checkpoint
* Fix test
2022-05-03 10:31:24 -04:00
Sylvain Gugger
a8fa2f91f4
Make Trainer compatible with sharded checkpoints ( #17053 )
...
* Make Trainer compatible with sharded checkpoints
* Add doc
2022-05-03 09:55:10 -04:00
Manuel R. Ciosici
3104036e7f
Add support for bitsandbytes ( #15622 )
...
* Add initial BNB integration
* fixup! Add initial BNB integration
* Add bnb test decorator
* Update Adamw8bit option name
* Use the full bnb package name
* Overide bnb for all embedding layers
* Fix package name
* Formatting
* Remove unnecessary import
* Update src/transformers/trainer.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Rename AdamwBNB optimizer option
* Add training test checking that bnb memory utilization is lower
* fix merge
* fix merge; fix + extend new test
* cleanup
* expand bnb
* move all require_* candidates to testing_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
2022-04-19 16:01:29 -04:00
code-review-doctor
a2392415e9
Some tests misusing assertTrue for comparisons fix ( #16771 )
...
* Fix issue avoid-misusing-assert-true found at https://codereview.doctor
* fix tests
* fix tf
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-04-19 14:44:08 +02:00
Sander Land
d7c8ce57d4
Avoid accessing .dataset of a DataLoader in Trainer ( #16451 )
...
* Avoid accessing .dataset of a dataloader
* style
* fix
* cleaning up, reverting some misunderstandings
* black
* add train_dataset argument to get_train_dataloader, and fix other instances of length checks
* flake8
* address comments
* fix bug
* cleanup
* add test
* Update tests/trainer/test_trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* under torch
* merge
* stylistic suggestion
Co-authored-by: Sander Land <sander@chatdesk.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-29 15:00:18 -04:00
Sylvain Gugger
4975002df5
Reorganize file utils ( #16264 )
...
* Split file_utils in several submodules
* Fixes
* Add back more objects
* More fixes
* Who exactly decided to import that from there?
* Second suggestion to code with code review
* Revert wront move
* Fix imports
* Adapt all imports
* Adapt all imports everywhere
* Revert this import, will fix in a separate commit
2022-03-23 10:26:33 -04:00
David Hall
5b7dcc7342
Seed _get_train_sampler's generator with arg seed to improve reproducibility ( #15961 )
...
* Seed get_train_sampler's generator with arg seed to improve reproducibility
and make the world_size<=1 code path more similar to the others
* move test file into trainer test explicitly
* dumb typo
* make style lint happy
* per discussion, switch to data_seed
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-08 13:45:41 -05:00
Lysandre Debut
29c10a41d0
[Test refactor 1/5] Per-folder tests reorganization ( #15725 )
...
* Per-folder tests reorganization
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Stas Bekman <stas@stason.org>
2022-02-23 15:46:28 -05:00