* Trainer should not modify its TrainingArguments
* Trainer should not modify its TrainingArguments
* Trainer should not modify its TrainingArguments
* Add test of resumed training
* Fixes
* Non multiGPU test
* Clean Trainer state
* Add more to the state
* Documentation
* One last test
* Make resume training test more complete
* Unwanted changes
* fix ZeroDivisionError and epoch counting
* Add test for num_train_epochs calculation in trainer.py
* Remove @require_non_multigpu for test_num_train_epochs_in_training
* Don't pass sampler for iterable dataset
* Added check for test and eval dataloaders.
* Formatting
* Don't pass sampler for iterable dataset
* Added check for test and eval dataloaders.
* Formatting
* Cleaner if nesting.
* Added test for trainer and iterable dataset
* Formatting for test
* Fixed import when torch is available only.
* Added require torch decorator to helper class
* Moved dataset class inside unittest
* Removed nested if and changed model in test
* Checking torch availability for IterableDataset
* Added data collator for XLNet language modeling and related calls
Added DataCollatorForXLNetLanguageModeling in data/data_collator.py
to generate necessary inputs for language modeling training with
XLNetLMHeadModel. Also added related arguments, logic and calls in
examples/language-modeling/run_language_modeling.py.
Resolves: #4739, #2008 (partially)
* Changed name to `DataCollatorForPermutationLanguageModeling`
Changed the name of `DataCollatorForXLNetLanguageModeling` to the more general `DataCollatorForPermutationLanguageModelling`.
Removed the `--mlm` flag requirement for the new collator and defined a separate `--plm_probability` flag for its use.
CTRL uses a CLM loss just like GPT and GPT-2, so should work out of the box with this script (provided `past` is taken care of
similar to `mems` for XLNet).
Changed calls and imports appropriately.
* Added detailed comments, changed variable names
Added more detailed comments to `DataCollatorForPermutationLanguageModeling` in `data/data_collator.py` to explain working. Also cleaned up variable names and made them more informative.
* Added tests for new data collator
Added tests in `tests/test_trainer.py` for DataCollatorForPermutationLanguageModeling based on those in DataCollatorForLanguageModeling. A specific test has been added to check for odd-length sequences.
* Fixed styling issues
* Adds predict stage for glue tasks, and generate result files which could be submitted to gluebenchmark.com website.
* Use Split enum + always output the label name
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* Improvements to the wandb integration
* small reorg + no global necessary
* feat(trainer): log epoch and final metrics
* Simplify logging a bit
* Fixup
* Fix crash when just running eval
Co-authored-by: Chris Van Pelt <vanpelt@gmail.com>
Co-authored-by: Boris Dayma <boris.dayma@gmail.com>
* Created using Colaboratory
* [examples] reorganize files
* remove run_tpu_glue.py as superseded by TPU support in Trainer
* Bugfix: int, not tuple
* move files around
* doc
* [tests] Add sample files for a regression task
* [HUGE] Trainer
* Feedback from @sshleifer
* Feedback from @thomwolf + logging tweak
* [file_utils] when downloading concurrently, get_from_cache will use the cached file for subsequent processes
* [glue] Use default max_seq_length of 128 like before
* [glue] move DataTrainingArguments around
* [ner] Change interface of InputExample, and align run_{tf,pl}
* Re-align the pl scripts a little bit
* ner
* [ner] Add integration test
* Fix language_modeling with API tweak
* [ci] Tweak loss target
* Don't break console output
* amp.initialize: model must be on right device before
* [multiple-choice] update for Trainer
* Re-align to 827d6d6ef0