* add pl_glue example test
* for now just test that it runs, next validate results of eval or predict?
* complete the run_pl_glue test to validate the actual outcome
* worked on my machine, CI gets less accuracy - trying higher epochs
* match run_pl.sh hparms
* more epochs?
* trying higher lr
* for now just test that the script runs to a completion
* correct the comment
* if cuda is available, add --fp16 --gpus=1 to cover more bases
* style
as discussed with @sshleifer, removing this TODO to switch to a tiny model, since it won't be able to test the results of the evaluation (i.e. the results are meaningless).
* Support for Comet.ml
* Need to import comet first
* Log this model, not the one in the backprop step
* Log args as hyperparameters; use framework to allow fine control
* Log hyperparameters with context
* Apply black formatting
* isort fix integrations
* isort fix __init__
* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/trainer_tf.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Address review comments
* Style + Quality, remove Tensorboard import test
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* support --lr_scheduler with multiple possibilities
* correct the error message
* add a note about supported schedulers
* cleanup
* cleanup2
* needs the argument default
* style
* add another assert in the test
* implement requested changes
* cleanups
* fix relative import
* cleanup
* enable easy checkout switch
allow having multiple repository checkouts and not needing to remember to rerun 'pip install -e .[dev]' when switching between checkouts and running tests.
* make isort happy
* examples needs one too
* Switch from return_tuple to return_dict
* Fix test
* [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614)
* Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests
* AutoModels
Tiny tweaks
* Style
* Final changes before merge
* Re-order for simpler review
* Final fixes
* Addressing @sgugger's comments
* Test MultipleChoice
* Rework TF trainer (#6038)
* Fully rework training/prediction loops
* fix method name
* Fix variable name
* Fix property name
* Fix scope
* Fix method name
* Fix tuple index
* Fix tuple index
* Fix indentation
* Fix variable name
* fix eval before log
* Add drop remainder for test dataset
* Fix step number + fix logging datetime
* fix eval loss value
* use global step instead of step + fix logging at step 0
* Fix logging datetime
* Fix global_step usage
* Fix breaking loop + logging datetime
* Fix step in prediction loop
* Fix step breaking
* Fix train/test loops
* Force TF at least 2.2 for the trainer
* Use assert_cardinality to facilitate the dataset size computation
* Log steps per epoch
* Make tfds compliant with TPU
* Make tfds compliant with TPU
* Use TF dataset enumerate instead of the Python one
* revert previous commit
* Fix data_dir
* Apply style
* rebase on master
* Address Sylvain's comments
* Address Sylvain's and Lysandre comments
* Trigger CI
* Remove unused import
* Switch from return_tuple to return_dict
* Fix test
* Add recent model
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Plu <plu.julien@gmail.com>
* Fully rework training/prediction loops
* fix method name
* Fix variable name
* Fix property name
* Fix scope
* Fix method name
* Fix tuple index
* Fix tuple index
* Fix indentation
* Fix variable name
* fix eval before log
* Add drop remainder for test dataset
* Fix step number + fix logging datetime
* fix eval loss value
* use global step instead of step + fix logging at step 0
* Fix logging datetime
* Fix global_step usage
* Fix breaking loop + logging datetime
* Fix step in prediction loop
* Fix step breaking
* Fix train/test loops
* Force TF at least 2.2 for the trainer
* Use assert_cardinality to facilitate the dataset size computation
* Log steps per epoch
* Make tfds compliant with TPU
* Make tfds compliant with TPU
* Use TF dataset enumerate instead of the Python one
* revert previous commit
* Fix data_dir
* Apply style
* rebase on master
* Address Sylvain's comments
* Address Sylvain's and Lysandre comments
* Trigger CI
* Remove unused import
* DataParallel fixes:
1. switched to a more precise check
- if self.args.n_gpu > 1:
+ if isinstance(model, nn.DataParallel):
2. fix tests - require the same fixup under DataParallel as the training module
* another fix
* Add deebert code
* Add readme of deebert
* Add test for deebert
Update test for Deebert
* Update DeeBert (README, class names, function refactoring); remove requirements.txt
* Format update
* Update test
* Update readme and model init methods
* Added data collator for XLNet language modeling and related calls
Added DataCollatorForXLNetLanguageModeling in data/data_collator.py
to generate necessary inputs for language modeling training with
XLNetLMHeadModel. Also added related arguments, logic and calls in
examples/language-modeling/run_language_modeling.py.
Resolves: #4739, #2008 (partially)
* Changed name to `DataCollatorForPermutationLanguageModeling`
Changed the name of `DataCollatorForXLNetLanguageModeling` to the more general `DataCollatorForPermutationLanguageModelling`.
Removed the `--mlm` flag requirement for the new collator and defined a separate `--plm_probability` flag for its use.
CTRL uses a CLM loss just like GPT and GPT-2, so should work out of the box with this script (provided `past` is taken care of
similar to `mems` for XLNet).
Changed calls and imports appropriately.
* Added detailed comments, changed variable names
Added more detailed comments to `DataCollatorForPermutationLanguageModeling` in `data/data_collator.py` to explain working. Also cleaned up variable names and made them more informative.
* Added tests for new data collator
Added tests in `tests/test_trainer.py` for DataCollatorForPermutationLanguageModeling based on those in DataCollatorForLanguageModeling. A specific test has been added to check for odd-length sequences.
* Fixed styling issues
Otherwise, if label is not specified, the following error occurs:
Traceback (most recent call last):
File "run_ner.py", line 303, in <module>
main()
File "run_ner.py", line 101, in main
model_args, data_args, training_args = parser.parse_json_file(json_file=os.path.abspath(sys.argv[1]))
File "/home/user/anaconda3/envs/bert/lib/python3.7/site-packages/transformers/hf_argparser.py", line 159, in parse_json_file
obj = dtype(**inputs)
TypeError: __init__() missing 1 required positional argument: 'labels'
* Fix the bug 'Attempted relative import with no known parent package' when using the bertabs example. Also change the used model from bertabs-finetuned-cnndm, since it seems not be accessible anymore
* Update run_summarization.py
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
* remove references to old API in docstring - update data processors
* style
* fix tests - better type checking error messages
* better type checking
* include awesome fix by @LysandreJik for #5310
* updated doc and examples
* Add BERT Loses Patience (Patience-based Early Exit)
* update model archive
* update format
* sort import
* flake8
* Add results
* full results
* align the table
* refactor to inherit
* default per gpu eval = 1
* Formatting
* Formatting
* isort
* modify readme
* Add check
* Fix format
* Fix format
* Doc strings
* ALBERT & BERT for sequence classification don't inherit from the original anymore
* Remove incorrect comments
* Remove incorrect comments
* Remove incorrect comments
* Sync up with new code
* Sync up with new code
* Add a test
* Add a test
* Add a test
* Add a test
* Add a test
* Add a test
* Finishing up!