* doc
* [tests] Add sample files for a regression task
* [HUGE] Trainer
* Feedback from @sshleifer
* Feedback from @thomwolf + logging tweak
* [file_utils] when downloading concurrently, get_from_cache will use the cached file for subsequent processes
* [glue] Use default max_seq_length of 128 like before
* [glue] move DataTrainingArguments around
* [ner] Change interface of InputExample, and align run_{tf,pl}
* Re-align the pl scripts a little bit
* ner
* [ner] Add integration test
* Fix language_modeling with API tweak
* [ci] Tweak loss target
* Don't break console output
* amp.initialize: model must be on right device before
* [multiple-choice] update for Trainer
* Re-align to 827d6d6ef0
* Big cleanup of `glue_convert_examples_to_features`
* Use batch_encode_plus
* Cleaner wrapping of glue_convert_examples_to_features for TF
@lysandrejik
* Cleanup syntax, thanks to @mfuntowicz
* Raise explicit error in case of user error
* Using loaded checkpoint with --do_predict
Without this fix, I'm getting near-random validation performance for a trained model, and the validation performance differs per validation run. I think this happens since the `model` variable isn't set with the loaded checkpoint, so I'm using a randomly initialized model. Looking at the model activations, they differ each time I run evaluation (but they don't with this fix).
* Update checkpoint loading
* Fixing model loading
* ✨ Alter base pl transformer to use automodels
* 🐛 Add batch size env variable to function call
* 💄 Apply black code style from Makefile
* 🚚 Move lightning base out of ner directory
* ✨ Add lightning glue example
* 💄 self
* move _feature_file to base class
* ✨ Move eval logging to custom callback
* 💄 Apply black code style
* 🐛 Add parent to pythonpath, remove copy command
* 🐛 Add missing max_length kwarg