* Pad to 8x for fp16 multiple choice example (#9752)
* Pad to 8x for fp16 squad trainer example (#9752)
* Pad to 8x for fp16 ner example (#9752)
* Pad to 8x for fp16 swag example (#9752)
* Pad to 8x for fp16 qa beam search example (#9752)
* Pad to 8x for fp16 qa example (#9752)
* Pad to 8x for fp16 seq2seq example (#9752)
* Pad to 8x for fp16 glue example (#9752)
* Pad to 8x for fp16 new ner example (#9752)
* update script template #9752
* Update examples/multiple-choice/run_swag.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update examples/question-answering/run_qa.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update examples/question-answering/run_qa_beam_search.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* improve code quality #9752
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Auto-resume training from checkpoint
* Update examples/text-classification/run_glue.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Roll out to other examples
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Add new SQUAD example
* Same with a task-specific Trainer
* Address review comment.
* Small fixes
* Initial work for XLNet
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Final clean up and working XLNet script
* Test and debug
* Final working version
* Add new SQUAD example
* Same with a task-specific Trainer
* Address review comment.
* Small fixes
* Initial work for XLNet
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Final clean up and working XLNet script
* Test and debug
* Final working version
* Add tick
* Update README
* Address review comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>