transformers/examples
Patrick von Platen 4dc65591b5
[Almost all TF models] TF clean up: add missing CLM / MLM loss; fix T5 naming and keras compile (#5395)
* add first version of clm tf

* make style

* add more tests for bert

* update tf clm loss

* fix tests

* correct tf ner script

* add mlm loss

* delete bogus file

* clean tf auto model + add tests

* finish adding clm loss everywhere

* fix training in distilbert

* fix flake8

* save intermediate

* fix tf t5 naming

* remove prints

* finish up

* up

* fix tf gpt2

* fix new test utils import

* fix flake8

* keep backward compatibility

* Update src/transformers/modeling_tf_albert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_auto.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_electra.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_roberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_mobilebert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_auto.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_distilbert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply sylvains suggestions

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-07-07 18:15:53 +02:00
..
adversarial [tokenizers] Updates data processors, docstring, examples and model cards to the new API (#5308) 2020-06-26 19:48:14 +02:00
benchmarking [Docs] Benchmark docs (#5360) 2020-06-29 16:08:57 +02:00
bert-loses-patience save_pretrained: mkdir(exist_ok=True) (#5258) 2020-06-28 14:53:47 -04:00
bertology Make DataCollator a callable (#5015) 2020-06-15 11:58:33 -04:00
contrib save_pretrained: mkdir(exist_ok=True) (#5258) 2020-06-28 14:53:47 -04:00
distillation save_pretrained: mkdir(exist_ok=True) (#5258) 2020-06-28 14:53:47 -04:00
language-modeling Added data collator for permutation (XLNet) language modeling and related calls (#5522) 2020-07-07 10:17:37 +02:00
longform-qa [tokenizers] Updates data processors, docstring, examples and model cards to the new API (#5308) 2020-06-26 19:48:14 +02:00
movement-pruning save_pretrained: mkdir(exist_ok=True) (#5258) 2020-06-28 14:53:47 -04:00
multiple-choice Clean up diffs in Trainer/TFTrainer (#5417) 2020-07-01 11:00:20 -04:00
question-answering [examples] Add trainer support for question-answering (#4829) 2020-07-07 08:57:08 -04:00
seq2seq Move tests/utils.py -> transformers/testing_utils.py (#5350) 2020-07-01 10:31:17 -04:00
text-classification Clean up diffs in Trainer/TFTrainer (#5417) 2020-07-01 11:00:20 -04:00
text-generation The add_space_before_punct_symbol is only for TransfoXL (#5549) 2020-07-06 12:17:05 -04:00
token-classification [Almost all TF models] TF clean up: add missing CLM / MLM loss; fix T5 naming and keras compile (#5395) 2020-07-07 18:15:53 +02:00
lightning_base.py [pl_examples] default warmup steps=0 (#5316) 2020-06-26 15:03:41 -04:00
README.md Fix examples titles and optimization doc page (#5408) 2020-07-01 08:11:25 -04:00
requirements.txt Upgrade examples to pl=0.8.1(#5146) 2020-06-22 20:40:10 -04:00
test_examples.py [cleanup] examples test_run_squad uses tiny model (#5059) 2020-06-16 14:06:45 -04:00
xla_spawn.py [TPU] Doc, fix xla_spawn.py, only preprocess dataset once (#4223) 2020-05-08 14:10:05 -04:00

Examples

Version 2.9 of 🤗 Transformers introduces a new Trainer class for PyTorch, and its equivalent TFTrainer for TF 2. Running the examples requires PyTorch 1.3.1+ or TensorFlow 2.1+.

Here is the list of all our examples:

  • grouped by task (all official examples work for multiple models)
  • with information on whether they are built on top of Trainer/TFTrainer (if not, they still work, they might just lack some features),
  • whether they also include examples for pytorch-lightning, which is a great fully-featured, general-purpose training library for PyTorch,
  • links to Colab notebooks to walk through the scripts and run them easily,
  • links to Cloud deployments to be able to deploy large-scale trainings in the Cloud with little to no setup.

This is still a work-in-progress in particular documentation is still sparse so please contribute improvements/pull requests.

The Big Table of Tasks

Task Example datasets Trainer support TFTrainer support pytorch-lightning Colab
language-modeling Raw text - - Open In Colab
text-classification GLUE, XNLI Open In Colab
token-classification CoNLL NER -
multiple-choice SWAG, RACE, ARC - Open In Colab
question-answering SQuAD - - -
text-generation - n/a n/a n/a Open In Colab
distillation All - - - -
summarization CNN/Daily Mail - - -
translation WMT - - -
bertology - - - - -
adversarial HANS - - -

Important note

Important To make sure you can successfully run the latest versions of the example scripts, you have to install the library from source and install some example-specific requirements. Execute the following steps in a new virtual environment:

git clone https://github.com/huggingface/transformers
cd transformers
pip install .
pip install -r ./examples/requirements.txt

One-click Deploy to Cloud (wip)

Azure

Deploy to Azure

Running on TPUs

When using Tensorflow, TPUs are supported out of the box as a tf.distribute.Strategy.

When using PyTorch, we support TPUs thanks to pytorch/xla. For more context and information on how to setup your TPU environment refer to Google's documentation and to the very detailed pytorch/xla README.

In this repo, we provide a very simple launcher script named xla_spawn.py that lets you run our example scripts on multiple TPU cores without any boilerplate. Just pass a --num_cores flag to this script, then your regular training script with its arguments (this is similar to the torch.distributed.launch helper for torch.distributed).

For example for run_glue:

python examples/xla_spawn.py --num_cores 8 \
	examples/text-classification/run_glue.py
	--model_name_or_path bert-base-cased \
	--task_name mnli \
	--data_dir ./data/glue_data/MNLI \
	--output_dir ./models/tpu \
	--overwrite_output_dir \
	--do_train \
	--do_eval \
	--num_train_epochs 1 \
	--save_steps 20000

Feedback and more use cases and benchmarks involving TPUs are welcome, please share with the community.