Commit Graph

1212 Commits

Author SHA1 Message Date
Stas Bekman
7c6a085ebf
pl version: examples/requirements.txt is single source of truth (#6309) 2020-08-11 10:58:54 -04:00
Stas Bekman
f6c0680d36
add pl_glue example test (#6034)
* add pl_glue example test

* for now just test that it runs, next validate results of eval or predict?

* complete the run_pl_glue test to validate the actual outcome

* worked on my machine, CI gets less accuracy - trying higher epochs

* match run_pl.sh hparms

* more epochs?

* trying higher lr

* for now just test that the script runs to a completion

* correct the comment

* if cuda is available, add --fp16 --gpus=1 to cover more bases

* style
2020-08-11 03:16:52 -04:00
Sam Shleifer
b9ecd92ee4
[s2s] Script to save wmt data to disk (#6403) 2020-08-10 22:49:39 -04:00
Rohit Gupta
35eb96de4d
correct pl link in readme (#6364) 2020-08-10 03:08:46 -04:00
Stas Bekman
0830e79512
the test now works again (#6371) 2020-08-10 02:55:52 -04:00
Sam Shleifer
9a5ef83748
[s2s] fix --gpus clarg collision (#6358) 2020-08-08 21:51:37 -04:00
Suraj Patil
9bed355449
[s2s] fix label_smoothed_nll_loss (#6344) 2020-08-08 04:21:12 -04:00
Sam Shleifer
99f73bcc71
[s2s] tiny QOL improvement: run_eval prints scores (#6341) 2020-08-08 02:45:55 -04:00
Stas Bekman
322dffc6c9
remove a TODO item to use a tiny model (#6338)
as discussed with @sshleifer, removing this TODO to switch to a tiny model, since it won't be able to test the results of the evaluation (i.e. the results are meaningless).
2020-08-07 21:30:39 -04:00
zcain117
1b8a7ffcfd
Add setup for TPU CI to run every hour. (#6219)
* Add setup for TPU CI to run every hour.

* Re-organize config.yml

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-08-07 11:17:07 -04:00
Stas Bekman
6695450a23
[examples] consistently use --gpus, instead of --n_gpu (#6315) 2020-08-07 10:36:32 -04:00
Stas Bekman
175cd45e13
fix the shuffle agrument usage and the default (#6307) 2020-08-06 20:32:28 -04:00
Bhashithe Abeysinghe
ffceef2042
[Fix] text-classification PL example (#6027)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-08-06 15:46:43 -04:00
xujiaze13
eb2bd8d6eb
Remove redundant line in run_pl_glue.py (#6305) 2020-08-06 15:43:45 -04:00
Sam Shleifer
2804fff839
[s2s]Use prepare_translation_batch for Marian finetuning (#6293)
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-08-06 14:58:38 -04:00
Doug Blank
b923871bb7
Adds comet_ml to the list of auto-experiment loggers (#6176)
* Support for Comet.ml

* Need to import comet first

* Log this model, not the one in the backprop step

* Log args as hyperparameters; use framework to allow fine control

* Log hyperparameters with context

* Apply black formatting

* isort fix integrations

* isort fix __init__

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/trainer_tf.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Address review comments

* Style + Quality, remove Tensorboard import test

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-08-06 11:31:30 -04:00
Stas Bekman
376c02e9a9
[WIP] lightning_base: support --lr_scheduler with multiple possibilities (#6232)
* support --lr_scheduler with multiple possibilities

* correct the error message

* add a note about supported schedulers

* cleanup

* cleanup2

* needs the argument default

* style

* add another assert in the test

* implement requested changes

* cleanups

* fix relative import

* cleanup
2020-08-05 09:01:17 -04:00
Sam Shleifer
57eb1cb68d
[s2s] Document better mbart finetuning command (#6229)
* Document better MT command

* improve multigpu command
2020-08-03 18:22:31 -04:00
Victor SANH
0513f8d275
correct label extraction + add note on discrepancies on trained MNLI model and HANS (#6221) 2020-08-03 15:02:51 -04:00
Sam Shleifer
b6b2f2270f
s2s: fix LR logging, remove some dead code. (#6205) 2020-08-03 10:36:26 -04:00
Stas Bekman
d8dbf3b75d
[s2s] clean up + doc (#6184)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-08-01 14:51:07 -04:00
Stas Bekman
f250beb8aa
enable easy checkout switch (#5645)
* enable easy checkout switch

allow having multiple repository checkouts and not needing to remember to rerun 'pip install -e .[dev]' when switching between checkouts and running tests.

* make isort happy

* examples needs one too
2020-07-31 04:34:46 -04:00
Sylvain Gugger
91cb95461e
Switch from return_tuple to return_dict (#6138)
* Switch from return_tuple to return_dict

* Fix test

* [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614)

* Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests

* AutoModels


Tiny tweaks

* Style

* Final changes before merge

* Re-order for simpler review

* Final fixes

* Addressing @sgugger's comments

* Test MultipleChoice

* Rework TF trainer (#6038)

* Fully rework training/prediction loops

* fix method name

* Fix variable name

* Fix property name

* Fix scope

* Fix method name

* Fix tuple index

* Fix tuple index

* Fix indentation

* Fix variable name

* fix eval before log

* Add drop remainder for test dataset

* Fix step number + fix logging datetime

* fix eval loss value

* use global step instead of step + fix logging at step 0

* Fix logging datetime

* Fix global_step usage

* Fix breaking loop + logging datetime

* Fix step in prediction loop

* Fix step breaking

* Fix train/test loops

* Force TF at least 2.2 for the trainer

* Use assert_cardinality to facilitate the dataset size computation

* Log steps per epoch

* Make tfds compliant with TPU

* Make tfds compliant with TPU

* Use TF dataset enumerate instead of the Python one

* revert previous commit

* Fix data_dir

* Apply style

* rebase on master

* Address Sylvain's comments

* Address Sylvain's and Lysandre comments

* Trigger CI

* Remove unused import

* Switch from return_tuple to return_dict

* Fix test

* Add recent model

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Plu <plu.julien@gmail.com>
2020-07-30 09:17:00 -04:00
Stas Bekman
3212b8850d
[s2s] add support for overriding config params (#6149) 2020-07-30 01:09:46 -04:00
Julien Plu
54f9fbeff8
Rework TF trainer (#6038)
* Fully rework training/prediction loops

* fix method name

* Fix variable name

* Fix property name

* Fix scope

* Fix method name

* Fix tuple index

* Fix tuple index

* Fix indentation

* Fix variable name

* fix eval before log

* Add drop remainder for test dataset

* Fix step number + fix logging datetime

* fix eval loss value

* use global step instead of step + fix logging at step 0

* Fix logging datetime

* Fix global_step usage

* Fix breaking loop + logging datetime

* Fix step in prediction loop

* Fix step breaking

* Fix train/test loops

* Force TF at least 2.2 for the trainer

* Use assert_cardinality to facilitate the dataset size computation

* Log steps per epoch

* Make tfds compliant with TPU

* Make tfds compliant with TPU

* Use TF dataset enumerate instead of the Python one

* revert previous commit

* Fix data_dir

* Apply style

* rebase on master

* Address Sylvain's comments

* Address Sylvain's and Lysandre comments

* Trigger CI

* Remove unused import
2020-07-29 14:32:01 -04:00
Lysandre Debut
641b873c13
XLNet PLM Readme (#6121) 2020-07-29 11:38:15 -04:00
Sam Shleifer
92f8ce2ed6
Fix deebert tests (#6102) 2020-07-28 18:30:16 -04:00
Sam Shleifer
dafa296c95
[s2s] Delete useless method, log tokens_per_batch (#6081) 2020-07-28 11:24:23 -04:00
Stas Bekman
f0c70085c2
link to README.md (#6068)
* add a link to README.md

* Update README.md
2020-07-28 20:34:58 +08:00
Sam Shleifer
3c7fbf35a6
MBART: support summarization tasks where max_src_len > max_tgt_len (#6003)
* MBART: support summarization tasks

* fix test

* Style

* add tokenizer test
2020-07-28 08:18:11 -04:00
Sam Shleifer
7a68d40138
[s2s] Don't mention packed data in README (#6079) 2020-07-27 20:07:21 -04:00
Sam Shleifer
1e00ef681d
[s2s] dont document packing because it hurts performance (#6077) 2020-07-27 18:26:00 -04:00
Sam Shleifer
11792d7826
CL util to convert models to fp16 before upload (#5953) 2020-07-27 12:21:25 -04:00
Sam Shleifer
4302ace5bd
[pack_dataset] don't sort before packing, only pack train (#5954) 2020-07-27 12:14:23 -04:00
Suraj Patil
d1d15d6f2d
[examples (seq2seq)] fix preparing decoder_input_ids for T5 (#5994) 2020-07-27 10:10:43 -04:00
Sam Shleifer
c69ea5efc4
[CI] Don't test apex (#6021) 2020-07-24 15:34:16 -04:00
Sam Shleifer
c3206eef44
[test] partial coverage for train_mbart_enro_cc25.sh (#5976) 2020-07-22 14:34:49 -04:00
Sam Shleifer
feeb956a19
[docs] Add integration test example to copy pasta template (#5961)
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-07-22 12:48:38 -04:00
Sam Shleifer
9dab39feea
seq2seq/run_eval.py can take decoder_start_token_id (#5949) 2020-07-21 16:58:45 -04:00
Sam Shleifer
5b193b39b0
[examples/seq2seq]: add --label_smoothing option (#5919) 2020-07-21 16:51:39 -04:00
Sam Shleifer
95d1962b9c
[Doc] explaining romanian postprocessing for MBART BLEU hacking (#5943) 2020-07-21 14:12:48 -04:00
Aditya Soni
ccbf74a685
typos in seq2seq/readme (#5937) 2020-07-21 09:44:59 -04:00
Qingqing Cao
8e0bcb56ec
DataParallel fix: multi gpu evaluation (#5926)
The DataParallel training was fixed in https://github.com/huggingface/transformers/pull/5733, this commit also fixes the evaluation. It's more convenient when the user enables both `do_train` and `do_eval`.
2020-07-20 17:54:08 -04:00
Sam Shleifer
f1a4e06f1f
[Fix] seq2seq pack_dataset.py actually packs (#5913)
Huge MT speedup!
2020-07-20 15:18:26 -04:00
Stas Bekman
35cb101eae
DataParallel fixes (#5733)
* DataParallel fixes:

1. switched to a more precise check
-        if self.args.n_gpu > 1:
+        if isinstance(model, nn.DataParallel):

2. fix tests - require the same fixup under DataParallel as the training module

* another fix
2020-07-20 09:29:12 -04:00
Sam Shleifer
09a2f40684
Seq2SeqDataset uses linecache to save memory by @Pradhy729 (#5792)
Co-authored-by: Pradhy729 <49659913+Pradhy729@users.noreply.github.com>
2020-07-18 13:57:33 -04:00
Sam Shleifer
dad5e12e54
[seq2seq] distillation.py accepts trainer arguments (#5865) 2020-07-18 07:43:57 -04:00
Sam Shleifer
ba2400189b
[seq2seq] MAX_LEN env var for MT commands (#5837) 2020-07-17 22:51:31 -04:00
Nathan Raw
529850ae7b
Lightning Updates for v0.8.5 (#5798)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-07-17 22:43:06 -04:00
Sam Shleifer
e238e3d55a
[seq2seq] Don't copy self.source in sortishsampler (#5818) 2020-07-17 01:53:25 -04:00
Sam Shleifer
283500ff9f
[seq2seq] pack_dataset.py rewrites dataset in max_tokens format (#5819) 2020-07-16 14:06:49 -04:00
Sam Shleifer
1a647abf0b
[fix] check code quality (#5772) 2020-07-15 14:59:38 -04:00
Sam Shleifer
d0486c8bc2
[cleanup] T5 test, warnings (#5761) 2020-07-15 08:23:22 -04:00
Boris Dayma
4d5a8d6557
docs(wandb): explain how to use W&B integration (#5607)
* docs(wandb): explain how to use W&B integration

fix #5262

* Also mention TensorBoard

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-07-14 05:12:33 -04:00
Julien Chaumond
201d23f285 Update The Big Table of Tasks
Co-Authored-By: Suraj Patil <surajp815@gmail.com>
Co-Authored-By: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-07-10 18:07:29 +02:00
Lysandre Debut
0533cf4706
Test XLA examples (#5583)
* Test XLA examples

* Style

* Using `require_torch_tpu`

* Style

* No need for pytest
2020-07-09 09:19:19 -04:00
Ji Xin
cfbb982974
Add DeeBERT (entropy-based early exiting for *BERT) (#5477)
* Add deebert code

* Add readme of deebert

* Add test for deebert

Update test for Deebert

* Update DeeBert (README, class names, function refactoring); remove requirements.txt

* Format update

* Update test

* Update readme and model init methods
2020-07-08 08:17:59 +08:00
Patrick von Platen
fde217c679
readme for benchmark (#5363) 2020-07-07 23:21:23 +02:00
Sam Shleifer
353b8f1e7a
Add mbart-large-cc25, support translation finetuning (#5129)
improve unittests for finetuning, especially w.r.t testing frozen parameters
fix freeze_embeds for T5
add streamlit setup.cfg
2020-07-07 13:23:01 -04:00
Patrick von Platen
4dc65591b5
[Almost all TF models] TF clean up: add missing CLM / MLM loss; fix T5 naming and keras compile (#5395)
* add first version of clm tf

* make style

* add more tests for bert

* update tf clm loss

* fix tests

* correct tf ner script

* add mlm loss

* delete bogus file

* clean tf auto model + add tests

* finish adding clm loss everywhere

* fix training in distilbert

* fix flake8

* save intermediate

* fix tf t5 naming

* remove prints

* finish up

* up

* fix tf gpt2

* fix new test utils import

* fix flake8

* keep backward compatibility

* Update src/transformers/modeling_tf_albert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_auto.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_electra.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_roberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_mobilebert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_auto.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_distilbert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply sylvains suggestions

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-07-07 18:15:53 +02:00
Suraj Patil
e49393c361
[examples] Add trainer support for question-answering (#4829)
* add SquadDataset

* add DataCollatorForQuestionAnswering

* update __init__

* add run_squad with  trainer

* add DataCollatorForQuestionAnswering in __init__

* pass data_collator to trainer

* doc tweak

* Update run_squad_trainer.py

* Update __init__.py

* Update __init__.py

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-07-07 08:57:08 -04:00
Shashank Gupta
3dcb748e31
Added data collator for permutation (XLNet) language modeling and related calls (#5522)
* Added data collator for XLNet language modeling and related calls

Added DataCollatorForXLNetLanguageModeling in data/data_collator.py
to generate necessary inputs for language modeling training with
XLNetLMHeadModel. Also added related arguments, logic and calls in
examples/language-modeling/run_language_modeling.py.

Resolves: #4739, #2008 (partially)

* Changed name to `DataCollatorForPermutationLanguageModeling`

Changed the name of `DataCollatorForXLNetLanguageModeling` to the more general `DataCollatorForPermutationLanguageModelling`.
Removed the `--mlm` flag requirement for the new collator and defined a separate `--plm_probability` flag for its use.
CTRL uses a CLM loss just like GPT and GPT-2, so should work out of the box with this script (provided `past` is taken care of
similar to `mems` for XLNet).
Changed calls and imports appropriately.

* Added detailed comments, changed variable names

Added more detailed comments to `DataCollatorForPermutationLanguageModeling` in `data/data_collator.py` to explain working. Also cleaned up variable names and made them more informative.

* Added tests for new data collator

Added tests in `tests/test_trainer.py` for DataCollatorForPermutationLanguageModeling based on those in DataCollatorForLanguageModeling. A specific test has been added to check for odd-length sequences.

* Fixed styling issues
2020-07-07 10:17:37 +02:00
Lysandre Debut
9d9b872b66
The add_space_before_punct_symbol is only for TransfoXL (#5549) 2020-07-06 12:17:05 -04:00
Sylvain Gugger
734a28a767
Clean up diffs in Trainer/TFTrainer (#5417)
* Cleanup and unify Trainer/TFTrainer

* Forgot to adapt TFTrainingArgs

* In tf scripts n_gpu -> n_replicas

* Update src/transformers/training_args.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* Formatting

* Fix typo

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-07-01 11:00:20 -04:00
Sam Shleifer
13deb95a40
Move tests/utils.py -> transformers/testing_utils.py (#5350) 2020-07-01 10:31:17 -04:00
Sylvain Gugger
4ade7491f4
Fix examples titles and optimization doc page (#5408) 2020-07-01 08:11:25 -04:00
Hong Xu
501040fd30
In the run_ner.py example, give the optional label arg a default value (#5326)
Otherwise, if label is not specified, the following error occurs:

	Traceback (most recent call last):
	  File "run_ner.py", line 303, in <module>
	    main()
	  File "run_ner.py", line 101, in main
	    model_args, data_args, training_args = parser.parse_json_file(json_file=os.path.abspath(sys.argv[1]))
	  File "/home/user/anaconda3/envs/bert/lib/python3.7/site-packages/transformers/hf_argparser.py", line 159, in parse_json_file
	    obj = dtype(**inputs)
	TypeError: __init__() missing 1 required positional argument: 'labels'
2020-06-30 19:45:35 -04:00
Sam Shleifer
27a7fe7a8d
examples/seq2seq: never override $WANDB_PROJECT (#5407) 2020-06-30 15:29:13 -04:00
Kevin Canwen Xu
331d8d2936
Upload DistilBART artwork (#5394) 2020-06-30 18:11:11 +08:00
MichaelJanz
9a473f1e43
Update Bertabs example to work again (#5355)
* Fix the bug 'Attempted relative import with no known parent package' when using the bertabs example. Also change the used model from bertabs-finetuned-cnndm, since it seems not be accessible anymore

* Update run_summarization.py

Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
2020-06-30 14:05:01 +08:00
Sam Shleifer
a316a6aaa8
[seq2seq docs] Move evaluation down, fix typo (#5365) 2020-06-29 10:36:04 -04:00
Patrick von Platen
4bcc35cd69
[Docs] Benchmark docs (#5360)
* first doc version

* add benchmark docs

* fix typos

* improve README

* Update docs/source/benchmarks.rst

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* fix naming and docs

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-06-29 16:08:57 +02:00
Sam Shleifer
45e26125de
save_pretrained: mkdir(exist_ok=True) (#5258)
* all save_pretrained methods mkdir if not os.path.exists
2020-06-28 14:53:47 -04:00
Suraj Patil
12dfbd4f7a
[examples] fix example links (#5344) 2020-06-28 12:54:54 -04:00
Sam Shleifer
393b8dc09a
examples/seq2seq/run_eval.py fixes and docs (#5322) 2020-06-26 19:20:43 -04:00
Sam Shleifer
5543b30aa6
[pl_examples] default warmup steps=0 (#5316) 2020-06-26 15:03:41 -04:00
Thomas Wolf
601d4d699c
[tokenizers] Updates data processors, docstring, examples and model cards to the new API (#5308)
* remove references to old API in docstring - update data processors

* style

* fix tests - better type checking error messages

* better type checking

* include awesome fix by @LysandreJik for #5310

* updated doc and examples
2020-06-26 19:48:14 +02:00
Patrick von Platen
79a82cc06a
[Benchmarks] improve Example Plotter (#5245)
* improve plotting

* better labels

* fix time plot
2020-06-26 15:00:14 +02:00
Lysandre Debut
7cc15bdd96
Closes #5218 2020-06-25 18:19:21 -04:00
Sam Shleifer
e008d520bb
[examples/seq2seq] more README improvements (#5274) 2020-06-25 10:13:01 -04:00
Sam Shleifer
40457bcebb
examples/seq2seq supports translation (#5202) 2020-06-24 23:58:11 -04:00
Victor SANH
4965aee064
[HANS] Fix label_list for RoBERTa/BART (class flipping) (#5196)
* fix weirdness in roberta/bart for mnli trained checkpoints

* black compliance

* isort code check
2020-06-24 14:38:15 -04:00
Patrick von Platen
9fe09cec76
[Benchmark] Extend Benchmark to all model type extensions (#5241)
* add benchmark for all kinds of models

* improved import

* delete bogus files

* make style
2020-06-24 15:11:42 +02:00
Sylvain Gugger
7c41057d50
Add hugs (#5225) 2020-06-24 07:56:14 -04:00
Sylvain Gugger
5e85b324ec
Use the script in utils (#5224) 2020-06-24 07:55:58 -04:00
Kevin Canwen Xu
54e9ce785d
Fix PABEE division by zero error (#5233)
* Fix PABEE division by zero error

* patience=0 by default
2020-06-24 16:10:36 +08:00
Sam Shleifer
76e5af4cfd
[pl_examples] revert deletion of optimizer_step (#5227) 2020-06-23 16:40:45 -04:00
Sam Shleifer
f5c2a122e3
Upgrade examples to pl=0.8.1(#5146) 2020-06-22 20:40:10 -04:00
Patrick von Platen
fa0be6d761
Benchmarks (#4912)
* finish benchmark

* fix isort

* fix setup cfg

* retab

* fix time measuring of tf graph mode

* fix tf cuda

* clean code

* better error message
2020-06-22 12:06:56 +02:00
Ilya Boytsov
bc3a0c0607
[examples] fixes arguments for summarization finetune scripts (#5157)
Authored-by: i.boytsov <i.boytsov@MAC867.local>
2020-06-21 11:51:21 -04:00
Kevin Canwen Xu
c0c577cf8f
Fix PABEE's result table (#5158) 2020-06-20 22:56:39 +08:00
Kevin Canwen Xu
2fd28d4363
Add BERT Loses Patience (Patience-based Early Exit) (#5078)
* Add BERT Loses Patience (Patience-based Early Exit)

* update model archive

* update format

* sort import

* flake8

* Add results

* full results

* align the table

* refactor to inherit

* default per gpu eval = 1

* Formatting

* Formatting

* isort

* modify readme

* Add check

* Fix format

* Fix format

* Doc strings

* ALBERT & BERT for sequence classification don't inherit from the original anymore

* Remove incorrect comments

* Remove incorrect comments

* Remove incorrect comments

* Sync up with new code

* Sync up with new code

* Add a test

* Add a test

* Add a test

* Add a test

* Add a test

* Add a test

* Finishing up!
2020-06-20 13:41:46 +08:00
Sam Shleifer
2db1e2f415
[cleanup] remove redundant code in SummarizationDataset (#5119) 2020-06-18 20:34:48 -04:00
Lysandre
efeb75b805 Remove misleading comment
closes #4958
2020-06-17 18:24:35 -04:00
Sam Shleifer
f1a3d03741
add pandas to setup.cfg (#5093) 2020-06-17 16:39:17 -04:00
Pranav Dayanand Pawar
049e14f0e3
very minor spelling correction in script command (#5090)
actual script name - counts_parameters.py
2020-06-17 16:08:43 -04:00
Sam Shleifer
043f9f51f9
[examples] SummarizationModule improvements (#4951) 2020-06-17 13:51:34 -04:00
Sylvain Gugger
cd40f6564e
Add header and fix command (#5082) 2020-06-17 11:45:05 -04:00
flozi00
af497b5672
Typo (#5069) 2020-06-16 16:46:20 -04:00
Yacine Jernite
49c5202522
Eli5 examples (#4968)
* add eli5 examples

* add dense query script

* query_di

* merging

* merging

* add_utils

* adds nearest neighbor wikipedia

* batch queries

* training_retriever

* new notebooks

* moved retriever traiing script

* finished wiki40b

* max_len_fix

* train_s2s

* retriever_batch_checkpointing

* cleanup

* merge

* dim_fix

* fix_indexer

* fix_wiki40b_snippets

* fix_embed_for_r

* fp32 index

* fix_sparse_q

* joint_training

* remove obsolete datasets

* add_passage_nn_results

* add_passage_nn_results

* add_batch_nn

* add_batch_nn

* add_data_scripts

* notebook

* notebook

* notebook

* fix_multi_gpu

* add_app

* full_caching

* full_caching

* notebook

* sparse_done

* images

* notebook

* add_image_gif

* with_Gif

* add_contr_image

* notebook

* notebook

* notebook

* train_functions

* notebook

* min_retrieval_length

* pandas_option

* notebook

* min_retrieval_length

* notebook

* notebook

* eval_Retriever

* notebook

* images

* notebook

* add_example

* add_example

* notebook

* fireworks

* notebook

* notebook

* joe's notebook comments

* app_update

* notebook

* notebook_link

* captions

* notebook

* assing RetriBert model

* add RetriBert to Auto

* change AutoLMHead to AutoSeq2Seq

* notebook downloads from hf models

* style_black

* style_black

* app_update

* app_update

* fix_app_update

* style

* style

* isort

* Delete WikiELI5training.ipynb

* Delete evaluate_eli5.py

* Delete WikiELI5explore.ipynb

* Delete ExploreWikiELI5Support.html

* Delete explainlikeimfive.py

* Delete wiki_snippets.py

* children before parent

* children before parent

* style_black

* style_black_only

* isort

* isort_new

* Update src/transformers/modeling_retribert.py

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* typo fixes

* app_without_asset

* cleanup

* Delete ELI5animation.gif

* Delete ELI5contrastive.svg

* Delete ELI5wiki_index.svg

* Delete choco_bis.svg

* Delete fireworks.gif

* Delete huggingface_logo.jpg

* Delete huggingface_logo.svg

* Delete Long_Form_Question_Answering_with_ELI5_and_Wikipedia.ipynb

* Delete eli5_app.py

* Delete eli5_utils.py

* readme

* Update README.md

* unused imports

* moved_info

* default_beam

* ftuned model

* disclaimer

* Update src/transformers/modeling_retribert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* black

* add_doc

* names

* isort_Examples

* isort_Examples

* Add doc to index

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-06-16 16:36:58 -04:00
Sam Shleifer
c3e607496c
[cleanup] examples test_run_squad uses tiny model (#5059) 2020-06-16 14:06:45 -04:00
Sylvain Gugger
d5477baf7d
Convert hans to Trainer (#5025)
* Convert hans to Trainer

* Tick box
2020-06-16 08:06:31 -04:00
Anthony MOI
36434220fc
[HUGE] Refactoring tokenizers backend - padding - truncation - pre-tokenized pipeline - fast tokenizers - tests (#4510)
* Use tokenizers pre-tokenized pipeline

* failing pretrokenized test

* Fix is_pretokenized in python

* add pretokenized tests

* style and quality

* better tests for batched pretokenized inputs

* tokenizers clean up - new padding_strategy - split the files

* [HUGE] refactoring tokenizers - padding - truncation - tests

* style and quality

* bump up requied tokenizers version to 0.8.0-rc1

* switched padding/truncation API - simpler better backward compat

* updating tests for custom tokenizers

* style and quality - tests on pad

* fix QA pipeline

* fix backward compatibility for max_length only

* style and quality

* Various cleans up - add verbose

* fix tests

* update docstrings

* Fix tests

* Docs reformatted

* __call__ method documented

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-06-15 17:12:51 -04:00
Sylvain Gugger
1affde2f10
Make DataCollator a callable (#5015)
* Make DataCollator a callable

* Update src/transformers/data/data_collator.py

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-06-15 11:58:33 -04:00
Stefan Schweter
d812e6d76e
NER: fix construction of input examples for RoBERTa (#4943)
* utils_ner: do not add extra sep token for RoBERTa model

* run_pl_ner: do not add extra sep token for RoBERTa model
2020-06-15 08:30:40 -04:00
Sylvain Gugger
403d309857
Hans data (#4854)
* Update hans data to be able to use Trainer

* Fixes

* Deal with tokenizer that don't have token_ids

* Clean up things

* Simplify data use

* Fix the input dict

* Formatting + proper path in README
2020-06-13 09:35:13 -04:00
VictorSanh
473808da0d update mvmt-pruning/saving_prunebert (updating torch to 1.5) 2020-06-11 19:42:45 +00:00
Sylvain Gugger
e8db8b845a
Remove unused arguments in Multiple Choice example (#4853)
* Remove unused arguments

* Formatting

* Remove second todo comment
2020-06-09 20:05:09 -04:00
songyouwei
29c36e9f36
run_pplm.py bug fix (#4867)
`is_leaf` may become `False` after `.to(device=device)` function call.
2020-06-09 19:14:27 -04:00
Sam Shleifer
f90bc44d9a
[examples] Cleanup summarization docs (#4876) 2020-06-09 17:38:28 -04:00
Amil Khare
02e5f79662
[examples] consolidate summarization examples (#4837) 2020-06-09 11:14:12 -04:00
daniel-shan
b6f365a8ed
Updates args in tf squad example. (#4820)
Co-authored-by: Daniel Shan <daniel.shan@workday.com>
2020-06-08 05:36:09 -04:00
Mr Ruben
ddf9a3dfc7
Updated path "cd examples/text-generation/pplm" (#4778)
https://github.com/huggingface/transformers/issues/4776
2020-06-05 21:16:48 -04:00
Sam Shleifer
875288b344
[isort] add matplotlib to known 3rd party dependencies (#4800) 2020-06-05 17:27:31 -04:00
Julien Chaumond
b9109f2de1 [doc] Make it clearer that text-generation does not involve training 2020-06-05 14:59:22 +02:00
Stefan Schweter
2a4b9e09c0
NER: Add new WNUT’17 example (#4681)
* ner: add preprocessing script for examples that splits longer sentences

* ner: example shell scripts use local preprocessing now

* ner: add new example section for WNUT’17 NER task. Remove old English CoNLL-03 results

* ner: satisfy black and isort
2020-06-04 19:13:17 -04:00
prajjwal1
48a05026de removed deprecared use of Variable api from pplm example 2020-06-04 18:07:49 -04:00
Jason Phang
492b352ab6
Remove unnecessary model_type arg in example (#4771) 2020-06-04 13:41:24 -04:00
Jin Young Sohn
b231a413f5
Add cache_dir to save features in GLUE + Differentiate match/mismatch for MNLI metrics (#4621)
* Glue task cleaup

* Enable writing cache to cache_dir in case dataset lives in readOnly
filesystem.
* Differentiate match vs mismatch for MNLI metrics.

* Style

* Fix pytype

* Fix type

* Use cache_dir in mnli mismatch eval dataset

* Small Tweaks

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-06-02 13:40:14 -04:00
Julien Chaumond
b42586ea56
Fix CI after killing archive maps (#4724)
* 🐛 Fix model ids for BART and Flaubert
2020-06-02 10:21:09 -04:00
Julien Chaumond
d4c2cb402d
Kill model archive maps (#4636)
* Kill model archive maps

* Fixup

* Also kill model_archive_map for MaskedBertPreTrainedModel

* Unhook config_archive_map

* Tokenizers: align with model id changes

* make style && make quality

* Fix CI
2020-06-02 09:39:33 -04:00
Lysandre Debut
88762a2f8c
Specify PyTorch versions for examples (#4710) 2020-06-02 04:29:28 -04:00
Victor SANH
bf760c80b5 finish README 2020-06-01 09:23:31 -04:00
Victor SANH
9d7d9b3ae0 weird import 2020-06-01 09:23:31 -04:00
Victor SANH
2a3c88a659 Update examples/movement-pruning/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-06-01 09:23:31 -04:00
Victor SANH
4ac462bfb8 Update examples/movement-pruning/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-06-01 09:23:31 -04:00
Victor SANH
35fa0bbca0 clarify README 2020-06-01 09:23:31 -04:00
Victor SANH
cc746a5020 flake8 compliance 2020-06-01 09:23:31 -04:00
Victor SANH
b11386e158 less prints in saving prunebert 2020-06-01 09:23:31 -04:00
Victor SANH
8b5d4003ab complete README 2020-06-01 09:23:31 -04:00
Victor SANH
5c8e5b3709 commplying with isort 2020-06-01 09:23:31 -04:00
Victor SANH
db2a3b2e01 space 2020-06-01 09:23:31 -04:00
Victor SANH
5f8f2d849a add floppy bert model notebok 2020-06-01 09:23:31 -04:00
Victor SANH
b41948f5cd add requirements 2020-06-01 09:23:31 -04:00
Victor SANH
fb8f4277b2 add scripts 2020-06-01 09:23:31 -04:00
Victor SANH
d489a6d3d5 add masked_run_* 2020-06-01 09:23:31 -04:00
Victor SANH
e4c07faf0a add sparsity modules 2020-06-01 09:23:31 -04:00
Patrick von Platen
96f57c9ccb
[Benchmark] Memory benchmark utils (#4198)
* improve memory benchmarking

* correct typo

* fix current memory

* check torch memory allocated

* better pytorch function

* add total cached gpu memory

* add total gpu required

* improve torch gpu usage

* update memory usage

* finalize memory tracing

* save intermediate benchmark class

* fix conflict

* improve benchmark

* improve benchmark

* finalize

* make style

* improve benchmarking

* correct typo

* make train function more flexible

* fix csv save

* better repr of bytes

* better print

* fix __repr__ bug

* finish plot script

* rename plot file

* delete csv and small improvements

* fix in plot

* fix in plot

* correct usage of timeit

* remove redundant line

* remove redundant line

* fix bug

* add hf parser tests

* add versioning and platform info

* make style

* add gpu information

* ensure backward compatibility

* finish adding all tests

* Update src/transformers/benchmark/benchmark_args.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/benchmark/benchmark_args_utils.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* delete csv files

* fix isort ordering

* add out of memory handling

* add better train memory handling

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-05-27 23:22:16 +02:00
Lysandre Debut
6a17688021
per_device instead of per_gpu/error thrown when argument unknown (#4618)
* per_device instead of per_gpu/error thrown when argument unknown

* [docs] Restore examples.md symlink

* Correct absolute links so that symlink to the doc works correctly

* Update src/transformers/hf_argparser.py

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Warning + reorder

* Docs

* Style

* not for squad

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-05-27 11:36:55 -04:00
Hao Tan
a9aa7456ac
Add back --do_lower_case to uncased models (#4245)
The option `--do_lower_case` is currently required by the uncased models (i.e., bert-base-uncased, bert-large-uncased).

Results:
BERT-BASE without --do_lower_case:  'exact': 73.83, 'f1': 82.22
BERT-BASE with --do_lower_case:  'exact': 81.02, 'f1': 88.34
2020-05-26 21:13:07 -04:00
Antonis Maronikolakis
50d1ce411f
add DistilBERT to supported models (#4558) 2020-05-25 14:50:45 -04:00
Zhangyx
49296533ca
Adds predict stage for glue tasks, and generate result files which can be submitted to gluebenchmark.com (#4463)
* Adds predict stage for glue tasks, and generate result files which could be submitted to gluebenchmark.com website.

* Use Split enum + always output the label name

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-05-21 09:17:44 -04:00
Tobias Lee
271bedb485
[examples] fix no grad in second pruning in run_bertology (#4479)
* fix no grad in second pruning and typo

* fix prune heads attention mismatch problem

* fix

* fix

* fix

* run make style

* run make style
2020-05-21 09:17:03 -04:00
Patrick von Platen
aa925a52fa
[Tests, GPU, SLOW] fix a bunch of GPU hardcoded tests in Pytorch (#4468)
* fix gpu slow tests in pytorch

* change model to device syntax
2020-05-19 21:35:04 +02:00
Julien Chaumond
5e7fe8b585
Distributed eval: SequentialDistributedSampler + gather all results (#4243)
* Distributed eval: SequentialDistributedSampler + gather all results

* For consistency only write to disk from world_master

Close https://github.com/huggingface/transformers/issues/4272

* Working distributed eval

* Hook into scripts

* Fix #3721 again

* TPU.mesh_reduce: stay in tensor space

Thanks @jysohn23

* Just a small comment

* whitespace

* torch.hub: pip install packaging

* Add test scenarii
2020-05-18 22:02:39 -04:00
Boris Dayma
d9ece8233d
fix(run_language_modeling): use arg overwrite_cache (#4407) 2020-05-18 11:37:35 -04:00
Julien Chaumond
757baee846 Fix un-prefixed f-string
see https://github.com/huggingface/transformers/pull/4367#discussion_r426356693

Hat/tip @girishponkiya
2020-05-18 11:20:46 -04:00
Julien Chaumond
15550ce0d1 [skip ci] remove local rank 2020-05-15 17:08:38 -04:00
Lysandre Debut
edf9ac11d4
Should return overflowing information for the log (#4385) 2020-05-15 09:49:11 -04:00
Julien Chaumond
af2e6bf87c [examples] Streamline doc 2020-05-14 20:34:31 -04:00