Commit Graph

1420 Commits

Author SHA1 Message Date
Wissam Antoun
fd7b6a5274
fixed JSON error in run_qa with fp16 (#9186) 2020-12-18 07:53:23 -05:00
Manuel Romero
66a14a2f6f
Fix link to old NER fine-tuning script (#9182) 2020-12-17 19:50:01 -05:00
Stas Bekman
f06d0fadc9
[trainer] apex fixes and tests (#9180) 2020-12-17 16:49:11 -08:00
Stas Bekman
63841c559b
add tests for the new sharded ddp fairscale integration (#9177) 2020-12-17 14:24:03 -08:00
Sylvain Gugger
9a67185344
Experimental support for fairscale ShardedDDP (#9139)
* Experimental stupport for fairscale ShardedDDP

* Add import error if fairscale not available

* Address review comments

* Fix seq2seq trainer
2020-12-16 13:47:48 -05:00
Sylvain Gugger
4d48973523
Update notebook table and transformers intro notebook (#9136) 2020-12-16 10:24:31 -05:00
Patrick von Platen
640e6fe190
[Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054)
* save intermediate

* save intermediate

* save intermediate

* correct flax bert model file

* new module / model naming

* make style

* almost finish BERT

* finish roberta

* make fix-copies

* delete keys file

* last refactor

* fixes in run_mlm_flax.py

* remove pooled from run_mlm_flax.py`

* fix gelu | gelu_new

* remove Module from inits

* splits

* dirty print

* preventing warmup_steps == 0

* smaller splits

* make fix-copies

* dirty print

* dirty print

* initial_evaluation argument

* declaration order fix

* proper model initialization/loading

* proper initialization

* run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug

* removed tokenizers warning hack, fixed model re-initialization

* reverted training_args.py changes

* fix flax from pretrained

* improve test in flax

* apply sylvains tips

* update init

* make 0.3.0 compatible

* revert tevens changes

* revert tevens changes 2

* finalize revert

* fix bug

* add docs

* add pretrained to init

* Update src/transformers/modeling_flax_utils.py

* fix copies

* final improvements

Co-authored-by: TevenLeScao <teven.lescao@gmail.com>
2020-12-16 13:03:32 +01:00
Teven
2a7e8e1608
[Examples] Add automatic dataset splitting in language-modeling examples (#9133)
* replaced jnp.split + removing textual model inputs + ensuring warmup_steps > 0

* Add automatic dataset splitting in language-modeling examples
2020-12-15 16:02:43 -05:00
Stas Bekman
14c79c3e31
native amp leak fix landed in 1.7.1 (#9115)
update README with good news that the leak fix has been applied to pytorch-1.7.1.
2020-12-15 09:10:41 -05:00
Yoshitomo Matsubara
44c340f45f
fix a bug in eval_batch_retrieval (#9089) 2020-12-15 14:46:55 +01:00
Stas Bekman
c19d04623e
[finetune_trainer] enhancements and fixes (#9042)
* trainer and finetune_trainer enhancements and fixes

* add fallback default

* move the fixing of incorrect keys back into finetune trainer

* s/eval/val/ to match the split

* trainer can now use a different prefix than eval_ for metrics

* document new arg

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* use 'eval' as the default for metric_key_prefix

* complete adjust var names + disambiguate

* fix logger

* add clarifying comment

* add clarifying comment

* style

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/trainer.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* complete removal of optional for metric_key_prefix

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-12-14 17:45:33 -08:00
Sylvain Gugger
29e4597950
Fix min_null_pred in the run_qa script (#9067) 2020-12-11 16:26:05 -05:00
dependabot[bot]
24f6cdeab6
Bump notebook in /examples/research_projects/movement-pruning/lxmert (#9062)
Bumps [notebook](https://github.com/jupyter/jupyterhub) from 6.1.4 to 6.1.5.
- [Release notes](https://github.com/jupyter/jupyterhub/releases)
- [Changelog](https://github.com/jupyterhub/jupyterhub/blob/master/CHECKLIST-Release.md)
- [Commits](https://github.com/jupyter/jupyterhub/commits)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2020-12-11 10:32:43 -05:00
Sylvain Gugger
783d7d2629
Reorganize examples (#9010)
* Reorganize example folder

* Continue reorganization

* Change requirements for tests

* Final cleanup

* Finish regroup with tests all passing

* Copyright

* Requirements and readme

* Make a full link for the documentation

* Address review comments

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add symlink

* Reorg again

* Apply suggestions from code review

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Adapt title

* Update to new strucutre

* Remove test

* Update READMEs

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-12-11 10:07:02 -05:00
NatLun137
91ab02af28
Fix typo #9012 (#1) (#9038)
There is a tiny typo in the code "transformers/examples/language-modeling/run_mlm_wwm.py" at line 284. [Details.](https://github.com/huggingface/transformers/issues/9012)
2020-12-10 16:41:00 -05:00
Funtowicz Morgan
75627148ee
Flax Masked Language Modeling training example (#8728)
* Remove "Model" suffix from Flax models to look more 🤗

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Initial working (forward + backward) for Flax MLM training example.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Simply code

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Addressing comments, using module and moving to LM task.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Restore parameter name "module" wrongly renamed model.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Restore correct output ordering...

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Actually commit the example 😅

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Add FlaxBertModelForMaskedLM after rebasing.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make it possible to initialize the training from scratch

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Reuse flax linen example of cross entropy loss

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added specific data collator for flax

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Remove todo for data collator

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added evaluation step

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added ability to provide dtype to support bfloat16 on TPU

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable flax tensorboard output

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable jax.pmap support.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Ensure batches are correctly sized to be dispatched with jax.pmap

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable bfloat16 with --fp16 cmdline args

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Correctly export metrics to tensorboard

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added dropout and ability to use it.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Effectively enable & disable during training and evaluation steps.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Oops.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable specifying kernel initializer scale

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Style.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added warmup step to the learning rate scheduler.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix typo.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Print training loss

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make style

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* fix linter issue (flake8)

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix model matching

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix dummies

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix non default dtype on Flax models

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Use the same create_position_ids_from_input_ids for FlaxRoberta

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make Roberta attention as Bert

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* fix copy

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Wording.

Co-authored-by: Marc van Zee <marcvanzee@gmail.com>

Co-authored-by: Marc van Zee <marcvanzee@gmail.com>
2020-12-09 17:13:56 +01:00
Sylvain Gugger
447808c85f
New squad example (#8992)
* Add new SQUAD example

* Same with a task-specific Trainer

* Address review comment.

* Small fixes

* Initial work for XLNet

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Final clean up and working XLNet script

* Test and debug

* Final working version

* Add new SQUAD example

* Same with a task-specific Trainer

* Address review comment.

* Small fixes

* Initial work for XLNet

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Final clean up and working XLNet script

* Test and debug

* Final working version

* Add tick

* Update README

* Address review comments

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-12-08 14:39:29 -05:00
Sylvain Gugger
00aa9dbca2
Copyright (#8970)
* Add copyright everywhere missing

* Style
2020-12-07 18:36:34 -05:00
Sylvain Gugger
62d30e0583
Small fix to the run clm script (#8973) 2020-12-07 17:32:09 -05:00
Sylvain Gugger
7f9ccffc5b
Use word_ids to get labels in run_ner (#8962)
* Use word_ids to get labels in run_ner

* Add sanity check
2020-12-07 14:26:36 -05:00
Ethan Perez
8dfc8c7221
Don't pass in token_type_ids to BART for GLUE (#8929)
Without this fix, training a `BARTForSequenceClassification` model with `run_pl_glue.py` gives `TypeError: forward() got an unexpected keyword argument 'token_type_ids'`, because BART does not have token_type_ids. I've solved this issue in the same way as it's solved for the "distilbert" model, and I can train BART models on SNLI without errors now.
2020-12-05 09:52:16 -05:00
Stas Bekman
df311a5ccf
[seq2seq] document the caveat of leaky native amp (#8930)
* document the caveat of leaky native amp

* Update examples/seq2seq/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-12-04 15:43:35 -08:00
Stas Bekman
4c3d98dddc
[s2s finetune_trainer] add instructions for distributed training (#8884) 2020-12-03 16:05:55 -08:00
Stas Bekman
379005c9d2
start using training_args.parallel_mode (#8882) 2020-12-01 11:40:36 -08:00
Stas Bekman
7f34d75780
[s2s trainer] fix DP mode (#8823)
* fix DP case on multi-gpu

* make executable

* test all 3 modes

* use the correct check for distributed

* dp doesn't need a special case

* restore original name

* cleanup
2020-11-30 12:55:56 -08:00
Sylvain Gugger
5530299096
Remove deprecated evalutate_during_training (#8852)
* Remove deprecated `evalutate_during_training`

* Update src/transformers/training_args_tf.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-11-30 11:12:15 -05:00
Stefan Schweter
19fa01ce2a
token-classification: use is_world_process_zero instead of deprecated is_world_master() (#8828) 2020-11-30 09:21:56 -05:00
Stas Bekman
ddf3c64654
potpurri of small fixes (#8807) 2020-11-26 14:06:27 -08:00
chutaklee
52708d2637
Fix PPLM (#8779)
* Fix pplm

* fix style

* make style

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-26 22:23:36 +01:00
Patrick von Platen
8f07f5c44b
Revert "finetune.py: specifying generation min_length (#8478)" (#8805)
This reverts commit 5aa361f3e5.
2020-11-26 20:12:01 +01:00
Daniel Khashabi
5aa361f3e5
finetune.py: specifying generation min_length (#8478) 2020-11-26 12:33:02 +05:30
Stas Bekman
82d443a7fd
[core] implement support for run-time dependency version checking (#8645)
* implement support for run-time dependency version checking

* try not escaping !

* use findall that works on py36

* small tweaks

* autoformatter worship

* simplify

* shorter names

* add support for non-versioned checks

* add deps

* revert

* tokenizers not required, check version only if installed

* make a proper distutils cmd and add make target

* tqdm must be checked before tokenizers

* workaround the DistributionNotFound peculiar setup

* handle the rest of packages in setup.py

* fully sync setup.py's install_requires - to check them all

* nit

* make install_requires more readable

* typo

* Update setup.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* restyle

* add types

* simplify

* simplify2

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-24 13:22:25 -05:00
Quentin Lhoest
a7d73cfdd4
fix rag index names in eval_rag.py example (#8730) 2020-11-24 17:04:47 +01:00
zhiheng-huang
2c83b3c38d
Support various BERT relative position embeddings (2nd) (#8276)
* Support BERT relative position embeddings

* Fix typo in README.md

* Address review comment

* Fix failing tests

* [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py

* make fix copies

* fix configs of electra and albert and fix longformer

* remove copy statement from longformer

* fix albert

* fix electra

* Add bert variants forward tests for various position embeddings

* [tiny] Fix style for test_modeling_bert.py

* improve docstring

* [tiny] improve docstring and remove unnecessary dependency

* [tiny] Remove unused import

* re-add to ALBERT

* make embeddings work for ALBERT

* add test for albert

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-24 14:40:53 +01:00
Sylvain Gugger
367f497dec
Fix max length in run_plm script (#8738) 2020-11-23 16:02:31 -05:00
Stas Bekman
1e45bef0a7
[trainer] make generate work with multigpu (#8716)
* make generate work with multigpu

* better fix - thanks @sgugger
2020-11-23 10:57:27 -08:00
Santiago Castro
e1f3156b21
Fix many typos (#8708) 2020-11-21 22:58:10 -05:00
Quentin Lhoest
8062fa63c5
Fix rag finetuning + add finetuning test (#8585)
* replace init_ddp_connection for index init

* style

* add finetune test

* add test data

* move generate tensors to device

* add test on EM metric

* style

* allow multi process test

* keep gloo process group for retrieval

* add multi-gpu test

* use custom accelerator

* clean test finetune

* minor

* style

* style

* typo

* use python call instead of imported main fumction

* return_dict fix in modeling_rag

* use float32 in retrieval

* store as float32 as well in the custom knowledge dataset example

* style

* rename to finetune_rag

* style

* update readme

* rename utils and callbacks to utils_rag and callbacks_rag

* fix test

* patrick's comments

* generate dummy data in the finetue test script

* remove dummy data files

* style
2020-11-20 19:05:03 +01:00
Stas Bekman
0ad45e108d
[examples/seq2seq] fix PL deprecation warning (#8577)
* fix deprecation warning

* fix
2020-11-19 21:46:04 +01:00
Sylvain Gugger
20b658607e
Fix run_ner script (#8664)
* Fix run_ner script

* Pin datasets
2020-11-19 13:59:30 -05:00
Sylvain Gugger
cb3e5c33f7
Fix a few last paths for the new repo org (#8666) 2020-11-19 11:56:42 -05:00
Matthias
a79a96ddaa
fix small typo (#8644)
Fixed a small typo on the XLNet and permutation language modelling section
2020-11-19 11:24:11 -05:00
Sylvain Gugger
4208f496ee
Better filtering of the model outputs in Trainer (#8633)
* Better filtering of the model outputs in Trainer

* Fix examples tests

* Add test for Lysandre
2020-11-19 10:43:15 -05:00
Quentin Lhoest
62cd9ce9f8
fix missing return dict (#8653) 2020-11-19 15:17:18 +01:00
Tim Isbister
28d16e7ac5
Update README.md (#8635) 2020-11-18 18:35:23 -05:00
Stas Bekman
d86d57faa3
[s2s] distillation apex breaks return_dict obj (#8631)
* apex breaks return_dict obj

* style
2020-11-18 12:51:29 -08:00
Sylvain Gugger
a0c62d2493
Fix training from scratch in new scripts (#8623) 2020-11-18 12:15:26 -05:00
Stas Bekman
cdf1b7ae82
fix to adjust for #8530 changes (#8612) 2020-11-18 10:25:00 -05:00
Stas Bekman
2819da02f7
[s2s] broken test (#8613) 2020-11-18 10:15:53 -05:00
Sylvain Gugger
dd52804f5f
Remove deprecated (#8604)
* Remove old deprecated arguments

Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

* Remove needless imports

* Fix tests

Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
2020-11-17 15:11:29 -05:00
Stas Bekman
f0435f5a61
these should run fine on multi-gpu (#8582) 2020-11-17 14:00:41 -05:00
Julien Chaumond
042a6aa777
Tokenizers: ability to load from model subfolder (#8586)
* <small>tiny typo</small>

* Tokenizers: ability to load from model subfolder

* use subfolder for local files as well

* Uniformize model shortcut name => model id

* from s3 => from huggingface.co

Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>
2020-11-17 08:58:45 -05:00
Sylvain Gugger
c89bdfbe72
Reorganize repo (#8580)
* Put models in subfolders

* Styling

* Fix imports in tests

* More fixes in test imports

* Sneaky hidden imports

* Fix imports in doc files

* More sneaky imports

* Finish fixing tests

* Fix examples

* Fix path for copies

* More fixes for examples

* Fix dummy files

* More fixes for example

* More model import fixes

* Is this why you're unhappy GitHub?

* Fix imports in conver command
2020-11-16 21:43:42 -05:00
Sylvain Gugger
1073a2bde5
Switch return_dict to True by default. (#8530)
* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Run on the real suite

* Fix slow tests
2020-11-16 11:43:00 -05:00
Thomas Wolf
f4e04cd2c6
[breaking|pipelines|tokenizers] Adding slow-fast tokenizers equivalence tests pipelines - Removing sentencepiece as a required dependency (#8073)
* Fixing roberta for slow-fast tests

* WIP getting equivalence on pipelines

* slow-to-fast equivalence - working on question-answering pipeline

* optional FAISS tests

* Pipeline Q&A

* Move pipeline tests to their own test job again

* update tokenizer to add sequence id methods

* update to tokenizers 0.9.4

* set sentencepiecce as optional

* clean up squad

* clean up pipelines to use sequence_ids

* style/quality

* wording

* Switch to use_fast = True by default

* update tests for use_fast at True by default

* fix rag tokenizer test

* removing protobuf from required dependencies

* fix NER test for use_fast = True by default

* fixing example tests (Q&A examples use slow tokenizers for now)

* protobuf in main deps extras["sentencepiece"] and example deps

* fix protobug install test

* try to fix seq2seq by switching to slow tokenizers for now

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-11-15 22:50:59 +01:00
Julien Plu
27b3ff316a
Try to understand and apply Sylvain's comments (#8458) 2020-11-12 13:43:00 -05:00
zeyuyun1
924c624a46
quick fix on concatenating text to support more datasets (#8474) 2020-11-12 09:47:08 -05:00
Sumithra Bhakthavatsalam
81ebd70671
[s2s] distill t5-large -> t5-small (#8376)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-11-11 17:58:45 -05:00
sarnoult
a38d1c7c31
Example NER script predicts on tokenized dataset (#8468)
The new run_ner.py script tries to run prediction on the input
test set `datasets["test"]`, but it should be the tokenized set
`tokenized_datasets["test"]`
2020-11-11 10:28:23 -05:00
Stas Bekman
02bdfc0251
using multi_gpu consistently (#8446)
* s|multiple_gpu|multi_gpu|g; s|multigpu|multi_gpu|g'

* doc
2020-11-10 13:23:58 -05:00
Stas Bekman
5d4972e608
[examples] better PL version check (#8429) 2020-11-10 09:33:23 -05:00
Shichao Sun
ae1cb4ec22
[s2s/distill] hparams.tokenizer_name = hparams.teacher (#8382) 2020-11-10 09:32:01 -05:00
Julien Chaumond
55e8d0cea2 Update links from s3 to huggingface.co 2020-11-10 14:03:29 +01:00
Stas Bekman
190df58560
[github CI] add a multi-gpu job for all example tests (#8341)
* add a multi-gpu job for all example tests

* run only ported tests

* rename

* explain why env is re-activated on each step

* mark all unported/checked tests with @require_torch_non_multigpu_but_fix_me

* style

* Apply suggestions from code review

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-11-09 15:47:38 -05:00
Patrick von Platen
9c83b96e62
[Tests] Add Common Test for Training + Fix a couple of bugs (#8415)
* add training tests

* correct longformer

* fix docs

* fix some tests

* fix some more train tests

* remove ipdb

* fix multiple edge case model training

* fix funnel and prophetnet

* clean gpt models

* undo renaming of albert
2020-11-09 18:24:41 +01:00
Sylvain Gugger
5c766ecb50 Fix typo 2020-11-09 11:50:51 -05:00
Sylvain Gugger
908a28894c
Add new token classification example (#8340)
* Add new token classification example

* Remove txt file

* Add test

* With actual testing done

* Less warmup is better

* Update examples/token-classification/run_ner_new.py

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Address review comments

* Fix test

* Make Lysandre happy

* Last touches and rename

* Rename in tests

* Address review comments

* More run_ner -> run_ner_old

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-11-09 11:39:55 -05:00
Sam Shleifer
ebde57acac
examples/docs: caveat that PL examples don't work on TPU (#8309) 2020-11-09 08:55:22 -05:00
Sam Shleifer
e6d9cdaafe
[s2s/distill] remove run_distiller.sh, fix xsum script (#8412) 2020-11-08 16:57:43 -05:00
Stas Bekman
66582492d3
[s2s test_finetune_trainer] failing multigpu test (#8400) 2020-11-08 16:45:40 -05:00
Stas Bekman
f62755a600
[s2s examples test] fix data path (#8398) 2020-11-08 16:44:18 -05:00
Jonathan Chang
5807ba3fa9
Fix typo (#8351) 2020-11-06 11:19:41 -05:00
Stas Bekman
9edafaebef
[s2s] test_bash_script.py - actually learn something (#8318)
* use decorator

* remove hardcoded paths

* make the test use more data and do real quality tests

* shave off 10 secs

* add --eval_beams 2, reformat

* reduce train size, use smaller custom dataset
2020-11-05 23:15:14 -05:00
Leandro von Werra
17450397a7
Docs bart training ref (#8330)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-11-05 17:20:57 -05:00
Stas Bekman
d787935a14
[s2s] test_distributed_eval (#8315)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-11-05 16:01:15 -05:00
Sam Shleifer
7abc1d96d1
no warn (#8329) 2020-11-05 11:42:24 -05:00
Bobby Donchev
52f44dd6d2
change TokenClassificationTask class methods to static methods (#7902)
* change TokenClassificationTask class methods to static methods

Since we do not require self in the class methods of TokenClassificationTask we should probably switch to static methods. Also, since the class TokenClassificationTask does not contain a constructor it is currently unusable as is. By switching to static methods this fixes the issue of having to document the intent of the broken class.

Also, since the get_labels and read_examples_from_file methods are ought to be implemented. Static method definitions are unchanged even after inheritance, which means that it can be overridden, similar to other class methods.

* Trigger Build

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-11-05 09:38:30 -05:00
Guillem García Subies
77c8f6c627
Corrected typo in readme (#8320) 2020-11-05 07:48:36 -05:00
Sylvain Gugger
9c4aa4ac1a
Clean up data collators and datasets (#8308)
* Clean up data collators and datasets

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Remove needless clone

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-11-04 17:24:49 -05:00
Manuel Romero
b1d3e95eb5
Fix path to old run_language_modeling.py script (#8302) 2020-11-04 13:17:57 -05:00
Sylvain Gugger
cf89724696
Fix validation file loading in scripts (#8298) 2020-11-04 10:42:18 -05:00
Pengzhi Gao
734afa37f6
Fix typo in language-modeling README.md (#8287) 2020-11-04 09:38:02 -05:00
Stas Bekman
1bb4bba53c
[CIs] Better reports everywhere (#8275)
* make it possible to invoke testconf.py in both test suites without crashing on having the same option added

* perl -pi -e 's|--make_reports|--make-reports|' to be consistent with other opts

* add `pytest --make-reports` to all CIs (and artifacts)

* fix
2020-11-03 16:57:12 -05:00
Patrick von Platen
068e6b5edd
make files independent (#8267) 2020-11-03 21:13:33 +01:00
Stas Bekman
cd360dcb26
[examples] minimal version requirement run-time check in PL (#8133)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-11-03 13:17:11 -05:00
Lysandre
eb6313e823 Fix Tatoeba skip 2020-11-03 10:35:00 -05:00
Sam Shleifer
b63beb743c
Skip tatoeba tests if Tatoeba-Challenge not cloned (#8260) 2020-11-03 09:49:29 -05:00
Patrick von Platen
9f1747f999
[Seq2Seq] Correct import in Seq2Seq Trainer (#8254) 2020-11-03 07:56:41 -05:00
Sylvain Gugger
e1b1b614b1
Add line by line option to mlm/plm scripts (#8240)
* Make line by line optional in run_mlm

* Add option to disable dynamic padding

* Add option to plm too and update README

* Typos

* More typos

* Even more typos

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-11-02 12:27:04 -05:00
Patrick von Platen
9bd30f7cf4
[Seq2SeqTrainer] Move import to init to make file self-contained (#8194)
* boom boom

* reverse order
2020-11-01 23:31:55 +01:00
Sylvain Gugger
9eb3a410cd
Remove deprecated arguments from new run_clm (#8197) 2020-10-30 15:27:20 -04:00
Sylvain Gugger
cdc48ce92d
Finalize lm examples (#8188)
* Finish the cleanup of the language-modeling examples

* Update main README

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Propagate changes

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-10-30 14:20:18 -04:00
wlhgtc
9a21b50614
Fix eval ref miss in Chinese WWM. (#8115)
* ADD: add whole word mask proxy for both eng and chinese

* MOD: adjust format

* MOD: reformat code

* MOD: update import

* MOD: fix bug

* MOD: add import

* MOD: fix bug

* MOD: decouple code and update readme

* MOD: reformat code

* Update examples/language-modeling/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* change wwm to whole_word_mask

* reformat code

* reformat

* format

* Code quality

* ADD: update chinese ref readme

* MOD: small changes

* MOD: small changes2

* update readme

* fix eval ref file miss bug

* format file

* MOD: move ref code to contrib

* MOD: add delimeter check

* reformat code

* refomat code

* Update examples/language-modeling/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-10-29 17:08:39 -04:00
Sylvain Gugger
691176283d
Add a template for examples and apply it for mlm and plm examples (#8153)
* Add a template for example scripts and apply it to mlm

* Formatting

* Fix test

* Add plm script

* Add a template for example scripts and apply it to mlm

* Formatting

* Fix test

* Add plm script

* Add a template for example scripts and apply it to mlm

* Formatting

* Fix test

* Add plm script

* Styling
2020-10-29 13:38:11 -04:00
Sam Shleifer
49e4fece5c
[s2s] distillBART docs for paper replication (#8150) 2020-10-29 12:01:15 -04:00
Sylvain Gugger
acf56408d8
Smarter prediction loop and no- -> no_ in console args (#8151)
* Smarter prediction loop and no- -> no_ in console args

* Fix test
2020-10-29 10:56:25 -04:00
Santiago Castro
969859d5f6
Fix doc errors and typos across the board (#8139)
* Fix doc errors and typos across the board

* Fix a typo

* Fix the CI

* Fix more typos

* Fix CI

* More fixes

* Fix CI

* More fixes

* More fixes
2020-10-29 10:33:33 -04:00
Stas Bekman
825925dfaa
[s2s test] cleanup (#8131) 2020-10-28 16:50:36 -04:00
Sean Naren
5e24982e58
Upgrade PyTorch Lightning to 1.0.2 (#7852)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-10-28 14:59:14 -04:00
Sylvain Gugger
378142afdf
Rename add_start_docstrings_to_callable (#8120) 2020-10-28 13:42:31 -04:00