Commit Graph

1379 Commits

Author SHA1 Message Date
Sylvain Gugger
ec07da65e2
Update the README of the text classification example (#9237)
* Update the README of the text classification example

* Update examples/README.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Adapt comment from review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-12-21 15:23:40 -05:00
Teven
4eef5889ac
Adding performer fine-tuning research exampke (#9239)
* added run_mlm_performer.py research example

* make styke

* make styke

* Added a README !
2020-12-21 21:19:41 +01:00
Amog Kamsetty
a4b21cdd20
[RAG] Add Ray implementation for distributed retrieval (#9197)
* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* uncomment

* uncomment

* wip

* updates

* add docstring

* updates

* fix arg

* fixes

* add unit tests

* update readme

* update readme

* update finetune script

* update test

* add test

* add ray to test dependencies

* separate ray and ray tune

* formatting

* shutdown ray at end of test

* fix tests

* formatting

* formatting

* even more formatting

* address comments

* formatting

* add files

* Update examples/research_projects/rag/test_distributed_retriever.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address comments

* addressing comments

Co-authored-by: Ubuntu <ubuntu@ip-172-31-21-208.us-west-2.compute.internal>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-12-21 10:39:30 +01:00
Stas Bekman
f38c4ad302
better logging and help (#9203) 2020-12-20 10:28:28 -08:00
Stas Bekman
6b850b671d
[run_glue] add speed metrics (#9198)
* add speed metrics

* suggestions
2020-12-18 17:09:30 -08:00
Aleksey Tikhonov
291974c65c
GPT-model attention heads pruning example (#9189)
* Pruning for GPT attn heads

* The code formatted according to the transformers requirements

* Update run_prune_gpt.py

* Update run_prune_gpt.py
2020-12-18 16:32:10 -05:00
Sylvain Gugger
1198ba8fba
Add timing inside Trainer (#9196)
* Add timing inside Trainer

* Fix tests

* Add n_objs for train

* Sort logs
2020-12-18 15:10:39 -05:00
Sylvain Gugger
9a25c5bd3a
Add new run_swag example (#9175)
* Add new run_swag example

* Add check

* Add sample

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Very important change to make Lysandre happy

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-12-18 14:19:24 -05:00
Manuel Romero
077a5dce32
Fix link to old SQUAD fine-tuning script (#9181) 2020-12-18 09:12:10 -05:00
Wissam Antoun
fd7b6a5274
fixed JSON error in run_qa with fp16 (#9186) 2020-12-18 07:53:23 -05:00
Manuel Romero
66a14a2f6f
Fix link to old NER fine-tuning script (#9182) 2020-12-17 19:50:01 -05:00
Stas Bekman
f06d0fadc9
[trainer] apex fixes and tests (#9180) 2020-12-17 16:49:11 -08:00
Stas Bekman
63841c559b
add tests for the new sharded ddp fairscale integration (#9177) 2020-12-17 14:24:03 -08:00
Sylvain Gugger
9a67185344
Experimental support for fairscale ShardedDDP (#9139)
* Experimental stupport for fairscale ShardedDDP

* Add import error if fairscale not available

* Address review comments

* Fix seq2seq trainer
2020-12-16 13:47:48 -05:00
Sylvain Gugger
4d48973523
Update notebook table and transformers intro notebook (#9136) 2020-12-16 10:24:31 -05:00
Patrick von Platen
640e6fe190
[Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054)
* save intermediate

* save intermediate

* save intermediate

* correct flax bert model file

* new module / model naming

* make style

* almost finish BERT

* finish roberta

* make fix-copies

* delete keys file

* last refactor

* fixes in run_mlm_flax.py

* remove pooled from run_mlm_flax.py`

* fix gelu | gelu_new

* remove Module from inits

* splits

* dirty print

* preventing warmup_steps == 0

* smaller splits

* make fix-copies

* dirty print

* dirty print

* initial_evaluation argument

* declaration order fix

* proper model initialization/loading

* proper initialization

* run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug

* removed tokenizers warning hack, fixed model re-initialization

* reverted training_args.py changes

* fix flax from pretrained

* improve test in flax

* apply sylvains tips

* update init

* make 0.3.0 compatible

* revert tevens changes

* revert tevens changes 2

* finalize revert

* fix bug

* add docs

* add pretrained to init

* Update src/transformers/modeling_flax_utils.py

* fix copies

* final improvements

Co-authored-by: TevenLeScao <teven.lescao@gmail.com>
2020-12-16 13:03:32 +01:00
Teven
2a7e8e1608
[Examples] Add automatic dataset splitting in language-modeling examples (#9133)
* replaced jnp.split + removing textual model inputs + ensuring warmup_steps > 0

* Add automatic dataset splitting in language-modeling examples
2020-12-15 16:02:43 -05:00
Stas Bekman
14c79c3e31
native amp leak fix landed in 1.7.1 (#9115)
update README with good news that the leak fix has been applied to pytorch-1.7.1.
2020-12-15 09:10:41 -05:00
Yoshitomo Matsubara
44c340f45f
fix a bug in eval_batch_retrieval (#9089) 2020-12-15 14:46:55 +01:00
Stas Bekman
c19d04623e
[finetune_trainer] enhancements and fixes (#9042)
* trainer and finetune_trainer enhancements and fixes

* add fallback default

* move the fixing of incorrect keys back into finetune trainer

* s/eval/val/ to match the split

* trainer can now use a different prefix than eval_ for metrics

* document new arg

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* use 'eval' as the default for metric_key_prefix

* complete adjust var names + disambiguate

* fix logger

* add clarifying comment

* add clarifying comment

* style

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/trainer.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* complete removal of optional for metric_key_prefix

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-12-14 17:45:33 -08:00
Sylvain Gugger
29e4597950
Fix min_null_pred in the run_qa script (#9067) 2020-12-11 16:26:05 -05:00
dependabot[bot]
24f6cdeab6
Bump notebook in /examples/research_projects/movement-pruning/lxmert (#9062)
Bumps [notebook](https://github.com/jupyter/jupyterhub) from 6.1.4 to 6.1.5.
- [Release notes](https://github.com/jupyter/jupyterhub/releases)
- [Changelog](https://github.com/jupyterhub/jupyterhub/blob/master/CHECKLIST-Release.md)
- [Commits](https://github.com/jupyter/jupyterhub/commits)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2020-12-11 10:32:43 -05:00
Sylvain Gugger
783d7d2629
Reorganize examples (#9010)
* Reorganize example folder

* Continue reorganization

* Change requirements for tests

* Final cleanup

* Finish regroup with tests all passing

* Copyright

* Requirements and readme

* Make a full link for the documentation

* Address review comments

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add symlink

* Reorg again

* Apply suggestions from code review

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Adapt title

* Update to new strucutre

* Remove test

* Update READMEs

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-12-11 10:07:02 -05:00
NatLun137
91ab02af28
Fix typo #9012 (#1) (#9038)
There is a tiny typo in the code "transformers/examples/language-modeling/run_mlm_wwm.py" at line 284. [Details.](https://github.com/huggingface/transformers/issues/9012)
2020-12-10 16:41:00 -05:00
Funtowicz Morgan
75627148ee
Flax Masked Language Modeling training example (#8728)
* Remove "Model" suffix from Flax models to look more 🤗

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Initial working (forward + backward) for Flax MLM training example.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Simply code

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Addressing comments, using module and moving to LM task.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Restore parameter name "module" wrongly renamed model.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Restore correct output ordering...

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Actually commit the example 😅

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Add FlaxBertModelForMaskedLM after rebasing.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make it possible to initialize the training from scratch

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Reuse flax linen example of cross entropy loss

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added specific data collator for flax

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Remove todo for data collator

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added evaluation step

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added ability to provide dtype to support bfloat16 on TPU

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable flax tensorboard output

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable jax.pmap support.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Ensure batches are correctly sized to be dispatched with jax.pmap

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable bfloat16 with --fp16 cmdline args

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Correctly export metrics to tensorboard

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added dropout and ability to use it.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Effectively enable & disable during training and evaluation steps.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Oops.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable specifying kernel initializer scale

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Style.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added warmup step to the learning rate scheduler.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix typo.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Print training loss

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make style

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* fix linter issue (flake8)

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix model matching

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix dummies

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix non default dtype on Flax models

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Use the same create_position_ids_from_input_ids for FlaxRoberta

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make Roberta attention as Bert

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* fix copy

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Wording.

Co-authored-by: Marc van Zee <marcvanzee@gmail.com>

Co-authored-by: Marc van Zee <marcvanzee@gmail.com>
2020-12-09 17:13:56 +01:00
Sylvain Gugger
447808c85f
New squad example (#8992)
* Add new SQUAD example

* Same with a task-specific Trainer

* Address review comment.

* Small fixes

* Initial work for XLNet

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Final clean up and working XLNet script

* Test and debug

* Final working version

* Add new SQUAD example

* Same with a task-specific Trainer

* Address review comment.

* Small fixes

* Initial work for XLNet

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Final clean up and working XLNet script

* Test and debug

* Final working version

* Add tick

* Update README

* Address review comments

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-12-08 14:39:29 -05:00
Sylvain Gugger
00aa9dbca2
Copyright (#8970)
* Add copyright everywhere missing

* Style
2020-12-07 18:36:34 -05:00
Sylvain Gugger
62d30e0583
Small fix to the run clm script (#8973) 2020-12-07 17:32:09 -05:00
Sylvain Gugger
7f9ccffc5b
Use word_ids to get labels in run_ner (#8962)
* Use word_ids to get labels in run_ner

* Add sanity check
2020-12-07 14:26:36 -05:00
Ethan Perez
8dfc8c7221
Don't pass in token_type_ids to BART for GLUE (#8929)
Without this fix, training a `BARTForSequenceClassification` model with `run_pl_glue.py` gives `TypeError: forward() got an unexpected keyword argument 'token_type_ids'`, because BART does not have token_type_ids. I've solved this issue in the same way as it's solved for the "distilbert" model, and I can train BART models on SNLI without errors now.
2020-12-05 09:52:16 -05:00
Stas Bekman
df311a5ccf
[seq2seq] document the caveat of leaky native amp (#8930)
* document the caveat of leaky native amp

* Update examples/seq2seq/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-12-04 15:43:35 -08:00
Stas Bekman
4c3d98dddc
[s2s finetune_trainer] add instructions for distributed training (#8884) 2020-12-03 16:05:55 -08:00
Stas Bekman
379005c9d2
start using training_args.parallel_mode (#8882) 2020-12-01 11:40:36 -08:00
Stas Bekman
7f34d75780
[s2s trainer] fix DP mode (#8823)
* fix DP case on multi-gpu

* make executable

* test all 3 modes

* use the correct check for distributed

* dp doesn't need a special case

* restore original name

* cleanup
2020-11-30 12:55:56 -08:00
Sylvain Gugger
5530299096
Remove deprecated evalutate_during_training (#8852)
* Remove deprecated `evalutate_during_training`

* Update src/transformers/training_args_tf.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-11-30 11:12:15 -05:00
Stefan Schweter
19fa01ce2a
token-classification: use is_world_process_zero instead of deprecated is_world_master() (#8828) 2020-11-30 09:21:56 -05:00
Stas Bekman
ddf3c64654
potpurri of small fixes (#8807) 2020-11-26 14:06:27 -08:00
chutaklee
52708d2637
Fix PPLM (#8779)
* Fix pplm

* fix style

* make style

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-26 22:23:36 +01:00
Patrick von Platen
8f07f5c44b
Revert "finetune.py: specifying generation min_length (#8478)" (#8805)
This reverts commit 5aa361f3e5.
2020-11-26 20:12:01 +01:00
Daniel Khashabi
5aa361f3e5
finetune.py: specifying generation min_length (#8478) 2020-11-26 12:33:02 +05:30
Stas Bekman
82d443a7fd
[core] implement support for run-time dependency version checking (#8645)
* implement support for run-time dependency version checking

* try not escaping !

* use findall that works on py36

* small tweaks

* autoformatter worship

* simplify

* shorter names

* add support for non-versioned checks

* add deps

* revert

* tokenizers not required, check version only if installed

* make a proper distutils cmd and add make target

* tqdm must be checked before tokenizers

* workaround the DistributionNotFound peculiar setup

* handle the rest of packages in setup.py

* fully sync setup.py's install_requires - to check them all

* nit

* make install_requires more readable

* typo

* Update setup.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* restyle

* add types

* simplify

* simplify2

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-24 13:22:25 -05:00
Quentin Lhoest
a7d73cfdd4
fix rag index names in eval_rag.py example (#8730) 2020-11-24 17:04:47 +01:00
zhiheng-huang
2c83b3c38d
Support various BERT relative position embeddings (2nd) (#8276)
* Support BERT relative position embeddings

* Fix typo in README.md

* Address review comment

* Fix failing tests

* [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py

* make fix copies

* fix configs of electra and albert and fix longformer

* remove copy statement from longformer

* fix albert

* fix electra

* Add bert variants forward tests for various position embeddings

* [tiny] Fix style for test_modeling_bert.py

* improve docstring

* [tiny] improve docstring and remove unnecessary dependency

* [tiny] Remove unused import

* re-add to ALBERT

* make embeddings work for ALBERT

* add test for albert

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-24 14:40:53 +01:00
Sylvain Gugger
367f497dec
Fix max length in run_plm script (#8738) 2020-11-23 16:02:31 -05:00
Stas Bekman
1e45bef0a7
[trainer] make generate work with multigpu (#8716)
* make generate work with multigpu

* better fix - thanks @sgugger
2020-11-23 10:57:27 -08:00
Santiago Castro
e1f3156b21
Fix many typos (#8708) 2020-11-21 22:58:10 -05:00
Quentin Lhoest
8062fa63c5
Fix rag finetuning + add finetuning test (#8585)
* replace init_ddp_connection for index init

* style

* add finetune test

* add test data

* move generate tensors to device

* add test on EM metric

* style

* allow multi process test

* keep gloo process group for retrieval

* add multi-gpu test

* use custom accelerator

* clean test finetune

* minor

* style

* style

* typo

* use python call instead of imported main fumction

* return_dict fix in modeling_rag

* use float32 in retrieval

* store as float32 as well in the custom knowledge dataset example

* style

* rename to finetune_rag

* style

* update readme

* rename utils and callbacks to utils_rag and callbacks_rag

* fix test

* patrick's comments

* generate dummy data in the finetue test script

* remove dummy data files

* style
2020-11-20 19:05:03 +01:00
Stas Bekman
0ad45e108d
[examples/seq2seq] fix PL deprecation warning (#8577)
* fix deprecation warning

* fix
2020-11-19 21:46:04 +01:00
Sylvain Gugger
20b658607e
Fix run_ner script (#8664)
* Fix run_ner script

* Pin datasets
2020-11-19 13:59:30 -05:00
Sylvain Gugger
cb3e5c33f7
Fix a few last paths for the new repo org (#8666) 2020-11-19 11:56:42 -05:00