Commit Graph

4824 Commits

Author SHA1 Message Date
Sylvain Gugger
a8db954cda
Activate check on the CI (#6427)
* Activate check on the CI

* Fix repo inconsistencies

* Don't document too much
2020-08-12 08:42:14 -04:00
Sylvain Gugger
34fabe1697
Move prediction_loss_only to TrainingArguments (#6426) 2020-08-12 08:03:45 -04:00
Sylvain Gugger
e9c3031463
Fixes to make life easier with the nlp library (#6423)
* allow using tokenizer.pad as a collate_fn in pytorch

* allow using tokenizer.pad as a collate_fn in pytorch

* Add documentation and tests

* Make attention mask the right shape

* Better test

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-08-12 08:00:56 -04:00
Stas Bekman
87b359439f
[test] replace capsys with the more refined CaptureStderr/CaptureStdout (#6422)
* replace capsys with the more refined CaptureStderr/CaptureStdout

* Update examples/seq2seq/test_seq2seq_examples.py

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-08-12 07:54:28 -04:00
Jared T Nielsen
ac5bcf236e
Fix FFN dropout in TFAlbertLayer, and split dropout in TFAlbertAttent… (#4323)
* Fix FFN dropout in TFAlbertLayer, and split dropout in TFAlbertAttention into two separate dropout layers.

* Same dropout fixes for PyTorch.
2020-08-12 07:52:42 -04:00
Lysandre Debut
4ffea5ce2f
Disabled pabee test (#6431) 2020-08-12 02:52:50 -04:00
Rohan Rajpal
155288f04b
[model_card] rohanrajpal/bert-base-codemixed-uncased-sentiment (#6324)
* Create README.md

* Update model_cards/rohanrajpal/bert-base-codemixed-uncased-sentiment/README.md

* Update model_cards/rohanrajpal/bert-base-codemixed-uncased-sentiment/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-08-11 18:38:18 -04:00
Manuel Romero
4e6245fc7e
Create model card T5-base fine-tuned on event2Mind for Intent Prediction (#6412) 2020-08-11 18:35:27 -04:00
Manuel Romero
46e3a0a6ec
Create README.md (#6381) 2020-08-11 18:34:11 -04:00
Manuel Romero
31dfde7429
Create README.md (#6378) 2020-08-11 18:32:37 -04:00
Manuel Romero
25e29150a2
Add metadata to be indexed properly (#6380) 2020-08-11 18:32:29 -04:00
Manuel Romero
471be5f279
Change metadata to be indexed correctly (#6379) 2020-08-11 18:32:18 -04:00
Rohan Rajpal
42ee0bc63d
Create README.md (#6346)
* Create README.md

* add results on SAIL dataset

* Update model_cards/rohanrajpal/bert-base-multilingual-codemixed-cased-sentiment/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-08-11 18:31:34 -04:00
Sam Shleifer
3f071c4b6e
[examples] add pytest dependency (#6425) 2020-08-11 17:58:09 -04:00
Stas Bekman
ece0903e11
lr_schedulers: add get_polynomial_decay_schedule_with_warmup (#6361)
* [wip] add get_polynomial_decay_schedule_with_warmup

* style

* add assert

* change lr_end to a much smaller default number

* check for exact equality

* [model_cards] electra-base-turkish-cased-ner (#6350)

* for electra-base-turkish-cased-ner

* Add metadata

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Temporarily de-activate TPU CI

* Update modeling_tf_utils.py (#6372)

fix typo: ckeckpoint->checkpoint

* the test now works again (#6371)

* correct pl link in readme (#6364)

* refactor almost identical tests (#6339)

* refactor almost identical tests

* important to add a clear assert error message

* make the assert error even more descriptive than the original bt

* Small docfile fixes (#6328)

* Patch models (#6326)

* TFAlbertFor{TokenClassification, MultipleChoice}

* Patch models

* BERT and TF BERT info


s

* Update check_repo

* Ci GitHub caching (#6382)

* Cache Github Actions CI

* Remove useless file

* Colab button (#6389)

* Add colab button

* Add colab link for tutorials

* Fix links for open in colab (#6391)

* Update src/transformers/optimization.py

consistently use lr_end=1e-7 default

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* [wip] add get_polynomial_decay_schedule_with_warmup

* style

* add assert

* change lr_end to a much smaller default number

* check for exact equality

* Update src/transformers/optimization.py

consistently use lr_end=1e-7 default

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove dup (leftover from merge)

* convert the test into the new refactored format

* stick to using the current_step as is, without ++

Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Alexander Measure <ameasure@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-08-11 17:56:41 -04:00
cedspam
6c87b73d6b
Create README.md (#6386)
* Create README.md

* Update README.md
2020-08-11 16:56:51 -04:00
Stas Bekman
0203d6517f
[pl] restore lr logging behavior for glue, ner examples (#6314) 2020-08-11 16:27:11 -04:00
Sam Shleifer
be1520d3a3
rename prepare_translation_batch -> prepare_seq2seq_batch (#6103) 2020-08-11 15:57:07 -04:00
Sam Shleifer
66fa8ceaea
PegasusForConditionalGeneration (torch version) (#6340)
Co-authored-by: Jingqing  Zhang <jingqing.zhang15@imperial.ac.uk>
2020-08-11 14:31:23 -04:00
Stas Bekman
f6cb0f806e
[s2s] wmt download script use less ram (#6405) 2020-08-11 12:04:17 -04:00
Stas Bekman
7c6a085ebf
pl version: examples/requirements.txt is single source of truth (#6309) 2020-08-11 10:58:54 -04:00
Pranav Vadrevu
1d1d5bec1b
Create Model Card File (#6357) 2020-08-11 10:36:15 -04:00
Abed khooli
00ce881c07
Create README.md (#6413)
* Create README.md

Model card for https://huggingface.co/akhooli/gpt2-small-arabic

* Update model_cards/akhooli/gpt2-small-arabic/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-08-11 10:35:31 -04:00
Nick Doiron
3ae30787b5
switch Hindi-BERT to S3 README (#6396) 2020-08-11 10:34:22 -04:00
Abed khooli
824e651e17
Create README.md (#6397)
* Create README.md

* Update model_cards/akhooli/gpt2-small-arabic-poetry/README.md

* Update model_cards/akhooli/gpt2-small-arabic-poetry/README.md

* Update model_cards/akhooli/gpt2-small-arabic-poetry/README.md

* Update model_cards/akhooli/gpt2-small-arabic-poetry/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-08-11 09:03:23 -04:00
guillaume-be
404782912a
[Performance improvement] "Bad tokens ids" optimization (#6064)
* Optimized banned token masking

* Avoid duplicate EOS masking if in bad_words_id

* Updated mask generation to handle empty banned token list

* Addition of unit tests for the updated bad_words_ids masking

* Updated timeout handling in `test_postprocess_next_token_scores_large_bad_words_list` unit test

* Updated timeout handling in `test_postprocess_next_token_scores_large_bad_words_list` unit test (timeout does not work on Windows)

* Moving Marian import to the test context to allow TF only environments to run

* Moving imports to torch_available test

* Updated operations device and test

* Updated operations device and test

* Added docstring and comment for in-place scores modification

* Moving test to own test_generation_utils, use of lighter models for testing

* removed unneded imports in test_modeling_common

* revert formatting change for ModelTesterMixin

* Updated caching, simplified eos token id test, removed unnecessary @require_torch

* formatting compliance
2020-08-11 05:56:40 -04:00
David LaPalomento
87e124c245
Warn if debug requested without TPU fixes (#6308) (#6390)
* Warn if debug requested without TPU fixes (#6308)
Check whether a PyTorch compatible TPU is available before attempting to print TPU metrics after training has completed. This way, users who apply `--debug` without reading the documentation aren't suprised by a stacktrace.

* Style

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-08-11 05:31:26 -04:00
Junyuan Zheng
cdf1f7edb2
Fix tokenizer saving and loading error (#6026)
* fix tokenizer saving and loading bugs when adding AddedToken to additional special tokens

* Add tokenizer test

* Style

* Style 2

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-08-11 04:49:16 -04:00
Stas Bekman
83984a61c6
testing utils: capturing std streams context manager (#6231)
* testing utils: capturing std streams context manager

* style

* missing import

* add the origin of this code
2020-08-11 03:56:47 -04:00
Stas Bekman
f6c0680d36
add pl_glue example test (#6034)
* add pl_glue example test

* for now just test that it runs, next validate results of eval or predict?

* complete the run_pl_glue test to validate the actual outcome

* worked on my machine, CI gets less accuracy - trying higher epochs

* match run_pl.sh hparms

* more epochs?

* trying higher lr

* for now just test that the script runs to a completion

* correct the comment

* if cuda is available, add --fp16 --gpus=1 to cover more bases

* style
2020-08-11 03:16:52 -04:00
Pradhy729
b25cec13c5
Feed forward chunking (#6024)
* Chunked feed forward for Bert

This is an initial implementation to test applying feed forward chunking for BERT.
Will need additional modifications based on output and benchmark results.

* Black and cleanup

* Feed forward chunking in BertLayer class.

* Isort

* add chunking for all models

* fix docs

* Fix typo

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2020-08-11 03:12:45 -04:00
Lysandre
8a3db6b303 Add TPU testing once again 2020-08-11 08:49:37 +02:00
zcain117
f65ac1faf2
Add missing docker arg for TPU CI. (#6393) 2020-08-11 02:48:49 -04:00
Sam Shleifer
b9ecd92ee4
[s2s] Script to save wmt data to disk (#6403) 2020-08-10 22:49:39 -04:00
Patrick von Platen
00bb0b25ed
TF Longformer (#5764)
* improve names and tests longformer

* more and better tests for longformer

* add first tf test

* finalize tf basic op functions

* fix merge

* tf shape test passes

* narrow down discrepancies

* make longformer local attn tf work

* correct tf longformer

* add first global attn function

* add more global longformer func

* advance tf longformer

* finish global attn

* upload big model

* finish all tests

* correct false any statement

* fix common tests

* make all tests pass except keras save load

* fix some tests

* fix torch test import

* finish tests

* fix test

* fix torch tf tests

* add docs

* finish docs

* Update src/transformers/modeling_longformer.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_longformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply Lysandres suggestions

* reverse to assert statement because function will fail otherwise

* applying sylvains recommendations

* Update src/transformers/modeling_longformer.py

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Update src/transformers/modeling_tf_longformer.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-08-10 23:25:06 +02:00
Patrick von Platen
3425936643
[EncoderDecoderModel] add a add_cross_attention boolean to config (#6377)
* correct encoder decoder model

* Apply suggestions from code review

* apply sylvains suggestions
2020-08-10 19:46:48 +02:00
Sylvain Gugger
06bc347c97
Fix links for open in colab (#6391) 2020-08-10 11:16:17 -04:00
Sylvain Gugger
3e0fe3cf5c
Colab button (#6389)
* Add colab button

* Add colab link for tutorials
2020-08-10 11:12:29 -04:00
Lysandre Debut
79588e6fdb
Ci GitHub caching (#6382)
* Cache Github Actions CI

* Remove useless file
2020-08-10 10:39:31 -04:00
Lysandre Debut
b99098abc7
Patch models (#6326)
* TFAlbertFor{TokenClassification, MultipleChoice}

* Patch models

* BERT and TF BERT info


s

* Update check_repo
2020-08-10 10:39:17 -04:00
Sylvain Gugger
6028ed92bd
Small docfile fixes (#6328) 2020-08-10 05:37:12 -04:00
Stas Bekman
1429b920d4
refactor almost identical tests (#6339)
* refactor almost identical tests

* important to add a clear assert error message

* make the assert error even more descriptive than the original bt
2020-08-10 05:31:20 -04:00
Rohit Gupta
35eb96de4d
correct pl link in readme (#6364) 2020-08-10 03:08:46 -04:00
Stas Bekman
0830e79512
the test now works again (#6371) 2020-08-10 02:55:52 -04:00
Alexander Measure
3a556b0fb7
Update modeling_tf_utils.py (#6372)
fix typo: ckeckpoint->checkpoint
2020-08-10 02:55:11 -04:00
Lysandre
1bbc54a87c Temporarily de-activate TPU CI 2020-08-10 08:11:40 +02:00
M. Yusuf Sarıgöz
6e8a38568e
[model_cards] electra-base-turkish-cased-ner (#6350)
* for electra-base-turkish-cased-ner

* Add metadata

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-08-09 03:39:51 -04:00
Sam Shleifer
9a5ef83748
[s2s] fix --gpus clarg collision (#6358) 2020-08-08 21:51:37 -04:00
Patrick von Platen
1aec991643
[GPT2] Correct typo in docs (#6352) 2020-08-08 20:37:29 +02:00
elsanns
9f57e39f71
Add notebook on fine-tuning and interpreting Electra (#6321)
Co-authored-by: eliska <3648991+elisans@users.noreply.github.com>
2020-08-08 11:47:33 +02:00