Commit Graph

7165 Commits

Author SHA1 Message Date
Sylvain Gugger
d7633a4e46
Add basic support for FP16 in SageMaker model parallelism (#11407)
* Add FP16 support for SageMaker MP

* Add print debugs

* Squeeze

* Remove debug statements

* Add defensive check

* Typo
2021-04-26 08:55:14 -04:00
Daniel Stancl
38a716cd41
TF BART models - Add cross_attentions to model output and fix cross-attention head masking (#10699)
* Add cross_attn_head_mask to BART

* Fix cross_attentions in TFBart-like models

* This commit enables returning of `cross_attentions`
for TFBart-like models

* It also fixes attention head masking in cross-attenion module

* Update TF model templates

* Fix missing , in TF model templates

* Fix typo: congig -> config
2021-04-26 14:16:21 +02:00
Sylvain Gugger
4bd6b54fa4 Pin black to 21.4b0 2021-04-26 08:12:54 -04:00
Sylvain Gugger
c1625b3261 With style 2021-04-26 08:07:29 -04:00
Sylvain Gugger
4b72cfd958 Pin black to 20.8.b1 2021-04-26 08:06:50 -04:00
Patrick von Platen
32dbb2d954
make style (#11442) 2021-04-26 13:50:34 +02:00
Vasudev Gupta
04ab2ca639
add pooling layer support (#11439) 2021-04-26 09:05:53 +02:00
abiolaTresor
30f065890e
updating the checkpoint for GPT2ForSequence Classification to one with classification head (#11434) 2021-04-26 10:28:51 +05:30
cronoik
35cd8eed88
EncoderDecoderConfigs should not create new objects (#11300)
* removes the creation of separate config objects and uses the existing ones instead+overwrite resize_token_embeddings from parent class because it is not working for the EncoderDecoderModel

* rollback to current version of the huggingface master branch

* reworked version that ties the encoder and decoder config of the parent encoderdecoder instance

* overwrite of resize_token_embeddings throws an error now

* review comment suggestion

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* implemented warning in case encoderdecoder is created with differing configs of encoderdecoderconfig and decoderconfig or encoderconfig

* added test to avoid diverging configs of wrapper class and wrapped classes

* Update src/transformers/models/encoder_decoder/modeling_encoder_decoder.py

* make style

Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-04-25 11:45:46 +02:00
Daniel Stancl
f45cb66bf6
Add head_mask, decoder_head_mask, cross_head_mask to ProphetNet (#9964)
* Add head_mask & decoder_head_mask + some corrections

* Fix head masking for N-grams

* Enable test_headmasking for encoder and decod

* Fix one typo regarding in modeling_propgetnet.py

* Enable test_headmasking for ProphetNetStandaloneDecoderModelTest
and ProphetNetStandaloneEncoderModelTest in test_modeling_prophetnet.py

* make style

* Fix cross_head_mask

* Fix attention head mask naming

* `cross_head_mask` -> `cross_attn_head_mask`

* `cross_layer_head_mask` -> `cross_attn_layer_head_mask`

* Still need to merge #10605 to master to pass the tests
2021-04-25 11:06:16 +02:00
Sylvain Gugger
52166f672e Style 2021-04-23 20:40:17 -04:00
cronoik
9cac4fab07
documentation linked to the parent class PreTrainedTokenizerFast but it should be the slow tokenizer (#11410) 2021-04-23 20:19:15 -04:00
Sylvain Gugger
b7fc043fce Merge branch 'master' of github.com:huggingface/transformers 2021-04-23 18:47:55 -04:00
Sylvain Gugger
81a6c7cd39 Use 3 workers for torch tests 2021-04-23 18:47:46 -04:00
Philip May
195bfd118a
Enable option for subword regularization in XLMRobertaTokenizer (#11149)
* enable subword regularization.

* fix tokenizer storage

* fix docstring formatting

* Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py

Co-authored-by: Stefan Schweter <stefan@schweter.it>

* fix docstring formatting

* add test for subword regularization tokenizer

* improve comments of test

* add sp_model_kwargs

* reformat docstring to match the style

* add some more documentation

* Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* improve docstring

* empty commit to trigger CI

* Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix docstring formatting for sphinx

Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-23 17:52:31 -04:00
Sylvain Gugger
1ef152eb48
Default to accuracy metric (#11405) 2021-04-23 14:49:59 -04:00
Daniel Stancl
e3ff165aa5
Fix cross-attention head mask for Torch encoder-decoder models (#10605)
* Fix cross-attention head mask for Torch BART models

* Fix head masking for cross-attention module for the following
models: BART, Blenderbot, Blenderbot_small, M2M_100, Marian, MBart,
Pegasus

* Enable test_headmasking for M2M_100 model

* Fix cross_head_mask for FSMT, LED and T5

* This commit fixes `head_mask` for cross-attention modules
in the following models: FSMT, LED, T5

* It also contains some smaller changes in doc so that
it is be perfectly clear the shape of `cross_head_mask`
is the same as of `decoder_head_mask`

* Update template

* Fix template for BartForCausalLM

* Fix cross_head_mask for Speech2Text models

* Fix cross_head_mask in templates

* Fix args order in BartForCausalLM template

* Fix doc in BART templates

* Make more explicit naming

* `cross_head_mask` -> `cross_attn_head_mask`

* `cross_layer_head_mask` -> `cross_attn_layer_head_mask`

* Fix doc

* make style quality

* Fix speech2text docstring
2021-04-23 18:58:06 +02:00
Sylvain Gugger
ca6b80cadb Wrong branch Sylvain... 2021-04-23 12:46:54 -04:00
Sylvain Gugger
3951fc55ee Try to trigger failure more 2021-04-23 12:44:54 -04:00
Sylvain Gugger
bd41a0f74d Style 2021-04-23 12:32:37 -04:00
Nicola De Cao
1811883e80
Fixing bug in generation (#11297)
When passing `inputs_embeds` and not `input_ids=None` the generation function fails because `input_ids` is created but the function but it should not.
2021-04-23 18:24:26 +02:00
Kiran R
5c00918681
added support for exporting of t5 to onnx with past_key_values (#10651) 2021-04-23 18:14:20 +02:00
Patrick von Platen
50f4539b82
push (#11400) 2021-04-23 15:36:27 +02:00
Sylvain Gugger
bf2e0cf70b
Trainer push to hub (#11328)
* Initial support for upload to hub

* push -> upload

* Fixes + examples

* Fix torchhub test

* Torchhub test I hate you

* push_model_to_hub -> push_to_hub

* Apply mixin to other pretrained models

* Remove ABC inheritance

* Add tests

* Typo

* Run tests

* Install git-lfs

* Change approach

* Add push_to_hub to all

* Staging test suite

* Typo

* Maybe like this?

* More deps

* Cache

* Adapt name

* Quality

* MOAR tests

* Put it in testing_utils

* Docs + torchhub last hope

* Styling

* Wrong method

* Typos

* Update src/transformers/file_utils.py

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Address review comments

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-04-23 09:17:37 -04:00
Teven
7bc86bea68
Fixed trainer total_flos relaoding in distributed mode (#11383)
* Fixed trainer total_flos relaoding in distributed mode

* logging flos at the end of training
2021-04-23 07:53:33 -04:00
Patrick von Platen
74e84f1fa6
make blenderbot test slow (#11395) 2021-04-23 07:49:09 -04:00
Yoshitomo Matsubara
c3d6f33918
fixed typos (#11391) 2021-04-23 07:48:42 -04:00
Max Del
a90d3f1862
Fix typo in text (#11396) 2021-04-23 07:37:19 -04:00
Patrick von Platen
2dc2d79ac7
correct conversion (#11394) 2021-04-23 11:59:34 +02:00
Patrick von Platen
b48cf7124c
correct typo (#11393) 2021-04-23 11:34:59 +02:00
Patrick von Platen
8c9b5fcbaf
[Flax] Big FlaxBert Refactor (#11364)
* improve flax

* refactor

* typos

* Update src/transformers/modeling_flax_utils.py

* Apply suggestions from code review

* Update src/transformers/modeling_flax_utils.py

* fix typo

* improve error tolerance

* typo

* correct nasty saving bug

* fix from pretrained

* correct tree map

* add note

* correct weight tying
2021-04-23 09:53:09 +02:00
Sylvain Gugger
3ed5e97ba0
Fix Trainer with remove_unused_columns=False (#11382)
* Fix Trainer with remove_unused_columns=False

* Typo
2021-04-22 11:16:24 -04:00
PenutChen
0f3ad1507e
Fix typo (#11369) 2021-04-22 10:10:16 -04:00
Matt
2617396094
Correctly cast num_train_epochs to int (#11379) 2021-04-22 13:49:59 +01:00
Takuya Makino
881945c0b5
Add space (#11373) 2021-04-22 17:48:58 +05:30
johnson7788
5b5e4ca366
[run_translation.py] fix typo (#11372)
fix typo

Co-authored-by: johnson <johnson@github.com>
2021-04-22 17:47:11 +05:30
Patrick von Platen
58d8795d74
[Flax] Correct typo (#11374)
* finish

* fix copy
2021-04-22 13:11:44 +02:00
Patrick von Platen
880154d2e1
[Wav2Vec2] Fix special tokens for Wav2Vec2 tokenizer (#11349)
* fix wav2vec2 tok

* up
2021-04-22 12:23:08 +02:00
Sylvain Gugger
6f14eab50b Add in torchhub 2021-04-21 19:17:29 -04:00
Sylvain Gugger
ff26f8ee3a Add huggingface_hub dep for #11328 2021-04-21 19:12:58 -04:00
wlhgtc
5e04d70868
Fix token_type_ids error for big_bird model. (#11355)
* MOD: fit chinese wwm to new datasets

* MOD: move wwm to new folder

* MOD: formate code

* Styling

* MOD add param and recover trainer

* MOD: add token_type_ids method for big bird

* MOD: format code

* MOD: format code

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
2021-04-21 19:37:57 +02:00
Stas Bekman
5aaf5aac0b
[contributing doc] explain/link to good first issue (#11346)
* explain/link to good first issue

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-21 10:10:11 -07:00
Matt
6fe79e57d7
Move old TF text classification script to legacy (#11361)
And update README to explain the work-in-progress!
2021-04-21 17:36:18 +01:00
Patrick von Platen
50595a3336
Remove boiler plate code (#11340)
* remove boiler plate code

* adapt roberta

* correct docs

* finish refactor
2021-04-21 18:34:38 +02:00
Matt
ac588594e2
Merge new TF example script (#11360)
First of the new and more idiomatic TF examples!
2021-04-21 17:04:55 +01:00
Stas Bekman
9f72e8f4e1
[testing doc] bring doc up to date (#11359)
* bring doc up to date

* fix
2021-04-21 08:51:00 -07:00
lewtun
41f3133a3a
Extract metric_key_prefix during NotebookProgressCallback.on_evaluate (#11347)
* Pass metric_key_prefix as kwarg to on_evaluate

* Replace eval_loss with metric_key_prefix_loss

* Default to "eval" if metric_key_prefix not in kwargs

* Add kwargs to CallbackHandler.on_evaluate signature

* Revert "Add kwargs to CallbackHandler.on_evaluate signature"

This reverts commit 8d4c85ed51.

* Revert "Pass metric_key_prefix as kwarg to on_evaluate"

This reverts commit 7766bfe271.

* Extract metric_key_prefix from metrics
2021-04-21 11:12:09 -04:00
Sylvain Gugger
dabeb15292
Examples reorg (#11350)
* Base move

* Examples reorganization

* Update references

* Put back test data

* Move conftest

* More fixes

* Move test data to test fixtures

* Update path

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments and clean

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-04-21 11:11:20 -04:00
Stas Bekman
ca7ff64f5b
[deepspeed] fix resume from checkpoint (#11352)
This PR fixes a bug that most likely somehow got exposed (not caused) by https://github.com/huggingface/transformers/pull/11318 - surprisingly the same test worked just fine before that other PR.
2021-04-21 07:48:15 -07:00
Sylvain Gugger
74712e22f3
Honor contributors to models (#11329)
* Honor contributors to models

* Fix typo

* Address review comments

* Add more authors
2021-04-21 09:47:27 -04:00