Commit Graph

7608 Commits

Author SHA1 Message Date
Patrick von Platen
b4b562d834
[Wav2Vec2] Padded vectors should not allowed to be sampled (#12764)
* fix_torch_device_generate_test

* remove @

* finish

* correct script

* correct script
2021-07-16 19:07:08 +02:00
SaulLu
6e87010060
Preserve list type of additional_special_tokens in special_token_map (#12759)
* preserve type of `additional_special_tokens` in `special_token_map`

* format

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-07-16 18:26:54 +02:00
Funtowicz Morgan
fbf1397bf8
Turn on eval mode when exporting to ONNX (#12758)
* Set model in eval mode when exporting to ONNX.

* Disable t5 for now.

* Disable T5 with past too.

* Style.
2021-07-16 15:09:15 +02:00
Suraj Patil
8ef3f36561
fix typos (#12757) 2021-07-16 16:44:59 +05:30
Nathan Zhou
c07334c12e
add intel-tensorflow-avx512 to the candidates (#12751) 2021-07-16 05:54:49 -04:00
Stas Bekman
6989264963
[doc] testing: how to trigger a self-push workflow (#12724)
* [testing] details of how to start self-push workflow

* style

* fix

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-07-15 16:18:56 -07:00
Patrick von Platen
a76dd7ee82
Update README.md 2021-07-16 00:16:30 +01:00
Patrick von Platen
2e9fb13fb1
[Wav2Vec2] Correctly pad mask indices for PreTraining (#12748)
* fix_torch_device_generate_test

* remove @

* start adding tests

* correct wav2vec2 pretraining

* up

* up

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-07-15 21:40:25 +01:00
SaulLu
5f2791c7c1
Replace specific tokenizer in log message by AutoTokenizer (#12745) 2021-07-15 12:59:48 -04:00
Stas Bekman
31cfcbd3e2
[doc] performance: batch sizes (#12725) 2021-07-15 09:39:34 -07:00
Stas Bekman
68605e9db1
[doc] parallelism: Which Strategy To Use When (#12712) 2021-07-15 09:38:51 -07:00
Lysandre Debut
eb4d7ef97b
Remove framework mention (#12731) 2021-07-15 11:49:02 -04:00
Lysandre Debut
959d448b3f
Fix led torchscript (#12735)
* Don't test LED on torchscript

* Typo
2021-07-15 11:48:50 -04:00
Lysandre Debut
f03580fb02
Fix DETR integration test (#12734) 2021-07-15 11:48:37 -04:00
Lysandre Debut
f42d9dcc0e
Patch T5 device test (#12742) 2021-07-15 16:40:17 +01:00
Lysandre Debut
370be9cc38
Fix MBart failing test (#12737) 2021-07-15 16:39:35 +01:00
qqaatw
2349ac58c4
Translate README.md to Traditional Chinese (#12701)
* Add README_zh-tw.md

* Add links to each README.

* Fix a mismatched term.

* Minor improvements.

* Rename language code to be more inclusive.

* Polish terms to make them fluent.

* Remove redundant spaces.

* Fix typo.
2021-07-15 23:35:39 +08:00
Lysandre Debut
eb2e006b35
Skip test while the model is not available (#12740) 2021-07-15 09:14:12 -04:00
Lysandre Debut
8c7bd1b97b
Skip test while the model is not available (#12739) 2021-07-15 09:06:47 -04:00
Lysandre Debut
3290315a2a
Fix AutoModel tests (#12733) 2021-07-15 09:06:12 -04:00
Lysandre Debut
01cb2f25e3
LXMERT integration test typo (#12736) 2021-07-15 08:29:49 -04:00
Sylvain Gugger
199b4c5264
Init adds its own files as impacted (#12709) 2021-07-15 04:17:47 -04:00
Will Rice
6fb58d30b9
Fix typo in example (#12716) 2021-07-15 12:14:03 +05:30
Patrick von Platen
8244c5ad4f
[Flax] Correct shift labels for seq2seq models in Flax (#12720)
* fix_torch_device_generate_test

* remove @

* push

* fix marian

* fix

* up
2021-07-15 12:12:36 +05:30
Stas Bekman
1a3deae820
[trainer] release tmp memory in checkpoint load (#12718)
* [trainer] release tmp memory in checkpoint load

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-07-14 15:18:02 -07:00
Stas Bekman
a18a17d2b6
[test] split test into 4 sub-tests to avoid timeout (#12710)
* split the test into 4 sub-tests to avoid timeout

* fix decorator order
2021-07-14 13:04:58 -07:00
Suraj Patil
44f5b260fe
flax model parallel training (#12590)
* update scripts

* add copyright

* add logging

* cleanup

* add z loss

* add readme

* shard description

* update readme
2021-07-14 22:55:44 +05:30
Matt
79c57e1a07
Deprecate TFTrainer (#12706)
* Deprecate TFTrainer

* Style pass
2021-07-14 15:59:14 +01:00
Sylvain Gugger
084873b025
Only test the files impacted by changes in the diff (#12644)
* Base test

* More test

* Fix mistake

* Add a docstring change

* Add doc ignore

* Add changes

* Add recursive dep search

* Add recursive dep search

* save

* Finalize test mapping

* Fix bug

* Print prettier

* Ignore comments and empty lines

* Make script runnable from anywhere

* Need dev install

* Like that

* Adapt

* Add as artifact

* Try on torch tests

* Fix yaml error

* Install GitPython

* Apply everywhere

* Be more defensive

* Revert to all tests if something is wrong

* Install GitPython

* Test if there are tests before launching.

* Fixes

* Fixes

* Fixes

* Fixes

* Bash syntax is horrible

* Be less stupid

* Try differently

* Typo

* Typo

* Typo

* Style

* Better name

* Escape quotes

* Ignore black unhelpful re-formatting

* Not a docstring

* Deal with inits in dependency map

* Run all tests once PR is merged.

* Add last job

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Stronger dependencies gather

* Ignore empty lines too!

* Clean up

* Fix quality

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-07-14 10:56:55 -04:00
Funtowicz Morgan
11edecd753
Fix uninitialized variables when config.mask_feature_prob > 0 (#12705) 2021-07-14 15:30:19 +01:00
Matt
f9ac677eba
Update TF examples README (#12703)
* Update Transformers README, rename token_classification example to token-classification to be consistent with the others

* Update examples/tensorflow/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add README for TF token classification

* Update examples/tensorflow/token-classification/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/tensorflow/token-classification/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-07-14 15:15:25 +01:00
Patrick von Platen
f4399ec570
Update README.md 2021-07-14 12:54:31 +01:00
Funtowicz Morgan
d94773e685
Provide mask_time_indices to _mask_hidden_states to avoid double masking (#12692)
* We need to provide mask_time_indices to `_mask_hidden_states` to avoid applying the mask two times

* apply the same to wav2vec2

* Uniformize the style between hubert and wav2vec2

* fix tf as well

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2021-07-14 12:17:33 +01:00
Sylvain Gugger
144cea253f
Fix multiple choice doc examples (#12679) 2021-07-14 03:35:18 -04:00
Stas Bekman
5dd0c956a8
non-native optimizers are mostly ok with zero-offload (#12690) 2021-07-13 20:18:51 -07:00
yujun
4cdb7ee51d
fix #11724 (#11897) 2021-07-13 22:18:54 +01:00
Lysandre Debut
83f025125d
Add timeout to CI. (#12684)
* Global 60-300 seconds timeout

* Add verbose option

* [skip ci] typo
2021-07-13 15:13:18 -04:00
Stas Bekman
78f5fe1416
[Deepspeed] adapt multiple models, add zero_to_fp32 tests (#12477)
* zero_to_fp32 tests

* args change

* remove unnecessary work

* use transformers.trainer_utils.get_last_checkpoint

* document the new features

* cleanup

* wip

* fix fsmt

* add bert

* cleanup

* add xlm-roberta

* electra works

* cleanup

* sync

* split off the model zoo tests

* cleanup

* cleanup

* cleanup

* cleanup

* reformat

* cleanup

* casing

* deepspeed>=0.4.3

* adjust distilbert

* Update docs/source/main_classes/deepspeed.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-07-13 12:07:32 -07:00
Matt
65bf05cd18
Adding TF translation example (#12667)
* Adding TF translation example

* Fixes and style pass for TF translation example

* Remove unused postprocess_text copied from run_summarization

* Adding README

* Review fixes

* Move changes to model.config to after we've initialized the model
2021-07-13 19:08:25 +01:00
Patrick von Platen
cee2d2135f
[Flax Generation] Correct inconsistencies PyTorch/Flax (#12662)
* fix_torch_device_generate_test

* remove @

* correct greedy search

* save intertmed

* add final logits bias

* correct

* up

* add more tests

* fix another bug

* finish tests

* finish marian tests

* up

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-07-13 18:53:30 +01:00
Stas Bekman
7a22a02a70
[tokenizer.prepare_seq2seq_batch] change deprecation to be easily actionable (#12669)
* change deprecation to be easily actionable

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* rework as suggested

* one warning together

* fix format

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-07-13 09:19:04 -07:00
qqaatw
711d901c49
Fix minor docstring typos. (#12682) 2021-07-13 12:08:15 -04:00
Sylvain Gugger
90178b0cef
Add option to load a pretrained model with mismatched shapes (#12664)
* Add option to load a pretrained model with mismatched shapes

* Fail at loading when mismatched shapes in Flax

* Fix tests

* Update src/transformers/modeling_flax_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Address review comments

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-07-13 10:15:15 -04:00
Patrick von Platen
7f6d375029
[Blenderbot] Fix docs (#12227)
* fix_torch_device_generate_test

* remove @

* fix docs
2021-07-13 14:17:31 +01:00
Jeroen Steggink
9519f0cd63
Wrong model is used in example, should be character instead of subword model (#12676)
* Wrong model is used, should be character instead of subword

In the original Google repo for CANINE there was mixup in the model names in the README.md, which was fixed 2 weeks ago. Since this transformer model was created before, it probably resulted in wrong use in this example.

s = subword, c = character

* canine.rst style fix

* Update docs/source/model_doc/canine.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Styling canine.rst

* Added links to model cards.

* Fixed links to model cards.

Co-authored-by: Jeroen Steggink <978411+jsteggink@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-07-13 08:40:27 -04:00
Nick Doiron
5803a2a7ac
Add ByT5 option to example run_t5_mlm_flax.py (#12634)
* Allow ByT5 type in Flax T5 script

* use T5TokenizerFast

* change up tokenizer config

* model_args

* reorder imports

* Update run_t5_mlm_flax.py
2021-07-13 13:39:57 +01:00
Lysandre Debut
9da1acaea2
**encode_plus() shouldn't run for W2V2CTC (#12655)
* **encode_plus() shouldn't run for  W2V2CTC

* Typo
2021-07-13 06:31:56 -04:00
Lysandre Debut
a6938c4721
Patch BigBird tokenization test (#12653) 2021-07-13 02:53:06 -04:00
Omar Sanseviero
c523b241c2
Update timeline for Flax event evaluation 2021-07-12 21:24:58 +02:00
Kevin Canwen Xu
dc06e43580
Fix typo in README_zh-hans.md (#12663) 2021-07-13 01:50:12 +08:00