Commit Graph

8821 Commits

Author SHA1 Message Date
Patrick von Platen
43f46aa7fd
[RAG] Fix rag from pretrained question encoder generator behavior (#11962)
* fix_torch_device_generate_test

* remove @

* fix rag from pretrained loading

* add test

* uplaod

* finish
2021-06-02 09:17:14 +01:00
dependabot[bot]
6db3a87de2
Bump urllib3 from 1.25.8 to 1.26.5 in /examples/research_projects/lxmert (#11983)
Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.25.8 to 1.26.5.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/1.25.8...1.26.5)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-06-02 03:40:20 -04:00
Stas Bekman
4ba203d9d3
[Trainer] add train loss and flops metrics reports (#11980)
* add train loss and flops metrics reports

* consistency

* add train_loss to skip keys

* restore on_train_end call timing
2021-06-01 15:58:31 -07:00
Stas Bekman
7ec596ecda
[DeepSpeed] decouple DeepSpeedConfigHF from Trainer (#11966)
* decouple DeepSpeedConfigHF from Trainer

* add LoggingLevel ctx manager; add new test

* cleanup

* add docs

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* implemented suggested renames

* formatter workaround

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-01 13:24:52 -07:00
Alberto Villa
1c3ab3e5d6
Typo in usage example, changed to device instead of torch_device (#11979) 2021-06-01 14:58:49 -04:00
Patrick von Platen
47a98fc4cb
ByT5 model (#11971)
* allow tf to use uneven num of layers

* add tokenizer

* finish docs

* finish docs

* Apply suggestions from code review

* include in index

* finish

* Update docs/source/model_doc/byt5.rst

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* apply sylvais suggestions

* make style

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2021-06-01 19:07:37 +01:00
Jeoung-Minju
1eb58b4560
typo correction (#11973)
* typo correction

* type corrections
2021-06-01 12:24:59 -04:00
Stas Bekman
79712e7e7a
[deepspeed] docs (#11940)
* deepspeed docs

* cleanup

* cleanup
2021-06-01 09:21:21 -07:00
Lysandre
985d708842 Run the integration tests on schedule tests instead of master tests 2021-06-01 15:58:31 +02:00
Volodymyr Byno
9996558bff
Neptune.ai integration (#11937)
An option that turns on neptune.ai logging
--report_to 'neptune'

Additional ENV variables:
	NEPTUNE_PROJECT
	NEPTUNE_API_TOKEN
	NEPTUNE_RUN_NAME (optional)
	NEPTUNE_STOP_TIMEOUT (optional)
2021-06-01 09:40:52 -04:00
Lysandre Debut
ae6ce28f31
Authorize args when instantiating an AutoModel (#11956) 2021-06-01 09:27:54 -04:00
Philip May
fcad801825
Add regression tests for slow sentencepiece tokenizers. (#11737)
* add test_vocab_size for sentencepiece tok.

* add test_get_vocab for sentencepiece tok.

* add test_convert_token_and_id for sentencepiece tok.

* add test_tokenize_and_convert_tokens_to_string for all tok.

* improve test_tokenize_and_convert_tokens_to_string for sp. tok.

* add common tokenizer integration tests
- for albert
- for barthez

* add tokenizer integration tests to bert gen.

* add most tokenizer integration tests

* fix camembert tokenizer integration test

* add tokenizer integration test to marian

* add tokenizer integration test to reformer

* add typing and doc to tokenizer_integration_test_util

* fix tokenizer integration test of reformer

* improve test_sentencepiece_tokenize_and_convert_tokens_to_string

* empty commit to trigger CI

* fix tokenizer integration test of reformer

* remove code not needed anymore

* empty commit to trigger CI

* empty commit to trigger CI
2021-06-01 09:24:39 -04:00
Josh Tanner
c3d958b2c0
reinitialize wandb config for each hyperparameter search run (#11945) 2021-06-01 09:18:33 -04:00
Riccardo Bassani
99dbbdb91e
bugfixes training_args.py (#11922)
modified according to:
https://pytorch.org/xla/release/1.8.1/_modules/torch_xla/core/xla_model.html
2021-06-01 09:04:51 -04:00
Fan Zhang
7e73601f32
modify qa-trainer (#11872)
* modify qa-trainer

* fix flax model
2021-06-01 08:28:41 -04:00
Shamane Siri
9ec0f01b6c
RAG-2nd2end-revamp (#11893)
* initial

* code quality test

* code quality

* added test functions in test_modeling_rag.py and test_retrieval_rag.py to test end2end retreiver

* minor change in test_modeling_rag

* fixed tests

* Update examples/research_projects/rag-end2end-retriever/README.md

typo corrected as suggested by lhoestq

Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>

* Update examples/research_projects/rag-end2end-retriever/finetune_rag.py

type change suggested by lhoestq

Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>

* Update src/transformers/models/rag/retrieval_rag.py

Adding this change as mentioned by lhoestq.

Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>

* completed the minor changes suggested by the reviewers

Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
2021-06-01 07:32:26 +01:00
Suraj Patil
ad25fd62bd
Add FlaxCLIP (#11883)
* add flax CLIP

* default input_shape

* add tests

* fix test

* fix name

* fix docs

* fix shapes

* attend at least 1 token

* flax conv to torch conv

* return floats

* fix equivalence tests

* fix import

* return attention_weights and update tests

* fix dosctrings

* address patricks comments

* input_shape arg

* add tests for get_image_features and get_text_features methods

* fix tests
2021-06-01 09:44:31 +05:30
Philip May
cfca638acb
Add MT5ForConditionalGeneration as supported arch. to summarization README (#11961)
* Add MT5ForConditionalGeneration as supported arch.

* Update README.md
2021-05-31 21:24:33 +05:30
Nicholas Vadivelu
1ab147d648
Remove redundant nn.log_softmax in run_flax_glue.py (#11920)
* Remove redundant `nn.log_softmax` in `run_flax_glue.py`

`optax.softmax_cross_entropy` expects unnormalized logits, and so it already calls `nn.log_softmax`, so I believe it is not needed here. `nn.log_softmax` is idempotent so mathematically it shouldn't have made a difference.

* Remove unused 'flax.linen' import
2021-05-31 15:29:04 +01:00
Philip May
fb60c309c6
fix assert (#11935) 2021-05-31 04:02:10 -04:00
Lysandre
04a9709c27 Remove datasets submodule 2021-05-31 09:18:49 +02:00
Lysandre Debut
8d171628fe
Test optuna and ray (#11924) 2021-05-28 07:52:01 -04:00
Jayendra
af1a10bff4
[Flax] Return Attention from BERT, ELECTRA, RoBERTa and GPT2 (#11918)
* Added logic to return attention from flax-bert model and added test cases to check that

* Added new line at the end of file to test_modeling_flax_common.py

* fixing code style

* Fixing Roberta and Elextra models too from cpoying bert

* Added temporary hack to not run test_attention_outputs for FlaxGPT2

* Returning attention weights from GPT2 and changed the tests accordingly.

* last fixes

* bump flax dependency

Co-authored-by: jayendra <jayendra@infocusp.in>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-05-28 16:16:56 +05:30
Bhadresh Savani
e1205e478a
Added Sequence Classification class in GPTNeo (#11906)
* seq classification changes

* fix tests
2021-05-28 06:27:02 -04:00
Nicolas Patry
80d712fac6
Adding new argument max_new_tokens for generate. (#11476)
* Adding new argument `max_new_tokens` for generate.

This is a proposal to add a new argument `max_new_tokens` to `generate`.
This include a `MaxNewTokensCriteria` that enables callers that don't
know about the token length ahead (like pipelines callers) to manage
more easily the length of their generated output.

* Adding a test for the user warning when both`max_length` and
`max_new_tokens` are used together.

* Removed redundant `no_grad`.
2021-05-27 14:22:58 +02:00
Josh Tanner
2dd6fb2585
Update deepspeed config to reflect hyperparameter search parameters (#11896)
* rebuild deepspeed config for hyperparameter search

* reformat code to fix style issues
2021-05-27 07:53:33 -04:00
Patrick von Platen
42fe0dc23e
Add Emotion Speech Noteboook (#11900) 2021-05-27 10:46:10 +01:00
Patrick von Platen
996a315e76
Flax Generate (#11777)
* fix_torch_device_generate_test

* remove @

* add

* indexing

* correct a couple of tests

* fix tests

* add logits processor

* finish top_k, top_p, temp

* add docs

* correct flax prng key default

* improve generate

* add generation docs

* add docs

* make style

* revert model outputs change

* make style

* correct typo

* fix tests

* fix slow test

* add raise

* finish generation

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-27 00:18:17 +01:00
Avital Oliver
2df546918e
Link official Cloud TPU JAX docs (#11892) 2021-05-26 15:44:40 -04:00
joerenner
1530384e5b
changing find_batch_size to work with tokenizer outputs (#11890)
* changing find_batch_size to work with tokenizer outputs

trainer_pt_utils.find_batch_size does not recognize the batch size of BatchEncoding objects. This can cause an error when a trainer relies on find_batch_size to report the number of observed examples in the evaluation loop.

* Trigger CI

Co-authored-by: jrenner <joseph.renner@inria.fr>
2021-05-26 11:59:06 -04:00
Patrick von Platen
d5a72b6e19
[Flax] Allow dataclasses to be jitted (#11886)
* fix_torch_device_generate_test

* remove @

* change dataclasses to flax ones

* fix typo

* fix jitted tests

* fix bert & electra
2021-05-26 15:01:13 +01:00
talkhaldi
e6126e1932
Correcting comments in T5Stack to reflect correct tuple order (#11330)
* Correcting comments to reflect correct tuple order

In order to match the actual order (line 513 and 516, and as accessed in 968), I've changed the order mentioned in comments L962 and L966-967.

* Update modeling_t5.py

Updating another comment as well

* Removing extra space

* Fixing style and quality

* style & quality

* Update src/transformers/models/t5/modeling_t5.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-05-26 14:07:23 +01:00
Daniel Stancl
0b93358447
Fix usage of head masks by TF encoder-decoder models' generate() function (#11775)
* Fix Bart

* Fix Blenderbot{,_small}

* Fix LED

* Fix Marian

* Fix MBart

* Fix Pegasus

* Fix T5

* Add test for generation with head_mask

* Add a common TF test

* Override a test for the LED model as head masking is not yet properly implemented

* Remove all head_masks from input preparation for LED

* Drop masking for T5 as it needs a bit of refactor
2021-05-26 14:02:44 +01:00
francescorubbo
0b0a598452
Ensure input tensor are on device. (#11874)
The feature extractor does not create tensors on the appropriate device,
so we call `ensure_tensor_on_device` before feeding the processed inputs
to the model.
2021-05-26 04:19:37 -04:00
Ahmet Akkoç
a9c797f93d
[Wav2Vec2ForCTC] example typo fixed (#11878) 2021-05-25 17:06:14 -04:00
Stas Bekman
1b6530104d
[Examples] create model with custom config on the fly (#11798)
* create custom model on the flight

* better wording

* add update_from_string

* cleanup

* cleanup

* Update src/transformers/configuration_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* more bool options

* style

* fix logger

* add test

* add the doc

* assert on conflict of options

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-05-25 10:40:49 -07:00
Stas Bekman
6287c929c1
[lm examples] fix overflow in perplexity calc (#11855)
* fix overflow in perplexity calc

* use inf

* fix
2021-05-25 08:11:26 -07:00
Patrick von Platen
7630c11f32
[Wav2Vec2] SpecAugment Fast (#11764)
* first try

* finish
2021-05-25 13:59:52 +01:00
Sylvain Gugger
f086652b16
Add option to log only once in multinode training (#11819)
* Add option to long only once in multinode training

* Use an alternate property
2021-05-25 08:03:43 -04:00
Wang Ran (汪然)
b8344a274f
typo (#11858) 2021-05-25 04:23:46 -04:00
Shiro T
f9880f62ad
fixed a small typo in the doc (#11856) 2021-05-25 04:18:55 -04:00
Lysandre Debut
6da129cb31
Enable memory metrics in tests that need it (#11859) 2021-05-25 04:06:19 -04:00
Lysandre Debut
db0b2477cc
Add some tests to the slow suite #11860 2021-05-25 04:06:06 -04:00
Sylvain Gugger
afe479adb5
[Trainer] Report both steps and num samples per second (#11818)
* [Trainer] Report both steps and num samples per second

* Fix batch number

* Update src/transformers/trainer_utils.py

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-05-24 19:51:42 -04:00
Nick Lane-Smith
eaab9397cd
Fix two typos in docs (#11852)
* typo2

* fix typo
2021-05-24 14:26:02 -04:00
Teven
8a2a3a25af
Fix flos single node (#11844)
* fixing flos bug/typo in non-distributed setting

* storing flos every logging_interval
2021-05-24 20:15:52 +02:00
Sylvain Gugger
adb785b0fe
Switch mem metrics flag (#11851)
* Switch mem metrics flag

* Update src/transformers/training_args.py

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-05-24 13:30:39 -04:00
Sylvain Gugger
fcdb85e9d2
Fix reference to XLNet (#11846) 2021-05-24 09:26:40 -04:00
Patrick von Platen
f580604157
[Flax] Fix PyTorch import error (#11839)
* fix_torch_device_generate_test

* remove @

* change pytorch import to flax import
2021-05-24 10:41:10 +01:00
Lysandre Debut
0cbddfb190
Replace double occurrences as the last step (#11367) 2021-05-24 03:38:59 -04:00