Commit Graph

7435 Commits

Author SHA1 Message Date
Stas Bekman
d25ad34c82
[CI] add dependency table sync verification (#12364)
* add dependency table sync verification

* improve the message

* improve the message

* revert

* ready to merge
2021-06-28 08:55:59 -07:00
Sylvain Gugger
57461ac0b4
Add possibility to maintain full copies of files (#12312) 2021-06-28 10:02:53 -04:00
Taha ValizadehAslani
9490d668d2
Update run_mlm.py (#12344)
Before the code could not be used for validation only because of this line:
extension = data_args.train_file.split(".")[-1]
was assuming that extension must be extracted from the training dataset. This line would run regardless of the training or validation options of the user. This would lead to an error if the user only wants to run an evaluation only and does not want to do train (because the training file does not exist). I modified it to extract extension from the training file if the user wants to do train and extract it from the validation file if the user wants to run eval. This way the code can be used for both training and validation separately.
2021-06-28 07:49:22 -04:00
Kilian Kluge
c7faf2ccc0
[Documentation] Warn that DataCollatorForWholeWordMask is limited to BertTokenizer-like tokenizers (#12371)
* Notify users that DataCollatorForWholeWordMask is limited to BertTokenier-like tokenizers

* Fix code formatting
2021-06-28 07:39:56 -04:00
Bhadresh Savani
ff5cdc086b
replace print with logger (#12368) 2021-06-26 09:31:25 -07:00
Bhadresh Savani
9a7545943d
updated example template (#12365) 2021-06-25 20:50:30 -07:00
Bhadresh Savani
539ee456d4
[Examples] Replicates the new --log_level feature to all trainer-based pytorch (#12359)
* added log_level

* fix comment

* fixed log_level

* Trigger CI

* Unfied logging

* simplified args for log_level
2021-06-25 14:58:42 -07:00
Stas Bekman
64e6098094
[trainer] add main_process_first context manager (#12351)
* main_process_first context manager

* handle multi-node, add context description

* sync desc
2021-06-25 14:58:03 -07:00
cronoik
f866425898
fixed multiplechoice tokenization (#12362)
* fixed multiplechoice tokenization

The model would have seen two sequences:
1. [CLS]prompt[SEP]prompt[SEP]
2. [CLS]choice0[SEP]choice1[SEP]
that is not correct as we want a contextualized embedding of prompt and choice

* removed outer brackets for proper sequence generation
2021-06-25 17:41:08 -04:00
Stas Bekman
4a872caef4
remove extra white space from log format (#12360) 2021-06-25 13:20:14 -07:00
Sylvain Gugger
a3daabfe14 Style 2021-06-25 15:54:31 -04:00
Kai Fricke
238521b0b6
Replace NotebookProgressReporter by ProgressReporter in Ray Tune run (#12357)
* Replace NotebookProgressReporter by ProgressReporter in Ray Tune run

* Move to local import
2021-06-25 14:12:03 -04:00
Vasudev Gupta
332a245861
Add FlaxBigBird QuestionAnswering script (#12233)
* port bigbird script

* adapt script a bit

* change location

* adapt more

* save progress

* init commit

* style

* dataset script tested

* readme add
2021-06-25 18:05:48 +01:00
jglaser
55bb4c06f7
Fix exception in prediction loop occurring for certain batch sizes (#12350)
* fix distributed_concat for scalar outputs

* Update README.md

* fixed typo (#12356)

* simplify fix with terser syntax

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Trigger CI

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: michal pitr <21157924+MichalPitr@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-25 10:55:15 -04:00
michal pitr
d4ce31e839
fixed typo (#12356) 2021-06-25 07:49:29 -04:00
Patrick von Platen
aa550c4a11
Update README.md 2021-06-25 11:55:51 +01:00
Marc van Zee
f2c4ce7e33
Add flax/jax quickstart (#12342) 2021-06-24 17:04:18 +01:00
Sylvain Gugger
5b1b5635d3 Document patch release v4.8.1 2021-06-24 10:15:15 -04:00
Lysandre Debut
8ef62ec9e1
Fix torchscript tests (#12336)
* Fix torchscript tests

* Better test

* Remove bogus print
2021-06-24 09:52:28 -04:00
Suraj Patil
aef3823e1a
[examples/Flax] move the examples table up (#12341) 2021-06-24 16:03:37 +05:30
Richard Liaw
7875b638cd
try-this (#12338)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-06-24 04:13:17 -04:00
Sylvain Gugger
cf3c9198aa Fix default to logging_dir lost in merge conflict 2021-06-23 16:22:29 -04:00
Stas Bekman
07ae6103c3
[Deepspeed] new docs (#12077)
* document sub_group_size

* style

* install + issues reporting

* style

* style

* Update docs/source/main_classes/deepspeed.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* indent 4

* restore

* style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-23 11:07:37 -07:00
Sam Havens
3694484d0a
Update training_args.py (#12328)
mention in `save_strategy` param description that `load_best_model_at_end` can override
2021-06-23 13:39:43 -04:00
Sylvain Gugger
2150dfed31 v4.9.0.dev0 2021-06-23 13:31:19 -04:00
Sylvain Gugger
9252a5127f Release: v4.8.0 2021-06-23 13:25:56 -04:00
Patrick von Platen
468cda20f2
[Flax T5] Fix weight initialization and fix docs (#12327)
* finish t5 flax fixes

* improve naming
2021-06-23 17:39:21 +01:00
Sylvain Gugger
12a4457c56 Pin good version of huggingface_hub 2021-06-23 12:30:15 -04:00
Michael Benayoun
986ac03e37
changed modeling_fx_utils.py to utils/fx.py for clarity (#12326)
Co-authored-by: Michael Benayoun <michael@huggingface.co>
2021-06-23 18:16:24 +02:00
Lysandre
941b4442ba Temporarily revert the fill-mask improvements. 2021-06-23 17:46:24 +02:00
Lysandre Debut
4bdff2cdbe
Conda build (#12323) 2021-06-23 11:07:07 -04:00
Sylvain Gugger
9eda6b52e2
Add all XxxPreTrainedModel to the main init (#12314)
* Add all XxxPreTrainedModel to the main init

* Add to template

* Add to template bis

* Add FlaxT5
2021-06-23 10:40:54 -04:00
Sylvain Gugger
53c60babe4
Clean push to hub API (#12187)
* Clean push to hub API

* Create working dir if it does not exist

* Different tweak

* New API + all models + test Flax

* Adds the Trainer clean up

* Update src/transformers/file_utils.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* (nit) output types

* No need to set clone_from when folder exists

* Update src/transformers/trainer.py

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Add generated_from_trainer tag

* Update to new version

* Fixes

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-06-23 10:11:19 -04:00
chenht2010
625f512d5e
[TFWav2Vec2] Fix docs (#12283)
* fix error

* make style check happy

Co-authored-by: chenhaitao <chenhaitao@qiyi.com>
2021-06-23 14:51:31 +01:00
Patrick von Platen
44739c8180
[Flax/JAX] Add how to propose projects markdown (#12311)
* fix_torch_device_generate_test

* remove @

* finish

* make style
2021-06-23 14:50:35 +01:00
Lysandre Debut
ef3dceff4a
Add mention of the huggingface_hub methods for offline mode (#12320) 2021-06-23 09:45:30 -04:00
Vasudev Gupta
e98233dde1
Flax T5 (#12150)
* copy pytorch-t5

* init

* boom boom

* forward pass same

* make generation work

* add more tests

* make test work

* finish normal tests

* make fix-copies

* finish quality

* correct slow example

* correct slow test

* version table

* upload models

* Update tests/test_modeling_flax_t5.py

* correct incorrectly deleted line

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-23 13:13:32 +01:00
David Fan
7d4cfa3b47
Rewrite ProphetNet to adapt converting ONNX friendly (#11981)
* Rewrite

* [ONNX] rewrite
2021-06-23 11:34:18 +01:00
Suraj Patil
c0fe3c9a7a
Flax summarization script (#12230)
* add summrization script

* fix arguments, preprocessing, metrics

* add generation and metrics

* auto model, prediction loop

* prettify

* label smoothing

* adress Sylvain and Patricks suggestions

* dynamically import shift_tokens_right

* fix shift_tokens_right_fn call
2021-06-23 15:49:30 +05:30
Daniel Stancl
26a2e36595
Add output in a dictionary for TF generate method (#12139)
* Add output args to greedy search

* Fix critical typo + make style quality

* Handle generate_beam_search

* Add dict_specific tests and fix the placement of encoder outputs

* Add  specific outputs

* Update doc

* Fix typo

* Adjust handling encoder_outputs + Fix generating for T5

* Fix generate for RAG

* Fix handling ouptut_attentions when target_mapping is not None

Take care of situations when target_mapping is provided
as there are 2-tuple of attentions

Change from:
if inputs["output_attentions"]:
    attentions = tuple(tf.transpose(t, perm(2, 3, 0, 1)) for t in attentions)

to:
if inputs["output_attentions"]:
    if inputs["target_mapping"] is not None:
        # when target_mapping is provided, there are 2-tuple of attentions
         attentions = tuple(
             tuple(tf.transpose(attn_stream, perm=(2, 3, 0, 1)) for attn_stream in t) for t in attentions
        )
    else:
        attentions = tuple(tf.transpose(t, perm=(2, 3, 0, 1)) for t in attentions)

* Rename kwargs to model_kwargs

* make style quality

* Move imports in test_modeling_tf_common.py

Move ModelOutput-related imports in test_modeling_tf_common.py
into the `is_tf_available():` statement.

* Rewrite nested if-statements

* Fix added tests
2021-06-23 10:52:11 +01:00
Nicolas Patry
d4be498441
Optimizing away the fill-mask pipeline. (#12113)
* Optimizing away the `fill-mask` pipeline.

- Don't send anything to the tokenizer unless needed. Vocab check is
much faster
- Keep BC by sending data to the tokenizer when needed. User handling warning messages will see performance benefits again
- Make `targets` and `top_k` work together better `top_k` cannot be
higher than `len(targets)` but can be smaller still.
- Actually simplify the `target_ids` in case of duplicate (it can happen
because we're parsing raw strings)
- Removed useless code to fail on empty strings. It works only if empty
string is in first position, moved to ignoring them instead.
- Changed the related tests as only the tests would fail correctly
(having incorrect value in first position)

* Make tests compatible for 2 different vocabs... (at the price of a
warning).

Co-authored-by: @EtaoinWu

* ValueError working globally

* Update src/transformers/pipelines/fill_mask.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* `tokenizer.vocab` -> `tokenizer.get_vocab()` for more compatiblity +
fallback.

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-06-23 10:38:04 +02:00
Kevin Canwen Xu
037e466b10
Add CodeCarbon Integration (#12304)
* Add optional dependency

* Add CodeCarbon integration

* Add CodeCarbon integration

* Add CodeCarbon integration

* typo
2021-06-23 14:53:09 +08:00
Stas Bekman
bfd5da8e28
[docs] performance (#12258)
* initial performance document

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* rewrites based on suggestions

* 8x multiple is for AMP only

* add contribute section

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-06-22 15:34:19 -07:00
Sylvain Gugger
1562c04e41
FlaxBartPretrainedModel -> FlaxBartPreTrainedModel (#12313) 2021-06-22 16:37:05 -04:00
Stas Bekman
ebe5413589
[trainer] 2 bug fixes and a rename (#12309)
* bug fixes and a rename

* add extended DDP test
2021-06-22 11:13:23 -07:00
Patrick von Platen
64029abe4c
[Flax] Main doc for event orga (#12305)
* fix_torch_device_generate_test

* remove @

* push

* finish

* some typos

* add more info on communication

* add suggestions
2021-06-22 18:02:52 +01:00
Kilian Kluge
032d56a435
Fix and improve documentation for LEDForConditionalGeneration (#12303)
* Replace conditional generation example (fixes #12268)

* Replace model in summarization example with finetuned checkpoint, adapt example text

* Fix typo in new summarization example

* Fix docstring formatting, add missing import statement to example
2021-06-22 09:58:13 -04:00
Suraj Patil
1498eb9888
add FlaxAutoModelForImageClassification in main init (#12298) 2021-06-22 18:26:05 +05:30
Stefan Schweter
2affeb2905
trainer_tf: adjust wandb installation command (#12291) 2021-06-22 08:47:31 -04:00
Hamid Shojanazeri
af6e01c5bc
Fix for the issue of device-id getting hardcoded for token_type_ids during Tracing [WIP] (#11252)
* registering a buffer for token_type_ids, to pass the error of device-id getting hardcoded when tracing

* sytle format

* adding persistent flag to the resgitered buffers that prevent from adding them to the state_dict and addresses the Backward compatibility issue

* adding the try catch to the fix as persistent flag is only available from PT >1.6

* adding version check

* added the condition to only use the token_type_ids buffer when its autogenerated not passed by user

* adding comments and making the conidtion where token_type_ids are None to use the registered buffer

* taking out position-embeddding from the if block

* adding comments

* handling the case if buffer for position_ids was not registered

* reverted the changes on position_ids, fix the issue with size of token_type_ids buffer, moved the modification for generated token_type_ids to Bertmodel, instead of Embeddings

* reverting the token_type_ids in case of None to the previous version

* reverting changes on position_ids adding back the if block

* changes added by running make fix-copies

* changes added by running make fix-copies and added the import version as it was getting used

* changes added by running make fix-copies

* changes added by running make fix-copies

* fixing the import format

* fixing the import format

* modified to use temp tensor for trimed and expanded token_type_ids buffer

* changes made by fix-copies after temp tensor modifications

* changes made by fix-copies after temp tensor modifications

* changes made by fix-copies after temp tensor modifications

* clean up

* clean up

* clean up

* clean up

* Nit

* Nit

* Nit

* modified according to support device conversion on traced models

* modified according to support device conversion on traced models

* modified according to support device conversion on traced models

* modified according to support device conversion on traced models

* changes based on latest in master

* Adapt templates

* Add version import

Co-authored-by: Ubuntu <ubuntu@ip-172-31-32-81.us-west-2.compute.internal>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-06-22 05:21:30 -04:00