Commit Graph

9852 Commits

Author SHA1 Message Date
Michael Benayoun
2e7e4280aa
Traced models serialization and torchscripting fix (#17206)
* Fix torch.jit.script and pickling issues

* Fix get_attr issues

* Fix import in function

* Fix GPT-J and T5 tracing for torch=1.11

* Gate graph surgery on torch version

* Modeling minor changes to enable TorchScripting

* Model serialization / deserialization test

* Remove _assert_is_none users
2022-05-23 17:50:40 +02:00
Maximilian Schmidt
1cd01b0af3
Fix Comet ML integration (#17381)
Callback function `on_train_end` crashed if Comet ML integration was
used but `COMET_MODE` set to `DISABLE`
2022-05-23 10:43:10 -04:00
Anugunj Naman
c86aad6110
Fix cvt docstrings (#17367) 2022-05-23 16:11:09 +02:00
ghlai9665
7b8cb26953
Correct & Improve Doctests for LayoutLMv2 (#17168)
* add inference example to LayoutLMv2ForQuestionAnswering, passing doctest

* add loss example to LayoutLMv2ForQuestionAnswering, passing doctest

* Add correct doctest for LayoutLMv2ForTokenClassification, passing doctest

* add correct doctest for LayoutLMv2ForSequenceClassification, passing test

* add correct doctest for LayoutLMv2Model, passing test

* make fixup

* fix to address review comments

* make style

* fix doctest line break issue, add to documentaiton_tests.txt, address review comments

* move comment about layoutlmv2 dependencies to the doc page

* format doc page as suggested

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* delete extraneous backtick

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-23 08:02:31 -04:00
Loubna Ben Allal
b48ac1a094
Fix CodeParrot training script (#17291)
* average loss over batches and accumulated steps for tracking

* fix layernorm weight decay

* use AdamW from Pytorch instead of Transformers

* add shuffling of sequences inside the batches

* add shuffling of sequences inside the batches

* add logging dir and reformat code

* fix lr tracking

* remove Mistral scaling

* keep Mistral scaling

* reformat code

* fix error

* fix error

* use shuffling function from Pytorch

* remove argument for shuffling batch sequences as it isn't optional

* update package versions and install accelerate from source

* remove unused package

* Update loss average over accumulated steps

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update loss average over accumulated steps

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* use one shuffle buffer argument

* compute avg_loss in one line

Co-authored-by: Loubna ben allal <loubnabenallal@gmail.com>
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
2022-05-23 12:55:35 +02:00
Daniel Stancl
b9bb417324
Fix a typo relative_postion_if_large -> relative_position_if_large (#17366) 2022-05-20 18:41:12 +02:00
Sylvain Gugger
3fd7de49f4
Pin dill to fix examples (#17368)
* Pin dill for now

* Try this version?

* force install

* Actually use dep in testing

* Try a larger pin
2022-05-20 11:00:58 -04:00
Patrick von Platen
54192058f3
[Test OPT] Add batch generation test opt (#17359)
* up

* up
2022-05-19 23:46:26 +02:00
ddobokki
48c22691e3
Fix bug in Wav2Vec2 pretrain example (#17326) 2022-05-19 22:42:44 +02:00
Nathan Dahlberg
5d6feecf16
fix for 17292 (#17293) 2022-05-19 22:21:19 +02:00
Patrick von Platen
518bd02c9b
[Generation] Fix Transition probs (#17311)
* [Draft] fix transition probs

* up

* up

* up

* make it work

* fix

* finish

* update
2022-05-19 22:17:02 +02:00
Patrick von Platen
e8714c0307
[OPT] Run test in lower precision on GPU (#17353)
* [OPT] Run test only in half precision

* up

* up

* up

* up

* finish

* fix on GPU

* Update tests/models/opt/test_modeling_opt.py
2022-05-19 22:15:36 +02:00
Nicolas Patry
2b282296f1
Adding batch_size test to QA pipeline. (#17330) 2022-05-19 14:28:12 -04:00
Nicolas Patry
a4386d7e40
[BC] Fixing usage of text pairs (#17324)
* [BC] Fixing usage of text pairs

The BC is actually preventing users from misusing the pipeline since
users could have been willing to send text pairs and the pipeline would
instead understand the thing as a batch returning bogus results.

The correct usage of text pairs is preserved in this PR even when that
makes the code clunky.

Adds support for {"text":..,, "text_pair": ...} inputs for both dataset
iteration and more explicit usage to pairs.

* Updating the doc.

* Update src/transformers/pipelines/text_classification.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/pipelines/text_classification.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/pipelines/test_pipelines_text_classification.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* quality.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2022-05-19 10:29:16 +02:00
Stas Bekman
3601aa8fc9
[tests] fix copy-n-paste error (#17312)
* [tests] fix copy-n-paste error

* fix
2022-05-18 16:00:47 -07:00
Yih-Dar
1b20c970a2
Fix ci_url might be None (#17332)
* fix

* Update utils/notification_service.py

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2022-05-18 21:49:08 +02:00
Yih-Dar
6aad3872ce
fix (#17337)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-18 15:26:44 -04:00
Zachary Mueller
1762ded30a
Fix metric calculation in examples and setup tests to run on multi-gpu for no_trainer scripts (#17331)
* Fix length in no_trainer examples

* Add setup and teardown

* Use new accelerator config generator to automatically make tests able to run based on environment
2022-05-18 14:17:40 -04:00
Jader Martins
6e195eb9de
docs for typical decoding (#17186)
Co-authored-by: Jader Martins <jadermcs94@gmail.com>
2022-05-18 19:18:43 +02:00
Yih-Dar
060fe61dff
Not send successful report (#17329)
* send report only if there is any failure

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-18 19:07:48 +02:00
Yih-Dar
b3b9f99ed2
Fix test_t5_decoder_model_past_large_inputs (#17320)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-18 17:57:23 +02:00
Jingya HUANG
6da76b9c2a
Add onnx export cuda support (#17183)
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-05-18 17:52:13 +02:00
NielsRogge
adc0ff2502
Add CvT (#17299)
* Adding cvt files

* Adding cvt files

* changes in init file

* Adding cvt files

* changes in init file

* Style fixes

* Address comments from code review

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Format lists in docstring

* Fix copies

* Apply suggestion from code review

Co-authored-by: AnugunjNaman <anugunjjha@gmail.com>
Co-authored-by: Ayushman Singh <singhayushman13@protonmail.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-18 17:47:18 +02:00
Sylvain Gugger
4710702837 Fix style 2022-05-18 10:46:40 -04:00
mraunak
5fdb54ece7
Add Information Gain Filtration algorithm (#16953)
* Add information gain filtration algorithm

* Complying with black requirements

* Added author

* Fixed import order

* flake8 corrections

Co-authored-by: Javier Turek <javier.turek@intel.com>
2022-05-18 10:39:02 -04:00
Kamal Raj
91ede485a7
Fix typo (#17328) 2022-05-18 10:29:53 -04:00
Yih-Dar
fe28eb9452
remove (#17325)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-18 10:06:41 -04:00
Nicolas Patry
2cb2ea3fa1
Accepting real pytorch device as arguments. (#17318)
* Accepting real pytorch device as arguments.

* is_torch_available.
2022-05-18 10:06:24 -04:00
Nicolas Patry
1c9d1f4ca8
Updating the docs for max_seq_len in QA pipeline (#17316) 2022-05-18 15:46:12 +02:00
Patrick von Platen
60ad73448c
[T5] Fix init in TF and Flax for pretraining (#17294)
* fix init

* Apply suggestions from code review

* fix

* finish

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-18 15:08:56 +02:00
Joaq
7ba1d4e51f
Add type hints for ProphetNet (Pytorch) (#17223)
* added type hints to prophetnet

* reformatted with black

* fix bc black misformatted some parts

* fix imports

* fix imports

* Update src/transformers/models/prophetnet/configuration_prophetnet.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* update OPTIONAL type hint and docstring

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-05-18 13:23:47 +01:00
Carl
d6b8e9cec7
Add trajectory transformer (#17141)
* Add trajectory transformer


Fix model init


Fix end of lines for .mdx files

Add trajectory transformer model to toctree

Add forward input docs

Fix docs, remove prints, simplify prediction test

Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update docs, more descriptive comments

Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update readme

Small comment update and add conversion script

Rebase and reformat

Fix copies

Fix rebase, remove duplicates

Fix rebase, remove duplicates

* Remove tapex

* Remove tapex

* Remove tapex
2022-05-17 19:07:43 -04:00
Patrick von Platen
c35264007b
fix (#17310) 2022-05-17 18:34:31 -04:00
Cesare Campagnano
d9050dc768
[LED] fix global_attention_mask not being passed for generation and docs clarification about grad checkpointing (#17112)
* [LED] fixed global_attention_mask not passed for generation + docs clarification for gradient checkpointing

* LED docs clarification

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [LED] gradient_checkpointing=True should be passed to TrainingArguments

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [LED] docs: remove wrong word

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [LED] docs fix typo

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-05-17 23:44:37 +02:00
Jean Vancoppenolle
bad358398a
Add support for pretraining recurring span selection to Splinter (#17247)
* Add SplinterForSpanSelection for pre-training recurring span selection.

* Formatting.

* Rename SplinterForSpanSelection to SplinterForPreTraining.

* Ensure repo consistency

* Fixup changes

* Address SplinterForPreTraining PR comments

* Incorporate feedback and derive multiple question tokens per example.

* Update src/transformers/models/splinter/modeling_splinter.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/splinter/modeling_splinter.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Jean Vancoppenole <jean.vancoppenolle@retresco.de>
Co-authored-by: Tobias Günther <tobias.guenther@retresco.de>
Co-authored-by: Tobias Günther <github@tobigue.de>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-05-17 23:42:14 +02:00
Yih-Dar
0511305549
Add PR author in CI report + merged by info (#17298)
* Add author info to CI report

* Add merged by info

* update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-17 12:56:58 -04:00
Sylvain Gugger
032d63b976
Fix dummy creation script (#17304) 2022-05-17 12:56:24 -04:00
Sylvain Gugger
986dd5c5bf Fix style 2022-05-17 12:50:14 -04:00
Karim Foda
38ddab10da
Doctest longformer (#16441)
* Add initial doctring changes

* make fixup

* Add TF doc changes

* fix seq classifier output

* fix quality errors

* t

* swithc head to random init

* Fix expected outputs

* Update src/transformers/models/longformer/modeling_longformer.py

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2022-05-17 18:32:12 +02:00
Patrick von Platen
10704e1209
[Test] Fix W2V-Conformer integration test (#17303)
* [Test] Fix W2V-Conformer integration test

* correct w2v2

* up
2022-05-17 18:20:36 +02:00
regisss
28a0811652
Improve mismatched sizes management when loading a pretrained model (#17257)
- Add --ignore_mismatched_sizes argument to classification examples

- Expand the error message when loading a model whose head dimensions are different from expected dimensions
2022-05-17 17:58:14 +02:00
Patrick von Platen
1f13ba818e
correct opt (#17301) 2022-05-17 15:48:23 +02:00
Matt
349f1c85d3
Rewrite TensorFlow train_step and test_step (#17057)
* Initial commit

* Better label renaming

* Remove breakpoint before pushing (this is your job)

* Test a lot more in the Keras fit() test

* make fixup

* Clarify the case where we flatten y dicts into tensors

* Clarify the case where we flatten y dicts into tensors

* Extract label name remapping to a method
2022-05-17 14:36:23 +01:00
Matt
651e48e1e5
Fix tests of mixed precision now that experimental is deprecated (#17300)
* Fix tests of mixed precision now that experimental is deprecated

* Fix mixed precision in training_args_tf.py too
2022-05-17 14:14:17 +01:00
SaulLu
6d211429ec
fix retribert's test_torch_encode_plus_sent_to_model (#17231) 2022-05-17 14:33:13 +02:00
NielsRogge
ec7f8af106
[ConvNeXT] Fix drop_path_rate (#17280)
* Fix drop_path_rate

* Fix TF's drop path rate
2022-05-17 07:37:48 -04:00
Yih-Dar
a26ab95e30
Fix wrong PT/TF categories in CI report (#17272)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-17 09:32:47 +02:00
Yih-Dar
1ac2b8fa7f
Fix missing job action button in CI report (#17270)
* use matrix.machine_type

* fix job names used in job_link

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-17 08:31:06 +02:00
Patrick von Platen
5a9957358c
Add Wav2Vec2Conformer (#16812)
* save intermediate

* add wav2vec2 conformer

* add more code

* more

* first test passes

* make all checkpoints work

* update

* up

* more clean ups

* save clean-up

* save clean-up

* save more

* remove bogus

* finalize design conformer

* remove vision

* finish all tests

* more changes

* finish code

* add doc tests

* add slow tests

* fix autoconfig test

* up

* correct docstring

* up

* update

* fix

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Update docs/source/en/model_doc/wav2vec2-conformer.mdx

* upload

* save copied from

* correct configs

* fix model outputs

* add to docs

* fix imports

* finish

* finish code

* correct copied from

* correct again

* correct make fix

* improve make fix copies

* save

* correct fix copy from

* correct init structure

* correct

* fix import

* apply suggestions

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
2022-05-17 00:43:16 +02:00
Kyungmin Lee
f0395cf58e
Fix test_model_parallelization (#17249)
* Fix test_model_parallelization

* Modify
2022-05-16 23:30:49 +02:00