Commit Graph

9300 Commits

Author SHA1 Message Date
Yih-Dar
d481b6414d
Make Flax pt-flax equivalence test more aggressive (#15841)
* Make test_equivalence_pt_to_flax more aggressive

* Make test_equivalence_flax_to_pt more aggressive

* don't use to_tuple

* clean-up

* fix missing test cases + testing on GPU

* fix conversion

* fix `ValueError: assignment destination is read-only`

* Add type checking

* commit to revert later

* Fix

* fix

* fix device

* better naming

* clean-up

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-18 18:15:36 +01:00
Clara Meister
c03b6e4259
value check for typical sampling (#16165)
* value check for typical sampling

* value check for typical sampling

* change from float to int comparison

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-03-18 17:05:27 +01:00
Chan Woo Kim
fdc2e643c3
added cbs to notebooks, made copy-paste error fix in generation_utils (#16246) 2022-03-18 17:04:43 +01:00
Suraj Patil
b25b92ac4f
update jax version and re-enable some tests (#16254) 2022-03-18 16:45:39 +01:00
Johannes Kolbe
5709a20416
Add unpack_inputs decorator for ctrl (#16242)
* add unpack_inputs decorator for ctrl

* replace "past" with "past_key_values"

Co-authored-by: Johannes Kolbe <johannes.kolbe@tech.better.team>
2022-03-18 15:33:24 +00:00
Louis Owen
ddbc9ae00b
Update XLM with TF decorator (#16247)
* update XLM with tf decorator

* move to top decorator

* set unpack_inputs as top decorator

Co-authored-by: Louis Owen <yellow@Louis-Owen.local>
2022-03-18 14:07:02 +00:00
Yih-Dar
a6271967c9
Override _pad in LEDTokenizer to deal with global_attention_mask (#15940)
* Override _pad in LEDTokenizer

* Override _pad in LEDTokenizerFast

* add Copied from

* calling the super method

* add comment about -1

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-18 13:30:08 +01:00
Zhaofeng Wu
cb2b0276b6
Change assertion to warning when passing past_key_value to T5 encoder (#16153)
* Change assertion to warning when passing past_key_value to T5 encoder

* lint
2022-03-18 12:52:55 +01:00
Nicolas Patry
ecb4662d17
Attention mask is important in the case of batching... (#16222)
* Attention mask is important in the case of batching...

* Improve the fix.

* Making the sentence different enough that they exhibit different
predictions.
2022-03-18 10:02:12 +01:00
NielsRogge
ec4e421b7d
Update expected slices for pillow > 9 (#16117)
* Update expected slices for pillow > 9

* Add expected slices depending on pillow version

* Add different slices depending on pillow version for other models

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-03-18 09:46:45 +01:00
Kshitiz Sharma
12d1f07770
integrations: mlflow: skip start_run() if a run is already active and sanity check on enabling integration (#16131)
* integrations: mlflow: skip start_run() call if a run is already active

* integrations: typo fix

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-17 16:39:57 -04:00
Stas Bekman
47cccb5318
[Deepspeed] non-HF Trainer doc update (#16238) 2022-03-17 13:33:55 -07:00
Patrick von Platen
8a96b0f10a
[Generate Docs] Correct docs (#16133)
* [Generate Docs] Correct docs

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2022-03-17 20:05:28 +01:00
Suraj Patil
632ff3c39e
[FlaxSpeechEncoderDecoderModel] Skip from_encoder_decoder_pretrained (#16236)
* skip the test

* fix

* fix skip
2022-03-17 20:05:14 +01:00
Boris Dayma
b6e06c845f
fix(flax): generate with logits processor/warper (#16231) 2022-03-17 19:39:16 +01:00
Johannes Kolbe
1c1e377e99
TF - add unpack_inputs decorator for marian (#16226)
* add unpack_inputs decorator

* small fix for attn_mask string

Co-authored-by: Johannes Kolbe <johannes.kolbe@tech.better.team>
2022-03-17 18:23:40 +00:00
罗崚骁(LUO Lingxiao)
81643edda5
Support PEP 563 for HfArgumentParser (#15795)
* Support PEP 563 for HfArgumentParser

* Fix issues for Python 3.6

* Add test for string literal annotation for HfArgumentParser

* Remove wrong comment

* Fix typo

* Improve code readability

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Use `isinstance` to compare types to pass quality check

* Fix style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-17 13:51:37 -04:00
Suraj Patil
93d3fd8645
remove jax.ops.index (#16220) 2022-03-17 17:51:43 +01:00
Ulaş "Sophylax" Sert
8481ecefbd
Fix Type Hint of Nan/Inf Logging Filter Arg (#16227) 2022-03-17 11:05:38 -04:00
Lysandre Debut
5a6b3ccd28
Skip equivalence test for TransfoXL (#16224)
* Skip test for TransfoXL

* Single list
2022-03-17 09:03:07 -04:00
Rahul
abd503d939
TF - Adding Unpack Decorator For DPR model (#16212)
* Adding Unpack Decorator

* Adding Unpack Decorator-moved it on top
2022-03-17 12:33:02 +00:00
Francesco Saverio Zuppichini
d9b8d1a9f5
update test (#16219) 2022-03-17 08:11:55 -04:00
Li-Huai (Allan) Lin
7e0d04bed1
Fix readmes (#16217) 2022-03-17 07:47:01 -04:00
Sylvain Gugger
e1da89ccb8
Fix reproducibility in Training for PyTorch 1.11 (#16209) 2022-03-17 07:42:58 -04:00
Dayyan Smith
e5101c2e27
Fix typo (#16208) 2022-03-17 07:21:20 -04:00
Yih-Dar
25b8f9a85b
Fix FlaxRoFormerClassificationHead activation (#16168)
* fix activation

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-17 11:45:50 +01:00
NielsRogge
03c14a515f
[Tests] Fix DiT test (#16218)
* Fix device

* Clean up

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-03-17 10:53:57 +01:00
Lysandre Debut
73f0a5d1f6
Fixes Loss for TransfoXL when using Trainer API v2 (#16140)
* fix(transfo_xl): Fixes TransfoXL support when using Trainer.

* fix(tests): Uses losses_1 and losses_2 pattern with TransfoXL test.

* fix(transfo_xl): Adds requested changes to allow for backward compatibility.

fix(transfo_xl): Adds requested changes to allow for backward compatibility.

fix(transfo_xl): Fixes code styling.

* Backward compatibility

* Update src/transformers/models/transfo_xl/modeling_transfo_xl.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Gustavo de Rosa <gth.rosa@uol.com.br>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-17 05:49:24 -04:00
Francesco Saverio Zuppichini
76c74b37c1
VAN: update modules names (#16201)
* done

* done
2022-03-17 10:25:09 +01:00
João Gustavo A. Amorim
99e2982f3e
Add/type annotations/model vision (#16151)
* add types annotations for Beit (PyTorch)

* add types annotations for ViT (PyTorch)

* add types annotations for Deit (PyTorch)

* change Optional[bool] to bool into some places at Beit

* change Optional[bool] to bool into some places at ViT
2022-03-16 20:27:54 +00:00
Patrick von Platen
2410d0f8ed
Fix generation min length (#16206)
* up

* fix min lengths
2022-03-16 18:49:23 +01:00
Francesco Saverio Zuppichini
667b823b89
Swin support for any input size (#15986)
* padding done

* correctly return one attention per layer

* almost correct, attentions are not flatten one tuple per stage

* tests green

* doc

* conversations

* reshaping hidden_states

* view in the test

* reshape_hidden_states in Encoder and Model

* new outputs with reshaped_hidden_states

* conversations

* doc

* Update docs/source/model_doc/swin.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* conversations

* fix tests

* minor changes

* resolved conversations

* attentions one per stage

* typo

* typos

* typos

* function signature

* CI

* clean up tests

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-03-16 18:38:25 +01:00
Joao Gante
204c54d411
TF: add beam search tests (#16202) 2022-03-16 15:44:33 +00:00
Suraj Patil
190994573a
Fix loading CLIPVisionConfig and CLIPTextConfig (#16198)
* override from_pretrained

* add tests

* remove docstrings

* fix typo

* Trigger CI
2022-03-16 16:24:01 +01:00
Yih-Dar
09013efdf1
Update step name (#16189)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-16 11:19:38 -04:00
Francesco Saverio Zuppichini
36f8c42519
ResNet: update modules names (#16196)
* updated names

* fit in one line

* typo
2022-03-16 15:59:56 +01:00
John Ryan
5bdf3313ef
Adding type hints for Distilbert (#16090)
* Distillbert type - squash

* Update src/transformers/models/distilbert/modeling_distilbert.py

Undo cleanup

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Update src/transformers/models/distilbert/modeling_distilbert.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Update src/transformers/models/distilbert/modeling_distilbert.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Update src/transformers/models/distilbert/modeling_distilbert.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Remove type

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-03-16 14:54:50 +00:00
Utku Saglam
0b8b06185d
clearer model variable naming: blenderbot_small (#16194)
Co-authored-by: utku saglam <utkusaglam@utku-MacBook-Pro.local>
2022-03-16 14:03:58 +00:00
Johannes Kolbe
f06c2c2ba1
TF unpack_input decorator for convnext (#16181)
* unpack_input decorator for tf_convnext

* set unpack_input as top decorator

Co-authored-by: Johannes Kolbe <johannes.kolbe@tech.better.team>
2022-03-16 14:01:32 +00:00
Anton Lozhkov
d35e0c6247
Minor fixes to XTREME-S (#16193)
* Minor fixes

* Fix vocab union

* Update examples/research_projects/xtreme-s/README.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update README

* unused import

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-03-16 17:23:00 +04:00
Utku Saglam
8cc925a241
TF clearer model variable naming: blenderbot (#16192)
Co-authored-by: utku saglam <utkusaglam@utku-MacBook-Pro.local>
2022-03-16 12:37:08 +00:00
Utku Saglam
0f35cda459
TF clearer model variable naming: funnel (#16178)
Co-authored-by: utku saglam <utkusaglam@utku-MacBook-Pro.local>
2022-03-16 10:37:47 +00:00
Sanchit Gandhi
ee27b3d7df
Replace all deprecated jax.ops operations with jnp's at (#16078)
* Replace all deprecated `jax.ops` operations with jnp's `at`

* np to jnp scores

* suggested changes
2022-03-16 09:08:55 +00:00
Patrick von Platen
c2dc89be62
[Xtreme-S] fix some namings (#16183) 2022-03-16 01:21:31 +01:00
Anton Lozhkov
99fd3eb4a5
Add the XTREME-S fine-tuning example (#15985)
* CTC+classification draft

* CTC+classification draft

* style

* multilingual runs

* Fix race condition during processor.from_reatrained

* Merge covost experiments

* Add README

* Quality

* Switch to .all configs

* Fix typos
2022-03-16 00:21:06 +01:00
Sylvain Gugger
db4dd44ae3 Trigger doc build 2022-03-15 17:00:31 -04:00
Yih-Dar
ea05d67164
Fix some Flax models' hidden_states (#16167)
* fix the last element in `hidden_states`

* fix missing elements in outputs for FlaxWav2Vec2EncoderLayerStableLayerNormCollection

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-15 19:06:46 +01:00
Dan Tegzes
88f7c564f0
Added type hints for Reformer (#16175) 2022-03-15 17:59:59 +00:00
Jack McDonald
16399d6197
Add type annotations for Perceiver (#16174) 2022-03-15 17:56:57 +00:00
Kamal Raj
015de6f081
TF clearer model variable naming: xlnet (#16150) 2022-03-15 17:50:30 +00:00