Commit Graph

1676 Commits

Author SHA1 Message Date
Yih-Dar
f466936476
Add has_attentions to TFModelTesterMixin as done on PyTorch side (#16259)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-19 11:44:17 +01:00
Yih-Dar
75c666b4a8
Aggressive PT/TF equivalence test on PT side (#16250)
* Aggressive PT/TF equivalence test on PT side

* Ugly fix for `TFTapasForQuestionAnswering`

* apply review suggestions

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-18 18:51:24 +01:00
Yih-Dar
d481b6414d
Make Flax pt-flax equivalence test more aggressive (#15841)
* Make test_equivalence_pt_to_flax more aggressive

* Make test_equivalence_flax_to_pt more aggressive

* don't use to_tuple

* clean-up

* fix missing test cases + testing on GPU

* fix conversion

* fix `ValueError: assignment destination is read-only`

* Add type checking

* commit to revert later

* Fix

* fix

* fix device

* better naming

* clean-up

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-18 18:15:36 +01:00
Suraj Patil
b25b92ac4f
update jax version and re-enable some tests (#16254) 2022-03-18 16:45:39 +01:00
Nicolas Patry
ecb4662d17
Attention mask is important in the case of batching... (#16222)
* Attention mask is important in the case of batching...

* Improve the fix.

* Making the sentence different enough that they exhibit different
predictions.
2022-03-18 10:02:12 +01:00
NielsRogge
ec4e421b7d
Update expected slices for pillow > 9 (#16117)
* Update expected slices for pillow > 9

* Add expected slices depending on pillow version

* Add different slices depending on pillow version for other models

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-03-18 09:46:45 +01:00
Suraj Patil
632ff3c39e
[FlaxSpeechEncoderDecoderModel] Skip from_encoder_decoder_pretrained (#16236)
* skip the test

* fix

* fix skip
2022-03-17 20:05:14 +01:00
罗崚骁(LUO Lingxiao)
81643edda5
Support PEP 563 for HfArgumentParser (#15795)
* Support PEP 563 for HfArgumentParser

* Fix issues for Python 3.6

* Add test for string literal annotation for HfArgumentParser

* Remove wrong comment

* Fix typo

* Improve code readability

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Use `isinstance` to compare types to pass quality check

* Fix style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-17 13:51:37 -04:00
Lysandre Debut
5a6b3ccd28
Skip equivalence test for TransfoXL (#16224)
* Skip test for TransfoXL

* Single list
2022-03-17 09:03:07 -04:00
Francesco Saverio Zuppichini
d9b8d1a9f5
update test (#16219) 2022-03-17 08:11:55 -04:00
NielsRogge
03c14a515f
[Tests] Fix DiT test (#16218)
* Fix device

* Clean up

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-03-17 10:53:57 +01:00
Lysandre Debut
73f0a5d1f6
Fixes Loss for TransfoXL when using Trainer API v2 (#16140)
* fix(transfo_xl): Fixes TransfoXL support when using Trainer.

* fix(tests): Uses losses_1 and losses_2 pattern with TransfoXL test.

* fix(transfo_xl): Adds requested changes to allow for backward compatibility.

fix(transfo_xl): Adds requested changes to allow for backward compatibility.

fix(transfo_xl): Fixes code styling.

* Backward compatibility

* Update src/transformers/models/transfo_xl/modeling_transfo_xl.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Gustavo de Rosa <gth.rosa@uol.com.br>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-17 05:49:24 -04:00
Patrick von Platen
2410d0f8ed
Fix generation min length (#16206)
* up

* fix min lengths
2022-03-16 18:49:23 +01:00
Francesco Saverio Zuppichini
667b823b89
Swin support for any input size (#15986)
* padding done

* correctly return one attention per layer

* almost correct, attentions are not flatten one tuple per stage

* tests green

* doc

* conversations

* reshaping hidden_states

* view in the test

* reshape_hidden_states in Encoder and Model

* new outputs with reshaped_hidden_states

* conversations

* doc

* Update docs/source/model_doc/swin.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* conversations

* fix tests

* minor changes

* resolved conversations

* attentions one per stage

* typo

* typos

* typos

* function signature

* CI

* clean up tests

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-03-16 18:38:25 +01:00
Joao Gante
204c54d411
TF: add beam search tests (#16202) 2022-03-16 15:44:33 +00:00
Suraj Patil
190994573a
Fix loading CLIPVisionConfig and CLIPTextConfig (#16198)
* override from_pretrained

* add tests

* remove docstrings

* fix typo

* Trigger CI
2022-03-16 16:24:01 +01:00
Sanchit Gandhi
ee27b3d7df
Replace all deprecated jax.ops operations with jnp's at (#16078)
* Replace all deprecated `jax.ops` operations with jnp's `at`

* np to jnp scores

* suggested changes
2022-03-16 09:08:55 +00:00
Matt
cd4c5c9060
TF XLA greedy generation (#15786)
* First attempt at TF XLA generation

* Fix comments

* Update XLA greedy generate with direct XLA calls

* Support attention mask, prepare_inputs_for_generation no longer hardcoded for greedy

* Handle position_ids correctly

* make xla generate work for non xla case

* force using xla generate

* refactor

* more fixes

* finish cleaning

* finish

* finish

* clean gpt2 tests

* add gpt2 tests

* correct more cases

* up

* finish

* finish

* more fixes

* flake 8 stuff

* final rag fix

* Update src/transformers/models/rag/modeling_tf_rag.py

* finish t5 as well

* finish

* Update src/transformers/generation_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-03-15 14:19:20 +01:00
NielsRogge
a7aca42fc4
Improve Swin for VisionEncoderDecoder (#16070)
* Add Swin2Bart test

* Fix swin tests

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-03-15 09:59:48 +01:00
Francesco Saverio Zuppichini
0a057201a9
Visual Attention Network (VAN) (#16027)
* encoder works

* addded files

* norm in stage

* convertion script

* tests

* fix copies

* make fix-copies

* fixed __init__

* make fix-copies

* fix

* shapiro test needed

* make fix-copie

* minor changes

* make style + quality

* minor refactor conversion script

* rebase + tests

* removed unused variables

* updated doc

* toctree

* CI

* doc

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* resolved conversations

* make fixup

* config passed to modules

* config passed to modules

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* conversations

* conversations

* copyrights

* normal test

* tests

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-03-15 08:47:12 +01:00
Francesco Saverio Zuppichini
e3008c679f
[WIP] Resnet (#15770)
* first commit

* ResNet model correctly implemented.

basic modeling + weights conversion is done

removed unused doc

mdx file

doc and conversion script

added feature_extractor to auto

test

minor changes + style + quality

doc

test

Delete process.yml

A left over from my attempt of running circleci locally

* minor changes

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* new test format

* minor changes from conversations

* minor changes from conversations

* make style + quality

* readded the tests

* test + README

* minor changes from conversations

* error in README

* make fix-copies

* removed regression for classification head

* make quality

* fixed loss control flow

* fixed loss control flow

* resolved conversations

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* READMEs

* index.mdx

* minor changes

* updated tests and models

* unused import

* outputs

* Update docs/source/model_doc/resnet.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added embeddings_size

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* conversation

* added push to hub

* test

* embedding_size

* make fix-copies

* resolved conversations

* CI

* changed organization

* minor changes

* CI

* minor changes

* conversations

* conversation

* doc

* tests

* removed unused docstring

* conversation

* removed unused outputs

* CI

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-03-14 19:57:55 +01:00
Yih-Dar
923c35b5c5
Make TF pt-tf equivalence test more aggressive (#15839)
* Make TF pt-tf equivalence test more aggressive

* Fix for TFConvNextModelTest and TFTransfoXLModelTest

* fix kwargs for outputs

* clean-up

* Add docstring for check_outputs()

* remove: need to rename encoder-decoder

* clean-up

* send PyTorch things to the correct device

* Add back the accidentally removed test case in test_pt_tf_model_equivalence()

* Fix: change to tuple before calling check_outputs()

* Fix: tfo could be a list

* use to_tuple()

* allow tfo only to be tuple or tensor

* allow tfo to be list or tuple for now + style change

* minor fix

* remove np.copy and update comments

* tfo -> tf_output, same for pt

* Add more detailed comment

* remove the incorrect comment

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-14 13:31:32 +01:00
Sanchit Gandhi
2de99e6c43
Fix Loading of Flax(Speech)EncoderDecoderModel kwargs from PreTrained Encoder-Decoder Checkpoints (#16056)
* Fix Loading of Flax(Speech)EncoderDecoderModel kwargs from PreTrained Encoder-Decoder Checkpoints

* change wording
2022-03-14 10:12:29 +01:00
lewtun
6e1e88fd38
Add TFCamembertForCausalLM and ONNX integration test (#16073)
* Make Camembert great again!

* Add Camembert to TensorFlow ONNX tests
2022-03-14 08:40:42 +01:00
Stas Bekman
580dd87c55
[Deepspeed] add support for bf16 mode (#14569)
* [WIP] add support for bf16 mode

* prep for bf16

* prep for bf16

* fix; zero2/bf16 is ok

* check bf16 is available

* test fixes

* enable zero3_bf16

* config files

* docs

* split stage_dtype; merge back to non-dtype-specific config file

* fix doc

* cleanup

* cleanup

* bfloat16 => bf16 to match the PR changes

* s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/

* test fixes/skipping

* move

* fix

* Update docs/source/main_classes/deepspeed.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* backticks

* cleanup

* cleanup

* cleanup

* new version

* add note about grad accum in bf16

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-11 17:53:53 -08:00
Kevin Bondzio
9442b3ce31
Add soft length regulation for sequence generation (#15245)
* add possibility to softly regulate length when using sampling method in model.generate() function

* fix test config, fix formatting

* fix rag integration, fix docstyling

* fix wrong docstring

* change param to tuple, add test

* fix old param in rag_model, remove unused import

* change test according to new param

* fix formatting

* fix test case

* fix doc style

* move start_length calculation to Logitprocessor

* add possibility to softly regulate length when using sampling method in model.generate() function

* fix rag integration, fix docstyling

* fix test config, fix formatting

* change param to tuple, add test

* fix old param in rag_model, remove unused import

* add possibility to softly regulate length when using sampling method in model.generate() function

* change param to tuple, add test

* fix old param in rag_model, remove unused import

* remove unused import

* fix small errors

* fix test

* add possibility to softly regulate length when using sampling method in model.generate() function

* fix test config, fix formatting

* fix rag integration, fix docstyling

* change param to tuple, add test

* fix old param in rag_model, remove unused import

* change test according to new param

* fix test case

* move start_length calculation to Logitprocessor

* add possibility to softly regulate length when using sampling method in model.generate() function

* fix rag integration, fix docstyling

* fix test config, fix formatting

* change param to tuple, add test

* fix old param in rag_model, remove unused import

* add possibility to softly regulate length when using sampling method in model.generate() function

* fix test config, fix formatting

* fix rag integration, fix docstyling

* add possibility to softly regulate length when using sampling method in model.generate() function

* fix rag integration, fix docstyling

* change param to tuple, add test

* fix old param in rag_model, remove unused import

* fix small errors

* Update src/transformers/generation_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/generation_utils.py

* Update src/transformers/generation_utils.py

* fix docstring, add type ind model rag

* fix docstrings

* introduce seq_length variable for cleaner code

* fix black formatting

* add input_ids_seq_length to modeling_rag

* add input_ids_seq_length to test

* retrigger checks

* retrigger checks

Co-authored-by: Kevin Bondzio <kev@AIM-LAP-02.local>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Kevin Bondzio <kev@AIM-LAP-02.fritz.box>
2022-03-11 19:36:44 +01:00
Yih-Dar
b6bdb943b2
Fix a TF test name (LayoutLMModelTest) (#16061)
* fix name

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-11 11:22:36 +01:00
lewtun
6b09328368
Fix duplicate arguments passed to dummy inputs in ONNX export (#16045)
* Fix duplicate arguments passed to dummy inputs in ONNX export

* Fix M2M100 ONNX config

* Ensure we check PreTrained model only if torch is available

* Remove TensorFlow tests for models without PyTorch parity
2022-03-10 20:19:45 +01:00
Suraj Patil
ba21001f4c
support new marian models (#15831)
* support not sharing embeddings

* update modeling

* update tokenizer

* fix conversion script

* always use self.shared

* boom boom

* begin tests

* update tests

* fix resize_decoder_token_embeddings

* address Patrick's comments

* style

* update conversion script

* fix conversion script

* fix tokenizer

* better name target vocab

* add integration test for tokenizer with two vocabs

* style

* address Patrick's comments

* add integration test for model
2022-03-10 19:41:56 +01:00
Sanchit Gandhi
1da84ae02c
Fix Bug in Flax-Speech-Encoder-Decoder Test (#16041)
* Fix Bug in Flax-Speech-Encoder-Decoder Test

* change thresholds for CPU precision
2022-03-10 12:09:29 +01:00
NielsRogge
8d83ebdf18
[Tests] Add attentions_option to ModelTesterMixin (#15909)
* Add attentions_option to common tester

* Fix tests, apply suggestion

* Apply suggestion from code review

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-03-10 12:00:30 +01:00
NielsRogge
0835119bf3
Add Document Image Transformer (DiT) (#15984)
* Add conversion script

* Improve script

* Fix bug

* Add option to push to hub

* Add support for classification models

* Update model name

* Upload feature extractor files first

* Remove hash checking

* Fix config

* Add id2label

* Add import

* Fix id2label file name

* Fix expected shape

* Add model to README

* Improve docs

* Add integration test and fix CI

* Fix code style

* Add missing init

* Add model to SPECIAL_MODULE_TO_TEST_MAP

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-03-10 11:34:44 +01:00
Sanchit Gandhi
fde901877a
Freeze Feature Encoder in FlaxSpeechEncoderDecoder (#15997)
* Freeze Feature Encoder in FlaxSpeechEncoderDecoder

* add backprop test
2022-03-10 09:59:19 +01:00
Sanchit Gandhi
b256f3518d
Add FlaxBartForCausalLM (#15995)
* add causal lm

* add CausalLM tests

* Add FlaxBartForCausalLM

* Add EncoderDecoder model tests

* change docstring

* make repo-consistency

* suggested changes

* remove jax ops

* correction

* rename pre-trained decoder model
2022-03-09 19:53:01 +01:00
lewtun
50dd314d93
Add ONNX export for ViT (#15658)
* Add ONNX support for ViT

* Refactor to use generic preprocessor

* Add vision dep to tests

* Extend ONNX slow tests to ViT

* Add dummy image generator

* Use model_type to determine modality

* Add deprecation warnings for tokenizer argument

* Add warning when overwriting the preprocessor

* Add optional args to docstrings

* Add minimum PyTorch version to OnnxConfig

* Refactor OnnxConfig class variables from CONSTANT_NAME to snake_case

* Add reasonable value for default atol

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-09 17:36:59 +01:00
Yih-Dar
b7fa1e3dee
Use tiny models for get_pretrained_model in TFEncoderDecoderModelTest (#15989)
* Use tiny model for TFRembertEncoderDecoderModelTest.get_pretrained_model()

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-09 17:16:25 +01:00
Francesco Saverio Zuppichini
1e8f37992f
done (#16012) 2022-03-09 15:51:56 +01:00
Yih-Dar
3ea046995e
Removed an outdated check about hdf5_version (#16011)
* removed an outdated check about hdf5_version

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-09 14:21:23 +01:00
Nicolas Patry
f4e4ad34cc
Add ForInstanceSegmentation models to image-segmentation pipelines (#15937)
* Adding ForInstanceSegmentation to pipelines.

* Last fix `category_id` renamed to `label_id`.

* Can't be none no more.

* No `is_thing_map` anymore.
2022-03-09 10:19:05 +01:00
David Hall
5b7dcc7342
Seed _get_train_sampler's generator with arg seed to improve reproducibility (#15961)
* Seed get_train_sampler's generator with arg seed to improve reproducibility

and make the world_size<=1 code path more similar to the others

* move test file into trainer test explicitly

* dumb typo

* make style lint happy

* per discussion, switch to data_seed

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-08 13:45:41 -05:00
Joao Gante
70203b5937
TF generate refactor - past without encoder outputs (#15944)
* Remove packed past from generation_tf_utils

* update models with the new past format

* update template accordingly
2022-03-08 14:46:44 +00:00
Yih-Dar
72983303c5
Fix TFEncoderDecoderModelTest - Pytorch device (#15979)
* fix device

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-08 13:37:20 +01:00
NielsRogge
b19f3e69a0
[Tests] Fix ViTMAE integration test (#15949)
* Fix test across both cpu and gpu

* Fix typo
2022-03-08 10:49:44 +01:00
NielsRogge
9879a1d5f0
Fix LayoutLMv2 test (#15939)
* Fix LayoutLMv2 test

* Update black
2022-03-08 10:49:30 +01:00
Yih-Dar
8b9ae45549
Set scale_embedding to False in some TF tests (#15952)
* set scale_embedding to False to avoid large (> 1e-5) output differences between PT/TF

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-07 22:14:33 +01:00
Sanchit Gandhi
1a62b25caf
Backprop Test for Freeze FlaxWav2Vec2 Feature Encoder (#15938)
* Backprop Test for Freeze FlaxWav2Vec2 Feature Encoder

* remove jnp.ndarray type suggestion

* assert frozen grads are precisely zero
2022-03-07 18:10:15 +01:00
Chan Woo Kim
ef9c3ca348
[Bug Fix] Beam search example in docs fails & a fix (integrating max_length in BeamScorer.finalize()) (#15555)
* added the test and fix

* had left out a comment
2022-03-07 09:10:18 +01:00
Francesco Saverio Zuppichini
9932ee4b4b
made MaskFormerModelTest faster (#15942) 2022-03-04 19:11:48 +01:00
Chan Woo Kim
5c6f57ee75
Constrained Beam Search [*With* Disjunctive Decoding] (#15761)
* added classes to get started with constrained beam search

* in progress, think i can directly force tokens now but not yet with the round robin

* think now i have total control, now need to code the bank selection

* technically works as desired, need to optimize and fix design choices leading to undersirable outputs

* complete PR #1 without disjunctive decoding

* removed incorrect tests

* Delete k.txt

* Delete test.py

* Delete test.sh

* revert changes to test scripts

* genutils

* full implementation with testing, no disjunctive yet

* shifted docs

* passing all tests realistically ran locally

* removing accidentally included print statements

* fixed source of error in initial PR test

* fixing the get_device() vs device trap

* fixed documentation docstrings about constrained_beam_search

* fixed tests having failing for Speech2TextModel's floating point inputs

* fix cuda long tensor

* added examples and testing for them and founx & fixed a bug in beam_search and constrained_beam_search

* deleted accidentally added test halting code with assert False

* code reformat

* Update tests/test_generation_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

* fixing based on comments on PR

* took out the testing code that should but work fails without the beam search moditification ; style changes

* fixing comments issues

* docstrings for ConstraintListState

* typo in PhrsalConstraint docstring

* docstrings improvements

* finished adding what is sort of an opinionated implementation of disjunctive generation, but it revealed errors in inner beam search logic during testing.

* fixed bug found in constrained beam search that used beam_idx that were not global across all the batches

* disjunctive constraint working 100% correctly

* passing all tests

* Accidentally included mlruns

* Update src/transformers/generation_beam_constraints.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/generation_beam_constraints.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* complete overhaul of type complexities and other nits

* strict type checks in generate()

* fixing second round of feedback by narsil

* fixed failing generation test because of type check overhaul

* generation test fail fix

* fixing test fails

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-03-04 18:18:34 +01:00
Francesco Saverio Zuppichini
040c11f6da
Tests for MaskFormerFeatureExtractor's post_process*** methods (#15929)
* proper tests for post_process*** methods in feature extractor

* mask th == 0

* Update tests/maskformer/test_feature_extraction_maskformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-04 18:04:19 +01:00