Commit Graph

1163 Commits

Author SHA1 Message Date
Patrick von Platen
c1af180dfe
Add Slack notification support for doc tests (#16253)
* up

* up

* up

* fix

* yeh

* ups

* Empty test commit

* correct quicktour

* correct

* correct

* up

* up

* uP

* uP

* up

* up

* uP

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* Update src/transformers/models/van/modeling_van.py

* finish

* apply suggestions

* remove folder

* revert to daily testing
2022-03-21 11:33:18 +01:00
Steven Liu
ffc319e7b8
Fix links in guides (#16182)
* 🖍 fix links in guides

* 🖍 apply feedback
2022-03-18 16:16:16 -05:00
Stas Bekman
47cccb5318
[Deepspeed] non-HF Trainer doc update (#16238) 2022-03-17 13:33:55 -07:00
Patrick von Platen
8a96b0f10a
[Generate Docs] Correct docs (#16133)
* [Generate Docs] Correct docs

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2022-03-17 20:05:28 +01:00
Francesco Saverio Zuppichini
667b823b89
Swin support for any input size (#15986)
* padding done

* correctly return one attention per layer

* almost correct, attentions are not flatten one tuple per stage

* tests green

* doc

* conversations

* reshaping hidden_states

* view in the test

* reshape_hidden_states in Encoder and Model

* new outputs with reshaped_hidden_states

* conversations

* doc

* Update docs/source/model_doc/swin.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* conversations

* fix tests

* minor changes

* resolved conversations

* attentions one per stage

* typo

* typos

* typos

* function signature

* CI

* clean up tests

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-03-16 18:38:25 +01:00
Sylvain Gugger
4f4e5ddbcb
Framework split (#16030)
* First files

* More files

* Last files

* Style
2022-03-15 10:13:34 -04:00
Markus Sagen
bcaf566038
[Fix doc example] Fix first example for the custom_datasets tutorial (#16087)
* Fix inconsistent example variable naming

- Example code for a sequence classification in Tensorflow had spelling mistakes and incorrect and inconsistent naming
- Changed variable naming to be consistent with the two other TF examples

* Fix incorrect incorrect training examples
2022-03-15 08:17:51 -04:00
Francesco Saverio Zuppichini
0a057201a9
Visual Attention Network (VAN) (#16027)
* encoder works

* addded files

* norm in stage

* convertion script

* tests

* fix copies

* make fix-copies

* fixed __init__

* make fix-copies

* fix

* shapiro test needed

* make fix-copie

* minor changes

* make style + quality

* minor refactor conversion script

* rebase + tests

* removed unused variables

* updated doc

* toctree

* CI

* doc

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* resolved conversations

* make fixup

* config passed to modules

* config passed to modules

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* conversations

* conversations

* copyrights

* normal test

* tests

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-03-15 08:47:12 +01:00
Francesco Saverio Zuppichini
e3008c679f
[WIP] Resnet (#15770)
* first commit

* ResNet model correctly implemented.

basic modeling + weights conversion is done

removed unused doc

mdx file

doc and conversion script

added feature_extractor to auto

test

minor changes + style + quality

doc

test

Delete process.yml

A left over from my attempt of running circleci locally

* minor changes

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* new test format

* minor changes from conversations

* minor changes from conversations

* make style + quality

* readded the tests

* test + README

* minor changes from conversations

* error in README

* make fix-copies

* removed regression for classification head

* make quality

* fixed loss control flow

* fixed loss control flow

* resolved conversations

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* READMEs

* index.mdx

* minor changes

* updated tests and models

* unused import

* outputs

* Update docs/source/model_doc/resnet.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added embeddings_size

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* conversation

* added push to hub

* test

* embedding_size

* make fix-copies

* resolved conversations

* CI

* changed organization

* minor changes

* CI

* minor changes

* conversations

* conversation

* doc

* tests

* removed unused docstring

* conversation

* removed unused outputs

* CI

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-03-14 19:57:55 +01:00
Omar Sanseviero
802984ad42
Fix and document Zero Shot Image Classification (#16079) 2022-03-14 08:50:36 +01:00
lewtun
6e1e88fd38
Add TFCamembertForCausalLM and ONNX integration test (#16073)
* Make Camembert great again!

* Add Camembert to TensorFlow ONNX tests
2022-03-14 08:40:42 +01:00
Stas Bekman
580dd87c55
[Deepspeed] add support for bf16 mode (#14569)
* [WIP] add support for bf16 mode

* prep for bf16

* prep for bf16

* fix; zero2/bf16 is ok

* check bf16 is available

* test fixes

* enable zero3_bf16

* config files

* docs

* split stage_dtype; merge back to non-dtype-specific config file

* fix doc

* cleanup

* cleanup

* bfloat16 => bf16 to match the PR changes

* s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/

* test fixes/skipping

* move

* fix

* Update docs/source/main_classes/deepspeed.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* backticks

* cleanup

* cleanup

* cleanup

* new version

* add note about grad accum in bf16

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-11 17:53:53 -08:00
Steven Liu
ae2dd42be5
Audio/vision task guides (#15808)
* 📝 first draft of audio/vision guides

*  make fixup

* 🖍 fix typo

* 🖍 close parentheses

* 🖍 apply feedback

* 🖍 apply feedback, make fixup

* 🖍 more fixup for perceiver

* 🖍 apply feedback

*  make fixup

* 🖍 fix data collator
2022-03-11 16:43:49 -06:00
Steven Liu
5b4c97d09d
Update troubleshoot guide (#16001)
* 📝 first draft

* 🖍 apply feedback

* 🖍 apply feedback
2022-03-11 13:05:44 -06:00
Sylvain Gugger
f7708e1bed Force default brnahc name via the config 2022-03-11 10:09:15 -05:00
David S. Batista
96ac7549cb
updating fine-tune classifier documentation (#16063) 2022-03-10 16:21:56 -05:00
Patrick von Platen
6ce11c2c0f
[Docs] Improve PyTorch, Flax generate API (#15988)
* Move generate docs

* up

* Update docs/source/_toctree.yml

* correct

* correct some stuff

* correct tests

* more fixes

* finish generate

* add to doc stest

* finish

* finalize

* add warning to generate method
2022-03-10 11:54:45 +01:00
NielsRogge
0835119bf3
Add Document Image Transformer (DiT) (#15984)
* Add conversion script

* Improve script

* Fix bug

* Add option to push to hub

* Add support for classification models

* Update model name

* Upload feature extractor files first

* Remove hash checking

* Fix config

* Add id2label

* Add import

* Fix id2label file name

* Fix expected shape

* Add model to README

* Improve docs

* Add integration test and fix CI

* Fix code style

* Add missing init

* Add model to SPECIAL_MODULE_TO_TEST_MAP

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-03-10 11:34:44 +01:00
Sanchit Gandhi
b256f3518d
Add FlaxBartForCausalLM (#15995)
* add causal lm

* add CausalLM tests

* Add FlaxBartForCausalLM

* Add EncoderDecoder model tests

* change docstring

* make repo-consistency

* suggested changes

* remove jax ops

* correction

* rename pre-trained decoder model
2022-03-09 19:53:01 +01:00
lewtun
50dd314d93
Add ONNX export for ViT (#15658)
* Add ONNX support for ViT

* Refactor to use generic preprocessor

* Add vision dep to tests

* Extend ONNX slow tests to ViT

* Add dummy image generator

* Use model_type to determine modality

* Add deprecation warnings for tokenizer argument

* Add warning when overwriting the preprocessor

* Add optional args to docstrings

* Add minimum PyTorch version to OnnxConfig

* Refactor OnnxConfig class variables from CONSTANT_NAME to snake_case

* Add reasonable value for default atol

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-09 17:36:59 +01:00
Patrick von Platen
c1aaa43935
[Doctests] Move doctests to new GPU & Fix bugs (#15969)
* test

* up

* up

* Empty test commit

* up

* update tests

* up

* fix some vision models

* correct

* correct docs

* Trigger notification

* finalize

* check

* correct quicktour

* Apply suggestions from code review

* improve doctests

* Trigger Build

* next try

* next try

* and again

* Output current clone information

* Output current clone information

* Correct path

* add tf round again

* revert to daily job

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2022-03-09 13:09:56 +01:00
Steven Liu
38cc35069c
Update training scripts docs (#15931)
* 📝 first draft

* 🖍 apply feedback

* 🖍 remove examples from toctree

* 🗑 remove examples from docs/source
2022-03-07 13:29:14 -06:00
Chan Woo Kim
5c6f57ee75
Constrained Beam Search [*With* Disjunctive Decoding] (#15761)
* added classes to get started with constrained beam search

* in progress, think i can directly force tokens now but not yet with the round robin

* think now i have total control, now need to code the bank selection

* technically works as desired, need to optimize and fix design choices leading to undersirable outputs

* complete PR #1 without disjunctive decoding

* removed incorrect tests

* Delete k.txt

* Delete test.py

* Delete test.sh

* revert changes to test scripts

* genutils

* full implementation with testing, no disjunctive yet

* shifted docs

* passing all tests realistically ran locally

* removing accidentally included print statements

* fixed source of error in initial PR test

* fixing the get_device() vs device trap

* fixed documentation docstrings about constrained_beam_search

* fixed tests having failing for Speech2TextModel's floating point inputs

* fix cuda long tensor

* added examples and testing for them and founx & fixed a bug in beam_search and constrained_beam_search

* deleted accidentally added test halting code with assert False

* code reformat

* Update tests/test_generation_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

* fixing based on comments on PR

* took out the testing code that should but work fails without the beam search moditification ; style changes

* fixing comments issues

* docstrings for ConstraintListState

* typo in PhrsalConstraint docstring

* docstrings improvements

* finished adding what is sort of an opinionated implementation of disjunctive generation, but it revealed errors in inner beam search logic during testing.

* fixed bug found in constrained beam search that used beam_idx that were not global across all the batches

* disjunctive constraint working 100% correctly

* passing all tests

* Accidentally included mlruns

* Update src/transformers/generation_beam_constraints.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/generation_beam_constraints.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* complete overhaul of type complexities and other nits

* strict type checks in generate()

* fixing second round of feedback by narsil

* fixed failing generation test because of type check overhaul

* generation test fail fix

* fixing test fails

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-03-04 18:18:34 +01:00
Javier de la Rosa
01485ceec3
Add missing support for Flax XLM-RoBERTa (#15900)
* Adding Flax XLM-RoBERTa

* Add Flax to __init__

* Adding doc and dummy objects

* Add tests

* Add Flax XLM-R models autodoc

* Fix tests

* Add Flask XLM-RoBERTa to TEST_FILES_WITH_NO_COMMON_TESTS

* Update src/transformers/models/xlm_roberta/modeling_flax_xlm_roberta.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update tests/xlm_roberta/test_modeling_flax_xlm_roberta.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update tests/xlm_roberta/test_modeling_flax_xlm_roberta.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Remove test on large Flask XLM-RoBERTa

* Add tokenizer to the test

Co-authored-by: Suraj Patil <surajp815@gmail.com>
2022-03-04 14:36:28 +01:00
Nicolas Patry
89c7d9cfba
Making MaskFormerForInstanceSegmentation. (#15934)
Small adjustments.

Adding in type hint.

Last fix ?

Only include the default dict thing, not the pipelines.
2022-03-04 13:56:15 +01:00
Li-Huai (Allan) Lin
7b3bd1f21a
Fix and improve REALM fine-tuning (#15297)
* Draft

* Add test

* Update src/transformers/models/realm/modeling_realm.py

* Apply suggestion

* Add block_mask

* Update

* Update

* Add block_embedding_to

* Remove no_grad

* Use AutoTokenizer

* Remove model.to overridding
2022-03-03 14:10:15 +01:00
Patrick von Platen
439de3f7f9
[Fix link in pipeline doc] (#15906) 2022-03-03 07:43:13 -05:00
Joao Gante
baab5e7cdf
TF generate refactor - Sample (#15793)
* Add TF logits wrappers 

* Add sample method

* add tests for TF logit wrappers

* TF generate sample tests now run on CPU

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-03-02 16:13:54 +00:00
Francesco Saverio Zuppichini
d83d22f578
Maskformer (#15682)
* maskformer

* conflicts

* conflicts

* minor fixes

* feature extractor test fix

refactor MaskFormerLoss following conversation

MaskFormer related types should not trigger a module time import error

missed one

removed all the types that are not used

update config mapping

minor updates in the doc

resolved conversation that doesn't need a discussion

minor changes

resolved conversations

fixed DetrDecoder

* minor changes

minor changes

fixed mdx file

test feature_extractor return types

functional losses -> classes

removed the return type test for the feature extractor

minor changes + style + quality

* conflicts?

* rebase master

* readme

* added missing files

* deleded poolformers test that where in the wrong palce

* CI

* minor changes

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* resolved conversations

* minor changes

* conversations

[Unispeech] Fix slow tests (#15818)

* remove soundfile old way of loading audio

* Adapt slow test

[Barthez Tokenizer] Fix saving (#15815)

[TFXLNet] Correct tf xlnet generate (#15822)

* [TFXLNet] Correct tf xlnet

* adapt test comment

Fix the push run (#15807)

Fix semantic segmentation pipeline test (#15826)

Fix dummy_inputs() to dummy_inputs in symbolic_trace doc (#15776)

Add model specific output classes to PoolFormer model docs (#15746)

* Added model specific output classes to poolformer docs

* Fixed Segformer typo in Poolformer docs

Adding the option to return_timestamps on pure CTC ASR models. (#15792)

* Adding the option to return_timestamps on pure CTC ASR models.

* Remove `math.prod` which was introduced in Python 3.8

* int are not floats.

* Reworking the PR to support "char" vs "word" output.

* Fixup!

* Update src/transformers/pipelines/automatic_speech_recognition.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines/automatic_speech_recognition.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines/automatic_speech_recognition.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines/automatic_speech_recognition.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines/automatic_speech_recognition.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines/automatic_speech_recognition.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines/automatic_speech_recognition.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines/automatic_speech_recognition.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines/automatic_speech_recognition.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Quality.

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

HFTracer.trace should use/return self.graph to be compatible with torch.fx.Tracer (#15824)

Fix tf.concatenate + test past_key_values for TF models (#15774)

* fix wrong method name tf.concatenate

* add tests related to causal LM / decoder

* make style and quality

* clean-up

* Fix TFBertModel's extended_attention_mask when past_key_values is provided

* Fix tests

* fix copies

* More tf.int8 -> tf.int32 in TF test template

* clean-up

* Update TF test template

* revert the previous commit + update the TF test template

* Fix TF template extended_attention_mask when past_key_values is provided

* Fix some styles manually

* clean-up

* Fix ValueError: too many values to unpack in the test

* Fix more: too many values to unpack in the test

* Add a comment for extended_attention_mask when there is past_key_values

* Fix TFElectra extended_attention_mask when past_key_values is provided

* Add tests to other TF models

* Fix for TF Electra test: add prepare_config_and_inputs_for_decoder

* Fix not passing training arg to lm_head in TFRobertaForCausalLM

* Fix tests (with past) for TF Roberta

* add testing for pask_key_values for TFElectra model

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

[examples/summarization and translation] fix readme (#15833)

Add ONNX Runtime quantization for text classification notebook (#15817)

Re-enable doctests for the quicktour (#15828)

* Re-enable doctests for the quicktour

* Re-enable doctests for task_summary (#15830)

* Remove &

Framework split model report (#15825)

Add TFConvNextModel (#15750)

* feat: initial implementation of convnext in tensorflow.

* fix: sample code for the classification model.

* chore: added checked for  from the classification model.

* chore: set bias initializer in the classification head.

* chore: updated license terms.

* chore: removed ununsed imports

* feat: enabled  argument during using drop_path.

* chore: replaced tf.identity with layers.Activation(linear).

* chore: edited default checkpoint.

* fix: minor bugs in the initializations.

* partial-fix: tf model errors for loading pretrained pt weights.

* partial-fix: call method updated

* partial-fix: cross loading of weights (4x3 variables to be matched)

* chore: removed unneeded comment.

* removed playground.py

* rebasing

* rebasing and removing playground.py.

* fix: renaming TFConvNextStage conv and layer norm layers

* chore: added initializers and other minor additions.

* chore: added initializers and other minor additions.

* add: tests for convnext.

* fix: integration tester class.

* fix: issues mentioned in pr feedback (round 1).

* fix: how output_hidden_states arg is propoagated inside the network.

* feat: handling of  arg for pure cnn models.

* chore: added a note on equal contribution in model docs.

* rebasing

* rebasing and removing playground.py.

* feat: encapsulation for the convnext trunk.

* Fix variable naming; Test-related corrections; Run make fixup

* chore: added Joao as a contributor to convnext.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* chore: corrected copyright year and added comment on NHWC.

* chore: fixed the black version and ran formatting.

* chore: ran make style.

* chore: removed from_pt argument from test, ran make style.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* fix: tests in the convnext subclass, ran make style.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* chore: moved convnext test to the correct location

* fix: locations for the test file of convnext.

* fix: convnext tests.

* chore: applied  sgugger's suggestion for dealing w/ output_attentions.

* chore: added comments.

* chore: applied updated quality enviornment style.

* chore: applied formatting with quality enviornment.

* chore: revert to the previous tests/test_modeling_common.py.

* chore: revert to the original test_modeling_common.py

* chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py

* fix: tests for convnext.

* chore: removed output_attentions argument from convnext config.

* chore: revert to the earlier tf utils.

* fix: output shapes of the hidden states

* chore: removed unnecessary comment

* chore: reverting to the right test_modeling_tf_common.py.

* Styling nits

Co-authored-by: ariG23498 <aritra.born2fly@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

* minor changes

* doc fix in feature extractor

* doc

* typose

* removed detr logic from config

* removed detr logic from config

* removed num_labels

* small fix in the config

* auxilary -> auxiliary

* make style

* some test is failing

* fix a weird char in config prevending doc-builder

* retry to fix the doc-builder issue

* make style

* new try to fix the doc builder

* CI

* change weights to facebook

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: ariG23498 <aritra.born2fly@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
2022-03-02 15:48:20 +01:00
Patrick von Platen
40040727ab
[Bart] Fix implementation note doc (#15879) 2022-03-02 10:24:32 +01:00
Michael Benayoun
4bfe75bd08
M2M100 support for ONNX export (#15193)
* Add M2M100 support for ONNX export

* Delete useless imports

* Add M2M100 to tests

* Fix protobuf issue
2022-03-02 10:03:14 +01:00
Steven Liu
6ccfa2170c
Inference for multilingual models (#15836)
* 📝 first draft for multilingual models

* 🖍 make style
2022-03-01 15:10:31 -06:00
NielsRogge
c008afea3c
Add link to notebooks (#15791)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-03-01 17:44:20 +01:00
Patrick von Platen
9863f7d228
[Benchmark tools] Deprecate all (#15848)
* [Benchmark tools] Deprecate all

* up
2022-03-01 11:26:20 +01:00
Eduardo Gonzalez Ponferrada
df5a4094a6
Add Data2Vec (#15507)
* Add data2vec model cloned from roberta

* Add checkpoint conversion script

* Fix copies

* Update docs

* Add checkpoint conversion script

* Remove fairseq data2vec_text script and fix format

* Add comment on where to get data2vec_text.py

* Remove mock implementation cheat.py and fix style

* Fix copies

* Remove TF and Flax classes from init

* Add back copy from fairseq data2vec_text.py and fix style

* Update model name in docs/source/index.mdx to be CamelCase

* Revert model name in table to lower-case to get check_table test to pass

* Update src/transformers/models/data2vec/__init__.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/convert_data2vec_original_pytorch_checkpoint_to_pytorch.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/model_doc/data2vec.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/data2vec.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/auto/configuration_auto.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/configuration_data2vec.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_modeling_data2vec.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/configuration_data2vec.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update documentation

* Copy-paste Data2VecConfig from BertConfig

* Update config checkpoint to point to edugp/data2vec-nlp-base. Fix style and repo-consistency

* Update config special tokens to match RoBERTa

* Split multiple assertions and add individual error messages

* Rename Data2VecModel to Data2VecForTextModel

* Add Data2Vec to _toctree.yml

* Rename Data2VecEmbeddings to Data2VecForTextEmbeddings

* Add initial Data2VecForAudio model (unfinished). Only matching fairseq's implementation up to the feature encoder (before positional encoding).

* finish audio model

* finish audio file

* Update names and fix style, quality and repo consistency

* Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimicking the Wav2Vec2 test suite. Fix bias initilization in positional conv layers. Move back configurations for audio and text to separate files.

* add inputs to logits to data2vec'

* correct autio models

* correct config auto

* correct tok auto

* Update utils/tests_fetcher.py

* delete unnecessary files

* delete unnecessary files

* further renaming

* make all tests pass

* finish

* remove useless test file

* Update tests/test_modeling_common.py

* Update utils/check_repo.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec_text.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Fix copies

* Update docs

* Remove fairseq data2vec_text script and fix format

* Add comment on where to get data2vec_text.py

* Remove mock implementation cheat.py and fix style

* Fix copies

* Remove TF and Flax classes from init

* Add back copy from fairseq data2vec_text.py and fix style

* Update model name in docs/source/index.mdx to be CamelCase

* Revert model name in table to lower-case to get check_table test to pass

* Update documentation

* Update src/transformers/models/data2vec/__init__.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/convert_data2vec_original_pytorch_checkpoint_to_pytorch.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/auto/configuration_auto.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/configuration_data2vec.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_modeling_data2vec.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/configuration_data2vec.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Copy-paste Data2VecConfig from BertConfig

* Update config checkpoint to point to edugp/data2vec-nlp-base. Fix style and repo-consistency

* Update config special tokens to match RoBERTa

* Split multiple assertions and add individual error messages

* Rename Data2VecModel to Data2VecForTextModel

* Add Data2Vec to _toctree.yml

* Rename Data2VecEmbeddings to Data2VecForTextEmbeddings

* Add initial Data2VecForAudio model (unfinished). Only matching fairseq's implementation up to the feature encoder (before positional encoding).

* finish audio model

* finish audio file

* add inputs to logits to data2vec'

* Update names and fix style, quality and repo consistency

* Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimicking the Wav2Vec2 test suite. Fix bias initilization in positional conv layers. Move back configurations for audio and text to separate files.

* correct autio models

* correct config auto

* correct tok auto

* delete unnecessary files

* delete unnecessary files

* Update utils/tests_fetcher.py

* further renaming

* make all tests pass

* finish

* remove useless test file

* Update tests/test_modeling_common.py

* Update utils/check_repo.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec_text.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Move data2vec tests to new structure

* Fix test imports for text tests

* Remove fairseq files

* Change paper link to arxiv

* Modify Data2Vec documentation to reflect that the encoder is not shared across the audio and text models in the current implementation.

* Update text model checkpoint to be facebook/data2vec-text-base

* Add 'Copy from' statements and update paper links and docs

* fix copy from statements

* improve copied from

* correct more copied from statements

* finish copied from stuff

* make style

* add model to README

* add to master

Co-authored-by: Eduardo Gonzalez Ponferrada <eduardo@ferrumhealth.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-01 11:09:20 +01:00
Sanchit Gandhi
e3342edc4e
Flax Speech-Encoder-Decoder Model (#15613)
* rebase

* Delete shift tokens func

* downsample decoder input seq len for init

* correct attention mask

* add tests

* pt flax cross test

* make fixup

* init file for import

* change pt-flax cross test threshold

* pt-flax test logits only

* move tests

* make repo-consistency

* consistent indentation

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-02-28 12:22:36 +01:00
Sayak Paul
84eaa6acf5
Add TFConvNextModel (#15750)
* feat: initial implementation of convnext in tensorflow.

* fix: sample code for the classification model.

* chore: added checked for  from the classification model.

* chore: set bias initializer in the classification head.

* chore: updated license terms.

* chore: removed ununsed imports

* feat: enabled  argument during using drop_path.

* chore: replaced tf.identity with layers.Activation(linear).

* chore: edited default checkpoint.

* fix: minor bugs in the initializations.

* partial-fix: tf model errors for loading pretrained pt weights.

* partial-fix: call method updated

* partial-fix: cross loading of weights (4x3 variables to be matched)

* chore: removed unneeded comment.

* removed playground.py

* rebasing

* rebasing and removing playground.py.

* fix: renaming TFConvNextStage conv and layer norm layers

* chore: added initializers and other minor additions.

* chore: added initializers and other minor additions.

* add: tests for convnext.

* fix: integration tester class.

* fix: issues mentioned in pr feedback (round 1).

* fix: how output_hidden_states arg is propoagated inside the network.

* feat: handling of  arg for pure cnn models.

* chore: added a note on equal contribution in model docs.

* rebasing

* rebasing and removing playground.py.

* feat: encapsulation for the convnext trunk.

* Fix variable naming; Test-related corrections; Run make fixup

* chore: added Joao as a contributor to convnext.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* chore: corrected copyright year and added comment on NHWC.

* chore: fixed the black version and ran formatting.

* chore: ran make style.

* chore: removed from_pt argument from test, ran make style.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* fix: tests in the convnext subclass, ran make style.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* chore: moved convnext test to the correct location

* fix: locations for the test file of convnext.

* fix: convnext tests.

* chore: applied  sgugger's suggestion for dealing w/ output_attentions.

* chore: added comments.

* chore: applied updated quality enviornment style.

* chore: applied formatting with quality enviornment.

* chore: revert to the previous tests/test_modeling_common.py.

* chore: revert to the original test_modeling_common.py

* chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py

* fix: tests for convnext.

* chore: removed output_attentions argument from convnext config.

* chore: revert to the earlier tf utils.

* fix: output shapes of the hidden states

* chore: removed unnecessary comment

* chore: reverting to the right test_modeling_tf_common.py.

* Styling nits

Co-authored-by: ariG23498 <aritra.born2fly@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
2022-02-25 18:19:16 +01:00
Sylvain Gugger
0118c4f6a8
Re-enable doctests for the quicktour (#15828)
* Re-enable doctests for the quicktour

* Re-enable doctests for task_summary (#15830)

* Remove &
2022-02-25 17:46:38 +01:00
Tanay Mehta
7566734d6f
Add model specific output classes to PoolFormer model docs (#15746)
* Added model specific output classes to poolformer docs

* Fixed Segformer typo in Poolformer docs
2022-02-25 13:43:56 +01:00
Steven Liu
fecb08c2b8
🧼 NLP task guides (#15731)
* clean commit of changes to NLP tasks

* 🖍 apply feedback

* 📝 move tf data collator in multiple choice

Co-authored-by: Steven <stevhliu@gmail.com>
2022-02-23 13:58:33 -06:00
Julien Chaumond
32f5de10a0
[doc] custom_models: mention security features of the Hub (#15768)
* custom_models: tiny doc addition

* mention security feature earlier in the section

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2022-02-23 11:40:06 -05:00
Nicolas Patry
f9582c205a
Adding ZeroShotImageClassificationPipeline (#12119)
* [Proposal] Adding ZeroShotImageClassificationPipeline

- Based on CLIP

* WIP, Resurection in progress.

* Resurrection... achieved.

* Reword handling different `padding_value` for `feature_extractor` and
`tokenizer`.

* Thanks doc-builder !

* Adding docs + global namespace `ZeroShotImageClassificationPipeline`.

* Fixing templates.

* Make the test pass and be robust to floating error.

* Adressing suraj's comments on docs mostly.

* Tf support start.

* TF support.

* Update src/transformers/pipelines/zero_shot_image_classification.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

Co-authored-by: Suraj Patil <surajp815@gmail.com>
2022-02-23 09:41:42 +01:00
Patrick von Platen
c44d3675c2
Time stamps for CTC models (#15687)
* [Wav2Vec2 Time Stamps]

* Add first version

* add word time stamps

* Fix

* save intermediate space

* improve

* [Finish CTC Tokenizer]

* remove @

* remove @

* push

* continue with phonemes

* up

* finish PR

* up

* add example

* rename

* finish

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* correct split

* finalize

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-02-22 19:26:44 +01:00
Francesco Saverio Zuppichini
38bed912e3
added link to our writing-doc document (#15756) 2022-02-22 09:57:28 +01:00
Joao Gante
3956b133b6
TF text classification examples (#15704)
* Working example with to_tf_dataset

* updated text_classification

* more comments
2022-02-21 17:17:59 +00:00
Gunjan Chhablani
2c2a31ffbc
Add missing PLBart entry in README (#15721)
* Add missing PLBart entry in index

* Fix README

* Fix README

* Fix style

* Change to master model doc
2022-02-18 21:11:42 +01:00
Gunjan Chhablani
ae1f835028
Add PLBart (#13269)
* Init PLBART

* Add missing configuration file

* Add conversion script and configurationf ile

* Fix style

* Update modeling and conversion scripts

* Fix scale embedding in config

* Add comment

* Fix conversion script

* Add classification option to conversion script

* Fix vocab size in config doc

* Add tokenizer files from MBart50

* Allow no lang code in regular tokenizer

* Add PLBart Tokenizer Converters

* Remove mask from multi tokenizer

* Remove mask from multi tokenizer

* Change from MBart-50 to MBart tokenizer

* Fix names and modify src/tgt behavior

* Fix imports for tokenizer

* Remove <mask> from multi tokenizer

* Fix style

* Change tokenizer_class to processor_class

* Add attribute map to config class

* Update modeling file to modified MBart code

* Update configuration file to MBart style configuration

* Fix tokenizer

* Separate tokenizers

* Fix error in tokenization auto

* Copy MBart tests

* Replace with MBart tokenization tests

* Fix style

* Fix language code in multi tokenizer

* Fix configuration docs

* Add entry for plbart_multi in transformers init

* Add dummy objects and fix imports

* Fix modeling tests

* Add TODO in config

* Fix copyright year

* Fix modeling docs and test

* Fix some tokenization tests and style

* Add changes from review

* Fix copies

* Fix docs

* Fix docs

* Fix style

* Fix year

* Add changes from review

* Remove extra changes

* Fix base tokenizer and doc

* Fix style

* Fix modeling and slow tokenizer tests

* Remove Multi-tokenizer Converter and Tests

* Delete QA model and Multi Tokenizer dummy objects

* Fix repo consistency and code quality issues

* Fix example documentation

* Fix style

* Remove PLBartTokenizer from type checking in init

* Fix consistency issue

* Add changes from review

* Fix style

* Remove PLBartTokenizerFast

* Remove FastTokenizer converter

* Fix AutoTokenzier mapping

* Add plbart to toctree and fix consistency issues

* Add language codes tokenizer test

* Fix styling and doc issues

* Add fixes for failing tests

* Fix copies

* Fix failing modeling test

* Change assert to assertTrue in modeling tests
2022-02-18 14:17:09 +01:00
Francesco Saverio Zuppichini
240cc6cbdc
Adding a model, more doc for pushing to the hub (#15690)
* doc for adding a model to the hub

* run make style

* resolved conversation

* removed a line

* removed )

* Update docs/source/add_new_model.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/add_new_model.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make style

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-02-18 09:11:18 +01:00
NielsRogge
57882177be
Add SimMIM (#15586)
* Add first draft

* Make model importable

* Make SwinForMaskedImageModeling importable

* Fix imports

* Add missing inits

* Add support for Swin

* Fix bug

* Fix bug

* Fix another bug

* Fix Swin MIM implementation

* Fix default encoder stride

* Fix Swin

* Add print statements for debugging

* Add image_size data argument

* Fix Swin

* Fix image_size

* Add print statements for debugging

* Fix print statement

* Remove print statements

* Improve reshaping of bool_masked_pos

* Add support for DeiT, fix tests

* Improve docstrings

* Apply new black version

* Improve script

* Fix bug

* Improve README

* Apply suggestions from code review

* Remove DS_Store and add to gitignore

* Apply suggestions from code review + fix BEiT Flax

* Revert BEiT changes

* Improve README

* Fix code quality

* Improve README

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-02-17 19:44:55 +01:00
Yih-Dar
92a537d938
Minor fix on README.md (#15688)
* fix README

* fix more arxiv links

* make fix-copies

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-02-17 08:38:32 -05:00