* Remove unused attributes
* Add link to blog and add clarification about input size
* Improve readability of the code
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Update training.mdx
Fixed Error Raised Due to Wrongly Accessing Training Sample
* Ran make style
* Revert to Old Commit
* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Draft a guide with our code quirks for new models
* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* up
* up
* up
* fix
* yeh
* ups
* Empty test commit
* correct quicktour
* correct
* correct
* up
* up
* uP
* uP
* up
* up
* uP
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* Update src/transformers/models/van/modeling_van.py
* finish
* apply suggestions
* remove folder
* revert to daily testing
* [Generate Docs] Correct docs
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* padding done
* correctly return one attention per layer
* almost correct, attentions are not flatten one tuple per stage
* tests green
* doc
* conversations
* reshaping hidden_states
* view in the test
* reshape_hidden_states in Encoder and Model
* new outputs with reshaped_hidden_states
* conversations
* doc
* Update docs/source/model_doc/swin.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* conversations
* fix tests
* minor changes
* resolved conversations
* attentions one per stage
* typo
* typos
* typos
* function signature
* CI
* clean up tests
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Fix inconsistent example variable naming
- Example code for a sequence classification in Tensorflow had spelling mistakes and incorrect and inconsistent naming
- Changed variable naming to be consistent with the two other TF examples
* Fix incorrect incorrect training examples
* first commit
* ResNet model correctly implemented.
basic modeling + weights conversion is done
removed unused doc
mdx file
doc and conversion script
added feature_extractor to auto
test
minor changes + style + quality
doc
test
Delete process.yml
A left over from my attempt of running circleci locally
* minor changes
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* new test format
* minor changes from conversations
* minor changes from conversations
* make style + quality
* readded the tests
* test + README
* minor changes from conversations
* error in README
* make fix-copies
* removed regression for classification head
* make quality
* fixed loss control flow
* fixed loss control flow
* resolved conversations
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* READMEs
* index.mdx
* minor changes
* updated tests and models
* unused import
* outputs
* Update docs/source/model_doc/resnet.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* added embeddings_size
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* conversation
* added push to hub
* test
* embedding_size
* make fix-copies
* resolved conversations
* CI
* changed organization
* minor changes
* CI
* minor changes
* conversations
* conversation
* doc
* tests
* removed unused docstring
* conversation
* removed unused outputs
* CI
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Add ONNX support for ViT
* Refactor to use generic preprocessor
* Add vision dep to tests
* Extend ONNX slow tests to ViT
* Add dummy image generator
* Use model_type to determine modality
* Add deprecation warnings for tokenizer argument
* Add warning when overwriting the preprocessor
* Add optional args to docstrings
* Add minimum PyTorch version to OnnxConfig
* Refactor OnnxConfig class variables from CONSTANT_NAME to snake_case
* Add reasonable value for default atol
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* test
* up
* up
* Empty test commit
* up
* update tests
* up
* fix some vision models
* correct
* correct docs
* Trigger notification
* finalize
* check
* correct quicktour
* Apply suggestions from code review
* improve doctests
* Trigger Build
* next try
* next try
* and again
* Output current clone information
* Output current clone information
* Correct path
* add tf round again
* revert to daily job
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* added classes to get started with constrained beam search
* in progress, think i can directly force tokens now but not yet with the round robin
* think now i have total control, now need to code the bank selection
* technically works as desired, need to optimize and fix design choices leading to undersirable outputs
* complete PR #1 without disjunctive decoding
* removed incorrect tests
* Delete k.txt
* Delete test.py
* Delete test.sh
* revert changes to test scripts
* genutils
* full implementation with testing, no disjunctive yet
* shifted docs
* passing all tests realistically ran locally
* removing accidentally included print statements
* fixed source of error in initial PR test
* fixing the get_device() vs device trap
* fixed documentation docstrings about constrained_beam_search
* fixed tests having failing for Speech2TextModel's floating point inputs
* fix cuda long tensor
* added examples and testing for them and founx & fixed a bug in beam_search and constrained_beam_search
* deleted accidentally added test halting code with assert False
* code reformat
* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update tests/test_generation_utils.py
* fixing based on comments on PR
* took out the testing code that should but work fails without the beam search moditification ; style changes
* fixing comments issues
* docstrings for ConstraintListState
* typo in PhrsalConstraint docstring
* docstrings improvements
* finished adding what is sort of an opinionated implementation of disjunctive generation, but it revealed errors in inner beam search logic during testing.
* fixed bug found in constrained beam search that used beam_idx that were not global across all the batches
* disjunctive constraint working 100% correctly
* passing all tests
* Accidentally included mlruns
* Update src/transformers/generation_beam_constraints.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/generation_beam_constraints.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* complete overhaul of type complexities and other nits
* strict type checks in generate()
* fixing second round of feedback by narsil
* fixed failing generation test because of type check overhaul
* generation test fail fix
* fixing test fails
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add TF logits wrappers
* Add sample method
* add tests for TF logit wrappers
* TF generate sample tests now run on CPU
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* maskformer
* conflicts
* conflicts
* minor fixes
* feature extractor test fix
refactor MaskFormerLoss following conversation
MaskFormer related types should not trigger a module time import error
missed one
removed all the types that are not used
update config mapping
minor updates in the doc
resolved conversation that doesn't need a discussion
minor changes
resolved conversations
fixed DetrDecoder
* minor changes
minor changes
fixed mdx file
test feature_extractor return types
functional losses -> classes
removed the return type test for the feature extractor
minor changes + style + quality
* conflicts?
* rebase master
* readme
* added missing files
* deleded poolformers test that where in the wrong palce
* CI
* minor changes
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* resolved conversations
* minor changes
* conversations
[Unispeech] Fix slow tests (#15818)
* remove soundfile old way of loading audio
* Adapt slow test
[Barthez Tokenizer] Fix saving (#15815)
[TFXLNet] Correct tf xlnet generate (#15822)
* [TFXLNet] Correct tf xlnet
* adapt test comment
Fix the push run (#15807)
Fix semantic segmentation pipeline test (#15826)
Fix dummy_inputs() to dummy_inputs in symbolic_trace doc (#15776)
Add model specific output classes to PoolFormer model docs (#15746)
* Added model specific output classes to poolformer docs
* Fixed Segformer typo in Poolformer docs
Adding the option to return_timestamps on pure CTC ASR models. (#15792)
* Adding the option to return_timestamps on pure CTC ASR models.
* Remove `math.prod` which was introduced in Python 3.8
* int are not floats.
* Reworking the PR to support "char" vs "word" output.
* Fixup!
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Quality.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
HFTracer.trace should use/return self.graph to be compatible with torch.fx.Tracer (#15824)
Fix tf.concatenate + test past_key_values for TF models (#15774)
* fix wrong method name tf.concatenate
* add tests related to causal LM / decoder
* make style and quality
* clean-up
* Fix TFBertModel's extended_attention_mask when past_key_values is provided
* Fix tests
* fix copies
* More tf.int8 -> tf.int32 in TF test template
* clean-up
* Update TF test template
* revert the previous commit + update the TF test template
* Fix TF template extended_attention_mask when past_key_values is provided
* Fix some styles manually
* clean-up
* Fix ValueError: too many values to unpack in the test
* Fix more: too many values to unpack in the test
* Add a comment for extended_attention_mask when there is past_key_values
* Fix TFElectra extended_attention_mask when past_key_values is provided
* Add tests to other TF models
* Fix for TF Electra test: add prepare_config_and_inputs_for_decoder
* Fix not passing training arg to lm_head in TFRobertaForCausalLM
* Fix tests (with past) for TF Roberta
* add testing for pask_key_values for TFElectra model
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
[examples/summarization and translation] fix readme (#15833)
Add ONNX Runtime quantization for text classification notebook (#15817)
Re-enable doctests for the quicktour (#15828)
* Re-enable doctests for the quicktour
* Re-enable doctests for task_summary (#15830)
* Remove &
Framework split model report (#15825)
Add TFConvNextModel (#15750)
* feat: initial implementation of convnext in tensorflow.
* fix: sample code for the classification model.
* chore: added checked for from the classification model.
* chore: set bias initializer in the classification head.
* chore: updated license terms.
* chore: removed ununsed imports
* feat: enabled argument during using drop_path.
* chore: replaced tf.identity with layers.Activation(linear).
* chore: edited default checkpoint.
* fix: minor bugs in the initializations.
* partial-fix: tf model errors for loading pretrained pt weights.
* partial-fix: call method updated
* partial-fix: cross loading of weights (4x3 variables to be matched)
* chore: removed unneeded comment.
* removed playground.py
* rebasing
* rebasing and removing playground.py.
* fix: renaming TFConvNextStage conv and layer norm layers
* chore: added initializers and other minor additions.
* chore: added initializers and other minor additions.
* add: tests for convnext.
* fix: integration tester class.
* fix: issues mentioned in pr feedback (round 1).
* fix: how output_hidden_states arg is propoagated inside the network.
* feat: handling of arg for pure cnn models.
* chore: added a note on equal contribution in model docs.
* rebasing
* rebasing and removing playground.py.
* feat: encapsulation for the convnext trunk.
* Fix variable naming; Test-related corrections; Run make fixup
* chore: added Joao as a contributor to convnext.
* rebasing
* rebasing and removing playground.py.
* rebasing
* rebasing and removing playground.py.
* chore: corrected copyright year and added comment on NHWC.
* chore: fixed the black version and ran formatting.
* chore: ran make style.
* chore: removed from_pt argument from test, ran make style.
* rebasing
* rebasing and removing playground.py.
* rebasing
* rebasing and removing playground.py.
* fix: tests in the convnext subclass, ran make style.
* rebasing
* rebasing and removing playground.py.
* rebasing
* rebasing and removing playground.py.
* chore: moved convnext test to the correct location
* fix: locations for the test file of convnext.
* fix: convnext tests.
* chore: applied sgugger's suggestion for dealing w/ output_attentions.
* chore: added comments.
* chore: applied updated quality enviornment style.
* chore: applied formatting with quality enviornment.
* chore: revert to the previous tests/test_modeling_common.py.
* chore: revert to the original test_modeling_common.py
* chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py
* fix: tests for convnext.
* chore: removed output_attentions argument from convnext config.
* chore: revert to the earlier tf utils.
* fix: output shapes of the hidden states
* chore: removed unnecessary comment
* chore: reverting to the right test_modeling_tf_common.py.
* Styling nits
Co-authored-by: ariG23498 <aritra.born2fly@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
* minor changes
* doc fix in feature extractor
* doc
* typose
* removed detr logic from config
* removed detr logic from config
* removed num_labels
* small fix in the config
* auxilary -> auxiliary
* make style
* some test is failing
* fix a weird char in config prevending doc-builder
* retry to fix the doc-builder issue
* make style
* new try to fix the doc builder
* CI
* change weights to facebook
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: ariG23498 <aritra.born2fly@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
* Add data2vec model cloned from roberta
* Add checkpoint conversion script
* Fix copies
* Update docs
* Add checkpoint conversion script
* Remove fairseq data2vec_text script and fix format
* Add comment on where to get data2vec_text.py
* Remove mock implementation cheat.py and fix style
* Fix copies
* Remove TF and Flax classes from init
* Add back copy from fairseq data2vec_text.py and fix style
* Update model name in docs/source/index.mdx to be CamelCase
* Revert model name in table to lower-case to get check_table test to pass
* Update src/transformers/models/data2vec/__init__.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/convert_data2vec_original_pytorch_checkpoint_to_pytorch.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update docs/source/model_doc/data2vec.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update docs/source/model_doc/data2vec.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/auto/configuration_auto.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/configuration_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update tests/test_modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/configuration_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update documentation
* Copy-paste Data2VecConfig from BertConfig
* Update config checkpoint to point to edugp/data2vec-nlp-base. Fix style and repo-consistency
* Update config special tokens to match RoBERTa
* Split multiple assertions and add individual error messages
* Rename Data2VecModel to Data2VecForTextModel
* Add Data2Vec to _toctree.yml
* Rename Data2VecEmbeddings to Data2VecForTextEmbeddings
* Add initial Data2VecForAudio model (unfinished). Only matching fairseq's implementation up to the feature encoder (before positional encoding).
* finish audio model
* finish audio file
* Update names and fix style, quality and repo consistency
* Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimicking the Wav2Vec2 test suite. Fix bias initilization in positional conv layers. Move back configurations for audio and text to separate files.
* add inputs to logits to data2vec'
* correct autio models
* correct config auto
* correct tok auto
* Update utils/tests_fetcher.py
* delete unnecessary files
* delete unnecessary files
* further renaming
* make all tests pass
* finish
* remove useless test file
* Update tests/test_modeling_common.py
* Update utils/check_repo.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec_text.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Fix copies
* Update docs
* Remove fairseq data2vec_text script and fix format
* Add comment on where to get data2vec_text.py
* Remove mock implementation cheat.py and fix style
* Fix copies
* Remove TF and Flax classes from init
* Add back copy from fairseq data2vec_text.py and fix style
* Update model name in docs/source/index.mdx to be CamelCase
* Revert model name in table to lower-case to get check_table test to pass
* Update documentation
* Update src/transformers/models/data2vec/__init__.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/convert_data2vec_original_pytorch_checkpoint_to_pytorch.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/auto/configuration_auto.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/configuration_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update tests/test_modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/configuration_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Copy-paste Data2VecConfig from BertConfig
* Update config checkpoint to point to edugp/data2vec-nlp-base. Fix style and repo-consistency
* Update config special tokens to match RoBERTa
* Split multiple assertions and add individual error messages
* Rename Data2VecModel to Data2VecForTextModel
* Add Data2Vec to _toctree.yml
* Rename Data2VecEmbeddings to Data2VecForTextEmbeddings
* Add initial Data2VecForAudio model (unfinished). Only matching fairseq's implementation up to the feature encoder (before positional encoding).
* finish audio model
* finish audio file
* add inputs to logits to data2vec'
* Update names and fix style, quality and repo consistency
* Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimicking the Wav2Vec2 test suite. Fix bias initilization in positional conv layers. Move back configurations for audio and text to separate files.
* correct autio models
* correct config auto
* correct tok auto
* delete unnecessary files
* delete unnecessary files
* Update utils/tests_fetcher.py
* further renaming
* make all tests pass
* finish
* remove useless test file
* Update tests/test_modeling_common.py
* Update utils/check_repo.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec_text.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Move data2vec tests to new structure
* Fix test imports for text tests
* Remove fairseq files
* Change paper link to arxiv
* Modify Data2Vec documentation to reflect that the encoder is not shared across the audio and text models in the current implementation.
* Update text model checkpoint to be facebook/data2vec-text-base
* Add 'Copy from' statements and update paper links and docs
* fix copy from statements
* improve copied from
* correct more copied from statements
* finish copied from stuff
* make style
* add model to README
* add to master
Co-authored-by: Eduardo Gonzalez Ponferrada <eduardo@ferrumhealth.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* rebase
* Delete shift tokens func
* downsample decoder input seq len for init
* correct attention mask
* add tests
* pt flax cross test
* make fixup
* init file for import
* change pt-flax cross test threshold
* pt-flax test logits only
* move tests
* make repo-consistency
* consistent indentation
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* feat: initial implementation of convnext in tensorflow.
* fix: sample code for the classification model.
* chore: added checked for from the classification model.
* chore: set bias initializer in the classification head.
* chore: updated license terms.
* chore: removed ununsed imports
* feat: enabled argument during using drop_path.
* chore: replaced tf.identity with layers.Activation(linear).
* chore: edited default checkpoint.
* fix: minor bugs in the initializations.
* partial-fix: tf model errors for loading pretrained pt weights.
* partial-fix: call method updated
* partial-fix: cross loading of weights (4x3 variables to be matched)
* chore: removed unneeded comment.
* removed playground.py
* rebasing
* rebasing and removing playground.py.
* fix: renaming TFConvNextStage conv and layer norm layers
* chore: added initializers and other minor additions.
* chore: added initializers and other minor additions.
* add: tests for convnext.
* fix: integration tester class.
* fix: issues mentioned in pr feedback (round 1).
* fix: how output_hidden_states arg is propoagated inside the network.
* feat: handling of arg for pure cnn models.
* chore: added a note on equal contribution in model docs.
* rebasing
* rebasing and removing playground.py.
* feat: encapsulation for the convnext trunk.
* Fix variable naming; Test-related corrections; Run make fixup
* chore: added Joao as a contributor to convnext.
* rebasing
* rebasing and removing playground.py.
* rebasing
* rebasing and removing playground.py.
* chore: corrected copyright year and added comment on NHWC.
* chore: fixed the black version and ran formatting.
* chore: ran make style.
* chore: removed from_pt argument from test, ran make style.
* rebasing
* rebasing and removing playground.py.
* rebasing
* rebasing and removing playground.py.
* fix: tests in the convnext subclass, ran make style.
* rebasing
* rebasing and removing playground.py.
* rebasing
* rebasing and removing playground.py.
* chore: moved convnext test to the correct location
* fix: locations for the test file of convnext.
* fix: convnext tests.
* chore: applied sgugger's suggestion for dealing w/ output_attentions.
* chore: added comments.
* chore: applied updated quality enviornment style.
* chore: applied formatting with quality enviornment.
* chore: revert to the previous tests/test_modeling_common.py.
* chore: revert to the original test_modeling_common.py
* chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py
* fix: tests for convnext.
* chore: removed output_attentions argument from convnext config.
* chore: revert to the earlier tf utils.
* fix: output shapes of the hidden states
* chore: removed unnecessary comment
* chore: reverting to the right test_modeling_tf_common.py.
* Styling nits
Co-authored-by: ariG23498 <aritra.born2fly@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
* custom_models: tiny doc addition
* mention security feature earlier in the section
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* [Proposal] Adding ZeroShotImageClassificationPipeline
- Based on CLIP
* WIP, Resurection in progress.
* Resurrection... achieved.
* Reword handling different `padding_value` for `feature_extractor` and
`tokenizer`.
* Thanks doc-builder !
* Adding docs + global namespace `ZeroShotImageClassificationPipeline`.
* Fixing templates.
* Make the test pass and be robust to floating error.
* Adressing suraj's comments on docs mostly.
* Tf support start.
* TF support.
* Update src/transformers/pipelines/zero_shot_image_classification.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* doc for adding a model to the hub
* run make style
* resolved conversation
* removed a line
* removed )
* Update docs/source/add_new_model.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/add_new_model.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* make style
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Added all files, PoolFormerFeatureExtractor still failing tests
* Fixed PoolFormerFeatureExtractor not being able to import
* Completed Poolformer doc
* Applied Suggested fixes
* Fixed errors in modeling_auto.py
* Fix feature extractor, convert docs to Markdown, styling of code
* Remove PoolFormer from check_repo and fix integration test
* Remove Poolformer from check_repo
* Fixed configuration_poolformer.py docs and removed inference.py from poolformer
* Ran with black v22
* Added PoolFormer to _toctree.yml
* Updated poolformer doc
* Applied suggested fixes and added on README.md
* Did make fixup and make fix-copies, tests should pass now
* Changed PoolFormer weights conversion script name and fixed README
* Applied fixes in test_modeling_poolformer.py and modeling_poolformer.py
* Added PoolFormerFeatureExtractor to AutoFeatureExtractor API
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
* TF generate start refactor
* Add tf tests for sample generate
* re-organize
* boom boom
* Apply suggestions from code review
* re-add
* add all code
* make random greedy pass
* make encoder-decoder random work
* further improvements
* delete bogus file
* make gpt2 and t5 tests work
* finish logits tests
* correct logits processors
* correct past / encoder_outputs drama
* refactor some methods
* another fix
* refactor shape_list
* fix more shape list
* import shape
_list
* finish docs
* fix imports
* make style
* correct tf utils
* Fix TFRag as well
* Apply Lysandre's and Sylvais suggestions
* Update tests/test_generation_tf_logits_process.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Update src/transformers/tf_utils.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* remove cpu according to gante
* correct logit processor
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Add TensorFlow support for ONNX export
* Change documentation to mention conversion with Tensorflow
* Refactor export into export_pytorch and export_tensorflow
* Check model's type instead of framework installation to choose between TF and Pytorch
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Alberto Bégué <alberto.begue@della.ai>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
* added classes to get started with constrained beam search
* in progress, think i can directly force tokens now but not yet with the round robin
* think now i have total control, now need to code the bank selection
* technically works as desired, need to optimize and fix design choices leading to undersirable outputs
* complete PR #1 without disjunctive decoding
* removed incorrect tests
* Delete k.txt
* Delete test.py
* Delete test.sh
* revert changes to test scripts
* genutils
* full implementation with testing, no disjunctive yet
* shifted docs
* passing all tests realistically ran locally
* removing accidentally included print statements
* fixed source of error in initial PR test
* fixing the get_device() vs device trap
* fixed documentation docstrings about constrained_beam_search
* fixed tests having failing for Speech2TextModel's floating point inputs
* fix cuda long tensor
* added examples and testing for them and founx & fixed a bug in beam_search and constrained_beam_search
* deleted accidentally added test halting code with assert False
* code reformat
* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update tests/test_generation_utils.py
* fixing based on comments on PR
* took out the testing code that should but work fails without the beam search moditification ; style changes
* fixing comments issues
* docstrings for ConstraintListState
* typo in PhrsalConstraint docstring
* docstrings improvements
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* PoC for a ProcessorMixin class
* Documentation
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Roll out to other processors
* Add base feature extractor class in init
* Use args and kwargs
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add wrapper classes
* convert inner layers to tf
* Add TF Encoder and Decoder layers
* TFSpeech2Text models
* Loadable model
* TF model with same outputs as PT model
* test skeleton
* correct tests and run the fixup
* correct attention expansion
* TFSpeech2Text pask_key_values with TF format
* electra is added to onnx supported model
* add google/electra-base-generator for test onnx module
Co-authored-by: Lewis Tunstall <lewis.c.tunstall@gmail.com>
* add xlm roberta xl
* add convert xlm xl fairseq checkpoint to pytorch
* fix init and documents for xlm-roberta-xl
* fix indention
* add test for XLM-R xl,xxl
* fix model hub name
* fix some stuff
* up
* correct init
* fix more
* fix as suggestions
* add torch_device
* fix default values of doc strings
* fix leftovers
* merge to master
* up
* correct hub names
* fix docs
* fix model
* up
* finalize
* last fix
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add copied from
* make style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* clean commit of changes
* apply review feedback, make edits
* fix backticks, minor formatting
* 🖍 make fixup and minor edits
* 🖍 fix # in header
* 📝 update code sample without from_pt
* 📝 final review
* Added missing code in exemplary notebook - custom datasets fine-tuning
Added missing code in tokenize_and_align_labels function in the exemplary notebook on custom datasets - token classification.
The missing code concerns adding labels for all but first token in a single word.
The added code was taken directly from huggingface official example - this [colab notebook](https://github.com/huggingface/notebooks/blob/master/transformers_doc/custom_datasets.ipynb).
* Changes requested in the review - keep the code as simple as possible
* First commit
* Add conversion script
* Make conversion script work for base model
* More improvements
* Update conversion script, works for vqa
* Add indexing argument to meshgrid
* Make conversion script work for ViltForPreTraining
* Add ViltForPreTraining to docs
* Fix device issue
* Add processor
* Add MinMaxResize to feature extractor
* Implement call method of ViltProcessor
* Fix tests
* Add integration test
* Add loss calculation for VQA
* Improve tests
* Improve some more tests
* Debug tests
* Small improvements
* Add support for attention_mask
* Remove mask_it
* Add pixel_mask
* Add tests for ViltFeatureExtractor
* Improve tests
* Add ViltForNaturalLanguageVisualReasoning
* Add ViltForNaturalLanguageVisualReasoning to conversion script
* Minor fixes
* Add support for image_embeds, update docstrings to markdown
* Update docs to markdown
* Improve conversion script
* Rename ViltForPreTraining to ViltForMaskedLM
* Improve conversion script
* Convert docstrings to markdown
* Fix code example of retrieval model
* Properly convert masked language model
* Add integration test for nlvr
* Fix code quality
* Apply suggestions from code review
* Add copied from statements
* Fix pretrained_config_archive_map
* Fix docs
* Add model to README
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Apply more suggestions from code review
* Make code more readable
* Add ViltForNaturalLanguageVisualReasoning to the tests
* Rename ViltForVisualQuestionAnswering to ViltForQuestionAnswering
* Replace pixel_values_2 by single tensor
* Add hidden_states and attentions
* Fix one more test
* Fix all tests
* Update year
* Fix rebase issues
* Fix another rebase issue
* Remove ViltForPreTraining from auto mapping
* Rename ViltForImageRetrievalTextRetrieval to ViltForImageAndTextRetrieval
* Make it possible to use BertTokenizerFast in the processor
* Use BertTokenizerFast by default
* Rename ViltForNaturalLanguageVisualReasoning, define custom model output
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* First draft
* More improvements
* More improvements
* More improvements
* Fix embeddings
* Add conversion script
* Finish conversion script
* More improvements
* Fix forward pass
* Remove print statements
* Add weights initialization
* Add initialization of decoder weights
* Add support for other models in the conversion script
* Fix patch_size for huge model
* Fix most of the tests
* Fix integration test
* Fix docs
* Fix archive_list
* Apply suggestions from code review
* Improve documentation
* Apply more suggestions
* Skip some tests due to non-deterministic behaviour
* Fix test_initialization
* Remove unneccessary initialization of nn.Embedding
* Improve docs
* Fix dummies
* Remove ViTMAEFeatureExtractor from docs
* Add model to README and table of contents
* Delete inference file
* update XLMProphetNet link
* update DPR link
* change prophetnet link
* change link MBART
* change link GPT
* update gpt2 link
* ctrl update link
* update Transformer-XL link
* Update Reformer link
* update xlnet link
* bert update link
* udpate albert link
* roberta update link
* update distilbert link
* update convbert link
* update XLM link
* xlm roberta update link
* update Flaubert link
* update electra link
* update funnel transformer and longformer
* bart update link
* pegasus update link
* udpate marianmt link
* t5 update link
* mt5 update link
* Add ONNX classes to main package
* Remove permalinks from ONNX guide
* Fix ToC entry
* Revert "Add ONNX classes to main package"
This reverts commit eb794a5b00.
* Add ONNX classes to main doc
* Fix syntax highlighting in doc
* Fix text
* Add FeaturesManager to doc
* Use paths to reference ONNX classes
* Add FeaturesManager to init
* Add missing ONNX paths
* Add IBertOnnxConfig and tests
* add all the supported features for IBERT and remove outputs in IbertOnnxConfig
* use OnnxConfig
* fix codestyle
* remove serialization.rst
* codestyle
* Start the work on TFVisionEncoderDecoderModel
* Expose TFVisionEncoderDecoderModel
* fix import
* Add modeling_tf_vision_encoder_decoder to _ignore_modules in get_model_modules()
* reorder
* Apply the fix for checkpoint loading as in #14016
* remove attention_mask + fix VISION_DUMMY_INPUTS
* A minimal change to make TF generate() work for vision models as encoder in encoder-decoder setting
* fix wrong condition: shape_list(input_ids) == 2
* add tests
* use personal TFViTModel checkpoint (for now)
* Add equivalence tests + projection layer
* style
* make sure projection layer can run
* Add examples
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Clean comments (need to work on TODOs for PyTorch models)
* Remove TF -> PT in check_pt_tf_equivalence for TFVisionEncoderDecoderModel
* fixes
* Revert changes in PT code.
* Update tests/test_modeling_tf_vision_encoder_decoder.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add test_inference_coco_en for TF test
* fix quality
* fix name
* build doc
* add main_input_name
* Fix ckpt name in test
* fix diff between master and this PR
* fix doc
* fix style and quality
* fix missing doc
* fix labels handling
* Delete auto.rst
* Add the changes done in #14016
* fix prefix
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* make style
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add FlaxRoFormer
* Clean code + make quality
* Fix output pooling for FlaxRoFormerForMultipleChoiceModule
* Apply suggestions from code review
* add flax model to repos
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Fix bad examples
* Add black formatting to style_doc
* Use first nonempty line
* Put it at the right place
* Don't add spaces to empty lines
* Better templates
* Deal with triple quotes in docstrings
* Result of style_doc
* Enable mdx treatment and fix code examples in MDXs
* Result of doc styler on doc source files
* Last fixes
* Break copy from
* Add ElectraForCausalLM and cover some basic tests & need to fix a few tests
* Fix bugs
* make style
* make fix-copies
* Update doc
* Change docstring to markdown format
* Remove redundant update_keys_to_ignore
* Pipeline chunks.
* Batching for Chunking pipelines ?
* Batching for `question-answering` and `zero-shot-cls`.
* Fixing for FNet.
* Making ASR a chunk pipeline.
* Chunking ASR API.
* doc style.
* Fixing ASR test.
* Fixing QA eror (p_mask, padding is 1, not 0).
* Enable both vad and simple chunking.
* Max length for vad.
* remove inference mode, crashing on s2t.
* Revert ChunkPipeline for ASRpipeline.
Too many knobs for simple integration within the pipeline, better stick
to external convenience functions instead, more control to be had,
simpler pipeline and also easier to replace with other things later.
* Drop necessity for PT for these.
* Enabling generators.
* Add mic + cleanup.
* Typo.
* Typo2.
* Remove ASR work, it does not belong in this PR anymore.
* Update src/transformers/pipelines/pt_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/pipelines/zero_shot_classification.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Adding many comments.
* Doc quality.
* `hidden_states` handling.
* Adding doc.
* Bad rebase.
* Autofixing docs.
* Fixing CRITICAL bug in the new Zerocls pipeline.
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* First commit to add MarianMT to ONNX
* Now MarianModel.forward() automatically generates decoder_input_ids, like BartModel.forward()
* Adjusted MarianOnnxConfig.inputs and outputs to work with seq2seq-lm feature
* Style fix
* Added support for other features for already supported models
* Partial support for causal and seq2seq models
* Partial support for causal and seq2seq models
* Add default task for MarianMT ONNX
* Remove automatic creation of decoder_input_ids
* Extend inputs and outputs for MarianMT ONNX config
* Add MarianMT to ONNX unit tests
* Refactor
* OnnxSeq2SeqConfigWithPast to support seq2seq models
* Parameterized the onnx tests
* Restored run_mlm.py
* Restored run_mlm.py
* [WIP] BART update
* BART and MBART
* Add past_key_values and fix dummy decoder inputs
Using a sequence length of 1 in generate_dummy_outputs() produces large discrepancies, presumably due to some hidden optimisations.
* Refactor MarianOnnxConfig to remove custom past_key_values logic
* Fix quality
* Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)"
This reverts commit 0f4e39c559.
* is_torch_available test to avoid failing imports
* sorting parameterize parameters to solve ERROR gw0 gw1
* tests fix
* tests fix
* GPT2 with past fix
* Fixed stateful class attribute change that was breaking things when converting multiple models sequentially
* Removed onnx file
* Refactor Marian export to account for base changes
* Fix copies
* Implemented suggestions
* Extend support for causal LM
* Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)"
This reverts commit 0f4e39c559.
* is_torch_available test to avoid failing imports
* sorting parameterize parameters to solve ERROR gw0 gw1
* tests fix
* tests fix
* GPT2 with past fix
* Fixed stateful class attribute change that was breaking things when converting multiple models sequentially
* Removed onnx file
* Implemented suggestions
* Fixed __init__ to resolve conflict with master
* Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)"
This reverts commit 0f4e39c559.
* is_torch_available test to avoid failing imports
* sorting parameterize parameters to solve ERROR gw0 gw1
* tests fix
* tests fix
* GPT2 with past fix
* Fixed stateful class attribute change that was breaking things when converting multiple models sequentially
* Removed onnx file
* Implemented suggestions
* Fixed __init__ to resolve conflict with master
* Remove commented import
* Remove ONNX model
* Remove redundant class method
* Tidy up imports
* Fix quality
* Refactor dummy input function
* Add copied from statements to Marian config functions
* Remove false copied from comments
* Fix copy from comment
Co-authored-by: Massimiliano Bruni <massimiliano.bruni@hcl.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
* PoC for conserving old links
* Do the same for other links
* remap the redirects section
* add instructions on how to move sections
* improve
Co-authored-by: Stas Bekman <stas@stason.org>