* First draft
* Add YolosForObjectDetection
* Make forward pass work
* Add mid position embeddings
* Add interpolation of position encodings
* Add expected values
* Add YOLOS to tests
* Add integration test
* Support tiny model as well
* Support all models in conversion script
* Remove mid_pe_size attribute
* Make more tests pass
* Add model to README and fix config
* Add copied from statements
* Rename base_model_prefix to vit
* Add missing YOLOS_PRETRAINED_CONFIG_ARCHIVE_MAP
* Apply suggestions from code review
* Apply more suggestions from code review
* Convert remaining checkpoints
* Improve docstrings
* Add YolosFeatureExtractor
* Add feature extractor to docs
* Add corresponding tests
* Fix style
* Fix docs
* Apply suggestion from code review
* Fix bad rebase
* Fix some more bad rebase
* Fix missing character
* Improve docs and variable names
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Add doctest BERT
* make fixup
* fix typo
* change checkpoints
* make fixup
* define doctest output value, update doctest for mobilebert
* solve fix-copies
* update QA target start index and end index
* change checkpoint for docs and reuse defined variable
* Update src/transformers/models/bert/modeling_tf_bert.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* make fixup
* Add Doctest for Albert and Bigbird
* make fixup
* overwrite examples for Albert and Bigbird
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* update longer examples for Bigbird
* using examples from squad_v2
* print out example text
* change name token-classification-big-bird checkpoint to random
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Improve CTRL doctests
* Fix `CTRLForSequenceClassification` flakiness with inconsistent losses
* Remove unused
* Fixup
* Add CTRL to documentation_tests.txt
* Fix control code not being first
* Add output assertions
* Change from sshleifer/tiny-ctrl -> ctrl
* Run `make fixup`
* apply `list` to output logits shape for clarity
* Reduce output loss precision to make assertion more robust
* Add assertion of control code being first
* Fix docstyle
* upper case sentence following control code
* Weird bug fixes
* Add a better generation example
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* Required the values GPTJ unfortunately cannot run the model =)
* Added the file to the doc tests
* Run Fixup and Style
* Fixed with the test versions of gptj. Ran Style and Fixup.
* Trigger ci
* A Minor Change to License
* Fixed spacing added to the benchmark_utils. Then refactored tests to const variables.
* Removed strings that were included as default parameters anyways.
Co-authored-by: ArEnSc <xx.mike.chung.xx@gmail.com>
* First Pass All Tests Pass
* WIP
* Adding file to documentation tests
* Change the base model for the example in the doc test.
* Fix Code Styling by running
make fixup
* Called Style
* Reverted to gpt2 model rather than distill gpt2
Then used a token classification model over a sequence model for an example.
* Fix Styling Issue
* Hopefully ignores the formatting issue.
Co-authored-by: ArEnSc <xx.mike.chung.xx@gmail.com>
* Fixed some bugs involving saving during epochs
* Added tests mimicking the existing examples tests
* Added in json exporting to all `no_trainer` examples for consistency
* Add TapexTokenizer
* Improve docstrings and provide option to provide answer
* Remove option for pretokenized inputs
* Add TAPEX to README
* Fix copies
* Remove option for pretokenized inputs
* Initial commit: add tapex fine-tuning examples on both table-based question answering and table-based fact verification.
* - Draft a README file for running the script and introducing some background.
- Remove unused code lines in tabfact script.
- Disable the deafult `pad_to_max_length` option which is memory-consuming.
* * Support `as_target_tokenizer` function for TapexTokenizer.
* Fix the do_lower_case behaviour of TapexTokenizer.
* Add unit tests for target scenarios and cased/uncased scenarios for both source and target.
* * Replace the label BartTokenizer with TapexTokenizer's as_target_tokenizer function.
* Fix typos in tapex example README.
* * fix the evaluation script - remove the property `task_name`
* * Make the label space more clear for tabfact tasks
* * Using a new fine-tuning script for tapex-base on tabfact.
* * Remove the lowercase code outside the tokenizer - we use the tokenizer to control whether do_lower_case
* Guarantee the hyper-parameter can be run without out-of-memory on 16GB card and report the new reproduced number on wikisql
* * Remove the default tokenizer_name option.
* Provide evaluation command.
* * Support for WikiTableQuestion dataset.
* Fix a typo in README.
* * Fix the datasets's key name in WikiTableQuestions
* Run make fixup and move test to folder
* Fix quality
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Apply some more suggestions from code review
* Improve docstrings
* Overwrite failing test
* Improve comment in example scripts
* Fix rebase
* Add TAPEX to Auto mapping
* Add TAPEX to auto config mappings
* Put TAPEX higher than BART in auto mapping
* Add TAPEX to doc tests
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
Co-authored-by: SivilTaram <qianlxc@outlook.com>
Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Created the Decision Transformer Modle
* updating tests, copy to other machine
* Added last hidden size to Decision Transformer modelling outputs
* Removed copy of original DT file
* made a temporary change to gpt2 to have it conform with the Decision Transformer version
* Updated tests
* Ignoring a file used to test the DT model
* added comments to config file
* added comments and argument descriptions to decision transformer file
* Updated doc
* Ran "make style"
* Remove old model imports
* Removed unused imports, cleaned up init file
* Update docs/source/model_doc/decision_transformer.mdx
added my username
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Reverted changes made to gpt2
* Removed datasets submodule
* Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states
* Added support for return of hidden states, attentions and return dict of gpt2 model.
* Updated tests to include many of the ModelTesterMixin tests.
The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes
* Added missing line to the end of gpt2 file
* Added an integration test for the Decision Transformer
Test performs and autoregressive evaluation for two time steps
* Set done and info to _ to fix failing test
* Updated integration test to be deterministic and check expected outputs
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Removed unnecessary config options
* Cleaned up commented code and old comments.
* Cleaned up commented code.
* Changed DecisionTransformer to Decision Transformer
* Added Decision Transformer to the main README file
* Added copy of GTP2 called DecisionTranformerGPT2Model
* isorted imports
* isorted imports
* Added model to non-English README files
* Ran make fix-copies and corrected some cases.
* Updated index file to include Decision Transformer
* Added gpt2 model as copy inside the Decision Transformer model file
* Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS
* Deleted redundant checkpoint files (I don't know how these got committed)
* Removed testing files. (These should have never been committed)
* Removed accidentally committed files
* Moved the Decision Transformer test to its own directory
* Add type hints for Pegasus (#16324)
* Funnel type hints (#16323)
* add pt funnel type hints
* add tf funnel type hints
* Add type hints for ProphetNet PyTorch (#16272)
* [GLPN] Improve docs (#16331)
* Add link to notebook
* Add link
* Fix bug
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Added type hints for Pytorch Marian calls (#16200)
* Added type hinting for forward functions in pytorch marian
* typo correction
* Removed type hints on functions from BART per Suraj Patil request
* fix import pb
* fix typo
* corrected tuple call
* ran black
* after fix-copies
Some optional tags on primitives were removed, past_key_values in MarianForCausalLM changed from Tuple of Tuple to List
* Fixing copies to roformer and pegasus
Co-authored-by: Clementine Fourrier <cfourrie@inria.fr>
Co-authored-by: matt <rocketknight1@gmail.com>
* Moved DecisionTransformOutput to modeling_decision_transformer
* Moved the example usage to research project and cleaned comments
* Made tests ignore the copy of gpt2 in Decision Transformer
* Added module output to modelling decision transformer
* removed copied gpt2 model from list of transformers models
* Updated tests and created __init__ file for new test location
* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Removed unneeded summary type from config file
* Fixed copies
* Updated pretrained config map to refer to hopper-medium checkpoint
* done (#16340)
* Added Decision transformer to model docs
* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add type annotations for Rembert/Splinter and copies (#16338)
* undo black autoformat
* minor fix to rembert forward with default
* make fix-copies, make quality
* Adding types to template model
* Removing List from the template types
* Remove `Optional` from a couple of types that don't accept `None`
Co-authored-by: matt <rocketknight1@gmail.com>
* [Bug template] Shift responsibilities for long-range (#16344)
* Fix code repetition in serialization guide (#16346)
* Adopt framework-specific blocks for content (#16342)
* ✨ refactor code samples with framework-specific blocks
* ✨ update training.mdx
* 🖍 apply feedback
* Updates the default branch from master to main (#16326)
* Updates the default branch from master to main
* Links from `master` to `main`
* Typo
* Update examples/flax/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Updated model with custom docstring example
* Created the Decision Transformer Modle
* updating tests, copy to other machine
* Added last hidden size to Decision Transformer modelling outputs
* Removed copy of original DT file
* made a temporary change to gpt2 to have it conform with the Decision Transformer version
* Updated tests
* Ignoring a file used to test the DT model
* added comments to config file
* added comments and argument descriptions to decision transformer file
* Updated doc
* Ran "make style"
* Remove old model imports
* Removed unused imports, cleaned up init file
* Update docs/source/model_doc/decision_transformer.mdx
added my username
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Reverted changes made to gpt2
* Removed datasets submodule
* Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states
* Added support for return of hidden states, attentions and return dict of gpt2 model.
* Updated tests to include many of the ModelTesterMixin tests.
The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes
* Added missing line to the end of gpt2 file
* Added an integration test for the Decision Transformer
Test performs and autoregressive evaluation for two time steps
* Set done and info to _ to fix failing test
* Updated integration test to be deterministic and check expected outputs
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Removed unnecessary config options
* Cleaned up commented code and old comments.
* Cleaned up commented code.
* Changed DecisionTransformer to Decision Transformer
* Added Decision Transformer to the main README file
* Added copy of GTP2 called DecisionTranformerGPT2Model
* isorted imports
* isorted imports
* Added model to non-English README files
* Ran make fix-copies and corrected some cases.
* Updated index file to include Decision Transformer
* Added gpt2 model as copy inside the Decision Transformer model file
* Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS
* Deleted redundant checkpoint files (I don't know how these got committed)
* Removed testing files. (These should have never been committed)
* Removed accidentally committed files
* Moved the Decision Transformer test to its own directory
* Moved DecisionTransformOutput to modeling_decision_transformer
* Moved the example usage to research project and cleaned comments
* Made tests ignore the copy of gpt2 in Decision Transformer
* Added module output to modelling decision transformer
* removed copied gpt2 model from list of transformers models
* Updated tests and created __init__ file for new test location
* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Removed unneeded summary type from config file
* Fixed copies
* Updated pretrained config map to refer to hopper-medium checkpoint
* Added Decision transformer to model docs
* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Updated model with custom docstring example
* Updated copies, config auto, and readme files.
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dan Tegzes <48134725+Tegzes@users.noreply.github.com>
Co-authored-by: Adam Montgomerie <adam@avanssion.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Co-authored-by: Clementine Fourrier <cfourrie@inria.fr>
Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com>
Co-authored-by: Jacob Dineen <54680234+jacobdineen@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
* Split file_utils in several submodules
* Fixes
* Add back more objects
* More fixes
* Who exactly decided to import that from there?
* Second suggestion to code with code review
* Revert wront move
* Fix imports
* Adapt all imports
* Adapt all imports everywhere
* Revert this import, will fix in a separate commit
* Updates the default branch from master to main
* Links from `master` to `main`
* Typo
* Update examples/flax/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* up
* up
* up
* fix
* yeh
* ups
* Empty test commit
* correct quicktour
* correct
* correct
* up
* up
* uP
* uP
* up
* up
* uP
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* Update src/transformers/models/van/modeling_van.py
* finish
* apply suggestions
* remove folder
* revert to daily testing
* first commit
* ResNet model correctly implemented.
basic modeling + weights conversion is done
removed unused doc
mdx file
doc and conversion script
added feature_extractor to auto
test
minor changes + style + quality
doc
test
Delete process.yml
A left over from my attempt of running circleci locally
* minor changes
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* new test format
* minor changes from conversations
* minor changes from conversations
* make style + quality
* readded the tests
* test + README
* minor changes from conversations
* error in README
* make fix-copies
* removed regression for classification head
* make quality
* fixed loss control flow
* fixed loss control flow
* resolved conversations
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* READMEs
* index.mdx
* minor changes
* updated tests and models
* unused import
* outputs
* Update docs/source/model_doc/resnet.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* added embeddings_size
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* conversation
* added push to hub
* test
* embedding_size
* make fix-copies
* resolved conversations
* CI
* changed organization
* minor changes
* CI
* minor changes
* conversations
* conversation
* doc
* tests
* removed unused docstring
* conversation
* removed unused outputs
* CI
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* maskformer
* conflicts
* conflicts
* minor fixes
* feature extractor test fix
refactor MaskFormerLoss following conversation
MaskFormer related types should not trigger a module time import error
missed one
removed all the types that are not used
update config mapping
minor updates in the doc
resolved conversation that doesn't need a discussion
minor changes
resolved conversations
fixed DetrDecoder
* minor changes
minor changes
fixed mdx file
test feature_extractor return types
functional losses -> classes
removed the return type test for the feature extractor
minor changes + style + quality
* conflicts?
* rebase master
* readme
* added missing files
* deleded poolformers test that where in the wrong palce
* CI
* minor changes
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* resolved conversations
* minor changes
* conversations
[Unispeech] Fix slow tests (#15818)
* remove soundfile old way of loading audio
* Adapt slow test
[Barthez Tokenizer] Fix saving (#15815)
[TFXLNet] Correct tf xlnet generate (#15822)
* [TFXLNet] Correct tf xlnet
* adapt test comment
Fix the push run (#15807)
Fix semantic segmentation pipeline test (#15826)
Fix dummy_inputs() to dummy_inputs in symbolic_trace doc (#15776)
Add model specific output classes to PoolFormer model docs (#15746)
* Added model specific output classes to poolformer docs
* Fixed Segformer typo in Poolformer docs
Adding the option to return_timestamps on pure CTC ASR models. (#15792)
* Adding the option to return_timestamps on pure CTC ASR models.
* Remove `math.prod` which was introduced in Python 3.8
* int are not floats.
* Reworking the PR to support "char" vs "word" output.
* Fixup!
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Quality.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
HFTracer.trace should use/return self.graph to be compatible with torch.fx.Tracer (#15824)
Fix tf.concatenate + test past_key_values for TF models (#15774)
* fix wrong method name tf.concatenate
* add tests related to causal LM / decoder
* make style and quality
* clean-up
* Fix TFBertModel's extended_attention_mask when past_key_values is provided
* Fix tests
* fix copies
* More tf.int8 -> tf.int32 in TF test template
* clean-up
* Update TF test template
* revert the previous commit + update the TF test template
* Fix TF template extended_attention_mask when past_key_values is provided
* Fix some styles manually
* clean-up
* Fix ValueError: too many values to unpack in the test
* Fix more: too many values to unpack in the test
* Add a comment for extended_attention_mask when there is past_key_values
* Fix TFElectra extended_attention_mask when past_key_values is provided
* Add tests to other TF models
* Fix for TF Electra test: add prepare_config_and_inputs_for_decoder
* Fix not passing training arg to lm_head in TFRobertaForCausalLM
* Fix tests (with past) for TF Roberta
* add testing for pask_key_values for TFElectra model
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
[examples/summarization and translation] fix readme (#15833)
Add ONNX Runtime quantization for text classification notebook (#15817)
Re-enable doctests for the quicktour (#15828)
* Re-enable doctests for the quicktour
* Re-enable doctests for task_summary (#15830)
* Remove &
Framework split model report (#15825)
Add TFConvNextModel (#15750)
* feat: initial implementation of convnext in tensorflow.
* fix: sample code for the classification model.
* chore: added checked for from the classification model.
* chore: set bias initializer in the classification head.
* chore: updated license terms.
* chore: removed ununsed imports
* feat: enabled argument during using drop_path.
* chore: replaced tf.identity with layers.Activation(linear).
* chore: edited default checkpoint.
* fix: minor bugs in the initializations.
* partial-fix: tf model errors for loading pretrained pt weights.
* partial-fix: call method updated
* partial-fix: cross loading of weights (4x3 variables to be matched)
* chore: removed unneeded comment.
* removed playground.py
* rebasing
* rebasing and removing playground.py.
* fix: renaming TFConvNextStage conv and layer norm layers
* chore: added initializers and other minor additions.
* chore: added initializers and other minor additions.
* add: tests for convnext.
* fix: integration tester class.
* fix: issues mentioned in pr feedback (round 1).
* fix: how output_hidden_states arg is propoagated inside the network.
* feat: handling of arg for pure cnn models.
* chore: added a note on equal contribution in model docs.
* rebasing
* rebasing and removing playground.py.
* feat: encapsulation for the convnext trunk.
* Fix variable naming; Test-related corrections; Run make fixup
* chore: added Joao as a contributor to convnext.
* rebasing
* rebasing and removing playground.py.
* rebasing
* rebasing and removing playground.py.
* chore: corrected copyright year and added comment on NHWC.
* chore: fixed the black version and ran formatting.
* chore: ran make style.
* chore: removed from_pt argument from test, ran make style.
* rebasing
* rebasing and removing playground.py.
* rebasing
* rebasing and removing playground.py.
* fix: tests in the convnext subclass, ran make style.
* rebasing
* rebasing and removing playground.py.
* rebasing
* rebasing and removing playground.py.
* chore: moved convnext test to the correct location
* fix: locations for the test file of convnext.
* fix: convnext tests.
* chore: applied sgugger's suggestion for dealing w/ output_attentions.
* chore: added comments.
* chore: applied updated quality enviornment style.
* chore: applied formatting with quality enviornment.
* chore: revert to the previous tests/test_modeling_common.py.
* chore: revert to the original test_modeling_common.py
* chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py
* fix: tests for convnext.
* chore: removed output_attentions argument from convnext config.
* chore: revert to the earlier tf utils.
* fix: output shapes of the hidden states
* chore: removed unnecessary comment
* chore: reverting to the right test_modeling_tf_common.py.
* Styling nits
Co-authored-by: ariG23498 <aritra.born2fly@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
* minor changes
* doc fix in feature extractor
* doc
* typose
* removed detr logic from config
* removed detr logic from config
* removed num_labels
* small fix in the config
* auxilary -> auxiliary
* make style
* some test is failing
* fix a weird char in config prevending doc-builder
* retry to fix the doc-builder issue
* make style
* new try to fix the doc builder
* CI
* change weights to facebook
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: ariG23498 <aritra.born2fly@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
* Add data2vec model cloned from roberta
* Add checkpoint conversion script
* Fix copies
* Update docs
* Add checkpoint conversion script
* Remove fairseq data2vec_text script and fix format
* Add comment on where to get data2vec_text.py
* Remove mock implementation cheat.py and fix style
* Fix copies
* Remove TF and Flax classes from init
* Add back copy from fairseq data2vec_text.py and fix style
* Update model name in docs/source/index.mdx to be CamelCase
* Revert model name in table to lower-case to get check_table test to pass
* Update src/transformers/models/data2vec/__init__.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/convert_data2vec_original_pytorch_checkpoint_to_pytorch.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update docs/source/model_doc/data2vec.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update docs/source/model_doc/data2vec.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/auto/configuration_auto.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/configuration_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update tests/test_modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/configuration_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update documentation
* Copy-paste Data2VecConfig from BertConfig
* Update config checkpoint to point to edugp/data2vec-nlp-base. Fix style and repo-consistency
* Update config special tokens to match RoBERTa
* Split multiple assertions and add individual error messages
* Rename Data2VecModel to Data2VecForTextModel
* Add Data2Vec to _toctree.yml
* Rename Data2VecEmbeddings to Data2VecForTextEmbeddings
* Add initial Data2VecForAudio model (unfinished). Only matching fairseq's implementation up to the feature encoder (before positional encoding).
* finish audio model
* finish audio file
* Update names and fix style, quality and repo consistency
* Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimicking the Wav2Vec2 test suite. Fix bias initilization in positional conv layers. Move back configurations for audio and text to separate files.
* add inputs to logits to data2vec'
* correct autio models
* correct config auto
* correct tok auto
* Update utils/tests_fetcher.py
* delete unnecessary files
* delete unnecessary files
* further renaming
* make all tests pass
* finish
* remove useless test file
* Update tests/test_modeling_common.py
* Update utils/check_repo.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec_text.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Fix copies
* Update docs
* Remove fairseq data2vec_text script and fix format
* Add comment on where to get data2vec_text.py
* Remove mock implementation cheat.py and fix style
* Fix copies
* Remove TF and Flax classes from init
* Add back copy from fairseq data2vec_text.py and fix style
* Update model name in docs/source/index.mdx to be CamelCase
* Revert model name in table to lower-case to get check_table test to pass
* Update documentation
* Update src/transformers/models/data2vec/__init__.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/convert_data2vec_original_pytorch_checkpoint_to_pytorch.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/auto/configuration_auto.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/configuration_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update tests/test_modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/configuration_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Copy-paste Data2VecConfig from BertConfig
* Update config checkpoint to point to edugp/data2vec-nlp-base. Fix style and repo-consistency
* Update config special tokens to match RoBERTa
* Split multiple assertions and add individual error messages
* Rename Data2VecModel to Data2VecForTextModel
* Add Data2Vec to _toctree.yml
* Rename Data2VecEmbeddings to Data2VecForTextEmbeddings
* Add initial Data2VecForAudio model (unfinished). Only matching fairseq's implementation up to the feature encoder (before positional encoding).
* finish audio model
* finish audio file
* add inputs to logits to data2vec'
* Update names and fix style, quality and repo consistency
* Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimicking the Wav2Vec2 test suite. Fix bias initilization in positional conv layers. Move back configurations for audio and text to separate files.
* correct autio models
* correct config auto
* correct tok auto
* delete unnecessary files
* delete unnecessary files
* Update utils/tests_fetcher.py
* further renaming
* make all tests pass
* finish
* remove useless test file
* Update tests/test_modeling_common.py
* Update utils/check_repo.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/data2vec/modeling_data2vec_text.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Move data2vec tests to new structure
* Fix test imports for text tests
* Remove fairseq files
* Change paper link to arxiv
* Modify Data2Vec documentation to reflect that the encoder is not shared across the audio and text models in the current implementation.
* Update text model checkpoint to be facebook/data2vec-text-base
* Add 'Copy from' statements and update paper links and docs
* fix copy from statements
* improve copied from
* correct more copied from statements
* finish copied from stuff
* make style
* add model to README
* add to master
Co-authored-by: Eduardo Gonzalez Ponferrada <eduardo@ferrumhealth.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* rebase
* Delete shift tokens func
* downsample decoder input seq len for init
* correct attention mask
* add tests
* pt flax cross test
* make fixup
* init file for import
* change pt-flax cross test threshold
* pt-flax test logits only
* move tests
* make repo-consistency
* consistent indentation
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add TensorFlow support for ONNX export
* Change documentation to mention conversion with Tensorflow
* Refactor export into export_pytorch and export_tensorflow
* Check model's type instead of framework installation to choose between TF and Pytorch
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Alberto Bégué <alberto.begue@della.ai>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
* fix_torch_device_generate_test
* remove @
* doc tests
* up
* up
* fix doctests
* adapt files
* finish refactor
* up
* save intermediate
* add more logic
* new change
* improve
* next try
* next try
* next try
* next try
* fix final spaces
* fix final spaces
* improve
* renaming
* correct more bugs
* finish wavlm
* add comment
* run on test runner
* finish all speech models
* adapt
* finish
* Add new model like command
* Bad doc-styler
* black and doc-styler, stop fighting!
* black and doc-styler, stop fighting!
* At last
* Clean up
* Typo
* Bad doc-styler
* Bad doc-styler
* All good maybe?
* Use constants
* Add doc and type hints
* More cleaning
* Add doc
* Fix Copied from
* Doc template
* Use typing.Pattern instead
* Framework-specific files
* Fixes
* Select frameworks clean model init
* Deal with frameworks in main init
* fixes
* Last fix
* Prompt user for info
* Delete exemple config
* Last fixes
* Add test config
* Fix bug with model_type included in each other
* Fixes
* More fixes
* More fixes
* Adapt config
* Remove print statements
* Will fix tokenization later, leave it broken for now
* Add test
* Quality
* Try this way
* Debug
* Maybe by setting the path?
* Let's try another way
* It should go better when actually passing the arg...
* Remove debug statements and style
* Fix config
* Add tests
* Test require the three backends
* intermediate commit
* Revamp pattern replacements and start work on feature extractors
* Adapt model info
* Finalize code for processors
* Fix in main init additions
* Finish questionnaire for processing classes
* Fix file name
* Fix for real
* Fix patterns
* Style
* Remove needless warnings
* Copied from should work now.
* Include Copied form in blocks
* Add test
* More fixes and tests
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address review comment
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Refine errors for pretrained objects
* PoC to avoid using get_list_of_files
* Adapt tests to use new errors
* Quality + Fix PoC
* Revert "PoC to avoid using get_list_of_files"
This reverts commit cb93b7cae8.
* Revert "Quality + Fix PoC"
This reverts commit 3ba6d0d4ca.
* Fix doc
* Revert PoC
* Add feature extractors
* More tests and PT model
* Adapt error message
* Feature extractor tests
* TF model
* Flax model and test
* Merge flax auto tests
* Add tokenization
* Fix test
* First commit
* Add conversion script
* Make conversion script work for base model
* More improvements
* Update conversion script, works for vqa
* Add indexing argument to meshgrid
* Make conversion script work for ViltForPreTraining
* Add ViltForPreTraining to docs
* Fix device issue
* Add processor
* Add MinMaxResize to feature extractor
* Implement call method of ViltProcessor
* Fix tests
* Add integration test
* Add loss calculation for VQA
* Improve tests
* Improve some more tests
* Debug tests
* Small improvements
* Add support for attention_mask
* Remove mask_it
* Add pixel_mask
* Add tests for ViltFeatureExtractor
* Improve tests
* Add ViltForNaturalLanguageVisualReasoning
* Add ViltForNaturalLanguageVisualReasoning to conversion script
* Minor fixes
* Add support for image_embeds, update docstrings to markdown
* Update docs to markdown
* Improve conversion script
* Rename ViltForPreTraining to ViltForMaskedLM
* Improve conversion script
* Convert docstrings to markdown
* Fix code example of retrieval model
* Properly convert masked language model
* Add integration test for nlvr
* Fix code quality
* Apply suggestions from code review
* Add copied from statements
* Fix pretrained_config_archive_map
* Fix docs
* Add model to README
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Apply more suggestions from code review
* Make code more readable
* Add ViltForNaturalLanguageVisualReasoning to the tests
* Rename ViltForVisualQuestionAnswering to ViltForQuestionAnswering
* Replace pixel_values_2 by single tensor
* Add hidden_states and attentions
* Fix one more test
* Fix all tests
* Update year
* Fix rebase issues
* Fix another rebase issue
* Remove ViltForPreTraining from auto mapping
* Rename ViltForImageRetrievalTextRetrieval to ViltForImageAndTextRetrieval
* Make it possible to use BertTokenizerFast in the processor
* Use BertTokenizerFast by default
* Rename ViltForNaturalLanguageVisualReasoning, define custom model output
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Start the work on TFVisionEncoderDecoderModel
* Expose TFVisionEncoderDecoderModel
* fix import
* Add modeling_tf_vision_encoder_decoder to _ignore_modules in get_model_modules()
* reorder
* Apply the fix for checkpoint loading as in #14016
* remove attention_mask + fix VISION_DUMMY_INPUTS
* A minimal change to make TF generate() work for vision models as encoder in encoder-decoder setting
* fix wrong condition: shape_list(input_ids) == 2
* add tests
* use personal TFViTModel checkpoint (for now)
* Add equivalence tests + projection layer
* style
* make sure projection layer can run
* Add examples
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Clean comments (need to work on TODOs for PyTorch models)
* Remove TF -> PT in check_pt_tf_equivalence for TFVisionEncoderDecoderModel
* fixes
* Revert changes in PT code.
* Update tests/test_modeling_tf_vision_encoder_decoder.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add test_inference_coco_en for TF test
* fix quality
* fix name
* build doc
* add main_input_name
* Fix ckpt name in test
* fix diff between master and this PR
* fix doc
* fix style and quality
* fix missing doc
* fix labels handling
* Delete auto.rst
* Add the changes done in #14016
* fix prefix
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* make style
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Fix bad examples
* Add black formatting to style_doc
* Use first nonempty line
* Put it at the right place
* Don't add spaces to empty lines
* Better templates
* Deal with triple quotes in docstrings
* Result of style_doc
* Enable mdx treatment and fix code examples in MDXs
* Result of doc styler on doc source files
* Last fixes
* Break copy from
* New doc styler
* Fix issue with args at the start
* Code sample fixes
* Style code examples in MDX
* Fix more patterns
* Typo
* Typo
* More patterns
* Do without black for now
* Get more info in error
* Docstring style
* Re-enable check
* Quality
* Fix add_end_docstring decorator
* Fix docstring
* Convert docstrings of all configurations and tokenizers
* Processors and fixes
* Last modeling files and fixes to models
* Pipeline modules
* Utils files
* Data submodule
* All the other files
* Style
* Missing examples
* Style again
* Fix copies
* Say bye bye to rst docstrings forever
* First draft
* Style and remove mlm
* Make forward pass work
* More improvements
* More improvements
* Fix bug
* More improvements
* More improvements
* Add PerceiverTokenizer first draft
* Improve conversion script
* More improvements
* Make conversion script work for the encoder
* Make conversion script work with local pickle files
* Style & quality, fix-copies
* Add dummy input to conversion script
* Add absolute position embeddings to TextPreProcessor
* Make forward pass of encoder work
* More improvements
* Move text preprocessor to separate script
* More improvements
* More improvements
* Add post processor
* Make MLM model work
* Style
* Add PerceiverForMaskedLM
* Add PerceiverImagePreprocessor
* Make style
* Make PerceiverForImageClassification work
* More improvements
* More improvements
* Use tokenizer in conversion script
* Use PerceiverForMaskedLM in conversion script
* Define custom PerceiverModelOutput
* Improve PerceiverAttention to make it work for both MLM and image classification
* More improvements
* More improvements
* More improvements to the conversion script
* Make conversion script work for both MLM and image classification
* Add PerceiverFeatureExtractor
* More improvements
* Style and quality
* Add center cropping
* Fix bug
* Small fix
* Add print statement
* Fix bug in image preprocessor
* Fix bug with conversion script
* Make output position embeddings an nn.Parameter layer instead of nn.Embedding
* Comment out print statements
* Add position encoding classes
* More improvements
* Use position_encoding_kwargs
* Add PerceiverForImageClassificationFourier
* Make style & quality
* Add PerceiverForImageClassificationConvProcessing
* Style & quality
* Add flow model
* Move processors to modeling file
* Make position encodings modular
* Make basic decoder use modular position encodings
* Add PerceiverForOpticalFlow to conversion script
* Add AudioPreprocessor
* Make it possible for the basic decoder to use Fourier position embeddings
* Add PerceiverForMultimodalAutoencoding
* Improve model for optical flow
* Improve _build_network_inputs method
* Add print statement
* Fix device issue
* Fix device of Fourier embeddings
* Add print statements for debugging
* Add another print statement
* Add another print statement
* Add another print statement
* Add another print statement
* Improve PerceiverAudioPreprocessor
* Improve conversion script for multimodal modal
* More improvements
* More improvements
* Improve multimodal model
* Make forward pass multimodal model work
* More improvements
* Improve tests
* Fix some more tests
* Add output dataclasses
* Make more tests pass
* Add print statements for debuggin
* Add tests for image classification
* Add PerceiverClassifierOutput
* More improvements
* Make more tests pass for the optical flow model
* Make style & quality
* Small improvements
* Don't support training for optical flow model for now
* Fix _prepare_for_class for tests
* Make more tests pass, add some docs
* Add multimodal model to tests
* Minor fixes
* Fix tests
* Improve conversion script
* Make fixup
* Remove pos_dim argument
* Fix device issue
* Potential fix for OOM
* Revert previous commit
* Fix test_initialization
* Add print statements for debugging
* Fix print statement
* Add print statement
* Add print statement
* Add print statement
* Add print statement
* Add print statement
* Add print statement
* Remove need for output_shape
* Comment out output_shape
* Remove unnecessary code
* Improve docs
* Fix make fixup
* Remove PerceiverTextProcessor from init
* Improve docs
* Small improvement
* Apply first batch of suggestions from code review
* Apply more suggestions from code review
* Update docstrings
* Define dicts beforehand for readability
* Rename task to architecture in conversion script, include PerceiverModel in tests
* Add print statements for debugging
* Fix tests on GPU
* Remove preprocessors, postprocessors and decoders from main init
* Add integration test
* Fix docs
* Replace einops by torch
* Update for new docs frontend
* Rename PerceiverForImageClassification
* Improve docs
* Improve docs
* Improve docs of PerceiverModel
* Fix some more tests
* Improve center_crop
* Add PerceiverForSequenceClassification
* Small improvements
* Fix tests
* Add integration test for optical flow model
* Clean up
* Add tests for tokenizer
* Fix tokenizer by adding special tokens properly
* Fix CI
* implement MLukeTokenizer and LukeForMaskedLM
* update tests
* update docs
* add LukeForMaskedLM to check_repo.py
* update README
* fix test and specify the entity pad id in tokenization_(m)luke
* fix EntityPredictionHeadTransform
* test: make sure model configs are jsonifiable
* fix: return python dict instead of config object
* fix: accept pretrained config and use correct class
* Re-enabling slow tests and applying them to core models only
* Re-enabling slow tests and applying them to core models only
* Add new test file to fetcher
* Remove tooslow tests from test_modeling_tf_common.py
* make style
* Style fixes
* Style fixes
* Style fixes
* Style fixes
* Adding core tests to GPT2 and BART
* Removing unused imports
Co-authored-by: niklas.fruehauf <niklas.fruehauf@sovanta.com>
Co-authored-by: matt <rocketknight1@gmail.com>
* Start PR doc
* Cleanup the quality checks and document them
* Add reference in the contributing guide
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Rename file as per review suggestion
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Add first draft
* Make forward pass work
* Improve conversion script
* Add notebook that checks if it works
* Add BeitForSemanticSegmentation to the tests
* More improvements
* Make BeitForSemanticSegmentation consistent with Segformer
* Small bug fix
* Add BeitForSemanticSegmentation to docs
* Make sure model doesn't output hidden states when the user doesn't want to
* Make it possible to convert the large model
* Fix issue
* Fix conversion script for large model
* Add auxiliary_head option to semantic segmentation model
* Apply suggestions from @sgugger's review
* Apply suggestions from code review
* Fix failing test
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* First draft
* Make style & quality
* Improve conversion script
* Add print statement to see actual slice
* Make absolute tolerance smaller
* Fix image classification models
* Add post_process_semantic method
* Disable padding
* Improve conversion script
* Rename to ForSemanticSegmentation, add integration test, remove post_process methods
* Improve docs
* Fix code quality
* Fix feature extractor tests
* Fix tests for image classification model
* Delete file
* Add is_torch_available to feature extractor
* Improve documentation of feature extractor methods
* Apply suggestions from @sgugger's code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Apply some more suggestions of code review
* Rebase with master
* Fix rebase issues
* Make sure model only outputs hidden states when the user wants to
* Apply suggestions from code review
* Add pad method
* Support padding of 2d images
* Add print statement
* Add print statement
* Move padding method to SegformerFeatureExtractor
* Fix issue
* Add casting of segmentation maps
* Add test for padding
* Add small note about padding
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add cross attentions to TFGPT2Model
* Add TFEncoderDecoderModel
* Add TFBaseModelOutputWithPoolingAndCrossAttentions
* Add cross attentions to TFBertModel
* Fix past or past_key_values argument issue
* Fix generation
* Fix save and load
* Add some checks and comments
* Clean the code that deals with past keys/values
* Add kwargs to processing_inputs
* Add serving_output to TFEncoderDecoderModel
* Some cleaning + fix use_cache value issue
* Fix tests + add bert2bert/bert2gpt2 tests
* Fix more tests
* Ignore crossattention.bias when loading GPT2 weights into TFGPT2
* Fix return_dict_in_generate in tf generation
* Fix is_token_logit_eos_token bug in tf generation
* Finalize the tests after fixing some bugs
* Fix another is_token_logit_eos_token bug in tf generation
* Add/Update docs
* Add TFBertEncoderDecoderModelTest
* Clean test script
* Add TFEncoderDecoderModel to the library
* Add cross attentions to TFRobertaModel
* Add TFRobertaEncoderDecoderModelTest
* make style
* Change the way of position_ids computation
* bug fix
* Fix copies in tf_albert
* Remove some copied from and apply some fix-copies
* Remove some copied
* Add cross attentions to some other TF models
* Remove encoder_hidden_states from TFLayoutLMModel.call for now
* Make style
* Fix TFRemBertForCausalLM
* Revert the change to longformer + Remove copies
* Revert the change to albert and convbert + Remove copies
* make quality
* make style
* Add TFRembertEncoderDecoderModelTest
* make quality and fix-copies
* test TFRobertaForCausalLM
* Fixes for failed tests
* Fixes for failed tests
* fix more tests
* Fixes for failed tests
* Fix Auto mapping order
* Fix TFRemBertEncoder return value
* fix tf_rembert
* Check copies are OK
* Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined
* Add TFEncoderDecoderModelSaveLoadTests
* fix tf weight loading
* check the change of use_cache
* Revert the change
* Add missing test_for_causal_lm for TFRobertaModelTest
* Try cleaning past
* fix _reorder_cache
* Revert some files to original versions
* Keep as many copies as possible
* Apply suggested changes - Use raise ValueError instead of assert
* Move import to top
* Fix wrong require_torch
* Replace more assert by raise ValueError
* Add test_pt_tf_model_equivalence (the test won't pass for now)
* add test for loading/saving
* finish
* finish
* Remove test_pt_tf_model_equivalence
* Update tf modeling template
* Remove pooling, added in the prev. commit, from MainLayer
* Update tf modeling test template
* Move inputs["use_cache"] = False to modeling_tf_utils.py
* Fix torch.Tensor in the comment
* fix use_cache
* Fix missing use_cache in ElectraConfig
* Add a note to from_pretrained
* Fix style
* Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt
* Fix TFMLP (in TFGPT2) activation issue
* Fix None past_key_values value in serving_output
* Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub
* Apply review suggestions - style for cross_attns in serving_output
* Apply review suggestions - change assert + docstrings
* break the error message to respect the char limit
* deprecate the argument past
* fix docstring style
* Update the encoder-decoder rst file
* fix Unknown interpreted text role "method"
* fix typo
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* beit-flax
* updated FLAX_BEIT_MLM_DOCSTRING
* removed bool_masked_pos from classification
* updated Copyright
* code refactoring: x -> embeddings
* updated test: rm from_pt
* Update docs/source/model_doc/beit.rst
* model code dtype updates and
other changes according to review
* relative_position_bias
revert back to pytorch design
* Properly use test_fetcher for examples
* Fake example modification
* Fake modeling file modification
* Clean fake modifications
* Run example tests for any modification.
* Add the audio classification pipeline
* Remove autoconfig exception
* Mark ffmpeg test as slow
* Rearrange pipeline tests
* Add small test
* Replace asserts with ValueError
* Add hubert classifier + tests
* Add hubert classifier + tests
* Dummies for all classification tests
* Wav2Vec2 classifier + ER test
* Fix hubert integration tests
* Add hubert IC
* Pass tests for all classification tasks on Hubert
* Pass all tests + copies
* Move models to the SUPERB org
* make flax gpt2 working with cross attention
* Remove encoder->decoder projection layer
* A draft (incomplete) for FlaxEncoderDecoderModel
* Add the method from_encoder_decoder_pretrained + the docstrings
* Fix the mistakes of using EncoderDecoderModel
* Fix style
* Add FlaxEncoderDecoderModel to the library
* Fix cyclic imports
* Add FlaxEncoderDecoderModel to modeling_flax_auto.py
* Remove question comments
* add tests for FlaxEncoderDecoderModel
* add flax_encoder_decoder to the lists of ignored entries in check_repo.py
* fix missing required positional arguments
* Remove **kwargs when creating FlaxEncoderDecoderModel in from_encoder_decoder_pretrained()
Also fix generation eos/pad tokens issue
* Fix: Use sequences from the generated_output
* Change a check from assert to raise ValueError
* Fix examples and token ids issues
* Fix missing all_cross_attentions when outputting tuple in modeling_gpt2
* Remove the changes in configuration docstrings.
* allow for bert 2 gpt2
* make fix-copies
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Change remaining examples to bert2gpt2
* Change the test to Bert2GPT2
* Fix examples
* Fix import
* Fix unpack bug
* Rename to FlaxEncoderDecoderModelTest and change the test to bert2gpt2
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Fix: NotImplentedError -> NotImplementedError
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* up
* finalize
Co-authored-by: ydshieh <ydshieh@user.noreply>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Doctests
* Limit to 4 decimals
* Try with separate PT/TF tests
* Remove test for TF
* Ellips the predictions
* Doctest continue on failure
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>