* Start the work for TFViTModel
* Convert to TF code - need to check in the follow up commits
* Clean up model code
* Expose TFViTModel
* make style
* make quality
* Add test
* make style & quality
* Fix some imports
* fix wrong usage - *kwargs => ** kwargs
* Fix Conv2D weight loading (PT->TF) issue
* Add tests for images with different sizes + fix model
* Fix some common tests for TFViTModel
* Use inputs instead of input_ids in test_compile_tf_model
* Add a comment about transpose and Conv2D in convert_tf_weight_name_to_pt_weight_name
* Avoid transpose in TFViT call
* Fix Conv2D issue in load_tf2_weights_in_pytorch_model
* Use tf.keras.layers.Conv2D instead of tf.nn.conv2d
* Using simpler heuristic to detect Conv2D layer
* Change convert_tf_weight_name_to_pt_weight_name to return TransposeType
* Check tf_weight_shape is not None before using it
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix missing comma
* fix input dtype
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add Beit model ouput class
* inherting from BaseModelOuputWithPooling
* updated docs if use_mean_pooling is False
* added beit specific outputs in model docs
* changed the import path
* Fix docs
Co-authored-by: Niels Rogge <niels.rogge1@gmail.com>
* Add first draft
* Make forward pass work
* Improve conversion script
* Add notebook that checks if it works
* Add BeitForSemanticSegmentation to the tests
* More improvements
* Make BeitForSemanticSegmentation consistent with Segformer
* Small bug fix
* Add BeitForSemanticSegmentation to docs
* Make sure model doesn't output hidden states when the user doesn't want to
* Make it possible to convert the large model
* Fix issue
* Fix conversion script for large model
* Add auxiliary_head option to semantic segmentation model
* Apply suggestions from @sgugger's review
* Apply suggestions from code review
* Fix failing test
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* Add the support for the fast (rust) implementation of BlenbderbotTokenizer
* Fix a converter and a typo in a doc
* Apply the patil-suraj's suggestion
* (Nitpick) Fast tokenization -> Fast Tokenization in doc
* Apply the SaulLu's suggestion
* Apply Narsil's suggestion to fix test pipelines
* Add encoder_no_repeat_ngram_size according to the Narsil's suggestion
* Revert the last (unnecessary) commit
* Override pipeline config for Blenderbot to allow for larger pos. emb.
* make fix-copies
* First draft
* Make style & quality
* Improve conversion script
* Add print statement to see actual slice
* Make absolute tolerance smaller
* Fix image classification models
* Add post_process_semantic method
* Disable padding
* Improve conversion script
* Rename to ForSemanticSegmentation, add integration test, remove post_process methods
* Improve docs
* Fix code quality
* Fix feature extractor tests
* Fix tests for image classification model
* Delete file
* Add is_torch_available to feature extractor
* Improve documentation of feature extractor methods
* Apply suggestions from @sgugger's code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Apply some more suggestions of code review
* Rebase with master
* Fix rebase issues
* Make sure model only outputs hidden states when the user wants to
* Apply suggestions from code review
* Add pad method
* Support padding of 2d images
* Add print statement
* Add print statement
* Move padding method to SegformerFeatureExtractor
* Fix issue
* Add casting of segmentation maps
* Add test for padding
* Add small note about padding
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* unispeech
* add copy from
* remove hubert copy from
* finish for today
* add unispeech-sat
* adapt more
* up
* up
* up
* up
* add modeling
* add tests
* up
* up
* finish
* up
* Apply suggestions from code review
* up
* up
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* up
* up
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add API to register a new object in auto classes
* Fix test
* Documentation
* Add to tokenizers and test
* Add cleanup after tests
* Be more careful
* Move import
* Move import
* Cleanup in TF test too
* Add consistency check
* Add documentation
* Style
* Update docs/source/model_doc/auto.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/auto/auto_factory.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* First draft
* Update self-attention of RoBERTa as proposition
* Improve conversion script
* Add TrOCR decoder-only model
* More improvements
* Make forward pass with pretrained weights work
* More improvements
* Some more improvements
* More improvements
* Make conversion work
* Clean up print statements
* Add documentation, processor
* Add test files
* Small improvements
* Some more improvements
* Make fix-copies, improve docs
* Make all vision encoder decoder model tests pass
* Make conversion script support other models
* Update URL for OCR image
* Update conversion script
* Fix style & quality
* Add support for the large-printed model
* Fix some issues
* Add print statement for debugging
* Add print statements for debugging
* Make possible fix for sinusoidal embedding
* Further debugging
* Potential fix v2
* Add more print statements for debugging
* Add more print statements for debugging
* Deubg more
* Comment out print statements
* Make conversion of large printed model possible, address review comments
* Make it possible to convert the stage1 checkpoints
* Clean up code, apply suggestions from code review
* Apply suggestions from code review, use Microsoft models in tests
* Rename encoder_hidden_size to cross_attention_hidden_size
* Improve docs
* Add cross attentions to TFGPT2Model
* Add TFEncoderDecoderModel
* Add TFBaseModelOutputWithPoolingAndCrossAttentions
* Add cross attentions to TFBertModel
* Fix past or past_key_values argument issue
* Fix generation
* Fix save and load
* Add some checks and comments
* Clean the code that deals with past keys/values
* Add kwargs to processing_inputs
* Add serving_output to TFEncoderDecoderModel
* Some cleaning + fix use_cache value issue
* Fix tests + add bert2bert/bert2gpt2 tests
* Fix more tests
* Ignore crossattention.bias when loading GPT2 weights into TFGPT2
* Fix return_dict_in_generate in tf generation
* Fix is_token_logit_eos_token bug in tf generation
* Finalize the tests after fixing some bugs
* Fix another is_token_logit_eos_token bug in tf generation
* Add/Update docs
* Add TFBertEncoderDecoderModelTest
* Clean test script
* Add TFEncoderDecoderModel to the library
* Add cross attentions to TFRobertaModel
* Add TFRobertaEncoderDecoderModelTest
* make style
* Change the way of position_ids computation
* bug fix
* Fix copies in tf_albert
* Remove some copied from and apply some fix-copies
* Remove some copied
* Add cross attentions to some other TF models
* Remove encoder_hidden_states from TFLayoutLMModel.call for now
* Make style
* Fix TFRemBertForCausalLM
* Revert the change to longformer + Remove copies
* Revert the change to albert and convbert + Remove copies
* make quality
* make style
* Add TFRembertEncoderDecoderModelTest
* make quality and fix-copies
* test TFRobertaForCausalLM
* Fixes for failed tests
* Fixes for failed tests
* fix more tests
* Fixes for failed tests
* Fix Auto mapping order
* Fix TFRemBertEncoder return value
* fix tf_rembert
* Check copies are OK
* Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined
* Add TFEncoderDecoderModelSaveLoadTests
* fix tf weight loading
* check the change of use_cache
* Revert the change
* Add missing test_for_causal_lm for TFRobertaModelTest
* Try cleaning past
* fix _reorder_cache
* Revert some files to original versions
* Keep as many copies as possible
* Apply suggested changes - Use raise ValueError instead of assert
* Move import to top
* Fix wrong require_torch
* Replace more assert by raise ValueError
* Add test_pt_tf_model_equivalence (the test won't pass for now)
* add test for loading/saving
* finish
* finish
* Remove test_pt_tf_model_equivalence
* Update tf modeling template
* Remove pooling, added in the prev. commit, from MainLayer
* Update tf modeling test template
* Move inputs["use_cache"] = False to modeling_tf_utils.py
* Fix torch.Tensor in the comment
* fix use_cache
* Fix missing use_cache in ElectraConfig
* Add a note to from_pretrained
* Fix style
* Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt
* Fix TFMLP (in TFGPT2) activation issue
* Fix None past_key_values value in serving_output
* Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub
* Apply review suggestions - style for cross_attns in serving_output
* Apply review suggestions - change assert + docstrings
* break the error message to respect the char limit
* deprecate the argument past
* fix docstring style
* Update the encoder-decoder rst file
* fix Unknown interpreted text role "method"
* fix typo
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Make gradient_checkpointing a training argument
* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Update src/transformers/configuration_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Fix tests
* Style
* document Gradient Checkpointing as a performance feature
* Small rename
* PoC for not using the config
* Adapt BC to new PoC
* Forgot to save
* Rollout changes to all other models
* Fix typo
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
* beit-flax
* updated FLAX_BEIT_MLM_DOCSTRING
* removed bool_masked_pos from classification
* updated Copyright
* code refactoring: x -> embeddings
* updated test: rm from_pt
* Update docs/source/model_doc/beit.rst
* model code dtype updates and
other changes according to review
* relative_position_bias
revert back to pytorch design
* Init FNet
* Update config
* Fix config
* Update model classes
* Update tokenizers to use sentencepiece
* Fix errors in model
* Fix defaults in config
* Remove position embedding type completely
* Fix typo and take only real numbers
* Fix type vocab size in configuration
* Add projection layer to embeddings
* Fix position ids bug in embeddings
* Add minor changes
* Add conversion script and remove CausalLM vestiges
* Fix conversion script
* Fix conversion script
* Remove CausalLM Test
* Update checkpoint names to dummy checkpoints
* Add tokenizer mapping
* Fix modeling file and corresponding tests
* Add tokenization test file
* Add PreTraining model test
* Make style and quality
* Make tokenization base tests work
* Update docs
* Add FastTokenizer tests
* Fix fast tokenizer special tokens
* Fix style and quality
* Remove load_tf_weights vestiges
* Add FNet to main README
* Fix configuration example indentation
* Comment tokenization slow test
* Fix style
* Add changes from review
* Fix style
* Remove bos and eos tokens from tokenizers
* Add tokenizer slow test, TPU transforms, NSP
* Add scipy check
* Add scipy availabilty check to test
* Fix tokenizer and use correct inputs
* Remove remaining TODOs
* Fix tests
* Fix tests
* Comment Fourier Test
* Uncomment Fourier Test
* Change to google checkpoint
* Add changes from review
* Fix activation function
* Fix model integration test
* Add more integration tests
* Add comparison steps to MLM integration test
* Fix style
* Add masked tokenization fix
* Improve mask tokenization fix
* Fix index docs
* Add changes from review
* Fix issue
* Fix failing import in test
* some more fixes
* correct fast tokenizer
* finalize
* make style
* Remove additional tokenization logic
* Set do_lower_case to False
* Allow keeping accents
* Fix tokenization test
* Fix FNet Tokenizer Fast
* fix tests
* make style
* Add tips to FNet docs
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
* [docs] update dead quickstart link on resuing past for GPT2
Thed dead link have been replaced by two links of forward and call methods of the GPT2 class for torch and tensorflow respectively.
* [docs] fix formatting for gpt2 page update
* fix_torch_device_generate_test
* remove @
* up
* correct some bugs
* correct model
* finish speech2text extension
* up
* up
* up
* up
* Update utils/custom_init_isort.py
* up
* up
* update with tokenizer
* correct old tok
* correct old tok
* fix bug
* up
* up
* add more tests
* up
* fix docs
* up
* fix some more tests
* add better config
* correct some more things
"
* fix tests
* improve docs
* Apply suggestions from code review
* Apply suggestions from code review
* final fixes
* finalize
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* apply suggestions Lysandre and Sylvain
* apply nicos suggestions
* upload everything
* finish
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
Co-authored-by: your_github_username <your_github_email>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Add the audio classification pipeline
* Remove autoconfig exception
* Mark ffmpeg test as slow
* Rearrange pipeline tests
* Add small test
* Replace asserts with ValueError
* Deberta_v2 tf
* added new line at the end of file, make style
* +V2, typo
* remove never executed branch of code
* rm cmnt and fixed typo in url filter
* cleanup according to review comments
* added #Copied from