* Start the work for TFViTModel
* Convert to TF code - need to check in the follow up commits
* Clean up model code
* Expose TFViTModel
* make style
* make quality
* Add test
* make style & quality
* Fix some imports
* fix wrong usage - *kwargs => ** kwargs
* Fix Conv2D weight loading (PT->TF) issue
* Add tests for images with different sizes + fix model
* Fix some common tests for TFViTModel
* Use inputs instead of input_ids in test_compile_tf_model
* Add a comment about transpose and Conv2D in convert_tf_weight_name_to_pt_weight_name
* Avoid transpose in TFViT call
* Fix Conv2D issue in load_tf2_weights_in_pytorch_model
* Use tf.keras.layers.Conv2D instead of tf.nn.conv2d
* Using simpler heuristic to detect Conv2D layer
* Change convert_tf_weight_name_to_pt_weight_name to return TransposeType
* Check tf_weight_shape is not None before using it
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix missing comma
* fix input dtype
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* correct order of overflowing tokens for LayoutLmV2 tokenizer
* test to check order of overflowing_tokens for a seq of input_ids
* fix up quality
* added suggested changes
* check that tests the bbox sequence
* pair_input test added
* pass quality test
* check bbox sequence added
* unittest method
* comments added
* add overflowing bbox test
* improved "seq_1"
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
* improve code quality
Co-authored-by: SaulLu <lucilesaul.com@gmail.com>
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
* minor modification to the wav2vec2 modeling file to support tensor-parallelism with DeepSpeed on this HuggingFace model
* refine the comments
* synch changes
* fix comments
* refine comments
* fix format
* Start PR doc
* Cleanup the quality checks and document them
* Add reference in the contributing guide
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Rename file as per review suggestion
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Fix of issue #13327: Wrong weight initialization for TF t5 model
* run black formatter
* fix typo
* remove my name tag from comments
Co-authored-by: Shirron <dan.shirron@intel.com>
* Adding support for `truncation` parameter on `feature-extraction`
pipeline.
Fixes#14183
* Fixing tests on ibert, longformer, and roberta.
* Rebase fix.
* minimal fixes to run DataCollatorForWholeWordMask with return_tensors="np" and return_tensors="tf"
* more consinstent implementation for numpy_mask_tokens
* Add cross attentions to TFGPT2Model
* change to is_pt_tf_cross_test
* A minor correction to a comment
* Remove n_ctx when creating self.crossattention
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* add Beit model ouput class
* inherting from BaseModelOuputWithPooling
* updated docs if use_mean_pooling is False
* added beit specific outputs in model docs
* changed the import path
* Fix docs
Co-authored-by: Niels Rogge <niels.rogge1@gmail.com>
* check test_configuration_tie
* Fix test_configuration_tie
* make test slow again
* Remove property and use model.module.bind
* revert to slow test
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* Add first draft
* Make forward pass work
* Improve conversion script
* Add notebook that checks if it works
* Add BeitForSemanticSegmentation to the tests
* More improvements
* Make BeitForSemanticSegmentation consistent with Segformer
* Small bug fix
* Add BeitForSemanticSegmentation to docs
* Make sure model doesn't output hidden states when the user doesn't want to
* Make it possible to convert the large model
* Fix issue
* Fix conversion script for large model
* Add auxiliary_head option to semantic segmentation model
* Apply suggestions from @sgugger's review
* Apply suggestions from code review
* Fix failing test
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>