* Start the work on TFVisionEncoderDecoderModel
* Expose TFVisionEncoderDecoderModel
* fix import
* Add modeling_tf_vision_encoder_decoder to _ignore_modules in get_model_modules()
* reorder
* Apply the fix for checkpoint loading as in #14016
* remove attention_mask + fix VISION_DUMMY_INPUTS
* A minimal change to make TF generate() work for vision models as encoder in encoder-decoder setting
* fix wrong condition: shape_list(input_ids) == 2
* add tests
* use personal TFViTModel checkpoint (for now)
* Add equivalence tests + projection layer
* style
* make sure projection layer can run
* Add examples
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Clean comments (need to work on TODOs for PyTorch models)
* Remove TF -> PT in check_pt_tf_equivalence for TFVisionEncoderDecoderModel
* fixes
* Revert changes in PT code.
* Update tests/test_modeling_tf_vision_encoder_decoder.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add test_inference_coco_en for TF test
* fix quality
* fix name
* build doc
* add main_input_name
* Fix ckpt name in test
* fix diff between master and this PR
* fix doc
* fix style and quality
* fix missing doc
* fix labels handling
* Delete auto.rst
* Add the changes done in #14016
* fix prefix
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* make style
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Fix bad examples
* Add black formatting to style_doc
* Use first nonempty line
* Put it at the right place
* Don't add spaces to empty lines
* Better templates
* Deal with triple quotes in docstrings
* Result of style_doc
* Enable mdx treatment and fix code examples in MDXs
* Result of doc styler on doc source files
* Last fixes
* Break copy from
* New doc styler
* Fix issue with args at the start
* Code sample fixes
* Style code examples in MDX
* Fix more patterns
* Typo
* Typo
* More patterns
* Do without black for now
* Get more info in error
* Docstring style
* Re-enable check
* Quality
* Fix add_end_docstring decorator
* Fix docstring
* Convert docstrings of all configurations and tokenizers
* Processors and fixes
* Last modeling files and fixes to models
* Pipeline modules
* Utils files
* Data submodule
* All the other files
* Style
* Missing examples
* Style again
* Fix copies
* Say bye bye to rst docstrings forever
* First draft
* Style and remove mlm
* Make forward pass work
* More improvements
* More improvements
* Fix bug
* More improvements
* More improvements
* Add PerceiverTokenizer first draft
* Improve conversion script
* More improvements
* Make conversion script work for the encoder
* Make conversion script work with local pickle files
* Style & quality, fix-copies
* Add dummy input to conversion script
* Add absolute position embeddings to TextPreProcessor
* Make forward pass of encoder work
* More improvements
* Move text preprocessor to separate script
* More improvements
* More improvements
* Add post processor
* Make MLM model work
* Style
* Add PerceiverForMaskedLM
* Add PerceiverImagePreprocessor
* Make style
* Make PerceiverForImageClassification work
* More improvements
* More improvements
* Use tokenizer in conversion script
* Use PerceiverForMaskedLM in conversion script
* Define custom PerceiverModelOutput
* Improve PerceiverAttention to make it work for both MLM and image classification
* More improvements
* More improvements
* More improvements to the conversion script
* Make conversion script work for both MLM and image classification
* Add PerceiverFeatureExtractor
* More improvements
* Style and quality
* Add center cropping
* Fix bug
* Small fix
* Add print statement
* Fix bug in image preprocessor
* Fix bug with conversion script
* Make output position embeddings an nn.Parameter layer instead of nn.Embedding
* Comment out print statements
* Add position encoding classes
* More improvements
* Use position_encoding_kwargs
* Add PerceiverForImageClassificationFourier
* Make style & quality
* Add PerceiverForImageClassificationConvProcessing
* Style & quality
* Add flow model
* Move processors to modeling file
* Make position encodings modular
* Make basic decoder use modular position encodings
* Add PerceiverForOpticalFlow to conversion script
* Add AudioPreprocessor
* Make it possible for the basic decoder to use Fourier position embeddings
* Add PerceiverForMultimodalAutoencoding
* Improve model for optical flow
* Improve _build_network_inputs method
* Add print statement
* Fix device issue
* Fix device of Fourier embeddings
* Add print statements for debugging
* Add another print statement
* Add another print statement
* Add another print statement
* Add another print statement
* Improve PerceiverAudioPreprocessor
* Improve conversion script for multimodal modal
* More improvements
* More improvements
* Improve multimodal model
* Make forward pass multimodal model work
* More improvements
* Improve tests
* Fix some more tests
* Add output dataclasses
* Make more tests pass
* Add print statements for debuggin
* Add tests for image classification
* Add PerceiverClassifierOutput
* More improvements
* Make more tests pass for the optical flow model
* Make style & quality
* Small improvements
* Don't support training for optical flow model for now
* Fix _prepare_for_class for tests
* Make more tests pass, add some docs
* Add multimodal model to tests
* Minor fixes
* Fix tests
* Improve conversion script
* Make fixup
* Remove pos_dim argument
* Fix device issue
* Potential fix for OOM
* Revert previous commit
* Fix test_initialization
* Add print statements for debugging
* Fix print statement
* Add print statement
* Add print statement
* Add print statement
* Add print statement
* Add print statement
* Add print statement
* Remove need for output_shape
* Comment out output_shape
* Remove unnecessary code
* Improve docs
* Fix make fixup
* Remove PerceiverTextProcessor from init
* Improve docs
* Small improvement
* Apply first batch of suggestions from code review
* Apply more suggestions from code review
* Update docstrings
* Define dicts beforehand for readability
* Rename task to architecture in conversion script, include PerceiverModel in tests
* Add print statements for debugging
* Fix tests on GPU
* Remove preprocessors, postprocessors and decoders from main init
* Add integration test
* Fix docs
* Replace einops by torch
* Update for new docs frontend
* Rename PerceiverForImageClassification
* Improve docs
* Improve docs
* Improve docs of PerceiverModel
* Fix some more tests
* Improve center_crop
* Add PerceiverForSequenceClassification
* Small improvements
* Fix tests
* Add integration test for optical flow model
* Clean up
* Add tests for tokenizer
* Fix tokenizer by adding special tokens properly
* Fix CI
* implement MLukeTokenizer and LukeForMaskedLM
* update tests
* update docs
* add LukeForMaskedLM to check_repo.py
* update README
* fix test and specify the entity pad id in tokenization_(m)luke
* fix EntityPredictionHeadTransform
* test: make sure model configs are jsonifiable
* fix: return python dict instead of config object
* fix: accept pretrained config and use correct class
* Re-enabling slow tests and applying them to core models only
* Re-enabling slow tests and applying them to core models only
* Add new test file to fetcher
* Remove tooslow tests from test_modeling_tf_common.py
* make style
* Style fixes
* Style fixes
* Style fixes
* Style fixes
* Adding core tests to GPT2 and BART
* Removing unused imports
Co-authored-by: niklas.fruehauf <niklas.fruehauf@sovanta.com>
Co-authored-by: matt <rocketknight1@gmail.com>
* Start PR doc
* Cleanup the quality checks and document them
* Add reference in the contributing guide
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Rename file as per review suggestion
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Add first draft
* Make forward pass work
* Improve conversion script
* Add notebook that checks if it works
* Add BeitForSemanticSegmentation to the tests
* More improvements
* Make BeitForSemanticSegmentation consistent with Segformer
* Small bug fix
* Add BeitForSemanticSegmentation to docs
* Make sure model doesn't output hidden states when the user doesn't want to
* Make it possible to convert the large model
* Fix issue
* Fix conversion script for large model
* Add auxiliary_head option to semantic segmentation model
* Apply suggestions from @sgugger's review
* Apply suggestions from code review
* Fix failing test
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* First draft
* Make style & quality
* Improve conversion script
* Add print statement to see actual slice
* Make absolute tolerance smaller
* Fix image classification models
* Add post_process_semantic method
* Disable padding
* Improve conversion script
* Rename to ForSemanticSegmentation, add integration test, remove post_process methods
* Improve docs
* Fix code quality
* Fix feature extractor tests
* Fix tests for image classification model
* Delete file
* Add is_torch_available to feature extractor
* Improve documentation of feature extractor methods
* Apply suggestions from @sgugger's code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Apply some more suggestions of code review
* Rebase with master
* Fix rebase issues
* Make sure model only outputs hidden states when the user wants to
* Apply suggestions from code review
* Add pad method
* Support padding of 2d images
* Add print statement
* Add print statement
* Move padding method to SegformerFeatureExtractor
* Fix issue
* Add casting of segmentation maps
* Add test for padding
* Add small note about padding
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add cross attentions to TFGPT2Model
* Add TFEncoderDecoderModel
* Add TFBaseModelOutputWithPoolingAndCrossAttentions
* Add cross attentions to TFBertModel
* Fix past or past_key_values argument issue
* Fix generation
* Fix save and load
* Add some checks and comments
* Clean the code that deals with past keys/values
* Add kwargs to processing_inputs
* Add serving_output to TFEncoderDecoderModel
* Some cleaning + fix use_cache value issue
* Fix tests + add bert2bert/bert2gpt2 tests
* Fix more tests
* Ignore crossattention.bias when loading GPT2 weights into TFGPT2
* Fix return_dict_in_generate in tf generation
* Fix is_token_logit_eos_token bug in tf generation
* Finalize the tests after fixing some bugs
* Fix another is_token_logit_eos_token bug in tf generation
* Add/Update docs
* Add TFBertEncoderDecoderModelTest
* Clean test script
* Add TFEncoderDecoderModel to the library
* Add cross attentions to TFRobertaModel
* Add TFRobertaEncoderDecoderModelTest
* make style
* Change the way of position_ids computation
* bug fix
* Fix copies in tf_albert
* Remove some copied from and apply some fix-copies
* Remove some copied
* Add cross attentions to some other TF models
* Remove encoder_hidden_states from TFLayoutLMModel.call for now
* Make style
* Fix TFRemBertForCausalLM
* Revert the change to longformer + Remove copies
* Revert the change to albert and convbert + Remove copies
* make quality
* make style
* Add TFRembertEncoderDecoderModelTest
* make quality and fix-copies
* test TFRobertaForCausalLM
* Fixes for failed tests
* Fixes for failed tests
* fix more tests
* Fixes for failed tests
* Fix Auto mapping order
* Fix TFRemBertEncoder return value
* fix tf_rembert
* Check copies are OK
* Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined
* Add TFEncoderDecoderModelSaveLoadTests
* fix tf weight loading
* check the change of use_cache
* Revert the change
* Add missing test_for_causal_lm for TFRobertaModelTest
* Try cleaning past
* fix _reorder_cache
* Revert some files to original versions
* Keep as many copies as possible
* Apply suggested changes - Use raise ValueError instead of assert
* Move import to top
* Fix wrong require_torch
* Replace more assert by raise ValueError
* Add test_pt_tf_model_equivalence (the test won't pass for now)
* add test for loading/saving
* finish
* finish
* Remove test_pt_tf_model_equivalence
* Update tf modeling template
* Remove pooling, added in the prev. commit, from MainLayer
* Update tf modeling test template
* Move inputs["use_cache"] = False to modeling_tf_utils.py
* Fix torch.Tensor in the comment
* fix use_cache
* Fix missing use_cache in ElectraConfig
* Add a note to from_pretrained
* Fix style
* Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt
* Fix TFMLP (in TFGPT2) activation issue
* Fix None past_key_values value in serving_output
* Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub
* Apply review suggestions - style for cross_attns in serving_output
* Apply review suggestions - change assert + docstrings
* break the error message to respect the char limit
* deprecate the argument past
* fix docstring style
* Update the encoder-decoder rst file
* fix Unknown interpreted text role "method"
* fix typo
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* beit-flax
* updated FLAX_BEIT_MLM_DOCSTRING
* removed bool_masked_pos from classification
* updated Copyright
* code refactoring: x -> embeddings
* updated test: rm from_pt
* Update docs/source/model_doc/beit.rst
* model code dtype updates and
other changes according to review
* relative_position_bias
revert back to pytorch design
* Properly use test_fetcher for examples
* Fake example modification
* Fake modeling file modification
* Clean fake modifications
* Run example tests for any modification.
* Add the audio classification pipeline
* Remove autoconfig exception
* Mark ffmpeg test as slow
* Rearrange pipeline tests
* Add small test
* Replace asserts with ValueError
* Add hubert classifier + tests
* Add hubert classifier + tests
* Dummies for all classification tests
* Wav2Vec2 classifier + ER test
* Fix hubert integration tests
* Add hubert IC
* Pass tests for all classification tasks on Hubert
* Pass all tests + copies
* Move models to the SUPERB org
* make flax gpt2 working with cross attention
* Remove encoder->decoder projection layer
* A draft (incomplete) for FlaxEncoderDecoderModel
* Add the method from_encoder_decoder_pretrained + the docstrings
* Fix the mistakes of using EncoderDecoderModel
* Fix style
* Add FlaxEncoderDecoderModel to the library
* Fix cyclic imports
* Add FlaxEncoderDecoderModel to modeling_flax_auto.py
* Remove question comments
* add tests for FlaxEncoderDecoderModel
* add flax_encoder_decoder to the lists of ignored entries in check_repo.py
* fix missing required positional arguments
* Remove **kwargs when creating FlaxEncoderDecoderModel in from_encoder_decoder_pretrained()
Also fix generation eos/pad tokens issue
* Fix: Use sequences from the generated_output
* Change a check from assert to raise ValueError
* Fix examples and token ids issues
* Fix missing all_cross_attentions when outputting tuple in modeling_gpt2
* Remove the changes in configuration docstrings.
* allow for bert 2 gpt2
* make fix-copies
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Change remaining examples to bert2gpt2
* Change the test to Bert2GPT2
* Fix examples
* Fix import
* Fix unpack bug
* Rename to FlaxEncoderDecoderModelTest and change the test to bert2gpt2
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Fix: NotImplentedError -> NotImplementedError
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* up
* finalize
Co-authored-by: ydshieh <ydshieh@user.noreply>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Doctests
* Limit to 4 decimals
* Try with separate PT/TF tests
* Remove test for TF
* Ellips the predictions
* Doctest continue on failure
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
* Initial work
* All auto models
* All tf auto models
* All flax auto models
* Tokenizers
* Add feature extractors
* Fix typos
* Fix other typo
* Use the right config
* Remove old mapping names and update logic in AutoTokenizer
* Update check_table
* Fix copies and check_repo script
* Fix last test
* Add back name
* clean up
* Update template
* Update template
* Forgot a )
* Use alternative to fixup
* Fix TF model template
* Address review comments
* Address review comments
* Style
* First pass
* Make conversion script work
* Improve conversion script
* Fix bug, conversion script working
* Improve conversion script, implement BEiTFeatureExtractor
* Make conversion script work based on URL
* Improve conversion script
* Add tests, add documentation
* Fix bug in conversion script
* Fix another bug
* Add support for converting masked image modeling model
* Add support for converting masked image modeling
* Fix bug
* Add print statement for debugging
* Fix another bug
* Make conversion script finally work for masked image modeling models
* Move id2label for datasets to JSON files on the hub
* Make sure id's are read in as integers
* Add integration tests
* Make style & quality
* Fix test, add BEiT to README
* Apply suggestions from @sgugger's review
* Apply suggestions from code review
* Make quality
* Replace nielsr by microsoft in tests, add docs
* Rename BEiT to Beit
* Minor fix
* Fix docs of BeitForMaskedImageModeling
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Base test
* More test
* Fix mistake
* Add a docstring change
* Add doc ignore
* Add changes
* Add recursive dep search
* Add recursive dep search
* save
* Finalize test mapping
* Fix bug
* Print prettier
* Ignore comments and empty lines
* Make script runnable from anywhere
* Need dev install
* Like that
* Adapt
* Add as artifact
* Try on torch tests
* Fix yaml error
* Install GitPython
* Apply everywhere
* Be more defensive
* Revert to all tests if something is wrong
* Install GitPython
* Test if there are tests before launching.
* Fixes
* Fixes
* Fixes
* Fixes
* Bash syntax is horrible
* Be less stupid
* Try differently
* Typo
* Typo
* Typo
* Style
* Better name
* Escape quotes
* Ignore black unhelpful re-formatting
* Not a docstring
* Deal with inits in dependency map
* Run all tests once PR is merged.
* Add last job
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Stronger dependencies gather
* Ignore empty lines too!
* Clean up
* Fix quality
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* [WIP] Model card defaults
* finetuned_from default value
* Add all mappings to the mapping file
* Be more defensive on finetuned_from arg
* Add default task tag
* Separate tags from tasks
* Edge case for dataset
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* [WIP] Add TFWav2Vec2Model
Work in progress for adding a tensorflow version of Wav2Vec2
* feedback changes
* small fix
* Test Feedback Round 1
* Add SpecAugment and CTC Loss
* correct spec augment mask creation
* docstring and correct copyright
* correct bugs
* remove bogus file
* finish tests correction
* del unnecessary layers
* Update src/transformers/models/wav2vec2/modeling_tf_wav2vec2.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* make style
* correct final bug
* Feedback Changes
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Squash all commits of modeling_detr_v7 branch into one
* Improve docs
* Fix tests
* Style
* Improve docs some more and fix most tests
* Fix slow tests of ViT, DeiT and DETR
* Improve replacement of batch norm
* Restructure timm backbone forward
* Make DetrForSegmentation support any timm backbone
* Fix name of output
* Address most comments by @LysandreJik
* Give better names for variables
* Conditional imports + timm in setup.py
* Address additional comments by @sgugger
* Make style, add require_timm and require_vision to testsé
* Remove train_backbone attribute of DetrConfig, add methods to freeze/unfreeze backbone
* Add png files to fixtures
* Fix type hint
* Add timm to workflows
* Add `BatchNorm2d` to the weight initialization
* Fix retain_grad test
* Replace model checkpoints by Facebook namespace
* Fix name of checkpoint in test
* Add user-friendly message when scipy is not available
* Address most comments by @patrickvonplaten
* Remove return_intermediate_layers attribute of DetrConfig and simplify Joiner
* Better initialization
* Scipy is necessary to get sklearn metrics
* Rename TimmBackbone to DetrTimmConvEncoder and rename DetrJoiner to DetrConvModel
* Make style
* Improve docs and add 2 community notebooks
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* Make quality scripts work when one backend is missing.
* Check env variable is properly set
* Add default
* With print statements
* Fix typo
* Set env variable
* Remove debug code
* Rebase with master
* Minor bug fix in docs
* Copy files from adding_luke_v2 and improve docs
* change the default value of use_entity_aware_attention to True
* remove word_hidden_states
* fix head models
* fix tests
* fix the conversion script
* add integration tests for the pretrained large model
* improve docstring
* Improve docs, make style
* fix _init_weights for pytorch 1.8
* improve docs
* fix tokenizer to construct entity sequence with [MASK] entity when entities=None
* Make fix-copies
* Make style & quality
* Bug fixes
* Add LukeTokenizer to init
* Address most comments by @patil-suraj and @LysandreJik
* rename _compute_extended_attention_mask to get_extended_attention_mask
* add comments to LukeSelfAttention
* fix the documentation of the tokenizer
* address comments by @patil-suraj, @LysandreJik, and @sgugger
* improve docs
* Make style, quality and fix-copies
* Improve docs
* fix docs
* add "entity_span_classification" task
* update example code for LukeForEntitySpanClassification
* improve docs
* improve docs
* improve the code example in luke.rst
* rename the classification layer in LukeForEntityClassification from typing to classifier
* add bias to the classifier in LukeForEntitySpanClassification
* update docs to use fine-tuned hub models in code examples of the head models
* update the example sentences
* Make style & quality
* Add require_torch to tokenizer tests
* Add require_torch to tokenizer tests
* Address comments by @sgugger and add community notebooks
* Make fix-copies
Co-authored-by: Ikuya Yamada <ikuya@ikuya.net>
* AutoFeatureExtractor
* Init and first tests
* Tests
* Damn you gitignore
* Quality
* Defensive test for when not all backends are here
* Use pattern for Speech2Text models
* Initial script
* Add script to properly sort imports in init.
* Add to the CI
* Update utils/custom_init_isort.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Separate scripts that change content from quality
* Move class_mapping_update to style_checks
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Apply black before checking copies
* Fix for class methods
* Deal with lonely brackets
* Remove debug and add forward changes
* Separate copies and fix test
* Add black as a test dependency
* Examples version update
* Refactor a bit
* All version updates
* Fixes
* README cleanup
* Post-release/patch
* Fixes
* More fixes
* Tests
* More fixes
* Moar fixes
* Make commands and update setup
* Replace spaces with weird tabs
* Fix test
* Style
* Tests run on Docker
Co-authored-by: Morgan <funtowiczmo@gmail.com>
* Comments from code review
* Reply to itself
* Dependencies
Co-authored-by: Morgan <funtowiczmo@gmail.com>
* Create modeling_tf_dpr.py
* Add TFDPR
* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot
last commit accidentally deleted these 4 lines, so I recover them back
* Add TFDPR
* Add TFDPR
* clean up some comments, add TF input-style doc string
* Add TFDPR
* Make return_dict=False as default
* Fix return_dict bug (in .from_pretrained)
* Add get_input_embeddings()
* Create test_modeling_tf_dpr.py
The current version is already passed all 27 tests!
Please see the test run at :
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing
* fix quality
* delete init weights
* run fix copies
* fix repo consis
* del config_class, load_tf_weights
They shoud be 'pytorch only'
* add config_class back
after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion
* newline after .. note::
* import tf, np (Necessary for ModelIntegrationTest)
* slow_test from_pretrained with from_pt=True
At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug
* Add simple TFDPRModelIntegrationTest
Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet
* upload correct tf model
* remove position_ids as missing keys
* create modeling_tf_rag
* add tests for tf
* add tf tests
* revert wrong pt commit
* further refactor
* further refactor
* refactor
* Update modeling_tf_rag.py
- input_processing
- fix prepare_input_for_generation (mostly fix generate bug)
- bring back from_pretrained hack in order to test generate
* delete colab pieces of code
* Show case of greedy "generate"
Temporarily change from beam_search test to greedy_search test to show case that TF and PT do get equivalent output.
* cosmetic update
* correct typos
* update
* push some progress
* make easy check
* fix rag save from pretrained
* Update src/transformers/modeling_tf_utils.py
* remove commented out lines
* delete unnecessary lines
* add simple test case for nq_checkpoint
Add nq_checkpoint test to show that current version without hack still fails
* temporarily put ugly hack back again
* Add TFRagSequenceForGeneration!!
* __init__.py , import TFRagSequenceForGeneration
* Add TFRagSequence tests!
* rag init.py - add TFRagSequenceForGeneration
* fix from_pretrained
* fix prepare_inputs_for_generation
* Beam search for RagToken!
* minor clean up
* add tf.cast in TFRagModel
* More tf.cast
* Add all remaining tests (still have issues)
* delete all T5 related
* make style
* fix load weight prefix
* fix bart
* fix return_dict for tf_rag
make all tests pass .. Hooray
* fix some tests
* fix code quality
* fix qualtiy check
* finish tests tf rag
* add tf rag to docs
* remove TFT5 from docstring
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* remove TFT5 from docstring
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Delete outdated comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* improve doc strings
* add generative model classes
* fix adjust token logic
* refactor generate for TFRag
* using shape_list, not _get_shape
Co-authored-by: Julien Plu <plu.julien@gmail.com>
* axis=[1]->axis=1
* delete NEED_HELP comment
* improve readability
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* improve readability
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* improve readability
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Indicating model is in a developing state in docstrings
As suggested by Julien
* small last changes
* apply sylvains suggestions
* finish tf rag
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>
Co-authored-by: Julien Plu <plu.julien@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add check-ops script
* Finish to implement check_tf_ops and start the test
* Make the test mandatory only for BERT
* Update tf_ops folder
* Remove useless classes
* Add the ONNX test for GPT2 and BART
* Add a onnxruntime slow test + better opset flexibility
* Fix test + apply style
* fix tests
* Switch min opset from 12 to 10
* Update src/transformers/file_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Fix GPT2
* Remove extra shape_list usage
* Fix GPT2
* Address Morgan's comments
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Main init work
* Add version
* Change from absolute to relative imports
* Fix imports
* One more typo
* More typos
* Styling
* Make quality script pass
* Add necessary replace in template
* Fix typos
* Spaces are ignored in replace for some reason
* Forgot one models.
* Fixes for import
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
* Add documentation
* Styling
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
* first commit
* change phobert to phoBERT as per author in overview
* v3 and v4 both runs on same code hence there is no need to differentiate them
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* create model
* add integration
* save current state
* make integration tests pass
* add one more test
* add explanation to tests
* remove from bart
* add padding
* remove unnecessary test
* make all tests pass
* re-add cookie cutter tests
* finish PyTorch
* fix attention test
* Update tests/test_modeling_common.py
* revert change
* remove unused file
* add string to doc
* save intermediate
* make tf integration tests pass
* finish tf
* fix doc
* fix docs again
* add led to doctree
* add to auto tokenizer
* added tips for led
* make style
* apply jplus statements
* correct tf longformer
* apply lysandres suggestions
* apply sylvains suggestions
* Apply suggestions from code review
* remove make on the fly linear embedding
* start refactor
* big first refactor
* save intermediate
* save intermediat
* correct mask issue
* save tests
* refactor padding masks
* make all tests pass
* further refactor
* make pegasus test pass
* fix bool if
* fix leftover tests
* continue
* bart renaming
* delete torchscript test hack
* fix imports in tests
* correct shift
* fix docs and repo cons
* re-add fix for FSTM
* typo in test
* fix typo
* fix another typo
* continue
* hot fix 2 for tf
* small fixes
* refactor types linting
* continue
* finish refactor
* fix import in tests
* better bart names
* further refactor and add test
* delete hack
* apply sylvains and lysandres commens
* small perf improv
* further perf improv
* improv perf
* fix typo
* make style
* small perf improv
* Add badge w/ number of models on the hub
* try to apease @sgugger 😇
* not sure what this `c` was about [ci skip]
* Fix script and move stuff around
* Fix doc styling error
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
* Put models in subfolders
* Styling
* Fix imports in tests
* More fixes in test imports
* Sneaky hidden imports
* Fix imports in doc files
* More sneaky imports
* Finish fixing tests
* Fix examples
* Fix path for copies
* More fixes for examples
* Fix dummy files
* More fixes for example
* More model import fixes
* Is this why you're unhappy GitHub?
* Fix imports in conver command
* Create modeling_tf_dpr.py
* Add TFDPR
* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot
last commit accidentally deleted these 4 lines, so I recover them back
* Add TFDPR
* Add TFDPR
* clean up some comments, add TF input-style doc string
* Add TFDPR
* Make return_dict=False as default
* Fix return_dict bug (in .from_pretrained)
* Add get_input_embeddings()
* Create test_modeling_tf_dpr.py
The current version is already passed all 27 tests!
Please see the test run at :
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing
* fix quality
* delete init weights
* run fix copies
* fix repo consis
* del config_class, load_tf_weights
They shoud be 'pytorch only'
* add config_class back
after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion
* newline after .. note::
* import tf, np (Necessary for ModelIntegrationTest)
* slow_test from_pretrained with from_pt=True
At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug
* Add simple TFDPRModelIntegrationTest
Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet
* upload correct tf model
* remove position_ids as missing keys
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>
* fix typo
* rm use_cdn & references, and implement new hf_bucket_url
* I'm pretty sure we don't need to `read` this file
* same here
* [BIG] file_utils.networking: do not gobble up errors anymore
* Fix CI 😇
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Tiny doc tweak
* Add doc + pass kwarg everywhere
* Add more tests and explain
cc @sshleifer let me know if better
Co-Authored-By: Sam Shleifer <sshleifer@gmail.com>
* Also implement revision in pipelines
In the case where we're passing a task name or a string model identifier
* Fix CI 😇
* Fix CI
* [hf_api] new methods + command line implem
* make style
* Final endpoints post-migration
* Fix post-migration
* Py3.6 compat
cc @stefan-it
Thank you @stas00
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
* Important files
* Styling them all
* Revert "Styling them all"
This reverts commit 7d029395fd.
* Syling them for realsies
* Fix syntax error
* Fix benchmark_utils
* More fixes
* Fix modeling auto and script
* Remove new line
* Fixes
* More fixes
* Fix more files
* Style
* Add FSMT
* More fixes
* More fixes
* More fixes
* More fixes
* Fixes
* More fixes
* More fixes
* Last fixes
* Make sphinx happy
* splitting fast and slow tokenizers [WIP]
* [WIP] splitting sentencepiece and tokenizers dependencies
* update dummy objects
* add name_or_path to models and tokenizers
* prefix added to file names
* prefix
* styling + quality
* spliting all the tokenizer files - sorting sentencepiece based ones
* update tokenizer version up to 0.9.0
* remove hard dependency on sentencepiece 🎉
* and removed hard dependency on tokenizers 🎉
* update conversion script
* update missing models
* fixing tests
* move test_tokenization_fast to main tokenization tests - fix bugs
* bump up tokenizers
* fix bert_generation
* update ad fix several tokenizers
* keep sentencepiece in deps for now
* fix funnel and deberta tests
* fix fsmt
* fix marian tests
* fix layoutlm
* fix squeezebert and gpt2
* fix T5 tokenization
* fix xlnet tests
* style
* fix mbart
* bump up tokenizers to 0.9.2
* fix model tests
* fix tf models
* fix seq2seq examples
* fix tests without sentencepiece
* fix slow => fast conversion without sentencepiece
* update auto and bert generation tests
* fix mbart tests
* fix auto and common test without tokenizers
* fix tests without tokenizers
* clean up tests lighten up when tokenizers + sentencepiece are both off
* style quality and tests fixing
* add sentencepiece to doc/examples reqs
* leave sentencepiece on for now
* style quality split hebert and fix pegasus
* WIP Herbert fast
* add sample_text_no_unicode and fix hebert tokenization
* skip FSMT example test for now
* fix style
* fix fsmt in example tests
* update following Lysandre and Sylvain's comments
* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
in `tests/test_utils_check_copies.py` I was getting intermittently:
```
utils/check_copies.py:52
/mnt/nvme1/code/transformers-comet/utils/check_copies.py:52: DeprecationWarning: invalid escape sequence \s
while line_index < len(lines) and re.search(f"^{indent}(class|def)\s+{name}", lines[line_index]) is None:
```
So this should fix it.
* PoC on RAG
* Format class name/obj name
* Better name in message
* PoC on one TF model
* Add PyTorch and TF dummy objects + script
* Treat scikit-learn
* Bad copy pastes
* Typo
* Copy code from Bert to Roberta and add safeguard script
* Fix docstring
* Comment code
* Formatting
* Update src/transformers/modeling_roberta.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Add test and fix bugs
* Fix style and make new comand
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Initial model
* Fix upsampling
* Add special cls token id and test
* Formatting
* Test and fist FunnelTokenizerFast
* Common tests
* Fix the check_repo script and document Funnel
* Doc fixes
* Add all models
* Write doc
* Fix test
* Initial model
* Fix upsampling
* Add special cls token id and test
* Formatting
* Test and fist FunnelTokenizerFast
* Common tests
* Fix the check_repo script and document Funnel
* Doc fixes
* Add all models
* Write doc
* Fix test
* Fix copyright
* Forgot some layers can be repeated
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/modeling_funnel.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address review comments
* Update src/transformers/modeling_funnel.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Address review comments
* Update src/transformers/modeling_funnel.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
* Slow integration test
* Make small integration test
* Formatting
* Add checkpoint and separate classification head
* Formatting
* Expand list, fix link and add in pretrained models
* Styling
* Add the model in all summaries
* Typo fixes
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
* Add a script to check all models are tested and documented
* Apply suggestions from code review
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
* Address comments
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
This change is mostly autogenerated with:
$ python -m autoflake --in-place --recursive examples templates transformers utils hubconf.py setup.py
I made minor changes in the generated diff.
This is the result of:
$ black --line-length 119 examples templates transformers utils hubconf.py setup.py
There's a lot of fairly long lines in the project. As a consequence, I'm
picking the longest widely accepted line length, 119 characters.
This is also Thomas' preference, because it allows for explicit variable
names, to make the code easier to understand.