* Improve docs of DeiT and ViT, add community notebook
* Add gitignore for test_samples
* Add notebook with Trainer
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Added Big Bird Fast Tokenizer initial file
* style fixes
* flake fixes
* Added big bird fast tokenizer to init files
* Added big bird fast to Auto tokenization
* fix styles
* minor quality fixes
* Added initial test code
* Fix SpmConverter when precompiled_charsmap doesn't exist
* fixed post processor
* minor style fix
* minor fix input names
* Actually fix identity normalization
* style
* Added token type ids to fast tokenizer
* style
* flake fix
* fix copies
Co-authored-by: Anthony MOI <m.anthony.moi@gmail.com>
* Add the ImageClassificationPipeline
* Code review
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
* Have `load_image` at the module level
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
* add electra model to flax
* Remove Electra Next Sentence Prediction model added by mistake
* fix parameter sharing and loosen equality threshold
* fix styling issues
* add mistaken removen imports
* fix electra table
* Add FlaxElectra to automodels and fixe docs
* fix issues pointed out the PR
* fix flax electra to comply with latest changes
* remove stale class
* add copied from
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* add flax roberta
* make style
* correct initialiazation
* modify model to save weights
* fix copied from
* fix copied from
* correct some more code
* add more roberta models
* Apply suggestions from code review
* merge from master
* finish
* finish docs
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
* Rebase with master
* Minor bug fix in docs
* Copy files from adding_luke_v2 and improve docs
* change the default value of use_entity_aware_attention to True
* remove word_hidden_states
* fix head models
* fix tests
* fix the conversion script
* add integration tests for the pretrained large model
* improve docstring
* Improve docs, make style
* fix _init_weights for pytorch 1.8
* improve docs
* fix tokenizer to construct entity sequence with [MASK] entity when entities=None
* Make fix-copies
* Make style & quality
* Bug fixes
* Add LukeTokenizer to init
* Address most comments by @patil-suraj and @LysandreJik
* rename _compute_extended_attention_mask to get_extended_attention_mask
* add comments to LukeSelfAttention
* fix the documentation of the tokenizer
* address comments by @patil-suraj, @LysandreJik, and @sgugger
* improve docs
* Make style, quality and fix-copies
* Improve docs
* fix docs
* add "entity_span_classification" task
* update example code for LukeForEntitySpanClassification
* improve docs
* improve docs
* improve the code example in luke.rst
* rename the classification layer in LukeForEntityClassification from typing to classifier
* add bias to the classifier in LukeForEntitySpanClassification
* update docs to use fine-tuned hub models in code examples of the head models
* update the example sentences
* Make style & quality
* Add require_torch to tokenizer tests
* Add require_torch to tokenizer tests
* Address comments by @sgugger and add community notebooks
* Make fix-copies
Co-authored-by: Ikuya Yamada <ikuya@ikuya.net>
* prep for deepspeed==0.3.16
* new version
* too soon
* support and test fp32 mode
* troubleshooting doc start
* workaround no longer needed
* add fp32 doc
* style
* cleanup, add tf32 note
* clarify
* release was made
* Adding `AutomaticSpeechRecognitionPipeline`.
- Because we added everything to enable this pipeline, we probably
should add it to `transformers`.
- This PR tries to limit the scope and focuses only on the pipeline part
(what should go in, and out).
- The tests are very specific for S2T and Wav2vec2 to make sure both
architectures are supported by the pipeline. We don't use the mixin for
tests right now, because that requires more work in the `pipeline`
function (will be done in a follow up PR).
- Unsure about the "helper" function `ffmpeg_read`. It makes a lot of
sense from a user perspective, it does not add any additional
dependencies (as in hard dependency, because users can always use their
own load mechanism). Meanwhile, it feels slightly clunky to have so much
optional preprocessing.
- The pipeline is not done to support streaming audio right now.
Future work:
- Add `automatic-speech-recognition` as a `task`. And add the
FeatureExtractor.from_pretrained within `pipeline` function.
- Add small models within tests
- Add the Mixin to tests.
- Make the logic between ForCTC vs ForConditionalGeneration better.
* Update tests/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Adding docs + main import + type checking + LICENSE.
* Doc style !.
* Fixing TYPE_HINT.
* Specifying waveform shape in the docs.
* Adding asserts + specify in the documentation the shape of the input
np.ndarray.
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Adding require to tests + move the `feature_extractor` doc.
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Initial support for upload to hub
* push -> upload
* Fixes + examples
* Fix torchhub test
* Torchhub test I hate you
* push_model_to_hub -> push_to_hub
* Apply mixin to other pretrained models
* Remove ABC inheritance
* Add tests
* Typo
* Run tests
* Install git-lfs
* Change approach
* Add push_to_hub to all
* Staging test suite
* Typo
* Maybe like this?
* More deps
* Cache
* Adapt name
* Quality
* MOAR tests
* Put it in testing_utils
* Docs + torchhub last hope
* Styling
* Wrong method
* Typos
* Update src/transformers/file_utils.py
Co-authored-by: Julien Chaumond <julien@huggingface.co>
* Address review comments
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Base move
* Examples reorganization
* Update references
* Put back test data
* Move conftest
* More fixes
* Move test data to test fixtures
* Update path
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address review comments and clean
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Start writing BERT-Japanese doc
* Fix typo, Update toctree
* Modify model file to use comment for document, Add examples
* Clean bert_japanese by make style
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Split a big code block into two
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add prefix >>> to all lines in code blocks
* Clean bert_japanese by make fixup
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* First draft of deit
* More improvements
* Remove DeiTTokenizerFast from init
* Conversion script works
* Add DeiT to ViT conversion script
* Add tests, add head model, add support for deit in vit conversion script
* Update model checkpoint names
* Update image_mean and image_std, set resample to bicubic
* Improve docs
* Docs improvements
* Add DeiTForImageClassificationWithTeacher to init
* Address comments by @sgugger
* Improve feature extractors
* Make fix-copies
* Minor fixes
* Address comments by @patil-suraj
* All models uploaded
* Fix tests
* Remove labels argument from DeiTForImageClassificationWithTeacher
* Fix-copies, style and quality
* Fix tests
* Fix typo
* Multiple docs improvements
* More docs fixes
* Added documentation for data collator.
* Update docs/source/data_collator.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Added documentation for data collator.
* Added documentation for the data collator.
* Merge branch 'doc_DataCollator' of C:\Users\mahii\PycharmProjects\transformers with conflicts.
* Update documentation for the data collator.
* Update documentation for the data collator.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Amna <A.A.Ahmad@student.tudelft.nl>
* Add a special tokenizer for CPM model
* make style
* fix
* Add docs
* styles
* cpm doc
* fix ci
* fix the overview
* add test
* make style
* typo
* Custom tokenizer flag
* Add REAMDE.md
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* make fairscale and deepspeed setup extras
* fix default
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* no reason not to ask for the good version
* update the CIs
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* AutoFeatureExtractor
* Init and first tests
* Tests
* Damn you gitignore
* Quality
* Defensive test for when not all backends are here
* Use pattern for Speech2Text models
* Documentation about loading a fast tokenizer within Transformers
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Squash all commits into one
* Update ViTFeatureExtractor to use image_utils instead of torchvision
* Remove torchvision and add Pillow
* Small docs improvement
* Address most comments by @sgugger
* Fix tests
* Clean up conversion script
* Pooler first draft
* Fix quality
* Improve conversion script
* Make style and quality
* Make fix-copies
* Minor docs improvements
* Should use fix-copies instead of manual handling
* Revert "Should use fix-copies instead of manual handling"
This reverts commit fd4e591bce.
* Place ViT in alphabetical order
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Added embeddings layer
* Added layoutlm layers, main model, maskedlm and token classification classes
* Added model classes to tf auto models
* Added model to PT to TF conversion script
* Added model to doc README
* Added tests
* Removed unused imports
* Added layoutlm model, test, and doc for sequence classification, and fix imports in __init__.py
* Made tests pass!
* Fixed typos in imports and docs
* Fixed a typo in embeddings layer
* Removed imports
* Fixed formatting issues, imports, tests
* Added layoutlm layers, main model, maskedlm and token classification classes
* Added model classes to tf auto models
* Added model to PT to TF conversion script
* Removed unused imports
* Added layoutlm model, test, and doc for sequence classification, and fix imports in __init__.py
* Made tests pass!
* Fixed typos in imports and docs
* Removed imports
* Fixed small formatting issues
* Removed duplicates import from main __init__.py
* Chnaged deafult arg to true for adding pooling layer to tf layoutlm
* Fixed formatting issues
* Style
* Added copied from to classes copied from bert
* Fixed doc strings examples to work with layoutlm inputs
* Removed PyTorch reference in doc strings example
* Added integration tests
* Cleaned up initialization file
* Updated model checkpoint identifiers
* Fixed imports
Co-authored-by: Amir Tahmasbi <amir@ehsai.ca>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* [doc] [testing] extend -k section
This PR adds more examples on using `pytest -k` - I always forget that I want to use `-k A OR B` when I want several tests - I keep trying AND and it doesn't match any.
* style
* pass hf optimizer and scheduler to deepspeed if not specified in ds config
* pass hf optimizer and scheduler to deepspeed if not specified in ds config
* update
* make init_deepspeed support config dict
* fix docstring formatting
* clean up trainer's comments
* add new tests
* fix type
* composit argparse doesn't work
* style
* add a new test, rename others
* document new functionality
* complete tests, add docs
* style
* correct level
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add new methods to the doc
* must tell DS we are using a non-native optimizer
* add protection against cpu_offload + HF optimizer combo
* fix the cli overrides
* sync docs + tests
* restore AdamW
* better docs
* need new version
* no longer needed
* remove outdate information
* refactor duplicated code
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* [WIP] Adding new parameter to `generate`: `max_time`.
Generation by tokens number is sometimes a bit clunky because we don't
know how many tokens are good enough or even how many tokens are in
the payload (for pipelines users for instance). This leads to hard
to understand behavior.
This PR proposes a new argument `max_time` which is a float of seconds
for the allowed time for `generate` to run on.
Ideally combinations of `max_tokens=None`, `max_time=2` could be used to
generate as many tokens as possible within time budget.
NB: Another possible approach consists of passing a callback to `generate`
putting the caller in charge of the actual decision of when to stop
generating tokens. It opens the door to 'which args should we pass'
to this callback. It's hard to imagine other use-cases for this
early stopping behavior than time (that are not already covered by
parameters of generate)
* Revamp with StoppingCriteria
* Removing deprecated mentions.
* Forgot arguments to stopping criteria.
* Readding max_length it's not just used as a stopping criteria.
* Default value for `stopping_criteria`.
* Address @patrickvonplaten comments.
- More docstrings
- Actual doc
- Include in global namespace
- Remove TF work.
* Put back `max_length` (deprecation different PR).
* Doc quality.
* Fixing old behavior without `stopping_criteria` but with `max_length`.
Making sure we don't break that in the future.
* Adding more tests for possible inconsistencies between
`max_length` and `stopping_criteria`.
* Fixing the torch imports.
* Create modeling_tf_dpr.py
* Add TFDPR
* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot
last commit accidentally deleted these 4 lines, so I recover them back
* Add TFDPR
* Add TFDPR
* clean up some comments, add TF input-style doc string
* Add TFDPR
* Make return_dict=False as default
* Fix return_dict bug (in .from_pretrained)
* Add get_input_embeddings()
* Create test_modeling_tf_dpr.py
The current version is already passed all 27 tests!
Please see the test run at :
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing
* fix quality
* delete init weights
* run fix copies
* fix repo consis
* del config_class, load_tf_weights
They shoud be 'pytorch only'
* add config_class back
after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion
* newline after .. note::
* import tf, np (Necessary for ModelIntegrationTest)
* slow_test from_pretrained with from_pt=True
At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug
* Add simple TFDPRModelIntegrationTest
Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet
* upload correct tf model
* remove position_ids as missing keys
* create modeling_tf_rag
* add tests for tf
* add tf tests
* revert wrong pt commit
* further refactor
* further refactor
* refactor
* Update modeling_tf_rag.py
- input_processing
- fix prepare_input_for_generation (mostly fix generate bug)
- bring back from_pretrained hack in order to test generate
* delete colab pieces of code
* Show case of greedy "generate"
Temporarily change from beam_search test to greedy_search test to show case that TF and PT do get equivalent output.
* cosmetic update
* correct typos
* update
* push some progress
* make easy check
* fix rag save from pretrained
* Update src/transformers/modeling_tf_utils.py
* remove commented out lines
* delete unnecessary lines
* add simple test case for nq_checkpoint
Add nq_checkpoint test to show that current version without hack still fails
* temporarily put ugly hack back again
* Add TFRagSequenceForGeneration!!
* __init__.py , import TFRagSequenceForGeneration
* Add TFRagSequence tests!
* rag init.py - add TFRagSequenceForGeneration
* fix from_pretrained
* fix prepare_inputs_for_generation
* Beam search for RagToken!
* minor clean up
* add tf.cast in TFRagModel
* More tf.cast
* Add all remaining tests (still have issues)
* delete all T5 related
* make style
* fix load weight prefix
* fix bart
* fix return_dict for tf_rag
make all tests pass .. Hooray
* fix some tests
* fix code quality
* fix qualtiy check
* finish tests tf rag
* add tf rag to docs
* remove TFT5 from docstring
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* remove TFT5 from docstring
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Delete outdated comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* improve doc strings
* add generative model classes
* fix adjust token logic
* refactor generate for TFRag
* using shape_list, not _get_shape
Co-authored-by: Julien Plu <plu.julien@gmail.com>
* axis=[1]->axis=1
* delete NEED_HELP comment
* improve readability
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* improve readability
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* improve readability
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Indicating model is in a developing state in docstrings
As suggested by Julien
* small last changes
* apply sylvains suggestions
* finish tf rag
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>
Co-authored-by: Julien Plu <plu.julien@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* offline mode start
* add specific values
* fix fallback
* add test
* better values check and range
* test that actually works
* document the offline mode
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* more strict check
* cleaner test
* pt-only test
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* feat: Adds three definitions to glossary from @cronoik
Needed a definition for transformer which in turn needed 2 more definitions
To do with issue https://github.com/huggingface/transformers/issues/9078
* fix: Adjusts definition of neural network to make it easier to read
* Introduce save_strategy training argument
* deprecate EvaluationStrategy
* collapse EvaluationStrategy and LoggingStrategy into a single
IntervalStrategy enum
* modify tests to use modified enum
* Fix None in add_token_positions - issue #10210
Fix None in add_token_positions related to the issue #10210
* add_token_positions fix None values in end_positions vector
add_token_positions fix None in end_positions vector as proposed by @joeddav
* expand install instructions
* fix
* white space
* rewrite as discussed in the PR
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* change the wording to encourage issue report
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add LayoutLMForSequenceClassification and integration tests
Improve docs
Add LayoutLM notebook to list of community notebooks
* Make style & quality
* Address comments by @sgugger, @patrickvonplaten and @LysandreJik
* Fix rebase with master
* Reformat in one line
* Improve code examples as requested by @patrickvonplaten
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Define new output dataclasses for greedy generation
* Add output_[...] flags in greedy generation methods
Added output_attentions, output_hidden_states, output_scores flags in
generate and greedy_search methods in GenerationMixin.
* [WIP] Implement logic and tests for output flags in generation
* Update GreedySearchOutput classes & docstring
* Implement greedy search output accumulation logic
Update greedy_search unittests
Fix generate method return value docstring
Properly init flags with the default config
* Update configuration to add output_scores flag
* Fix test_generation_utils
Sort imports and fix isinstance tests for GreedySearchOutputs
* Fix typo in generation_utils
* Add return_dict_in_generate for backwards compatibility
* Add return_dict_in_generate flag in config
* Fix tyPo in configuration
* Fix handling of attentions and hidden_states flags
* Make style & quality
* first attempt attentions
* some corrections
* improve tests
* special models requires special test
* disable xlm test for now
* clean tests
* fix for tf
* isort
* Add output dataclasses for other generation methods
* Add logic to return dict in sample generation
* Complete test for sample generation
- Pass output_attentions and output_hidden_states flags to encoder in
encoder-decoder models
- Fix import satements order in test_generation_utils file
* Add logic to return dict in sample generation
- Refactor tests to avoid using self.assertTrue, which provides
scarce information when the test fails
- Add tests for the three beam_search methods: vanilla, sample and
grouped
* Style doc
* Fix copy-paste error in generation tests
* Rename logits to scores and refactor
* Refactor group_beam_search for consistency
* make style
* add sequences_scores
* fix all tests
* add docs
* fix beam search finalize test
* correct docstring
* clean some files
* Made suggested changes to the documentation
* Style doc ?
* Style doc using the Python util
* Update src/transformers/generation_utils.py
* fix empty lines
* fix all test
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* first commit
* change phobert to phoBERT as per author in overview
* v3 and v4 both runs on same code hence there is no need to differentiate them
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* create model
* add integration
* save current state
* make integration tests pass
* add one more test
* add explanation to tests
* remove from bart
* add padding
* remove unnecessary test
* make all tests pass
* re-add cookie cutter tests
* finish PyTorch
* fix attention test
* Update tests/test_modeling_common.py
* revert change
* remove unused file
* add string to doc
* save intermediate
* make tf integration tests pass
* finish tf
* fix doc
* fix docs again
* add led to doctree
* add to auto tokenizer
* added tips for led
* make style
* apply jplus statements
* correct tf longformer
* apply lysandres suggestions
* apply sylvains suggestions
* Apply suggestions from code review
* Use extlinks to point hyperlink with the version of code
* Point to version on release and master until then
* Apply style
* Correct links
* Add missing backtick
* Simple missing backtick after all.
Co-authored-by: Raghavendra Sugeeth P S <raghav-5305@raghav-5305.csez.zohocorpin.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* add past_key_values
* add use_cache option
* make mask before cutting ids
* adjust position_ids according to past_key_values
* flatten past_key_values
* fix positional embeds
* fix _reorder_cache
* set use_cache to false when not decoder, fix attention mask init
* add test for caching
* add past_key_values for Roberta
* fix position embeds
* add caching test for roberta
* add doc
* make style
* doc, fix attention mask, test
* small fixes
* adress patrick's comments
* input_ids shouldn't start with pad token
* use_cache only when decoder
* make consistent with bert
* make copies consistent
* add use_cache to encoder
* add past_key_values to tapas attention
* apply suggestions from code review
* make coppies consistent
* add attn mask in tests
* remove copied from longformer
* apply suggestions from code review
* fix bart test
* nit
* simplify model outputs
* fix doc
* fix output ordering
* Add label smoothing in Trainer
* Add options for scheduler and Adafactor in Trainer
* Put Seq2SeqTrainer in the main lib
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Address review comments and adapt scripts
* Documentation
* Move test not using script to tests folder
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* First commit: adding all files from tapas_v3
* Fix multiple bugs including soft dependency and new structure of the library
* Improve testing by adding torch_device to inputs and adding dependency on scatter
* Use Python 3 inheritance rather than Python 2
* First draft model cards of base sized models
* Remove model cards as they are already on the hub
* Fix multiple bugs with integration tests
* All model integration tests pass
* Remove print statement
* Add test for convert_logits_to_predictions method of TapasTokenizer
* Incorporate suggestions by Google authors
* Fix remaining tests
* Change position embeddings sizes to 512 instead of 1024
* Comment out positional embedding sizes
* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
* Added more model names
* Fix truncation when no max length is specified
* Disable torchscript test
* Make style & make quality
* Quality
* Address CI needs
* Test the Masked LM model
* Fix the masked LM model
* Truncate when overflowing
* More much needed docs improvements
* Fix some URLs
* Some more docs improvements
* Test PyTorch scatter
* Set to slow + minify
* Calm flake8 down
* First commit: adding all files from tapas_v3
* Fix multiple bugs including soft dependency and new structure of the library
* Improve testing by adding torch_device to inputs and adding dependency on scatter
* Use Python 3 inheritance rather than Python 2
* First draft model cards of base sized models
* Remove model cards as they are already on the hub
* Fix multiple bugs with integration tests
* All model integration tests pass
* Remove print statement
* Add test for convert_logits_to_predictions method of TapasTokenizer
* Incorporate suggestions by Google authors
* Fix remaining tests
* Change position embeddings sizes to 512 instead of 1024
* Comment out positional embedding sizes
* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
* Added more model names
* Fix truncation when no max length is specified
* Disable torchscript test
* Make style & make quality
* Quality
* Address CI needs
* Test the Masked LM model
* Fix the masked LM model
* Truncate when overflowing
* More much needed docs improvements
* Fix some URLs
* Some more docs improvements
* Add add_pooling_layer argument to TapasModel
Fix comments by @sgugger and @patrickvonplaten
* Fix issue in docs + fix style and quality
* Clean up conversion script and add task parameter to TapasConfig
* Revert the task parameter of TapasConfig
Some minor fixes
* Improve conversion script and add test for absolute position embeddings
* Improve conversion script and add test for absolute position embeddings
* Fix bug with reset_position_index_per_cell arg of the conversion cli
* Add notebooks to the examples directory and fix style and quality
* Apply suggestions from code review
* Move from `nielsr/` to `google/` namespace
* Apply Sylvain's comments
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Rogge Niels <niels.rogge@howest.be>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>