* Add TapexTokenizer
* Improve docstrings and provide option to provide answer
* Remove option for pretokenized inputs
* Add TAPEX to README
* Fix copies
* Remove option for pretokenized inputs
* Initial commit: add tapex fine-tuning examples on both table-based question answering and table-based fact verification.
* - Draft a README file for running the script and introducing some background.
- Remove unused code lines in tabfact script.
- Disable the deafult `pad_to_max_length` option which is memory-consuming.
* * Support `as_target_tokenizer` function for TapexTokenizer.
* Fix the do_lower_case behaviour of TapexTokenizer.
* Add unit tests for target scenarios and cased/uncased scenarios for both source and target.
* * Replace the label BartTokenizer with TapexTokenizer's as_target_tokenizer function.
* Fix typos in tapex example README.
* * fix the evaluation script - remove the property `task_name`
* * Make the label space more clear for tabfact tasks
* * Using a new fine-tuning script for tapex-base on tabfact.
* * Remove the lowercase code outside the tokenizer - we use the tokenizer to control whether do_lower_case
* Guarantee the hyper-parameter can be run without out-of-memory on 16GB card and report the new reproduced number on wikisql
* * Remove the default tokenizer_name option.
* Provide evaluation command.
* * Support for WikiTableQuestion dataset.
* Fix a typo in README.
* * Fix the datasets's key name in WikiTableQuestions
* Run make fixup and move test to folder
* Fix quality
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Apply some more suggestions from code review
* Improve docstrings
* Overwrite failing test
* Improve comment in example scripts
* Fix rebase
* Add TAPEX to Auto mapping
* Add TAPEX to auto config mappings
* Put TAPEX higher than BART in auto mapping
* Add TAPEX to doc tests
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
Co-authored-by: SivilTaram <qianlxc@outlook.com>
Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* 📝 add image/vision classification and asr
* 🖍 minor formatting fixes
* Fixed a typo in legacy seq2seq_trainer.py (#16531)
* Add ONNX export for BeiT (#16498)
* Add beit onnx conversion support
* Updated docs
* Added cross reference to ViT ONNX config
* call on_train_end when trial is pruned (#16536)
* Type hints added (#16529)
* Fix Bart type hints (#16297)
* Add type hints to PLBart PyTorch
* Remove pending merge conflicts
* Fix PLBart Type Hints
* Add changes from review
* Add VisualBert type hints (#16544)
* Adding missing type hints for mBART model (PyTorch) (#16429)
* added type hints for mbart tensorflow tf implementation
* Adding missing type hints for mBART model
Tensorflow Implementation model added with missing type hints
* Missing Type hints - correction
For TF model
* Code fixup using make quality tests
* Hint types - typo error
* make fix-copies and make fixup
* type hints
* updated files
* type hints update
* making dependent modesls coherent
Co-authored-by: matt <rocketknight1@gmail.com>
* Remove MBart subclass of XLMRoberta in tokenzier docs (#16546)
* Remove MBart subclass of XLMRoberta in tokenzier
* Fix style
* Copy docs from MBart50 tokenizer
* Use random_attention_mask for TF tests (#16517)
* use random_attention_mask for TF tests
* Fix for TFCLIP test (for now).
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* Improve code example (#16450)
Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
* Pin tokenizers version <0.13 (#16539)
* Pin tokenizers version <0.13
* Style
* Add code samples for TF speech models (#16494)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* [FlaxSpeechEncoderDecoder] Fix dtype bug (#16581)
* [FlaxSpeechEncoderDecoder] Fix dtype bug
* more fixes
* Making the impossible to connect error actually report the right URL. (#16446)
* Fix flax import in __init__.py: modeling_xglm -> modeling_flax_xglm (#16556)
* Add utility to find model labels (#16526)
* Add utility to find model labels
* Use it in the Trainer
* Update src/transformers/utils/generic.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Quality
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Enable doc in Spanish (#16518)
* Reorganize doc for multilingual support
* Fix style
* Style
* Toc trees
* Adapt templates
* Add use_auth to load_datasets for private datasets to PT and TF examples (#16521)
* fix formatting and remove use_auth
* Add use_auth_token to Flax examples
* add a test checking the format of `convert_tokens_to_string`'s output (#16540)
* add new tests
* add comment to overridden tests
* TF: Finalize `unpack_inputs`-related changes (#16499)
* Add unpack_inputs to remaining models
* removed kwargs to `call()` in TF models
* fix TF T5 tests
* [SpeechEncoderDecoderModel] Correct Encoder Last Hidden State Output (#16586)
* initialize the default rank set on TrainerState (#16530)
* initialize the default rank set on TrainerState
* fix style
* Trigger doc build
* Fix CI: test_inference_for_pretraining in ViTMAEModelTest (#16591)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* add a template to add missing tokenization test (#16553)
* add a template to add missing tokenization test
* add cookiecutter setting
* improve doc
* Update templates/adding_a_missing_tokenization_test/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* made _load_pretrained_model_low_mem static + bug fix (#16548)
* handle torch_dtype in low cpu mem usage (#16580)
* [Doctests] Correct filenaming (#16599)
* [Doctests] Correct filenaming
* improve quicktour
* make style
* Adding new train_step logic to make things less confusing for users (#15994)
* Adding new train_step logic to make things less confusing for users
* DO NOT ASK WHY WE NEED THAT SUBCLASS
* Metrics now working, at least for single-output models with type annotations!
* Updates and TODOs for the new train_step
* Make fixup
* Temporary test workaround until T5 has types
* Temporary test workaround until T5 has types
* I think this actually works! Needs a lot of tests though
* MAke style/quality
* Revert changes to T5 tests
* Deleting the aforementioned unmentionable subclass
* Deleting the aforementioned unmentionable subclass
* Adding a Keras API test
* Style fixes
* Removing unneeded TODO and comments
* Update test_step too
* Stop trying to compute metrics with the dummy_loss, patch up test
* Make style
* make fixup
* Docstring cleanup
* make fixup
* make fixup
* Stop expanding 1D input tensors when using dummy loss
* Adjust T5 test given the new compile()
* make fixup
* Skipping test for convnext
* Removing old T5-specific Keras test now that we have a common one
* make fixup
* make fixup
* Only skip convnext test on CPU
* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Avoiding TF import issues
* make fixup
* Update compile() to support TF 2.3
* Skipping model.fit() on template classes for now
* Skipping model.fit() on template class tests for now
* Replace ad-hoc solution with find_labels
* make fixup
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Adding missing type hints for BigBird model (#16555)
* added type hints for mbart tensorflow tf implementation
* Adding missing type hints for mBART model
Tensorflow Implementation model added with missing type hints
* Missing Type hints - correction
For TF model
* Code fixup using make quality tests
* Hint types - typo error
* make fix-copies and make fixup
* type hints
* updated files
* type hints update
* making dependent modesls coherent
* Type hints for BigBird
* removing typos
Co-authored-by: matt <rocketknight1@gmail.com>
* [deepspeed] fix typo, adjust config name (#16597)
* 🖍 apply feedback
Co-authored-by: Cathy <815244047@qq.com>
Co-authored-by: Jim Rohrer <jrohrer1@gmail.com>
Co-authored-by: Ferdinand Schlatt <fschlatt@gmail.com>
Co-authored-by: Dahlbomii <101373053+Dahlbomii@users.noreply.github.com>
Co-authored-by: Gunjan Chhablani <chhablani.gunjan@gmail.com>
Co-authored-by: Rishav Chandra Varma <rishavchandra.v16@iiits.in>
Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Daniel Stancl <46073029+stancld@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Karim Foda <35491698+KMFODA@users.noreply.github.com>
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Andres Codas <andrescodas@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* first proposal
* replace model outputs in various models
* conflicts
* docstring
* update poolformer
* minor change in docstring
* CI
* removed poolformer specific outputs from doc
* removed convnext specific outputs from doc
* CI
* weird char in segformer
* conversations
* reverted docstring for BaseModelOutputWithPooling
* update outputs
* changed docstring in BaseModelOutput
* updated docstring in modeling outputs
* typos :)
* fixed typo after copy & paste it all around
* CI
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* segformer
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* ported TFViTMAEIntermediate and TFViTMAEOutput.
* added TFViTMAEModel and TFViTMAEDecoder.
* feat: added a noise argument in the implementation for reproducibility.
* feat: vit mae models with an additional noise argument for reproducibility.
Co-authored-by: ariG23498 <aritra.born2fly@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* fix confusing PIL instructions
As stated in the documentation
[here](https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html?highlight=pdf#write-only-formats),
PIL can only write PDF's, not read them. Remove references to reading
PDF's via PIL from this page to avoid confusion.
* mention PDF in doc examples using PIL
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Be explicit: PDFs must be converted to images
* fix formatting
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Created the Decision Transformer Modle
* updating tests, copy to other machine
* Added last hidden size to Decision Transformer modelling outputs
* Removed copy of original DT file
* made a temporary change to gpt2 to have it conform with the Decision Transformer version
* Updated tests
* Ignoring a file used to test the DT model
* added comments to config file
* added comments and argument descriptions to decision transformer file
* Updated doc
* Ran "make style"
* Remove old model imports
* Removed unused imports, cleaned up init file
* Update docs/source/model_doc/decision_transformer.mdx
added my username
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Reverted changes made to gpt2
* Removed datasets submodule
* Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states
* Added support for return of hidden states, attentions and return dict of gpt2 model.
* Updated tests to include many of the ModelTesterMixin tests.
The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes
* Added missing line to the end of gpt2 file
* Added an integration test for the Decision Transformer
Test performs and autoregressive evaluation for two time steps
* Set done and info to _ to fix failing test
* Updated integration test to be deterministic and check expected outputs
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Removed unnecessary config options
* Cleaned up commented code and old comments.
* Cleaned up commented code.
* Changed DecisionTransformer to Decision Transformer
* Added Decision Transformer to the main README file
* Added copy of GTP2 called DecisionTranformerGPT2Model
* isorted imports
* isorted imports
* Added model to non-English README files
* Ran make fix-copies and corrected some cases.
* Updated index file to include Decision Transformer
* Added gpt2 model as copy inside the Decision Transformer model file
* Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS
* Deleted redundant checkpoint files (I don't know how these got committed)
* Removed testing files. (These should have never been committed)
* Removed accidentally committed files
* Moved the Decision Transformer test to its own directory
* Add type hints for Pegasus (#16324)
* Funnel type hints (#16323)
* add pt funnel type hints
* add tf funnel type hints
* Add type hints for ProphetNet PyTorch (#16272)
* [GLPN] Improve docs (#16331)
* Add link to notebook
* Add link
* Fix bug
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Added type hints for Pytorch Marian calls (#16200)
* Added type hinting for forward functions in pytorch marian
* typo correction
* Removed type hints on functions from BART per Suraj Patil request
* fix import pb
* fix typo
* corrected tuple call
* ran black
* after fix-copies
Some optional tags on primitives were removed, past_key_values in MarianForCausalLM changed from Tuple of Tuple to List
* Fixing copies to roformer and pegasus
Co-authored-by: Clementine Fourrier <cfourrie@inria.fr>
Co-authored-by: matt <rocketknight1@gmail.com>
* Moved DecisionTransformOutput to modeling_decision_transformer
* Moved the example usage to research project and cleaned comments
* Made tests ignore the copy of gpt2 in Decision Transformer
* Added module output to modelling decision transformer
* removed copied gpt2 model from list of transformers models
* Updated tests and created __init__ file for new test location
* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Removed unneeded summary type from config file
* Fixed copies
* Updated pretrained config map to refer to hopper-medium checkpoint
* done (#16340)
* Added Decision transformer to model docs
* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add type annotations for Rembert/Splinter and copies (#16338)
* undo black autoformat
* minor fix to rembert forward with default
* make fix-copies, make quality
* Adding types to template model
* Removing List from the template types
* Remove `Optional` from a couple of types that don't accept `None`
Co-authored-by: matt <rocketknight1@gmail.com>
* [Bug template] Shift responsibilities for long-range (#16344)
* Fix code repetition in serialization guide (#16346)
* Adopt framework-specific blocks for content (#16342)
* ✨ refactor code samples with framework-specific blocks
* ✨ update training.mdx
* 🖍 apply feedback
* Updates the default branch from master to main (#16326)
* Updates the default branch from master to main
* Links from `master` to `main`
* Typo
* Update examples/flax/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Updated model with custom docstring example
* Created the Decision Transformer Modle
* updating tests, copy to other machine
* Added last hidden size to Decision Transformer modelling outputs
* Removed copy of original DT file
* made a temporary change to gpt2 to have it conform with the Decision Transformer version
* Updated tests
* Ignoring a file used to test the DT model
* added comments to config file
* added comments and argument descriptions to decision transformer file
* Updated doc
* Ran "make style"
* Remove old model imports
* Removed unused imports, cleaned up init file
* Update docs/source/model_doc/decision_transformer.mdx
added my username
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Reverted changes made to gpt2
* Removed datasets submodule
* Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states
* Added support for return of hidden states, attentions and return dict of gpt2 model.
* Updated tests to include many of the ModelTesterMixin tests.
The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes
* Added missing line to the end of gpt2 file
* Added an integration test for the Decision Transformer
Test performs and autoregressive evaluation for two time steps
* Set done and info to _ to fix failing test
* Updated integration test to be deterministic and check expected outputs
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Removed unnecessary config options
* Cleaned up commented code and old comments.
* Cleaned up commented code.
* Changed DecisionTransformer to Decision Transformer
* Added Decision Transformer to the main README file
* Added copy of GTP2 called DecisionTranformerGPT2Model
* isorted imports
* isorted imports
* Added model to non-English README files
* Ran make fix-copies and corrected some cases.
* Updated index file to include Decision Transformer
* Added gpt2 model as copy inside the Decision Transformer model file
* Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS
* Deleted redundant checkpoint files (I don't know how these got committed)
* Removed testing files. (These should have never been committed)
* Removed accidentally committed files
* Moved the Decision Transformer test to its own directory
* Moved DecisionTransformOutput to modeling_decision_transformer
* Moved the example usage to research project and cleaned comments
* Made tests ignore the copy of gpt2 in Decision Transformer
* Added module output to modelling decision transformer
* removed copied gpt2 model from list of transformers models
* Updated tests and created __init__ file for new test location
* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Removed unneeded summary type from config file
* Fixed copies
* Updated pretrained config map to refer to hopper-medium checkpoint
* Added Decision transformer to model docs
* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Updated model with custom docstring example
* Updated copies, config auto, and readme files.
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dan Tegzes <48134725+Tegzes@users.noreply.github.com>
Co-authored-by: Adam Montgomerie <adam@avanssion.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Co-authored-by: Clementine Fourrier <cfourrie@inria.fr>
Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com>
Co-authored-by: Jacob Dineen <54680234+jacobdineen@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
* Updates the default branch from master to main
* Links from `master` to `main`
* Typo
* Update examples/flax/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add Flaubert to ONNX to make it available for conversion.
* Fixed features for FlauBERT. fixup command remove flaubert to docs list.
Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
* Remove unused attributes
* Add link to blog and add clarification about input size
* Improve readability of the code
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Update training.mdx
Fixed Error Raised Due to Wrongly Accessing Training Sample
* Ran make style
* Revert to Old Commit
* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Draft a guide with our code quirks for new models
* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* up
* up
* up
* fix
* yeh
* ups
* Empty test commit
* correct quicktour
* correct
* correct
* up
* up
* uP
* uP
* up
* up
* uP
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* Update src/transformers/models/van/modeling_van.py
* finish
* apply suggestions
* remove folder
* revert to daily testing
* [Generate Docs] Correct docs
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>