* create model
* add integration
* save current state
* make integration tests pass
* add one more test
* add explanation to tests
* remove from bart
* add padding
* remove unnecessary test
* make all tests pass
* re-add cookie cutter tests
* finish PyTorch
* fix attention test
* Update tests/test_modeling_common.py
* revert change
* remove unused file
* add string to doc
* save intermediate
* make tf integration tests pass
* finish tf
* fix doc
* fix docs again
* add led to doctree
* add to auto tokenizer
* added tips for led
* make style
* apply jplus statements
* correct tf longformer
* apply lysandres suggestions
* apply sylvains suggestions
* Apply suggestions from code review
* Use extlinks to point hyperlink with the version of code
* Point to version on release and master until then
* Apply style
* Correct links
* Add missing backtick
* Simple missing backtick after all.
Co-authored-by: Raghavendra Sugeeth P S <raghav-5305@raghav-5305.csez.zohocorpin.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* add past_key_values
* add use_cache option
* make mask before cutting ids
* adjust position_ids according to past_key_values
* flatten past_key_values
* fix positional embeds
* fix _reorder_cache
* set use_cache to false when not decoder, fix attention mask init
* add test for caching
* add past_key_values for Roberta
* fix position embeds
* add caching test for roberta
* add doc
* make style
* doc, fix attention mask, test
* small fixes
* adress patrick's comments
* input_ids shouldn't start with pad token
* use_cache only when decoder
* make consistent with bert
* make copies consistent
* add use_cache to encoder
* add past_key_values to tapas attention
* apply suggestions from code review
* make coppies consistent
* add attn mask in tests
* remove copied from longformer
* apply suggestions from code review
* fix bart test
* nit
* simplify model outputs
* fix doc
* fix output ordering
* Add label smoothing in Trainer
* Add options for scheduler and Adafactor in Trainer
* Put Seq2SeqTrainer in the main lib
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Address review comments and adapt scripts
* Documentation
* Move test not using script to tests folder
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* First commit: adding all files from tapas_v3
* Fix multiple bugs including soft dependency and new structure of the library
* Improve testing by adding torch_device to inputs and adding dependency on scatter
* Use Python 3 inheritance rather than Python 2
* First draft model cards of base sized models
* Remove model cards as they are already on the hub
* Fix multiple bugs with integration tests
* All model integration tests pass
* Remove print statement
* Add test for convert_logits_to_predictions method of TapasTokenizer
* Incorporate suggestions by Google authors
* Fix remaining tests
* Change position embeddings sizes to 512 instead of 1024
* Comment out positional embedding sizes
* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
* Added more model names
* Fix truncation when no max length is specified
* Disable torchscript test
* Make style & make quality
* Quality
* Address CI needs
* Test the Masked LM model
* Fix the masked LM model
* Truncate when overflowing
* More much needed docs improvements
* Fix some URLs
* Some more docs improvements
* Test PyTorch scatter
* Set to slow + minify
* Calm flake8 down
* First commit: adding all files from tapas_v3
* Fix multiple bugs including soft dependency and new structure of the library
* Improve testing by adding torch_device to inputs and adding dependency on scatter
* Use Python 3 inheritance rather than Python 2
* First draft model cards of base sized models
* Remove model cards as they are already on the hub
* Fix multiple bugs with integration tests
* All model integration tests pass
* Remove print statement
* Add test for convert_logits_to_predictions method of TapasTokenizer
* Incorporate suggestions by Google authors
* Fix remaining tests
* Change position embeddings sizes to 512 instead of 1024
* Comment out positional embedding sizes
* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
* Added more model names
* Fix truncation when no max length is specified
* Disable torchscript test
* Make style & make quality
* Quality
* Address CI needs
* Test the Masked LM model
* Fix the masked LM model
* Truncate when overflowing
* More much needed docs improvements
* Fix some URLs
* Some more docs improvements
* Add add_pooling_layer argument to TapasModel
Fix comments by @sgugger and @patrickvonplaten
* Fix issue in docs + fix style and quality
* Clean up conversion script and add task parameter to TapasConfig
* Revert the task parameter of TapasConfig
Some minor fixes
* Improve conversion script and add test for absolute position embeddings
* Improve conversion script and add test for absolute position embeddings
* Fix bug with reset_position_index_per_cell arg of the conversion cli
* Add notebooks to the examples directory and fix style and quality
* Apply suggestions from code review
* Move from `nielsr/` to `google/` namespace
* Apply Sylvain's comments
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Rogge Niels <niels.rogge@howest.be>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
* add model parallelism to T5EncoderModel
add model parallelism to T5EncoderModel
* remove decoder from T5EncoderModel parallelize
* uodate T5EncoderModel docs
* Extend T5ModelTest for T5EncoderModel
* fix T5Stask using range for get_device_map
* fix style
Co-authored-by: Ahmed Elnaggar <elnaggar@rostlab.informatik.tu-muenchen.de>
* rm all model cards
* Update the .rst
@sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler
* Add a rootlevel README.md with simple instructions/context
* Update docs/source/model_sharing.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* make style
* rm all model cards
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* remove make on the fly linear embedding
* start refactor
* big first refactor
* save intermediate
* save intermediat
* correct mask issue
* save tests
* refactor padding masks
* make all tests pass
* further refactor
* make pegasus test pass
* fix bool if
* fix leftover tests
* continue
* bart renaming
* delete torchscript test hack
* fix imports in tests
* correct shift
* fix docs and repo cons
* re-add fix for FSTM
* typo in test
* fix typo
* fix another typo
* continue
* hot fix 2 for tf
* small fixes
* refactor types linting
* continue
* finish refactor
* fix import in tests
* better bart names
* further refactor and add test
* delete hack
* apply sylvains and lysandres commens
* small perf improv
* further perf improv
* improv perf
* fix typo
* make style
* small perf improv
* diverse beam search
* bug fixes
* bug fixes
* bug fix
* separate out diverse_beam_search function
* separate out diverse_beam_search function
* bug fix
* improve code quality
* bug fix
* bug fix
* separate out diverse beam search scorer
* code format
* code format
* code format
* code format
* add test
* code format
* documentation changes
* code quality
* add slow integration tests
* more general name
* refactor into logits processor
* add test
* avoid too much copy paste
* refactor
* add to docs
* fix-copies
* bug fix
* Revert "bug fix"
This reverts commit c99eb5a8dc.
* improve comment
* implement sylvains feedback
Co-authored-by: Ayush Jain <a.jain@sprinklr.com>
Co-authored-by: ayushtiku5 <40797286+ayushtiku5@users.noreply.github.com>
* Add TFGPT2ForSequenceClassification based on DialogRPT
* Add TFGPT2ForSequenceClassification based on DialogRPT
* TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing
* Add TFGPT2ForSequenceClassification based on DialogRPT
* TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing
* code refactor for latest other TF PR
* code refactor
* code refactor
* Update modeling_tf_gpt2.py
* Add badge w/ number of models on the hub
* try to apease @sgugger 😇
* not sure what this `c` was about [ci skip]
* Fix script and move stuff around
* Fix doc styling error
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
* fix mems in xlnet
* fix use_mems
* fix use_mem_len
* fix use mems
* clean docs
* fix tf typo
* make xlnet tf for generation work
* fix tf test
* refactor use cache
* add use cache for missing models
* correct use_cache in generate
* correct use cache in tf generate
* fix tf
* correct getattr typo
* make sylvain happy
* change in docs as well
* do not apply to cookie cutter statements
* fix tf test
* make pytorch model fully backward compatible
* Add early stopping patience and minimum threshold metric must improve to prevent early stopping to pytorch trainer
* Add early stopping test
* Set patience counter to 0 if best metric not defined yet
* Make early stopping a callback. Add callback event for updating the best metric for early stopping callback to trigger on.
* Run make style
* make funciton name sensible
* Improve new argument docstring wording and hope that flakey CI test passes.
* Use on_evaluation callback instead of custom. Remove some debug printing
* Move early stopping arguments and state into early stopping callback
* Run make style
* Remove old code
* Fix docs formatting. make style went rogue on me.
* Remove copied attributes and fix variable
* Add assertions on training arguments instead of mutating them. Move comment out of public docs.
* Make separate test for early stopping callback. Add test of invalid arguments.
* Run make style... I remembered before CI this time!
* appease flake8
* Add EarlyStoppingCallback to callback docs
* Make docstring EarlyStoppingCallabck match other callbacks.
* Fix typo in docs
* working on LongformerForSequenceClassification
* add TFLongformerForMultipleChoice
* add TFLongformerForTokenClassification
* use add_start_docstrings_to_model_forward
* test TFLongformerForSequenceClassification
* test TFLongformerForMultipleChoice
* test TFLongformerForTokenClassification
* remove test from repo
* add test and doc for TFLongformerForSequenceClassification, TFLongformerForTokenClassification, TFLongformerForMultipleChoice
* add requested classes to modeling_tf_auto.py
update dummy_tf_objects
fix tests
fix bugs in requested classes
* pass all tests except test_inputs_embeds
* sync with master
* pass all tests except test_inputs_embeds
* pass all tests
* pass all tests
* work on test_inputs_embeds
* fix style and quality
* make multi choice work
* fix TFLongformerForTokenClassification signature
* fix TFLongformerForMultipleChoice, TFLongformerForSequenceClassification signature
* fix mult choice
* fix mc hint
* fix input embeds
* fix input embeds
* refactor input embeds
* fix copy issue
* apply sylvains changes and clean more
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Updated the Extractive Question Answering code snippets
The Extractive Question Answering code snippets do not work anymore since the models return task-specific output objects. This commit fixes the pytorch and tensorflow examples but adding `.values()` to the model call.
* Update task_summary.rst
* Tokenizers should be framework agnostic
* Run the slow tests
* Not testing
* Fix documentation
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* <small>tiny typo</small>
* Tokenizers: ability to load from model subfolder
* use subfolder for local files as well
* Uniformize model shortcut name => model id
* from s3 => from huggingface.co
Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>
* Put models in subfolders
* Styling
* Fix imports in tests
* More fixes in test imports
* Sneaky hidden imports
* Fix imports in doc files
* More sneaky imports
* Finish fixing tests
* Fix examples
* Fix path for copies
* More fixes for examples
* Fix dummy files
* More fixes for example
* More model import fixes
* Is this why you're unhappy GitHub?
* Fix imports in conver command
* Use the CI to identify failing tests
* Remove from all examples and tests
* More default switch
* Fixes
* More test fixes
* More fixes
* Last fixes hopefully
* Use the CI to identify failing tests
* Remove from all examples and tests
* More default switch
* Fixes
* More test fixes
* More fixes
* Last fixes hopefully
* Run on the real suite
* Fix slow tests
* First addition of Flax/Jax documentation
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* make style
* Ensure input order match between Bert & Roberta
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Install dependencies "all" when building doc
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* wraps build_doc deps with ""
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Addressing @sgugger comments.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Use list to highlight JAX features.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Make style.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Let's not look to much into the future for now.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Style
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* Create modeling_tf_dpr.py
* Add TFDPR
* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot
last commit accidentally deleted these 4 lines, so I recover them back
* Add TFDPR
* Add TFDPR
* clean up some comments, add TF input-style doc string
* Add TFDPR
* Make return_dict=False as default
* Fix return_dict bug (in .from_pretrained)
* Add get_input_embeddings()
* Create test_modeling_tf_dpr.py
The current version is already passed all 27 tests!
Please see the test run at :
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing
* fix quality
* delete init weights
* run fix copies
* fix repo consis
* del config_class, load_tf_weights
They shoud be 'pytorch only'
* add config_class back
after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion
* newline after .. note::
* import tf, np (Necessary for ModelIntegrationTest)
* slow_test from_pretrained with from_pt=True
At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug
* Add simple TFDPRModelIntegrationTest
Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet
* upload correct tf model
* remove position_ids as missing keys
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>
* [testing utils] get_auto_remove_tmp_dir default change
Now that I have been using `get_auto_remove_tmp_dir default change` for a while, I realized that the defaults aren't most optimal.
99% of the time we want the tmp dir to be empty at the beginning of the test - so changing the default to `before=True` - this shouldn't impact any tests since this feature is used only during debug.
* simplify things
* update docs
* fix doc layout
* style
* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* better 3-state doc
* style
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* s/tmp/temporary/ + style
* correct the statement
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add training tests
* correct longformer
* fix docs
* fix some tests
* fix some more train tests
* remove ipdb
* fix multiple edge case model training
* fix funnel and prophetnet
* clean gpt models
* undo renaming of albert
* Output cross-attention with decoder attention output
* Update src/transformers/modeling_bert.py
* add cross-attention for t5 and bart as well
* fix tests
* correct typo in docs
* add sylvains and sams comments
* correct typo
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Output global_attentions in Longformer models
* make style
* small refactoring
* fix tests
* make fix-copies
* add for tf as well
* remove comments in test
* make fix-copies
* make style
* add docs
* make docstring pretty
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
* first draft
* show design proposition for new generate method
* up
* make better readable
* make first version
* gpt2 tests pass
* make beam search for gpt2 work
* add first encoder-decoder code
* delete typo
* make t5 work
* save indermediate
* make bart work with beam search
* finish beam search bart / t5
* add default kwargs
* make more tests pass
* fix no bad words sampler
* some fixes and tests for all distribution processors
* fix test
* fix rag slow tests
* merge to master
* add nograd to generate
* make all slow tests pass
* speed up generate
* fix edge case bug
* small fix
* correct typo
* add type hints and docstrings
* fix typos in tests
* add beam search tests
* add tests for beam scorer
* fix test rag
* finish beam search tests
* move generation tests in seperate file
* fix generation tests
* more tests
* add aggressive generation tests
* fix tests
* add gpt2 sample test
* add more docstring
* add more docs
* finish doc strings
* apply some more of sylvains and sams comments
* fix some typos
* make fix copies
* apply lysandres and sylvains comments
* final corrections on examples
* small fix for reformer
* move the helper code into testing_utils
* port test_trainer_distributed to work with pytest
* improve docs
* simplify notes
* doc
* doc
* style
* doc
* further improvements
* torch might not be available
* real fix
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* first attempt to add AzureML callbacks
* func arg fix
* var name fix, but still won't fix error...
* fixing as in https://discuss.huggingface.co/t/how-to-integrate-an-azuremlcallback-for-logging-in-azure/1713/2
* Avoid lint check of azureml import
* black compliance
* Make isort happy
* Fix point typo in docs
* Add AzureML to Callbacks docs
* Attempt to make sphinx happy
* Format callback docs
* Make documentation style happy
* Make docs compliant to style
Co-authored-by: Davide Fiocco <davide.fiocco@frontiersin.net>
* Important files
* Styling them all
* Revert "Styling them all"
This reverts commit 7d029395fd.
* Syling them for realsies
* Fix syntax error
* Fix benchmark_utils
* More fixes
* Fix modeling auto and script
* Remove new line
* Fixes
* More fixes
* Fix more files
* Style
* Add FSMT
* More fixes
* More fixes
* More fixes
* More fixes
* Fixes
* More fixes
* More fixes
* Last fixes
* Make sphinx happy
* Fix minor typos
Fix minor typos in the docs.
* Update docs/source/preprocessing.rst
Clearer data structure description.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add MLflow integration class
Add integration code for MLflow in integrations.py along with the code
that checks that MLflow is installed.
* Add MLflowCallback import
Add import of MLflowCallback in trainer.py
* Handle model argument
Allow the callback to handle model argument and store model config items as hyperparameters.
* Log parameters to MLflow in batches
MLflow cannot log more than a hundred parameters at once.
Code added to split the parameters into batches of 100 items and log the batches one by one.
* Fix style
* Add docs on MLflow callback
* Fix issue with unfinished runs
The "fluent" api used in MLflow integration allows only one run to be active at any given moment. If the Trainer is disposed off and a new one is created, but the training is not finished, it will refuse to log the results when the next trainer is created.
* Add MLflow integration class
Add integration code for MLflow in integrations.py along with the code
that checks that MLflow is installed.
* Add MLflowCallback import
Add import of MLflowCallback in trainer.py
* Handle model argument
Allow the callback to handle model argument and store model config items as hyperparameters.
* Log parameters to MLflow in batches
MLflow cannot log more than a hundred parameters at once.
Code added to split the parameters into batches of 100 items and log the batches one by one.
* Fix style
* Add docs on MLflow callback
* Fix issue with unfinished runs
The "fluent" api used in MLflow integration allows only one run to be active at any given moment. If the Trainer is disposed off and a new one is created, but the training is not finished, it will refuse to log the results when the next trainer is created.
* slow tests should be slow
* exception note
* style
* integrate LysandreJik's notes with some expansions
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* another slow test
* fix link, and prose
* clarify.
* note from Sam
* typo
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add CustomHFIndex
* typo in config
* update tests
* add custom dataset example
* clean script
* update test data
* minor in test
* docs
* docs
* style
* fix imports
* allow to pass the indexed dataset directly
* update tests
* use multiset DPR
* address thom and patrick's comments
* style
* update dpr tokenizer
* add output_dir flag in use_own_knowledge_dataset.py
* allow custom datasets in examples/rag/finetune.py
* add test for custom dataset in distributed rag retriever
* splitting fast and slow tokenizers [WIP]
* [WIP] splitting sentencepiece and tokenizers dependencies
* update dummy objects
* add name_or_path to models and tokenizers
* prefix added to file names
* prefix
* styling + quality
* spliting all the tokenizer files - sorting sentencepiece based ones
* update tokenizer version up to 0.9.0
* remove hard dependency on sentencepiece 🎉
* and removed hard dependency on tokenizers 🎉
* update conversion script
* update missing models
* fixing tests
* move test_tokenization_fast to main tokenization tests - fix bugs
* bump up tokenizers
* fix bert_generation
* update ad fix several tokenizers
* keep sentencepiece in deps for now
* fix funnel and deberta tests
* fix fsmt
* fix marian tests
* fix layoutlm
* fix squeezebert and gpt2
* fix T5 tokenization
* fix xlnet tests
* style
* fix mbart
* bump up tokenizers to 0.9.2
* fix model tests
* fix tf models
* fix seq2seq examples
* fix tests without sentencepiece
* fix slow => fast conversion without sentencepiece
* update auto and bert generation tests
* fix mbart tests
* fix auto and common test without tokenizers
* fix tests without tokenizers
* clean up tests lighten up when tokenizers + sentencepiece are both off
* style quality and tests fixing
* add sentencepiece to doc/examples reqs
* leave sentencepiece on for now
* style quality split hebert and fix pegasus
* WIP Herbert fast
* add sample_text_no_unicode and fix hebert tokenization
* skip FSMT example test for now
* fix style
* fix fsmt in example tests
* update following Lysandre and Sylvain's comments
* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add Documentation for GPT-1 Classification
* Add GPT-1 with Classification head
* Add tests for GPT-1 Classification
* Add GPT-1 For Classification to auto models
* Remove authorized missing keys, change checkpoint to openai-gpt
* [WIP] SP tokenizers
* fixing tests for T5
* WIP tokenizers
* serialization
* update T5
* WIP T5 tokenization
* slow to fast conversion script
* Refactoring to move tokenzier implementations inside transformers
* Adding gpt - refactoring - quality
* WIP adding several tokenizers to the fast world
* WIP Roberta - moving implementations
* update to dev4 switch file loading to in-memory loading
* Updating and fixing
* advancing on the tokenizers - updating do_lower_case
* style and quality
* moving forward with tokenizers conversion and tests
* MBart, T5
* dumping the fast version of transformer XL
* Adding to autotokenizers + style/quality
* update init and space_between_special_tokens
* style and quality
* bump up tokenizers version
* add protobuf
* fix pickle Bert JP with Mecab
* fix newly added tokenizers
* style and quality
* fix bert japanese
* fix funnel
* limite tokenizer warning to one occurence
* clean up file
* fix new tokenizers
* fast tokenizers deep tests
* WIP adding all the special fast tests on the new fast tokenizers
* quick fix
* adding more fast tokenizers in the fast tests
* all tokenizers in fast version tested
* Adding BertGenerationFast
* bump up setup.py for CI
* remove BertGenerationFast (too early)
* bump up tokenizers version
* Clean old docstrings
* Typo
* Update following Lysandre comments
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
* Initial callback proposal
* Finish various callbacks
* Post-rebase conflicts
* Fix tests
* Don't use something that's not set
* Documentation
* Remove unwanted print.
* Document all models can work
* Add tests + small fixes
* Update docs/source/internal/trainer_utils.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address review comments
* Fix TF tests
* Real fix this time
* This one should work
* Fix typo
* Really fix typo
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* configuration_squeezebert.py
thin wrapper around bert tokenizer
fix typos
wip sb model code
wip modeling_squeezebert.py. Next step is to get the multi-layer-output interface working
set up squeezebert to use BertModelOutput when returning results.
squeezebert documentation
formatting
allow head mask that is an array of [None, ..., None]
docs
docs cont'd
path to vocab
docs and pointers to cloud files (WIP)
line length and indentation
squeezebert model cards
formatting of model cards
untrack modeling_squeezebert_scratchpad.py
update aws paths to vocab and config files
get rid of stub of NSP code, and advise users to pretrain with mlm only
fix rebase issues
redo rebase of modeling_auto.py
fix issues with code formatting
more code format auto-fixes
move squeezebert before bert in tokenization_auto.py and modeling_auto.py because squeezebert inherits from bert
tests for squeezebert modeling and tokenization
fix typo
move squeezebert before bert in modeling_auto.py to fix inheritance problem
disable test_head_masking, since squeezebert doesn't yet implement head masking
fix issues exposed by the test_modeling_squeezebert.py
fix an issue exposed by test_tokenization_squeezebert.py
fix issue exposed by test_modeling_squeezebert.py
auto generated code style improvement
issue that we inherited from modeling_xxx.py: SqueezeBertForMaskedLM.forward() calls self.cls(), but there is no self.cls, and I think the goal was actually to call self.lm_head()
update copyright
resolve failing 'test_hidden_states_output' and remove unused encoder_hidden_states and encoder_attention_mask
docs
add integration test. rename squeezebert-mnli --> squeezebert/squeezebert-mnli
autogenerated formatting tweaks
integrate feedback from patrickvonplaten and sgugger to programming style and documentation strings
* tiny change to order of imports
* Clean up model documentation
* Formatting
* Preparation work
* Long lines
* Main work on rst files
* Cleanup all config files
* Syntax fix
* Clean all tokenizers
* Work on first models
* Models beginning
* FaluBERT
* All PyTorch models
* All models
* Long lines again
* Fixes
* More fixes
* Update docs/source/model_doc/bert.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update docs/source/model_doc/electra.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Last fixes
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* skip decorators: docs, tests, bugs
* another important note
* style
* bloody style
* add @pytest.mark.parametrize
* add note
* no idea what it wants :(
* added rag WIP
* path fix
* Formatting / renaming prior to actual work
* added rag WIP
* path fix
* Formatting / renaming prior to actual work
* added rag WIP
* path fix
* Formatting / renaming prior to actual work
* added rag WIP
* Formatting / renaming prior to actual work
* First commit
* improve comments
* Retrieval evaluation scripts
* refactor to include modeling outputs + MPI retriever
* Fix rag-token model + refactor
* Various fixes + finetuning logic
* use_bos fix
* Retrieval refactor
* Finetuning refactoring and cleanup
* Add documentation and cleanup
* Remove set_up_rag_env.sh file
* Fix retrieval wit HF index
* Fix import errors
* Fix quality errors
* Refactor as per suggestions in https://github.com/huggingface/transformers/pull/6813#issuecomment-687208867
* fix quality
* Fix RAG Sequence generation
* minor cleanup plus initial tests
* fix test
* fix tests 2
* Comments fix
* post-merge fixes
* Improve readme + post-rebase refactor
* Extra dependencied for tests
* Fix tests
* Fix tests 2
* Refactor test requirements
* Fix tests 3
* Post-rebase refactor
* rename nlp->datasets
* RAG integration tests
* add tokenizer to slow integration test and allow retriever to run on cpu
* add tests; fix position ids warning
* change structure
* change structure
* add from encoder generator
* save working solution
* make all integration tests pass
* add RagTokenizer.save/from_pretrained and RagRetriever.save/from_pretrained
* don't save paths
* delete unnecessary imports
* pass config to AutoTokenizer.from_pretrained for Rag tokenizers
* init wiki_dpr only once
* hardcode legacy index and passages paths (todo: add the right urls)
* finalize config
* finalize retriver api and config api
* LegacyIndex index download refactor
* add dpr to autotokenizer
* make from pretrained more flexible
* fix ragfortokengeneration
* small name changes in tokenizer
* add labels to models
* change default index name
* add retrieval tests
* finish token generate
* align test with previous version and make all tests pass
* add tests
* finalize tests
* implement thoms suggestions
* add first version of test
* make first tests work
* make retriever platform agnostic
* naming
* style
* add legacy index URL
* docstrings + simple retrieval test for distributed
* clean model api
* add doc_ids to retriever's outputs
* fix retrieval tests
* finish model outputs
* finalize model api
* fix generate problem for rag
* fix generate for other modles
* fix some tests
* save intermediate
* set generate to default
* big refactor generate
* delete rag_api
* correct pip faiss install
* fix auto tokenization test
* fix faiss install
* fix test
* move the distributed logic to examples
* model page
* docs
* finish tests
* fix dependencies
* fix import in __init__
* Refactor eval_rag and finetune scripts
* start docstring
* add psutil to test
* fix tf test
* move require torch to top
* fix retrieval test
* align naming
* finish automodel
* fix repo consistency
* test ragtokenizer save/load
* add rag model output docs
* fix ragtokenizer save/load from pretrained
* fix tokenizer dir
* remove torch in retrieval
* fix docs
* fixe finetune scripts
* finish model docs
* finish docs
* remove auto model for now
* add require torch
* remove solved todos
* integrate sylvains suggestions
* sams comments
* correct mistake on purpose
* improve README
* Add generation test cases
* fix rag token
* clean token generate
* fix test
* add note to test
* fix attention mask
* add t5 test for rag
* Fix handling prefix in finetune.py
* don't overwrite index_name
Co-authored-by: Patrick Lewis <plewis@fb.com>
Co-authored-by: Aleksandra Piktus <piktus@devfair0141.h2.fair>
Co-authored-by: Aleksandra Piktus <piktus@learnfair5102.h2.fair>
Co-authored-by: Aleksandra Piktus <piktus@learnfair5067.h2.fair>
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>
* ready for PR
* cleanup
* correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST
* fix
* perfectionism
* revert change from another PR
* odd, already committed this one
* non-interactive upload workaround
* backup the failed experiment
* store langs in config
* workaround for localizing model path
* doc clean up as in https://github.com/huggingface/transformers/pull/6956
* style
* back out debug mode
* document: run_eval.py --num_beams 10
* remove unneeded constant
* typo
* re-use bart's Attention
* re-use EncoderLayer, DecoderLayer from bart
* refactor
* send to cuda and fp16
* cleanup
* revert (moved to another PR)
* better error message
* document run_eval --num_beams
* solve the problem of tokenizer finding the right files when model is local
* polish, remove hardcoded config
* add a note that the file is autogenerated to avoid losing changes
* prep for org change, remove unneeded code
* switch to model4.pt, update scores
* s/python/bash/
* missing init (but doesn't impact the finetuned model)
* cleanup
* major refactor (reuse-bart)
* new model, new expected weights
* cleanup
* cleanup
* full link
* fix model type
* merge porting notes
* style
* cleanup
* have to create a DecoderConfig object to handle vocab_size properly
* doc fix
* add note (not a public class)
* parametrize
* - add bleu scores integration tests
* skip test if sacrebleu is not installed
* cache heavy models/tokenizers
* some tweaks
* remove tokens that aren't used
* more purging
* simplify code
* switch to using decoder_start_token_id
* add doc
* Revert "major refactor (reuse-bart)"
This reverts commit 226dad15ca.
* decouple from bart
* remove unused code #1
* remove unused code #2
* remove unused code #3
* update instructions
* clean up
* move bleu eval to examples
* check import only once
* move data+gen script into files
* reuse via import
* take less space
* add prepare_seq2seq_batch (auto-tested)
* cleanup
* recode test to use json instead of yaml
* ignore keys not needed
* use the new -y in transformers-cli upload -y
* [xlm tok] config dict: fix str into int to match definition (#7034)
* [s2s] --eval_max_generate_length (#7018)
* Fix CI with change of name of nlp (#7054)
* nlp -> datasets
* More nlp -> datasets
* Woopsie
* More nlp -> datasets
* One last
* extending to support allen_nlp wmt models
- allow a specific checkpoint file to be passed
- more arg settings
- scripts for allen_nlp models
* sync with changes
* s/fsmt-wmt/wmt/ in model names
* s/fsmt-wmt/wmt/ in model names (p2)
* s/fsmt-wmt/wmt/ in model names (p3)
* switch to a better checkpoint
* typo
* make non-optional args such - adjust tests where possible or skip when there is no other choice
* consistency
* style
* adjust header
* cards moved (model rename)
* use best custom hparams
* update info
* remove old cards
* cleanup
* s/stas/facebook/
* update scores
* s/allen_nlp/allenai/
* url maps aren't needed
* typo
* move all the doc / build /eval generators to their own scripts
* cleanup
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* fix indent
* duplicated line
* style
* use the correct add_start_docstrings
* oops
* resizing can't be done with the core approach, due to 2 dicts
* check that the arg is a list
* style
* style
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Initial model
* Fix upsampling
* Add special cls token id and test
* Formatting
* Test and fist FunnelTokenizerFast
* Common tests
* Fix the check_repo script and document Funnel
* Doc fixes
* Add all models
* Write doc
* Fix test
* Initial model
* Fix upsampling
* Add special cls token id and test
* Formatting
* Test and fist FunnelTokenizerFast
* Common tests
* Fix the check_repo script and document Funnel
* Doc fixes
* Add all models
* Write doc
* Fix test
* Fix copyright
* Forgot some layers can be repeated
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/modeling_funnel.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address review comments
* Update src/transformers/modeling_funnel.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Address review comments
* Update src/transformers/modeling_funnel.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
* Slow integration test
* Make small integration test
* Formatting
* Add checkpoint and separate classification head
* Formatting
* Expand list, fix link and add in pretrained models
* Styling
* Add the model in all summaries
* Typo fixes
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Tested in a local build of the docs.
e.g. Just above https://huggingface.co/transformers/task_summary.html#causal-language-modeling
Copy will copy the full code, e.g.
for token in top_5_tokens:
print(sequence.replace(tokenizer.mask_token, tokenizer.decode([token])))
Instead of currently only:
for token in top_5_tokens:
>>> for token in top_5_tokens:
... print(sequence.replace(tokenizer.mask_token, tokenizer.decode([token])))
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help reduce our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help increase our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help decrease our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help offset our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help improve our carbon footprint.
Docs for the option fix:
https://sphinx-copybutton.readthedocs.io/en/latest/
* [doc] multiple corrections to "Summary of the tasks"
* fix indentation
* correction
* fix links, add links to examples/seq2seq/README.md instead of non-existing script
* [doc] make the text more readable, fix some typos, add some disambiguation
* Update docs/source/glossary.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Generation doc
* MBartForConditionalGeneration (#6441)
* add MBartForConditionalGeneration
* style
* rebase and fixes
* add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS
* fix docs
* don't ignore mbart
* doc
* fix mbart fairseq link
* put mbart before bart
* apply doc suggestions
* Use hash to clean the test dirs (#6475)
* Use hash to clean the test dirs
* Use hash to clean the test dirs
* Use hash to clean the test dirs
* fix
* [EncoderDecoder] Add Cross Attention for GPT2 (#6415)
* add cross attention layers for gpt2
* make gpt2 cross attention work
* finish bert2gpt2
* add explicit comments
* remove attention mask since not yet supported
* revert attn mask in pipeline
* Update src/transformers/modeling_gpt2.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/modeling_encoder_decoder.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Sort unique_no_split_tokens to make it deterministic (#6461)
* change unique_no_split_tokens's type to set
* use sorted list instead of set
* style
* Import accuracy_score (#6480)
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address comments
* Styling
* Generation doc
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address comments
* Styling
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
Co-authored-by: gijswijnholds <gijswijnholds@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* improve names and tests longformer
* more and better tests for longformer
* add first tf test
* finalize tf basic op functions
* fix merge
* tf shape test passes
* narrow down discrepancies
* make longformer local attn tf work
* correct tf longformer
* add first global attn function
* add more global longformer func
* advance tf longformer
* finish global attn
* upload big model
* finish all tests
* correct false any statement
* fix common tests
* make all tests pass except keras save load
* fix some tests
* fix torch test import
* finish tests
* fix test
* fix torch tf tests
* add docs
* finish docs
* Update src/transformers/modeling_longformer.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/modeling_tf_longformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* apply Lysandres suggestions
* reverse to assert statement because function will fail otherwise
* applying sylvains recommendations
* Update src/transformers/modeling_longformer.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
* Update src/transformers/modeling_tf_longformer.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
* Add a script to check all models are tested and documented
* Apply suggestions from code review
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
* Address comments
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
* TF outputs and test on BERT
* Albert to DistilBert
* All remaining TF models except T5
* Documentation
* One file forgotten
* TF outputs and test on BERT
* Albert to DistilBert
* All remaining TF models except T5
* Documentation
* One file forgotten
* Add new models and fix issues
* Quality improvements
* Add T5
* A bit of cleanup
* Fix for slow tests
* Style
* Replace mecab-python3 with fugashi
This replaces mecab-python3 with fugashi for Japanese tokenization. I am
the maintainer of both projects.
Both projects are MeCab wrappers, so the underlying C++ code is the
same. fugashi is the newer wrapper and doesn't use SWIG, so for basic
use of the MeCab API it's easier to use.
This code insures the use of a version of ipadic installed via pip,
which should make versioning and tracking down issues easier.
fugashi has wheels for Windows, OSX, and Linux, which will help with
issues with installing old versions of mecab-python3 on Windows.
Compared to mecab-python3, because fugashi doesn't use SWIG, it doesn't
require a C++ runtime to be installed on Windows.
In adding this change I removed some code dealing with `cursor`,
`token_start`, and `token_end` variables. These variables didn't seem to
be used for anything, it is unclear to me why they were there.
I ran the tests and they passed, though I couldn't figure out how to run
the slow tests (`--runslow` gave an error) and didn't try testing with
Tensorflow.
* Style fix
* Remove unused variable
Forgot to delete this...
* Adapt doc with install instructions
* Fix typo
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add onnxruntime transformers optimization support
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Added Optimization section in ONNX/ONNXRuntime documentation.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Improve note reference
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Fixing imports order.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Add warning about different level of optimization between torch and tf export.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Address @LysandreJik wording suggestion
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address @LysandreJik wording suggestion
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Always optimize model before quantization for maximum performances.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Address comments on the documentation.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Improve TensorFlow optimization message as suggested by @yufenglee
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Removed --optimize parameter
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Warn the user about current quantization limitation when model is larger than 2GB.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Trigger CI for last check
* Small change in print for the optimization section.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* initial commit for pipeline implementation
Addition of input processing and history concatenation
* Conversation pipeline tested and working for single & multiple conversation inputs
* Added docstrings for dialogue pipeline
* Addition of dialogue pipeline integration tests
* Delete test_t5.py
* Fixed max code length
* Updated styling
* Fixed test broken by formatting tools
* Removed unused import
* Added unit test for DialoguePipeline
* Fixed Tensorflow compatibility
* Fixed multi-framework support using framework flag
* - Fixed docstring
- Added `min_length_for_response` as an initialization parameter
- Renamed `*args` to `conversations`, `conversations` being a `Conversation` or a `List[Conversation]`
- Updated truncation to truncate entire segments of conversations, instead of cutting in the middle of a user/bot input
* - renamed pipeline name from dialogue to conversational
- removed hardcoded default value of 1000 and use config.max_length instead
- added `append_response` and `set_history` method to the Conversation class to avoid direct fields mutation
- fixed bug in history truncation method
* - Updated ConversationalPipeline to accept only active conversations (otherwise a ValueError is raised)
* - Simplified input tensor conversion
* - Updated attention_mask value for Tensorflow compatibility
* - Updated last dialogue reference to conversational & fixed integration tests
* Fixed conflict with master
* Updates following review comments
* Updated formatting
* Added Conversation and ConversationalPipeline to the library __init__, addition of docstrings for Conversation, added both to the docs
* Update src/transformers/pipelines.py
Updated docsting following review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Switch from return_tuple to return_dict
* Fix test
* [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614)
* Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests
* AutoModels
Tiny tweaks
* Style
* Final changes before merge
* Re-order for simpler review
* Final fixes
* Addressing @sgugger's comments
* Test MultipleChoice
* Rework TF trainer (#6038)
* Fully rework training/prediction loops
* fix method name
* Fix variable name
* Fix property name
* Fix scope
* Fix method name
* Fix tuple index
* Fix tuple index
* Fix indentation
* Fix variable name
* fix eval before log
* Add drop remainder for test dataset
* Fix step number + fix logging datetime
* fix eval loss value
* use global step instead of step + fix logging at step 0
* Fix logging datetime
* Fix global_step usage
* Fix breaking loop + logging datetime
* Fix step in prediction loop
* Fix step breaking
* Fix train/test loops
* Force TF at least 2.2 for the trainer
* Use assert_cardinality to facilitate the dataset size computation
* Log steps per epoch
* Make tfds compliant with TPU
* Make tfds compliant with TPU
* Use TF dataset enumerate instead of the Python one
* revert previous commit
* Fix data_dir
* Apply style
* rebase on master
* Address Sylvain's comments
* Address Sylvain's and Lysandre comments
* Trigger CI
* Remove unused import
* Switch from return_tuple to return_dict
* Fix test
* Add recent model
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Plu <plu.julien@gmail.com>
* Added capability to quantize a model while exporting through ONNX.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
We do not support multiple extensions
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Reformat files
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* More quality
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Ensure test_generate_identified_name compares the same object types
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Added documentation everywhere on ONNX exporter
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Use pathlib.Path instead of plain-old string
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Use f-string everywhere
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Use the correct parameters for black formatting
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Use Python 3 super() style.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Use packaging.version to ensure installed onnxruntime version match requirements
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Fixing imports sorting order.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Missing raise(s)
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Added quantization documentation
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Fix some spelling.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Fix bad list header format
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Move torchscript and add ONNX documentation under modle_export
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Let's follow guidelines by the gurus: Renamed torchscript.rst to serialization.rst
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Remove previously introduced tree element
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* WIP doc
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* ONNX documentation
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Fix invalid link
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Improve spelling
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Final wording pass
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Work on tokenizer summary
* Finish tutorial
* Link to it
* Apply suggestions from code review
Co-authored-by: Anthony MOI <xn1t0x@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Add vocab definition
Co-authored-by: Anthony MOI <xn1t0x@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* remove references to old API in docstring - update data processors
* style
* fix tests - better type checking error messages
* better type checking
* include awesome fix by @LysandreJik for #5310
* updated doc and examples
* All done
* Link to the tutorial
* Typo fixes
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
* Add metnion of the return_xxx args
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
* Quicktour part 1
* Update
* All done
* Typos
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
* Address comments in quick tour
* Update docs/source/quicktour.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update from feedback
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* ElectraForQuestionAnswering
* udate __init__
* add test for electra qa model
* add ElectraForQuestionAnswering in auto models
* add ElectraForQuestionAnswering in all_model_classes
* fix outputs, input_ids defaults to None
* add ElectraForQuestionAnswering in docs
* remove commented line
* [hf_api] Attach all unknown attributes for future-proof compatibility
* [Pipeline] NerPipeline is really a TokenClassificationPipeline
* modelcard.py: I don't think we need to force the download
* Remove config, tokenizer from SUPPORTED_TASKS as we're moving to one model = one weight + one tokenizer
* FillMaskPipeline: also output token in string form
* TextClassificationPipeline: option to return all scores, not just the argmax
* Update docs/source/main_classes/pipelines.rst
* Kill model archive maps
* Fixup
* Also kill model_archive_map for MaskedBertPreTrainedModel
* Unhook config_archive_map
* Tokenizers: align with model id changes
* make style && make quality
* Fix CI
* better api
* improve automatic setting of global attention mask
* fix longformer bug
* fix global attention mask in test
* fix global attn mask flatten
* fix slow tests
* update docstring
* update docs and make more robust
* improve attention mask
* add multiple choice for longformer
* add models to docs
* adapt docstring
* add test to longformer
* add longformer for mc in init and modeling auto
* fix tests
* first commit
* bug fixes
* better examples
* undo padding
* remove wrong VOCAB_FILES_NAMES
* License
* make style
* make isort happy
* unit tests
* integration test
* make `black` happy by undoing `isort` changes!!
* lint
* no need for the padding value
* batch_size not bsz
* remove unused type casting
* seqlen not seq_len
* staticmethod
* `bert` selfattention instead of `n2`
* uint8 instead of bool + lints
* pad inputs_embeds using embeddings not a constant
* black
* unit test with padding
* fix unit tests
* remove redundant unit test
* upload model weights
* resolve todo
* simpler _mask_invalid_locations without lru_cache + backward compatible masked_fill_
* increase unittest coverage
"Migrating from pytorch-transformers to transformers" is missing in the main document. It is available in the main `readme` thought. Just move it to the document.
* Created using Colaboratory
* [examples] reorganize files
* remove run_tpu_glue.py as superseded by TPU support in Trainer
* Bugfix: int, not tuple
* move files around
* first copy & past commit from Bert and morgans LSH code
* add easy way to compare to trax original code
* translate most of function
* make trax lsh self attention deterministic with numpy seed + copy paste code
* add same config
* add same config
* make layer init work
* implemented hash_vectors function for lsh attention
* continue reformer translation
* hf LSHSelfAttentionLayer gives same output as trax layer
* refactor code
* refactor code
* refactor code
* refactor
* refactor + add reformer config
* delete bogus file
* split reformer attention layer into two layers
* save intermediate step
* save intermediate step
* make test work
* add complete reformer block layer
* finish reformer layer
* implement causal and self mask
* clean reformer test and refactor code
* fix merge conflicts
* fix merge conflicts
* update init
* fix device for GPU
* fix chunk length init for tests
* include morgans optimization
* improve memory a bit
* improve comment
* factorize num_buckets
* better testing parameters
* make whole model work
* make lm model work
* add t5 copy paste tokenizer
* add chunking feed forward
* clean config
* add improved assert statements
* make tokenizer work
* improve test
* correct typo
* extend config
* add complexer test
* add new axial position embeddings
* add local block attention layer
* clean tests
* refactor
* better testing
* save intermediate progress
* clean test file
* make shorter input length work for model
* allow variable input length
* refactor
* make forward pass for pretrained model work
* add generation possibility
* finish dropout and init
* make style
* refactor
* add first version of RevNet Layers
* make forward pass work and add convert file
* make uploaded model forward pass work
* make uploaded model forward pass work
* refactor code
* add namedtuples and cache buckets
* correct head masks
* refactor
* made reformer more flexible
* make style
* remove set max length
* add attention masks
* fix up tests
* fix lsh attention mask
* make random seed optional for the moment
* improve memory in reformer
* add tests
* make style
* make sure masks work correctly
* detach gradients
* save intermediate
* correct backprob through gather
* make style
* change back num hashes
* rename to labels
* fix rotation shape
* fix detach
* update
* fix trainer
* fix backward dropout
* make reformer more flexible
* fix conflict
* fix
* fix
* add tests for fixed seed in reformer layer
* fix trainer typo
* fix typo in activations
* add fp16 tests
* add fp16 training
* support fp16
* correct gradient bug in reformer
* add fast gelu
* re-add dropout for embedding dropout
* better naming
* better naming
* renaming
* finalize test branch
* finalize tests
* add more tests
* finish tests
* fix
* fix type trainer
* fix fp16 tests
* fix tests
* fix tests
* fix tests
* fix issue with dropout
* fix dropout seeds
* correct random seed on gpu
* finalize random seed for dropout
* finalize random seed for dropout
* remove duplicate line
* correct half precision bug
* make style
* refactor
* refactor
* docstring
* remove sinusoidal position encodings for reformer
* move chunking to modeling_utils
* make style
* clean config
* make style
* fix tests
* fix auto tests
* pretrained models
* fix docstring
* update conversion file
* Update pretrained_models.rst
* fix rst
* fix rst
* update copyright
* fix test path
* fix test path
* fix small issue in test
* include reformer in generation tests
* add docs for axial position encoding
* finish docs
* Update convert_reformer_trax_checkpoint_to_pytorch.py
* remove isort
* include sams comments
* remove wrong comment in utils
* correct typos
* fix typo
* Update reformer.rst
* applied morgans optimization
* make style
* make gpu compatible
* remove bogus file
* big test refactor
* add example for chunking
* fix typo
* add to README
* Add GenerationPipeline
* Fix parameter names
* Correct parameter __call__ parameters
* Add model type attribute and correct function calls for prepare_input
* Take out trailing commas from init attributes
* Remove unnecessary tokenization line
* Implement support for multiple text inputs
* Apply generation support for multiple input text prompts
* Take out tensor coersion
* Take out batch index
* Add text prompt to return sequence
* Squeeze token tensore before decoding
* Return only a single list of sequences if only one prompt was used
* Correct results variable name
* Add GenerationPipeline to SUPPORTED_TASKS with the alias , initalized w GPT2
* Registedred AutoModelWithLMHead for both pt and t
* Update docstring for GenerationPipeline
* Add kwargs parameter to mode.generate
* Take out kwargs parameter after all
* Add generation pipeline example in pipeline docstring
* Fix max length by squeezing tokens tensor
* Apply ensure_tensor_on_device to pytorch tensor
* Include generation step in torch.no_grad
* Take out input from prepare_xlm_input and set 'en' as default xlm_language
* Apply framework specific encoding during prepare_input
* Format w make style
* Move GenerationPipeline import to follow proper import sorting
* Take out training comma from generation dict
* Apply requested changes
* Change name to TextGenerationPipeline
* Apply TextGenerationPipeline rename to __init___
* Changing alias to
* Set input mapping as input to ensure_tensor_on_device
* Fix assertion placement
* Add test_text_generation
* Add TextGenerationPipeline to PipelineCommonTests
* Take out whitespace
* Format __init__ w black
* Fix __init__ style
* Forman __init___
* Add line to end of __init__
* Correct model tokenizer set for test_text_generation
* Ensure to return list of list, not list of string (to pass test)
* Limit test models to only 3 to limit runtime to address circleCI timeout error
* Update src/transformers/pipelines.py
Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines.py
Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines.py
Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines.py
Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines.py
Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>
* Update tests/test_pipelines.py
Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines.py
Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines.py
Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines.py
Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>
* Remove argument docstring, __init__, add additional __call__ arguments, and reformat results to list of dict
* Fix blank result list
* Add TextGenerationPipeline to pipelines.rst
* Update src/transformers/pipelines.py
Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines.py
Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>
* Fix typos from adding PADDING_TEXT_TOKEN_LENGTH
* Fix incorrectly moved result list
* Update src/transformers/pipelines.py
Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/pipelines.py
* Update src/transformers/pipelines.py
* Update src/transformers/pipelines.py
* Update src/transformers/pipelines.py
* Update src/transformers/pipelines.py
* Update src/transformers/pipelines.py
* Update src/transformers/pipelines.py
* Update src/transformers/pipelines.py
* Update src/transformers/pipelines.py
* Update src/transformers/pipelines.py
* Update src/transformers/pipelines.py
* Update src/transformers/pipelines.py
Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>
* Add back generation line and make style
* Take out blank whitespace
* Apply new alis, text-generation, to test_pipelines
* Fix text generation alias in test
* Update src/transformers/pipelines.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* First pass on utility classes and python tokenizers
* finishing cleanup pass
* style and quality
* Fix tests
* Updating following @mfuntowicz comment
* style and quality
* Fix Roberta
* fix batch_size/seq_length inBatchEncoding
* add alignement methods + tests
* Fix OpenAI and Transfo-XL tokenizers
* adding trim_offsets=True default for GPT2 et RoBERTa
* style and quality
* fix tests
* add_prefix_space in roberta
* bump up tokenizers to rc7
* style
* unfortunately tensorfow does like these - removing shape/seq_len for now
* Update src/transformers/tokenization_utils.py
Co-Authored-By: Stefan Schweter <stefan@schweter.it>
* Adding doc and docstrings
* making flake8 happy
Co-authored-by: Stefan Schweter <stefan@schweter.it>
* Add clear description of how to train T5
* correct docstring in T5
* correct typo
* correct docstring format
* update t5 model docs
* implement collins feedback
* fix typo and add more explanation for sentinal tokens
* delete unnecessary todos
* passing
* Undo stupid chg
* docs
* undo rename
* delete-cruft
* only import if you have torch
* Dont rely on dict ordering
* Fix dict ordering upstream
* docstring link
* docstring link
* remove trailing comma for 3.5 compat
* new name
* delegate kwarging
* Update kwargs
* memory benchmark rss
* have both forward pass and line-by-line mem tracing
* cleaned up tracing
* refactored and cleaning up API
* no f-strings yet...
* add GPU mem logging
* fix GPU memory monitoring
* style and quality
* clean up and doc
* update with comments
* Switching to python 3.6+
* fix quality
* Pipeline doc initial commit
* pipeline abstraction
* Remove modelcard argument from pipeline
* Task-specific pipelines can be instantiated with no model or tokenizer
* All pipelines doc
* Usage: Sequence Classification & Question Answering
* Pipeline example
* Language modeling
* TensorFlow code for Sequence classification
* Custom TF/PT toggler in docs
* QA + LM for TensorFlow
* Finish Usage for both PyTorch and TensorFlow
* Addressing Julien's comments
* More assertive
* cleanup
* Favicon
- added favicon option in conf.py along with the favicon image
- udpated 🤗 logo. slightly smaller and should appear more consistent across editing programs (no more tongue on the outside of the mouth)
Co-authored-by: joshchagani <joshua@joshuachagani.com>
Use -e only in docs targeted at contributors.
If a user copy-pastes command line with [--editable], they will hit
an error. If they don't know the --editable option, we're giving them
a choice to make before they can move forwards, but this isn't a choice
they need to make right now.
* Switch to plain unittest for skipping slow tests.
Add a RUN_SLOW environment variable for running them.
* Switch to plain unittest for PyTorch dependency.
* Switch to plain unittest for TensorFlow dependency.
* Avoid leaking open files in the test suite.
This prevents spurious warnings when running tests.
* Fix unicode warning on Python 2 when running tests.
The warning was:
UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
* Support running PyTorch tests on a GPU.
Reverts 27e015bd.
* Tests no longer require pytest.
* Make tests pass on cuda