* NerPipeline (TokenClassification) now outputs offsets of words
- It happens that the offsets are missing, it forces the user to pattern
match the "word" from his input, which is not always feasible.
For instance if a sentence contains the same word twice, then there
is no way to know which is which.
- This PR proposes to fix that by outputting 2 new keys for this
pipelines outputs, "start" and "end", which correspond to the string
offsets of the word. That means that we should always have the
invariant:
```python
input[entity["start"]: entity["end"]] == entity["entity_group"]
# or entity["entity"] if not grouped
```
* Fixing doc style
* Slightly increase tolerance between pytorch and flax output
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* test_multiple_sentences doesn't require torch
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Simplify parameterization on "jit" to use boolean rather than str
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Use `require_torch` on `test_multiple_sentences` because we pull the weight from the hub.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Rename "jit" parameter to "use_jit" for (hopefully) making it self-documenting.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Remove pytest.mark.parametrize which seems to fail in some circumstances
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Fix unused imports.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Fix style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Give default parameters values for traced model.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Review comment: Change sentences to sequences
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Fix decoder not returning hidden states from the last layer
* Resolve conflict
* Change the way to gather hidden states
* Add decoder hidden states test
* Make pytest and black happy
* Remove redundant line
* remove new line
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* fix mems in xlnet
* fix use_mems
* fix use_mem_len
* fix use mems
* clean docs
* fix tf typo
* make xlnet tf for generation work
* fix tf test
* refactor use cache
* add use cache for missing models
* correct use_cache in generate
* correct use cache in tf generate
* fix tf
* correct getattr typo
* make sylvain happy
* change in docs as well
* do not apply to cookie cutter statements
* fix tf test
* make pytorch model fully backward compatible
* bart output hidden states upstream
* same w/ decoder
* add tests
* fix prophetnet
* fix gpt2 and ctrl
* fix fstm and skip test for reformer and longformer
* fix all models
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* implement support for run-time dependency version checking
* try not escaping !
* use findall that works on py36
* small tweaks
* autoformatter worship
* simplify
* shorter names
* add support for non-versioned checks
* add deps
* revert
* tokenizers not required, check version only if installed
* make a proper distutils cmd and add make target
* tqdm must be checked before tokenizers
* workaround the DistributionNotFound peculiar setup
* handle the rest of packages in setup.py
* fully sync setup.py's install_requires - to check them all
* nit
* make install_requires more readable
* typo
* Update setup.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* restyle
* add types
* simplify
* simplify2
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Support BERT relative position embeddings
* Fix typo in README.md
* Address review comment
* Fix failing tests
* [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py
* make fix copies
* fix configs of electra and albert and fix longformer
* remove copy statement from longformer
* fix albert
* fix electra
* Add bert variants forward tests for various position embeddings
* [tiny] Fix style for test_modeling_bert.py
* improve docstring
* [tiny] improve docstring and remove unnecessary dependency
* [tiny] Remove unused import
* re-add to ALBERT
* make embeddings work for ALBERT
* add test for albert
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add early stopping patience and minimum threshold metric must improve to prevent early stopping to pytorch trainer
* Add early stopping test
* Set patience counter to 0 if best metric not defined yet
* Make early stopping a callback. Add callback event for updating the best metric for early stopping callback to trigger on.
* Run make style
* make funciton name sensible
* Improve new argument docstring wording and hope that flakey CI test passes.
* Use on_evaluation callback instead of custom. Remove some debug printing
* Move early stopping arguments and state into early stopping callback
* Run make style
* Remove old code
* Fix docs formatting. make style went rogue on me.
* Remove copied attributes and fix variable
* Add assertions on training arguments instead of mutating them. Move comment out of public docs.
* Make separate test for early stopping callback. Add test of invalid arguments.
* Run make style... I remembered before CI this time!
* appease flake8
* Add EarlyStoppingCallback to callback docs
* Make docstring EarlyStoppingCallabck match other callbacks.
* Fix typo in docs
* Make ci fail
* Try to make tests actually run?
* CI finally failing?
* Fix CI
* Revert "Fix CI"
This reverts commit ca7923be73.
* Ooops wrong one
* one more try
* Ok ok let's move this elsewhere
* Alternative to globals() (#8667)
* Alternative to globals()
* Error is raised later so return None
* Sentencepiece not installed make some tokenizers None
* Apply Lysandre wisdom
* Slightly clearer comment?
cc @sgugger
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* `disable_ngram_loss` fix for prophetnet
* add changes documentation
* fix _compute_loss to use mean reduction and -100 to masked tokens & remove unnecessary arguments
* mean label smoothing loss
* small refactor
* fix test
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
* working on LongformerForSequenceClassification
* add TFLongformerForMultipleChoice
* add TFLongformerForTokenClassification
* use add_start_docstrings_to_model_forward
* test TFLongformerForSequenceClassification
* test TFLongformerForMultipleChoice
* test TFLongformerForTokenClassification
* remove test from repo
* add test and doc for TFLongformerForSequenceClassification, TFLongformerForTokenClassification, TFLongformerForMultipleChoice
* add requested classes to modeling_tf_auto.py
update dummy_tf_objects
fix tests
fix bugs in requested classes
* pass all tests except test_inputs_embeds
* sync with master
* pass all tests except test_inputs_embeds
* pass all tests
* pass all tests
* work on test_inputs_embeds
* fix style and quality
* make multi choice work
* fix TFLongformerForTokenClassification signature
* fix TFLongformerForMultipleChoice, TFLongformerForSequenceClassification signature
* fix mult choice
* fix mc hint
* fix input embeds
* fix input embeds
* refactor input embeds
* fix copy issue
* apply sylvains changes and clean more
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Adding PrefixConstrainedLogitsProcessor
* fixing RAG and style_doc
* fixing black (v20 instead of v19)
* Improving doc in generation_logits_process.py
* Improving docs and typing in generation_utils.py
* docs improvement
* adding test and fixing doc typo
* fixing doc_len
* isort on test
* fixed test
* improve docstring a bit
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Tokenizers should be framework agnostic
* Run the slow tests
* Not testing
* Fix documentation
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Put models in subfolders
* Styling
* Fix imports in tests
* More fixes in test imports
* Sneaky hidden imports
* Fix imports in doc files
* More sneaky imports
* Finish fixing tests
* Fix examples
* Fix path for copies
* More fixes for examples
* Fix dummy files
* More fixes for example
* More model import fixes
* Is this why you're unhappy GitHub?
* Fix imports in conver command
* Use the CI to identify failing tests
* Remove from all examples and tests
* More default switch
* Fixes
* More test fixes
* More fixes
* Last fixes hopefully
* Use the CI to identify failing tests
* Remove from all examples and tests
* More default switch
* Fixes
* More test fixes
* More fixes
* Last fixes hopefully
* Run on the real suite
* Fix slow tests
* Fix passing token_type_ids during GPT2DoubleHeadsModel.generate() if used
and for GPT2LMHeadModel too
* Update tests to check token_type_ids usage in GPT2 models
* Simply insert T5Tokenizer's prepare_seq2seq_batch
* Update/Add some 'import'
* fix RunTimeError caused by '.view'
* Moves .view related error avoidance from seq2seq_trainer to inside prophetnet
* Update test_tokenization_prophetnet.py
* Format the test code with black
* Re-format the test code
* Update test_tokenization_prophetnet.py
* Add importing require_torch in the test code
* Add importing BatchEncoding in the test code
* Re-format the test code on Colab
* Fixing roberta for slow-fast tests
* WIP getting equivalence on pipelines
* slow-to-fast equivalence - working on question-answering pipeline
* optional FAISS tests
* Pipeline Q&A
* Move pipeline tests to their own test job again
* update tokenizer to add sequence id methods
* update to tokenizers 0.9.4
* set sentencepiecce as optional
* clean up squad
* clean up pipelines to use sequence_ids
* style/quality
* wording
* Switch to use_fast = True by default
* update tests for use_fast at True by default
* fix rag tokenizer test
* removing protobuf from required dependencies
* fix NER test for use_fast = True by default
* fixing example tests (Q&A examples use slow tokenizers for now)
* protobuf in main deps extras["sentencepiece"] and example deps
* fix protobug install test
* try to fix seq2seq by switching to slow tokenizers for now
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update some tests
* Small update
* Apply style
* Use max_position_embeddings
* Create a fake attribute
* Create a fake attribute
* Update wrong name
* Wrong TransfoXL model file
* Keep the common tests agnostic
* Create modeling_tf_dpr.py
* Add TFDPR
* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot
last commit accidentally deleted these 4 lines, so I recover them back
* Add TFDPR
* Add TFDPR
* clean up some comments, add TF input-style doc string
* Add TFDPR
* Make return_dict=False as default
* Fix return_dict bug (in .from_pretrained)
* Add get_input_embeddings()
* Create test_modeling_tf_dpr.py
The current version is already passed all 27 tests!
Please see the test run at :
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing
* fix quality
* delete init weights
* run fix copies
* fix repo consis
* del config_class, load_tf_weights
They shoud be 'pytorch only'
* add config_class back
after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion
* newline after .. note::
* import tf, np (Necessary for ModelIntegrationTest)
* slow_test from_pretrained with from_pt=True
At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug
* Add simple TFDPRModelIntegrationTest
Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet
* upload correct tf model
* remove position_ids as missing keys
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>
* Add next sentence prediction loss computation
* Apply style
* Fix tests
* Add forgotten import
* Add forgotten import
* Use a new parameter
* Remove kwargs and use positional arguments
* fix typo
* rm use_cdn & references, and implement new hf_bucket_url
* I'm pretty sure we don't need to `read` this file
* same here
* [BIG] file_utils.networking: do not gobble up errors anymore
* Fix CI 😇
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Tiny doc tweak
* Add doc + pass kwarg everywhere
* Add more tests and explain
cc @sshleifer let me know if better
Co-Authored-By: Sam Shleifer <sshleifer@gmail.com>
* Also implement revision in pipelines
In the case where we're passing a task name or a string model identifier
* Fix CI 😇
* Fix CI
* [hf_api] new methods + command line implem
* make style
* Final endpoints post-migration
* Fix post-migration
* Py3.6 compat
cc @stefan-it
Thank you @stas00
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
* add training tests
* correct longformer
* fix docs
* fix some tests
* fix some more train tests
* remove ipdb
* fix multiple edge case model training
* fix funnel and prophetnet
* clean gpt models
* undo renaming of albert
* Add new token classification example
* Remove txt file
* Add test
* With actual testing done
* Less warmup is better
* Update examples/token-classification/run_ner_new.py
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
* Address review comments
* Fix test
* Make Lysandre happy
* Last touches and rename
* Rename in tests
* Address review comments
* More run_ner -> run_ner_old
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
* Output cross-attention with decoder attention output
* Update src/transformers/modeling_bert.py
* add cross-attention for t5 and bart as well
* fix tests
* correct typo in docs
* add sylvains and sams comments
* correct typo
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Make Trainer evaluation handle dynamic seq_length
* Document behavior.
* Fix test
* Better fix
* Fixes for realsies this time
* Address review comments
* Without forgetting to save...
* Output global_attentions in Longformer models
* make style
* small refactoring
* fix tests
* make fix-copies
* add for tf as well
* remove comments in test
* make fix-copies
* make style
* add docs
* make docstring pretty
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
- The issue is that with previous code we would have the following:
```python
qa_pipeline = (...)
qa_pipeline(question="Where was he born ?", context="")
-> IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
```
The goal here is to improve this to actually return a ValueError
wherever possible.
While at it, I tried to simplify QuestionArgumentHandler's code to
make it smaller and more compat while keeping backward compat.
* Bug fix: NER pipeline shouldn't group separate entities of same type
* style fix
* [Bug Fix] Shouldn't group entities that are both 'B' even if they are same type
(B-type1 B-type1) != (B-type1 I-type1)
[Bug Fix] add an option `ignore_subwords` to ignore subsequent ##wordpieces in predictions. Because some models train on only the first token of a word and not on the subsequent wordpieces (BERT NER default). So it makes sense doing the same thing at inference time.
The simplest fix is to just group the subwords with the first wordpiece.
[TODO] how to handle ignored scores? just set them to 0 and calculate zero invariant mean ?
[TODO] handle different wordpiece_prefix ## ? possible approaches:
get it from tokenizer? but currently most tokenizers dont have a wordpiece_prefix property?
have an _is_subword(token)
[Feature add] added option to `skip_special_tokens`. Cause It was harder to remove them after grouping.
[Additional Changes] remove B/I prefix on returned grouped_entities
[Feature Request/TODO] Return indexes?
[Bug TODO] can't use fast tokenizer with grouped_entities ('BertTokenizerFast' object has no attribute 'convert_tokens_to_string')
* use offset_mapping to fix [UNK] token problem
* ignore score for subwords
* modify ner_pipeline test
* modify ner_pipeline test
* modify ner_pipeline test
* ner_pipeline change ignore_subwords default to true
* add ner_pipeline ignore_subword=False test case
* fix offset_mapping index
* fix style again duh
* change is_subword and convert_tokens_to_string logic
* merge tests with new test structure
* change test names
* remove old tests
* ner tests for fast tokenizer
* fast tokenizers have convert_tokens_to_string
* Fix the incorrect merge
Co-authored-by: Ceyda Cinarel <snu-ceyda@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* make it possible to invoke testconf.py in both test suites without crashing on having the same option added
* perl -pi -e 's|--make_reports|--make-reports|' to be consistent with other opts
* add `pytest --make-reports` to all CIs (and artifacts)
* fix
* Updated ConversationalPipeline to work with encoder-decoder models (e.g. BlenderBot)
* Addition of integration test for EncoderDecoder conversation model
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* [FIX] TextGenerationPipeline is currently broken.
It's most likely due to #8180.
What's missing is a multi vs single string handler at the beginning of
the pipe.
And also there was no testing of this pipeline.
* Fixing Conversational tests too.
* first draft
* show design proposition for new generate method
* up
* make better readable
* make first version
* gpt2 tests pass
* make beam search for gpt2 work
* add first encoder-decoder code
* delete typo
* make t5 work
* save indermediate
* make bart work with beam search
* finish beam search bart / t5
* add default kwargs
* make more tests pass
* fix no bad words sampler
* some fixes and tests for all distribution processors
* fix test
* fix rag slow tests
* merge to master
* add nograd to generate
* make all slow tests pass
* speed up generate
* fix edge case bug
* small fix
* correct typo
* add type hints and docstrings
* fix typos in tests
* add beam search tests
* add tests for beam scorer
* fix test rag
* finish beam search tests
* move generation tests in seperate file
* fix generation tests
* more tests
* add aggressive generation tests
* fix tests
* add gpt2 sample test
* add more docstring
* add more docs
* finish doc strings
* apply some more of sylvains and sams comments
* fix some typos
* make fix copies
* apply lysandres and sylvains comments
* final corrections on examples
* small fix for reformer
* Replace swish with silu
* revert nn.silu to nn.swish due to older version
* simplify optimized silu conditional and fix format
* Update activations.py
* Update activations_tf.py
* Update modeling_flax_utils.py
* Update modeling_openai.py
* add swish testcase
* add pytorch swish testcase
* Add more robust python version check
* more formatting fixes
Co-authored-by: TFUsers <TFUsers@gmail.com>
* Test TF GPU CI
* Change cache
* Fix missing torch requirement
* Fix some model tests
Style
* LXMERT
* MobileBERT
* Longformer skip test
* XLNet
* The rest of the tests
* RAG goes OOM in multi gpu setup
* YAML test files
* Last fixes
* Skip doctests
* Fill mask tests
* Yaml files
* Last test fix
* Style
* Update cache
* Change ONNX tests to slow + use tiny model
* move the helper code into testing_utils
* port test_trainer_distributed to work with pytest
* improve docs
* simplify notes
* doc
* doc
* style
* doc
* further improvements
* torch might not be available
* real fix
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* better reports
* a whole bunch of reports in their own files
* clean up
* improvements
* github artifacts experiment
* style
* complete the report generator with multiple improvements/fixes
* fix
* save all reports under one dir to easy upload
* can remove temp failing tests
* doc fix
* some cleanup
* WIP refactoring pipeline tests - switching to fast tokenizers
* fix dialog pipeline and fill-mask
* refactoring pipeline tests backbone
* make large tests slow
* fix tests (tf Bart inactive for now)
* fix doc...
* clean up for merge
* fixing tests - remove bart from summarization until there is TF
* fix quality and RAG
* Add new translation pipeline tests - fix JAX tests
* only slow for dialog
* Fixing the missing TF-BART imports in modeling_tf_auto
* spin out pipeline tests in separate CI job
* adding pipeline test to CI YAML
* add slow pipeline tests
* speed up tf and pt join test to avoid redoing all the standalone pt and tf tests
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
* Update src/transformers/pipelines.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/pipelines.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add require_torch and require_tf in is_pt_tf_cross_test
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Start simplification
* More progress
* Finished script
* Address comments and update tests instructions
* Wrong test
* Accept files as inputs and fix test
* Update src/transformers/trainer_utils.py
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* Fix labels and add combined score
* Add special labels
* Update TPU command
* Revert to old label strategy
* Use model labels
* Fix for STT-B
* Styling
* Apply suggestions from code review
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
* Code styling
* Fix review comments
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
* Actually make the "translation", "translation_XX_to_YY" task behave correctly.
Background:
- Currently "translation_cn_to_ar" does not work. (only 3 pairs are
supported)
- Some models, contain in their config the correct values for the (src,
tgt) pair they can translate. It's usually just one pair, and we can
infer it automatically from the `model.config.task_specific_params`. If
it's not defined we can still probably load the TranslationPipeline
nevertheless.
Proposed fix:
- A simplified version of what could become more general which is
a `parametrized` task. "translation" + (src, tgt) in this instance
it what we need in the general case. The way we go about it for now
is simply parsing "translation_XX_to_YY". If cases of parametrized task arise
we should preferably go in something closer to what `datasets` propose
which is having a secondary argument `task_options`? that will be close
to what that task requires.
- Should be backward compatible in all cases for instance
`pipeline(task="translation_en_to_de") should work out of the box.
- Should provide a warning when a specific translation pair has been
selected on behalf of the user using
`model.config.task_specific_params`.
* Update src/transformers/pipelines.py
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* fix config save
* add test
* add config class variable and another test
* line break
* fix fsmt and typo
* god am I making many errors today :-/
* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* slow tests should be slow
* exception note
* style
* integrate LysandreJik's notes with some expansions
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* another slow test
* fix link, and prose
* clarify.
* note from Sam
* typo
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* make the save_load special key tests common
* handle mbart
* cleaner solution
* fix
* move test_save_load_missing_keys back into fstm for now
* restore
* style
* add marian
* add pegasus
* blenderbot
* revert - no static embed
* add CustomHFIndex
* typo in config
* update tests
* add custom dataset example
* clean script
* update test data
* minor in test
* docs
* docs
* style
* fix imports
* allow to pass the indexed dataset directly
* update tests
* use multiset DPR
* address thom and patrick's comments
* style
* update dpr tokenizer
* add output_dir flag in use_own_knowledge_dataset.py
* allow custom datasets in examples/rag/finetune.py
* add test for custom dataset in distributed rag retriever
* fix 5990
* accomodate iterable dataset without predefined length
* set it as 1 use case: provide max_steps, and NO num_epochs
* Is a merge of master and PR 5995
* fix trainer test under TF
* fix only for torch
* TF trainer untouched
* trainer tests are skipped when no torch
* address comments
* fix quality checks
* remove torch.dataset from test_trainer
* unnecessary inheritance
* RegressionDataset implements all needed methods __len__ and __getitem__
* fix quality checks
* restore RegressionDataset
* was wrongly under is_torch_available()
* WIP flax bert
* Initial commit Bert Jax/Flax implementation.
* Embeddings working and equivalent to PyTorch.
* Move embeddings in its own module BertEmbeddings
* Added jax.jit annotation on forward call
* BertEncoder on par with PyTorch ! :D
* Add BertPooler on par with PyTorch !!
* Working Jax+Flax implementation of BertModel with < 1e-5 differences on the last layer.
* Fix pooled output to take only the first token of the sequence.
* Refactoring to use BertConfig from transformers.
* Renamed FXBertModel to FlaxBertModel
* Model is now initialized in FlaxBertModel constructor and reused.
* WIP JaxPreTrainedModel
* Cleaning up the code of FlaxBertModel
* Added ability to load Flax model saved through save_pretrained()
* Added ability to convert Pytorch Bert model to FlaxBert
* FlaxBert can now load every Pytorch Bert model with on-the-fly conversion
* Fix hardcoded shape values in conversion scripts.
* Improve the way we handle LayerNorm conversion from PyTorch to Flax.
* Added positional embeddings as parameter of BertModel with default to np.arange.
* Let's roll FlaxRoberta !
* Fix missing position_ids parameters on predict for Bert
* Flax backend now supports batched inputs
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Make it possible to load msgpacked model on convert from pytorch in last resort.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Moved save_pretrained to Jax base class along with more constructor parameters.
* Use specialized, model dependent conversion functio.
* Expose `is_flax_available` in file_utils.
* Added unittest for Flax models.
* Added run_tests_flax to the CI.
* Introduce FlaxAutoModel
* Added more unittests
* Flax model reference the _MODEL_ARCHIVE_MAP from PyTorch model.
* Addressing review comments.
* Expose seed in both Bert and Roberta
* Fix typo suggested by @stefan-it
Co-Authored-By: Stefan Schweter <stefan@schweter.it>
* Attempt to make style
* Attempt to make style in tests too
* Added jax & jaxlib to the flax optional dependencies.
* Attempt to fix flake8 warnings ...
* Redo black again and again
* When black and flake8 fight each other for a space ... 💥💥💥
* Try removing trailing comma to make both black and flake happy!
* Fix invalid is_<framework>_available call, thanks @LysandreJik 🎉
* Fix another invalid import in flax_roberta test
* Bump and pin flax release to 0.1.0.
* Make flake8 happy, remove unused jax import
* Change the type of the catch for msgpack.
* Remove unused import.
* Put seed as optional constructor parameter.
* trigger ci again
* Fix too much parameters in BertAttention.
* Formatting.
* Simplify Flax unittests to avoid machine crashes.
* Fix invalid number of arguments when raising issue for an unknown model.
* Address @bastings comment in PR, moving jax.jit decorated outside of __call__
* Fix incorrect path to require_flax/require_pytorch functions.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Attempt to make style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Correct rebasing of circle-ci dependencies
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Fix import sorting.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Fix unused imports.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Again import sorting...
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Installing missing nlp dependency for flax unittests.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Fix laoding of model for Flax implementations.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* jit the inner function call to make JAX-compatible
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Format !
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Flake one more time 🎶
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Rewrites BERT in Flax to the new Linen API (#7211)
* Rewrite Flax HuggingFace PR to Linen
* Some fixes
* Fix tests
* Fix CI with change of name of nlp (#7054)
* nlp -> datasets
* More nlp -> datasets
* Woopsie
* More nlp -> datasets
* One last
* Expose `is_flax_available` in file_utils.
* Added run_tests_flax to the CI.
* Attempt to make style
* trigger ci again
* Fix import sorting.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Revert "Rewrites BERT in Flax to the new Linen API (#7211)"
This reverts commit 23703a5eb3364e26a1cbc3ee34b4710d86a674b0.
* Remove jnp.lax references
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Reintroduce Linen changes ...
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Use jax native's gelu function.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Renaming BertModel to BertModule to highlight the fact this is the Flax Module object.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Rewrite FlaxAutoModel test to not rely on pretrained_model_archive_map
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Remove unused variable in BertModule.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Remove unused variable in BertModule again
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Attempt to have is_flax_available working again.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Introduce JAX TensorType
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Improve ImportError message when trying to convert to various TensorType format.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Makes Flax model jittable.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Ensure flax models are jittable in unittests.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Remove unused imports.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Ensure jax imports are guarded behind is_flax_available.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make style again
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make style again again
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make style again again again
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Update src/transformers/file_utils.py
Co-authored-by: Marc van Zee <marcvanzee@gmail.com>
* Bump flax to it's latest version
Co-authored-by: Marc van Zee <marcvanzee@gmail.com>
* Bump jax version to at least 0.2.0
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Update the unittest to use TensorType.JAX
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* isort import in tests.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Match new flax parameters name "params"
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Remove unused imports.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Add flax models to transformers __init__
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Attempt to address all CI related comments.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Correct circle.yml indent.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Correct circle.yml indent (2)
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Remove coverage from flax tests
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Addressing many naming suggestions from comments
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Simplify for loop logic to interate over layers in FlaxBertLayerCollection
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* use f-string syntax for formatting logs.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Use config property from FlaxPreTrainedModel.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* use "cls_token" instead of "first_token" variable name.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* use "hidden_state" instead of "h" variable name.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Correct class reference in docstring to link to Flax related modules.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Added HF + Google Flax team copyright.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make Roberta independent from Bert
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Move activation functions to flax_utils.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Move activation functions to flax_utils for bert.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Added docstring for BERT
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Update import for Bert and Roberta tokenizers
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* fix-copies
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Correct FlaxRobertaLayer to match PyTorch.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Use the same store_artifact for flax unittest
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make sure gradient are disabled only locally for flax unittest using torch equivalence.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Use relative imports
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Marc van Zee <marcvanzee@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Propagating n_docs as parameter to all RagModel's related functions that defaults to self.config.n_docs
* Making n_docs parameter's default value to None in marginalize function
* Fixing code quality issues
* Handle the special case when generator is of T5PreTrainedModel instance type. T5PreTrainedModel do not have n_docs as parameter
* T5PreTrainedModel do not have n_docs as parameter
* Addressing review comment
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Correcting comment by addressing review comment
* Adding assert statement verifying that n_docs is correctly set. n_docs should be the same for both retriever and generator.
* Fixing flake8 reported issue
* Correcting test datasets for rag
* Using doc_scores instead of context_input_ids to check assert as in RagSequenceForGeneration context_input_ids can be null
* doc_scores second dimension have number of retrieved docs
* Changing assert comment
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* splitting fast and slow tokenizers [WIP]
* [WIP] splitting sentencepiece and tokenizers dependencies
* update dummy objects
* add name_or_path to models and tokenizers
* prefix added to file names
* prefix
* styling + quality
* spliting all the tokenizer files - sorting sentencepiece based ones
* update tokenizer version up to 0.9.0
* remove hard dependency on sentencepiece 🎉
* and removed hard dependency on tokenizers 🎉
* update conversion script
* update missing models
* fixing tests
* move test_tokenization_fast to main tokenization tests - fix bugs
* bump up tokenizers
* fix bert_generation
* update ad fix several tokenizers
* keep sentencepiece in deps for now
* fix funnel and deberta tests
* fix fsmt
* fix marian tests
* fix layoutlm
* fix squeezebert and gpt2
* fix T5 tokenization
* fix xlnet tests
* style
* fix mbart
* bump up tokenizers to 0.9.2
* fix model tests
* fix tf models
* fix seq2seq examples
* fix tests without sentencepiece
* fix slow => fast conversion without sentencepiece
* update auto and bert generation tests
* fix mbart tests
* fix auto and common test without tokenizers
* fix tests without tokenizers
* clean up tests lighten up when tokenizers + sentencepiece are both off
* style quality and tests fixing
* add sentencepiece to doc/examples reqs
* leave sentencepiece on for now
* style quality split hebert and fix pegasus
* WIP Herbert fast
* add sample_text_no_unicode and fix hebert tokenization
* skip FSMT example test for now
* fix style
* fix fsmt in example tests
* update following Lysandre and Sylvain's comments
* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* HerBERT transformer model for Polish language understanding.
* HerbertTokenizerFast generated with HerbertConverter
* Herbert base and large model cards
* Herbert model cards with tags
* Herbert tensorflow models
* Herbert model tests based on Bert test suit
* src/transformers/tokenization_herbert.py edited online with Bitbucket
* src/transformers/tokenization_herbert.py edited online with Bitbucket
* docs/source/model_doc/herbert.rst edited online with Bitbucket
* Herbert tokenizer tests and bug fixes
* src/transformers/configuration_herbert.py edited online with Bitbucket
* Copyrights and tests for TFHerbertModel
* model_cards/allegro/herbert-base-cased/README.md edited online with Bitbucket
* model_cards/allegro/herbert-large-cased/README.md edited online with Bitbucket
* Bug fixes after testing
* Reformat modified_only_fixup
* Proper order of configuration
* Herbert proper documentation formatting
* Formatting with make modified_only_fixup
* Dummies fixed
* Adding missing models to documentation
* Removing HerBERT model as it is a simple extension of BERT
* Update model_cards/allegro/herbert-base-cased/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* Update model_cards/allegro/herbert-large-cased/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* HerbertTokenizer deprecated configuration removed
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* Improving Pipelines by defaulting to framework='tf' when
pytorch seems unavailable.
* Actually changing the default resolution order to account for model
defaults
Adding a new tests for each pipeline to check that pipeline(task) works
too without manually adding the framework too.
* Add Documentation for GPT-1 Classification
* Add GPT-1 with Classification head
* Add tests for GPT-1 Classification
* Add GPT-1 For Classification to auto models
* Remove authorized missing keys, change checkpoint to openai-gpt
* Reintroduce clean_text call which was removed by mistake in #4723
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Added unittest for clean_text parameter on Bert tokenizer.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Better unittest name.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Adapt unittest to use untrained tokenizer.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Code quality + update test
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* [WIP] SP tokenizers
* fixing tests for T5
* WIP tokenizers
* serialization
* update T5
* WIP T5 tokenization
* slow to fast conversion script
* Refactoring to move tokenzier implementations inside transformers
* Adding gpt - refactoring - quality
* WIP adding several tokenizers to the fast world
* WIP Roberta - moving implementations
* update to dev4 switch file loading to in-memory loading
* Updating and fixing
* advancing on the tokenizers - updating do_lower_case
* style and quality
* moving forward with tokenizers conversion and tests
* MBart, T5
* dumping the fast version of transformer XL
* Adding to autotokenizers + style/quality
* update init and space_between_special_tokens
* style and quality
* bump up tokenizers version
* add protobuf
* fix pickle Bert JP with Mecab
* fix newly added tokenizers
* style and quality
* fix bert japanese
* fix funnel
* limite tokenizer warning to one occurence
* clean up file
* fix new tokenizers
* fast tokenizers deep tests
* WIP adding all the special fast tests on the new fast tokenizers
* quick fix
* adding more fast tokenizers in the fast tests
* all tokenizers in fast version tested
* Adding BertGenerationFast
* bump up setup.py for CI
* remove BertGenerationFast (too early)
* bump up tokenizers version
* Clean old docstrings
* Typo
* Update following Lysandre comments
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
* Initial callback proposal
* Finish various callbacks
* Post-rebase conflicts
* Fix tests
* Don't use something that's not set
* Documentation
* Remove unwanted print.
* Document all models can work
* Add tests + small fixes
* Update docs/source/internal/trainer_utils.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address review comments
* Fix TF tests
* Real fix this time
* This one should work
* Fix typo
* Really fix typo
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* configuration_squeezebert.py
thin wrapper around bert tokenizer
fix typos
wip sb model code
wip modeling_squeezebert.py. Next step is to get the multi-layer-output interface working
set up squeezebert to use BertModelOutput when returning results.
squeezebert documentation
formatting
allow head mask that is an array of [None, ..., None]
docs
docs cont'd
path to vocab
docs and pointers to cloud files (WIP)
line length and indentation
squeezebert model cards
formatting of model cards
untrack modeling_squeezebert_scratchpad.py
update aws paths to vocab and config files
get rid of stub of NSP code, and advise users to pretrain with mlm only
fix rebase issues
redo rebase of modeling_auto.py
fix issues with code formatting
more code format auto-fixes
move squeezebert before bert in tokenization_auto.py and modeling_auto.py because squeezebert inherits from bert
tests for squeezebert modeling and tokenization
fix typo
move squeezebert before bert in modeling_auto.py to fix inheritance problem
disable test_head_masking, since squeezebert doesn't yet implement head masking
fix issues exposed by the test_modeling_squeezebert.py
fix an issue exposed by test_tokenization_squeezebert.py
fix issue exposed by test_modeling_squeezebert.py
auto generated code style improvement
issue that we inherited from modeling_xxx.py: SqueezeBertForMaskedLM.forward() calls self.cls(), but there is no self.cls, and I think the goal was actually to call self.lm_head()
update copyright
resolve failing 'test_hidden_states_output' and remove unused encoder_hidden_states and encoder_attention_mask
docs
add integration test. rename squeezebert-mnli --> squeezebert/squeezebert-mnli
autogenerated formatting tweaks
integrate feedback from patrickvonplaten and sgugger to programming style and documentation strings
* tiny change to order of imports
* Trainer should not modify its TrainingArguments
* Trainer should not modify its TrainingArguments
* Trainer should not modify its TrainingArguments
* Add test of resumed training
* Fixes
* Non multiGPU test
* Clean Trainer state
* Add more to the state
* Documentation
* One last test
* Make resume training test more complete
* Unwanted changes
* GPT2 gradient checkpointing
* find_unused_parameters removed if checkpointing
* find_unused_parameters removed if checkpointing
* Update src/transformers/configuration_gpt2.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Added a test for generation with checkpointing
* Update src/transformers/configuration_gpt2.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>