* Add deebert code
* Add readme of deebert
* Add test for deebert
Update test for Deebert
* Update DeeBert (README, class names, function refactoring); remove requirements.txt
* Format update
* Update test
* Update readme and model init methods
* Default decoder inputs to encoder ones for T5 if neither are specified.
* Fixing typo, now all tests are passing.
* Changing einsum to operations supported by onnx
* Adding a test to ensure T5 can be exported to onnx op>9
* Modified test for onnx export to make it faster
* Styling changes.
* Styling changes.
* Changing notation for matrix multiplication
Co-authored-by: Abel Riboulot <tkai@protomail.com>
* Added data collator for XLNet language modeling and related calls
Added DataCollatorForXLNetLanguageModeling in data/data_collator.py
to generate necessary inputs for language modeling training with
XLNetLMHeadModel. Also added related arguments, logic and calls in
examples/language-modeling/run_language_modeling.py.
Resolves: #4739, #2008 (partially)
* Changed name to `DataCollatorForPermutationLanguageModeling`
Changed the name of `DataCollatorForXLNetLanguageModeling` to the more general `DataCollatorForPermutationLanguageModelling`.
Removed the `--mlm` flag requirement for the new collator and defined a separate `--plm_probability` flag for its use.
CTRL uses a CLM loss just like GPT and GPT-2, so should work out of the box with this script (provided `past` is taken care of
similar to `mems` for XLNet).
Changed calls and imports appropriately.
* Added detailed comments, changed variable names
Added more detailed comments to `DataCollatorForPermutationLanguageModeling` in `data/data_collator.py` to explain working. Also cleaned up variable names and made them more informative.
* Added tests for new data collator
Added tests in `tests/test_trainer.py` for DataCollatorForPermutationLanguageModeling based on those in DataCollatorForLanguageModeling. A specific test has been added to check for odd-length sequences.
* Fixed styling issues
* Exposing prepare_for_model for both slow & fast tokenizers
* Update method signature
* The traditional style commit
* Hide the warnings behind the verbose flag
* update default truncation strategy and prepare_for_model
* fix tests and prepare_for_models methods
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
* Make QA pipeline supports models with more than 2 outputs such as BART assuming start/end are the two first outputs.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* When using the new padding/truncation paradigm setting padding="max_length" + max_length=X actually pads the input up to max_length.
This result in every sample going through QA pipelines to be of size 384 whatever the actual input size is making the overall pipeline very slow.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Mask padding & question before applying softmax. Softmax has been refactored to operate in log space for speed and stability.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Format.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Use PaddingStrategy.LONGEST instead of DO_NOT_PAD
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Revert "When using the new padding/truncation paradigm setting padding="max_length" + max_length=X actually pads the input up to max_length."
This reverts commit 1b00a9a2
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Trigger CI after unattended failure
* Trigger CI
* Work on tokenizer summary
* Finish tutorial
* Link to it
* Apply suggestions from code review
Co-authored-by: Anthony MOI <xn1t0x@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Add vocab definition
Co-authored-by: Anthony MOI <xn1t0x@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>