* Add return lengths
* make pad a bit more flexible so it can be used as collate_fn
* check all kwargs sent to encoding method are known
* fixing kwargs in encodings
* New AddedToken class in python
This class let you specify specifique tokenization behaviors for some special tokens. Used in particular for GPT2 and Roberta, to control how white spaces are stripped around special tokens.
* style and quality
* switched to hugginface tokenizers library for AddedTokens
* up to tokenizer 0.8.0-rc3 - update API to use AddedToken state
* style and quality
* do not raise an error on additional or unused kwargs for tokenize() but only a warning
* transfo-xl pretrained model requires torch
* Update src/transformers/tokenization_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Quicktour part 1
* Update
* All done
* Typos
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
* Address comments in quick tour
* Update docs/source/quicktour.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update from feedback
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Cleaner warning when loading pretrained models
This make more explicit logging messages when using the various `from_pretrained` methods. It also make these messages as `logging.warning` because it's a common source of silent mistakes.
* Update src/transformers/modeling_utils.py
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* Update src/transformers/modeling_utils.py
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* style and quality
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* fix#5081 and improve backward compatibility (slightly)
* add nlp to setup.cfg - style and quality
* align default to previous default
* remove test that doesn't generalize
* add support for gradient checkpointing in BERT
* fix unit tests
* isort
* black
* workaround for `torch.utils.checkpoint.checkpoint` not accepting bool
* Revert "workaround for `torch.utils.checkpoint.checkpoint` not accepting bool"
This reverts commit 5eb68bb804.
* workaround for `torch.utils.checkpoint.checkpoint` not accepting bool
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Configure all models to use output_hidden_states as argument passed to foward()
* Pass all tests
* Remove cast_bool_to_primitive in TF Flaubert model
* correct tf xlnet
* add pytorch test
* add tf test
* Fix broken tests
* Configure all models to use output_hidden_states as argument passed to foward()
* Pass all tests
* Remove cast_bool_to_primitive in TF Flaubert model
* correct tf xlnet
* add pytorch test
* add tf test
* Fix broken tests
* Refactor output_hidden_states for mobilebert
* Reset and remerge to master
Co-authored-by: Joseph Liu <joseph.liu@coinflex.com>
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>