transformers/docs/source/main_classes
Thomas Wolf 827d6d6ef0
Cleanup fast tokenizers integration (#3706)
* First pass on utility classes and python tokenizers

* finishing cleanup pass

* style and quality

* Fix tests

* Updating following @mfuntowicz comment

* style and quality

* Fix Roberta

* fix batch_size/seq_length inBatchEncoding

* add alignement methods + tests

* Fix OpenAI and Transfo-XL tokenizers

* adding trim_offsets=True default for GPT2 et RoBERTa

* style and quality

* fix tests

* add_prefix_space in roberta

* bump up tokenizers to rc7

* style

* unfortunately tensorfow does like these - removing shape/seq_len for now

* Update src/transformers/tokenization_utils.py

Co-Authored-By: Stefan Schweter <stefan@schweter.it>

* Adding doc and docstrings

* making flake8 happy

Co-authored-by: Stefan Schweter <stefan@schweter.it>
2020-04-18 13:43:57 +02:00
..
configuration.rst GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
model.rst GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
optimizer_schedules.rst TF ALBERT + TF Utilities + Fix warnings 2020-01-23 09:38:45 -05:00
pipelines.rst Add Summarization to Pipelines (#3128) 2020-03-17 18:04:21 -04:00
processors.rst [examples] rename run_lm_finetuning to run_language_modeling 2020-02-07 09:15:28 -05:00
tokenizer.rst Cleanup fast tokenizers integration (#3706) 2020-04-18 13:43:57 +02:00