transformers/docs/source
Thomas Wolf 9aeacb58ba
Adding Fast tokenizers for SentencePiece based tokenizers - Breaking: remove Transfo-XL fast tokenizer (#7141)
* [WIP] SP tokenizers

* fixing tests for T5

* WIP tokenizers

* serialization

* update T5

* WIP T5 tokenization

* slow to fast conversion script

* Refactoring to move tokenzier implementations inside transformers

* Adding gpt - refactoring - quality

* WIP adding several tokenizers to the fast world

* WIP Roberta - moving implementations

* update to dev4 switch file loading to in-memory loading

* Updating and fixing

* advancing on the tokenizers - updating do_lower_case

* style and quality

* moving forward with tokenizers conversion and tests

* MBart, T5

* dumping the fast version of transformer XL

* Adding to autotokenizers + style/quality

* update init and space_between_special_tokens

* style and quality

* bump up tokenizers version

* add protobuf

* fix pickle Bert JP with Mecab

* fix newly added tokenizers

* style and quality

* fix bert japanese

* fix funnel

* limite tokenizer warning to one occurence

* clean up file

* fix new tokenizers

* fast tokenizers deep tests

* WIP adding all the special fast tests on the new fast tokenizers

* quick fix

* adding more fast tokenizers in the fast tests

* all tokenizers in fast version tested

* Adding BertGenerationFast

* bump up setup.py for CI

* remove BertGenerationFast (too early)

* bump up tokenizers version

* Clean old docstrings

* Typo

* Update following Lysandre comments

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
2020-10-08 11:32:16 +02:00
..
_static The toggle actually sticks (#7586) 2020-10-05 11:23:57 -04:00
imgs Guide to fixed-length model perplexity evaluation (#5449) 2020-07-07 16:04:15 -06:00
internal Trainer callbacks (#7596) 2020-10-07 10:50:21 -04:00
main_classes Trainer callbacks (#7596) 2020-10-07 10:50:21 -04:00
model_doc Adding Fast tokenizers for SentencePiece based tokenizers - Breaking: remove Transfo-XL fast tokenizer (#7141) 2020-10-08 11:32:16 +02:00
benchmarks.rst Models doc (#7345) 2020-09-23 13:20:45 -04:00
bertology.rst Models doc (#7345) 2020-09-23 13:20:45 -04:00
conf.py Release: v3.3.1 2020-09-29 14:17:34 -04:00
contributing.md Update installation page and add contributing to the doc (#5084) 2020-06-17 14:01:10 -04:00
converting_tensorflow_models.rst Models doc (#7345) 2020-09-23 13:20:45 -04:00
custom_datasets.rst Models doc (#7345) 2020-09-23 13:20:45 -04:00
examples.md per_device instead of per_gpu/error thrown when argument unknown (#4618) 2020-05-27 11:36:55 -04:00
favicon.ico Adding usage examples for common tasks (#2850) 2020-02-25 13:48:24 -05:00
glossary.rst Models doc (#7345) 2020-09-23 13:20:45 -04:00
index.rst Blenderbot (#7418) 2020-10-07 19:09:23 -04:00
installation.md Make transformers install check positive (#7473) 2020-09-30 07:44:40 -04:00
migration.md Add hugs (#5225) 2020-06-24 07:56:14 -04:00
model_sharing.rst docs: fix model sharing file names (#5855) 2020-09-28 08:17:30 -04:00
model_summary.rst Document RAG again (#7377) 2020-09-28 08:31:46 -04:00
multilingual.rst Models doc (#7345) 2020-09-23 13:20:45 -04:00
notebooks.md Update notebooks (#3620) 2020-04-06 14:32:39 -04:00
perplexity.rst Models doc (#7345) 2020-09-23 13:20:45 -04:00
philosophy.rst Models doc (#7345) 2020-09-23 13:20:45 -04:00
preprocessing.rst Models doc (#7345) 2020-09-23 13:20:45 -04:00
pretrained_models.rst docs(pretrained_models): fix num parameters (#7575) 2020-10-05 07:50:56 -04:00
quicktour.rst Models doc (#7345) 2020-09-23 13:20:45 -04:00
serialization.rst Models doc (#7345) 2020-09-23 13:20:45 -04:00
task_summary.rst Add forgotten return_dict argument in the docs (#7483) 2020-10-01 04:41:29 -04:00
testing.rst Models doc (#7345) 2020-09-23 13:20:45 -04:00
tokenizer_summary.rst Models doc (#7345) 2020-09-23 13:20:45 -04:00
training.rst Models doc (#7345) 2020-09-23 13:20:45 -04:00