mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-14 18:18:24 +06:00
![]() * Use tokenizers pre-tokenized pipeline * failing pretrokenized test * Fix is_pretokenized in python * add pretokenized tests * style and quality * better tests for batched pretokenized inputs * tokenizers clean up - new padding_strategy - split the files * [HUGE] refactoring tokenizers - padding - truncation - tests * style and quality * bump up requied tokenizers version to 0.8.0-rc1 * switched padding/truncation API - simpler better backward compat * updating tests for custom tokenizers * style and quality - tests on pad * fix QA pipeline * fix backward compatibility for max_length only * style and quality * Various cleans up - add verbose * fix tests * update docstrings * Fix tests * Docs reformatted * __call__ method documented Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> |
||
---|---|---|
.. | ||
_static | ||
imgs | ||
main_classes | ||
model_doc | ||
benchmarks.md | ||
bertology.rst | ||
conf.py | ||
converting_tensorflow_models.rst | ||
examples.md | ||
favicon.ico | ||
glossary.rst | ||
index.rst | ||
installation.md | ||
migration.md | ||
model_sharing.md | ||
multilingual.rst | ||
notebooks.md | ||
pretrained_models.rst | ||
quickstart.md | ||
serialization.rst | ||
summary.rst | ||
torchscript.rst | ||
usage.rst |