mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-04 13:20:12 +06:00

* Rename index.mdx to index.md * With saved modifs * Address review comment * Treat all files * .mdx -> .md * Remove special char * Update utils/tests_fetcher.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> --------- Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
1.5 KiB
1.5 KiB
Utilities for Tokenizers
This page lists all the utility functions used by the tokenizers, mainly the class
[~tokenization_utils_base.PreTrainedTokenizerBase
] that implements the common methods between
[PreTrainedTokenizer
] and [PreTrainedTokenizerFast
] and the mixin
[~tokenization_utils_base.SpecialTokensMixin
].
Most of those are only useful if you are studying the code of the tokenizers in the library.
PreTrainedTokenizerBase
autodoc tokenization_utils_base.PreTrainedTokenizerBase - call - all
SpecialTokensMixin
autodoc tokenization_utils_base.SpecialTokensMixin
Enums and namedtuples
autodoc tokenization_utils_base.TruncationStrategy
autodoc tokenization_utils_base.CharSpan
autodoc tokenization_utils_base.TokenSpan