mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-04 21:30:07 +06:00

* Clean up model documentation * Formatting * Preparation work * Long lines * Main work on rst files * Cleanup all config files * Syntax fix * Clean all tokenizers * Work on first models * Models beginning * FaluBERT * All PyTorch models * All models * Long lines again * Fixes * More fixes * Update docs/source/model_doc/bert.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update docs/source/model_doc/electra.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Last fixes Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
39 lines
1.6 KiB
ReStructuredText
39 lines
1.6 KiB
ReStructuredText
Utilities for Tokenizers
|
|
-----------------------------------------------------------------------------------------------------------------------
|
|
|
|
This page lists all the utility functions used by the tokenizers, mainly the class
|
|
:class:`~transformers.tokenization_utils_base.PreTrainedTokenizerBase` that implements the common methods between
|
|
:class:`~transformers.PreTrainedTokenizer` and :class:`~transformers.PreTrainedTokenizerFast` and the mixin
|
|
:class:`~transformers.tokenization_utils_base.SpecialTokensMixin`.
|
|
|
|
Most of those are only useful if you are studying the code of the tokenizers in the library.
|
|
|
|
PreTrainedTokenizerBase
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.tokenization_utils_base.PreTrainedTokenizerBase
|
|
:special-members: __call__
|
|
:members:
|
|
|
|
|
|
SpecialTokensMixin
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.tokenization_utils_base.SpecialTokensMixin
|
|
:members:
|
|
|
|
|
|
Enums and namedtuples
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
.. autoclass:: transformers.tokenization_utils_base.ExplicitEnum
|
|
|
|
.. autoclass:: transformers.tokenization_utils_base.PaddingStrategy
|
|
|
|
.. autoclass:: transformers.tokenization_utils_base.TensorType
|
|
|
|
.. autoclass:: transformers.tokenization_utils_base.TruncationStrategy
|
|
|
|
.. autoclass:: transformers.tokenization_utils_base.CharSpan
|
|
|
|
.. autoclass:: transformers.tokenization_utils_base.TokenSpan
|