mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-31 18:22:34 +06:00
Add note to resize token embeddings matrix when adding new tokens to voc (#10331)
This commit is contained in:
parent
19e737b93e
commit
e73a3e1891
@ -971,6 +971,12 @@ class SpecialTokensMixin:
|
||||
Add a list of new tokens to the tokenizer class. If the new tokens are not in the vocabulary, they are added to
|
||||
it with indices starting from length of the current vocabulary.
|
||||
|
||||
.. Note::
|
||||
When adding new tokens to the vocabulary, you should make sure to also resize the token embedding matrix of
|
||||
the model so that its embedding matrix matches the tokenizer.
|
||||
|
||||
In order to do that, please use the :meth:`~transformers.PreTrainedModel.resize_token_embeddings` method.
|
||||
|
||||
Args:
|
||||
new_tokens (:obj:`str`, :obj:`tokenizers.AddedToken` or a list of `str` or :obj:`tokenizers.AddedToken`):
|
||||
Tokens are only added if they are not already in the vocabulary. :obj:`tokenizers.AddedToken` wraps a
|
||||
|
Loading…
Reference in New Issue
Block a user