mirror of
https://github.com/huggingface/transformers.git
synced 2025-08-01 18:51:14 +06:00
Update chat template warnings/guides (#27634)
* Update default ChatML template * Update docs/warnings * Update docs/source/en/chat_templating.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Slight rework --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
This commit is contained in:
parent
ce31508134
commit
74a3cebfa5
@ -376,7 +376,10 @@ input formats. Our default template for models that don't have a class-specific
|
|||||||
```
|
```
|
||||||
|
|
||||||
If you like this one, here it is in one-liner form, ready to copy into your code. The one-liner also includes
|
If you like this one, here it is in one-liner form, ready to copy into your code. The one-liner also includes
|
||||||
handy support for "generation prompts" - see the next section for more!
|
handy support for [generation prompts](#what-are-generation-prompts), but note that it doesn't add BOS or EOS tokens!
|
||||||
|
If your model expects those, they won't be added automatically by `apply_chat_template` - in other words, the
|
||||||
|
text will be tokenized with `add_special_tokens=False`. This is to avoid potential conflicts between the template and
|
||||||
|
the `add_special_tokens` logic. If your model expects special tokens, make sure to add them to the template!
|
||||||
|
|
||||||
```
|
```
|
||||||
tokenizer.chat_template = "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}"
|
tokenizer.chat_template = "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}"
|
||||||
|
@ -1786,7 +1786,7 @@ class PreTrainedTokenizerBase(SpecialTokensMixin, PushToHubMixin):
|
|||||||
"""
|
"""
|
||||||
logger.warning_once(
|
logger.warning_once(
|
||||||
"\nNo chat template is defined for this tokenizer - using a default chat template "
|
"\nNo chat template is defined for this tokenizer - using a default chat template "
|
||||||
"that implements the ChatML format. If the default is not appropriate for "
|
"that implements the ChatML format (without BOS/EOS tokens!). If the default is not appropriate for "
|
||||||
"your model, please set `tokenizer.chat_template` to an appropriate template. "
|
"your model, please set `tokenizer.chat_template` to an appropriate template. "
|
||||||
"See https://huggingface.co/docs/transformers/main/chat_templating for more information.\n"
|
"See https://huggingface.co/docs/transformers/main/chat_templating for more information.\n"
|
||||||
)
|
)
|
||||||
|
Loading…
Reference in New Issue
Block a user