Clarify stride option (#22684)

* Clarify stride option

* formatting
This commit is contained in:
Luc CAILLIAU 2023-04-11 15:06:54 +02:00 committed by GitHub
parent 0224aaf67f
commit 06b05d4575
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -68,7 +68,8 @@ class AggregationStrategy(ExplicitEnum):
same entity together in the predictions or not.
stride (`int`, *optional*):
If stride is provided, the pipeline is applied on all the text. The text is split into chunks of size
model_max_length. Works only with fast tokenizers and `aggregation_strategy` different from `NONE`.
model_max_length. Works only with fast tokenizers and `aggregation_strategy` different from `NONE`. The
value of this argument defines the number of overlapping tokens between chunks.
aggregation_strategy (`str`, *optional*, defaults to `"none"`):
The strategy to fuse (or not) tokens based on the model prediction.