mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-31 02:02:21 +06:00
new length penalty docstring (#19006)
This commit is contained in:
parent
f89f16a51e
commit
4157e3cd7e
@ -148,7 +148,10 @@ class PretrainedConfig(PushToHubMixin):
|
||||
Parameter for repetition penalty that will be used by default in the `generate` method of the model. 1.0
|
||||
means no penalty.
|
||||
length_penalty (`float`, *optional*, defaults to 1):
|
||||
Exponential penalty to the length that will be used by default in the `generate` method of the model.
|
||||
Exponential penalty to the length that is used with beam-based generation. It is applied as an exponent to
|
||||
the sequence length, which in turn is used to divide the score of the sequence. Since the score is the log
|
||||
likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences, while
|
||||
`length_penalty` < 0.0 encourages shorter sequences.
|
||||
no_repeat_ngram_size (`int`, *optional*, defaults to 0) -- Value that will be used by default in the
|
||||
`generate` method of the model for `no_repeat_ngram_size`. If set to int > 0, all ngrams of that size can
|
||||
only occur once.
|
||||
|
@ -138,9 +138,10 @@ class BeamSearchScorer(BeamScorer):
|
||||
Defines the device type (*e.g.*, `"cpu"` or `"cuda"`) on which this instance of `BeamSearchScorer` will be
|
||||
allocated.
|
||||
length_penalty (`float`, *optional*, defaults to 1.0):
|
||||
Exponential penalty to the length. 1.0 means no penalty. Set to values < 1.0 in order to encourage the
|
||||
model to generate shorter sequences, to a value > 1.0 in order to encourage the model to produce longer
|
||||
sequences.
|
||||
Exponential penalty to the length that is used with beam-based generation. It is applied as an exponent to
|
||||
the sequence length, which in turn is used to divide the score of the sequence. Since the score is the log
|
||||
likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences, while
|
||||
`length_penalty` < 0.0 encourages shorter sequences.
|
||||
do_early_stopping (`bool`, *optional*, defaults to `False`):
|
||||
Whether to stop the beam search when at least `num_beams` sentences are finished per batch or not.
|
||||
num_beam_hyps_to_keep (`int`, *optional*, defaults to 1):
|
||||
@ -405,9 +406,10 @@ class ConstrainedBeamSearchScorer(BeamScorer):
|
||||
Defines the device type (*e.g.*, `"cpu"` or `"cuda"`) on which this instance of `BeamSearchScorer` will be
|
||||
allocated.
|
||||
length_penalty (`float`, *optional*, defaults to 1.0):
|
||||
Exponential penalty to the length. 1.0 means no penalty. Set to values < 1.0 in order to encourage the
|
||||
model to generate shorter sequences, to a value > 1.0 in order to encourage the model to produce longer
|
||||
sequences.
|
||||
Exponential penalty to the length that is used with beam-based generation. It is applied as an exponent to
|
||||
the sequence length, which in turn is used to divide the score of the sequence. Since the score is the log
|
||||
likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences, while
|
||||
`length_penalty` < 0.0 encourages shorter sequences.
|
||||
do_early_stopping (`bool`, *optional*, defaults to `False`):
|
||||
Whether to stop the beam search when at least `num_beams` sentences are finished per batch or not.
|
||||
num_beam_hyps_to_keep (`int`, *optional*, defaults to 1):
|
||||
|
@ -455,10 +455,10 @@ class TFGenerationMixin:
|
||||
eos_token_id (`int`, *optional*):
|
||||
The id of the *end-of-sequence* token.
|
||||
length_penalty (`float`, *optional*, defaults to 1.0):
|
||||
Exponential penalty to the length. 1.0 means no penalty.
|
||||
|
||||
Set to values < 1.0 in order to encourage the model to generate shorter sequences, to a value > 1.0 in
|
||||
order to encourage the model to produce longer sequences.
|
||||
Exponential penalty to the length that is used with beam-based generation. It is applied as an exponent
|
||||
to the sequence length, which in turn is used to divide the score of the sequence. Since the score is
|
||||
the log likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences,
|
||||
while `length_penalty` < 0.0 encourages shorter sequences.
|
||||
no_repeat_ngram_size (`int`, *optional*, defaults to 0):
|
||||
If set to int > 0, all ngrams of that size can only occur once.
|
||||
bad_words_ids(`List[int]`, *optional*):
|
||||
@ -1419,10 +1419,10 @@ class TFGenerationMixin:
|
||||
eos_token_id (`int`, *optional*):
|
||||
The id of the *end-of-sequence* token.
|
||||
length_penalty (`float`, *optional*, defaults to 1.0):
|
||||
Exponential penalty to the length. 1.0 means no penalty.
|
||||
|
||||
Set to values < 1.0 in order to encourage the model to generate shorter sequences, to a value > 1.0 in
|
||||
order to encourage the model to produce longer sequences.
|
||||
Exponential penalty to the length that is used with beam-based generation. It is applied as an exponent
|
||||
to the sequence length, which in turn is used to divide the score of the sequence. Since the score is
|
||||
the log likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences,
|
||||
while `length_penalty` < 0.0 encourages shorter sequences.
|
||||
no_repeat_ngram_size (`int`, *optional*, defaults to 0):
|
||||
If set to int > 0, all ngrams of that size can only occur once.
|
||||
bad_words_ids(`List[int]`, *optional*):
|
||||
@ -2657,7 +2657,10 @@ class TFGenerationMixin:
|
||||
eos_token_id (`int`, *optional*):
|
||||
The id of the *end-of-sequence* token.
|
||||
length_penalty (`float`, *optional*, defaults to 1.0):
|
||||
Exponential penalty to the length. 1.0 means no penalty.
|
||||
Exponential penalty to the length that is used with beam-based generation. It is applied as an exponent
|
||||
to the sequence length, which in turn is used to divide the score of the sequence. Since the score is
|
||||
the log likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences,
|
||||
while `length_penalty` < 0.0 encourages shorter sequences.
|
||||
early_stopping (`bool`, *optional*, defaults to `False`):
|
||||
Whether to stop the beam search when at least `num_beams` sentences are finished per batch or not.
|
||||
logits_processor (`[TFLogitsProcessorList]`, *optional*):
|
||||
|
@ -1005,9 +1005,10 @@ class GenerationMixin:
|
||||
eos_token_id (`int`, *optional*, defaults to `model.config.eos_token_id`):
|
||||
The id of the *end-of-sequence* token.
|
||||
length_penalty (`float`, *optional*, defaults to `model.config.length_penalty` or 1.0 if the config does not set any value):
|
||||
Exponential penalty to the length. 1.0 means that the beam score is penalized by the sequence length.
|
||||
0.0 means no penalty. Set to values < 0.0 in order to encourage the model to generate longer
|
||||
sequences, to a value > 0.0 in order to encourage the model to produce shorter sequences.
|
||||
Exponential penalty to the length that is used with beam-based generation. It is applied as an exponent
|
||||
to the sequence length, which in turn is used to divide the score of the sequence. Since the score is
|
||||
the log likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences,
|
||||
while `length_penalty` < 0.0 encourages shorter sequences.
|
||||
no_repeat_ngram_size (`int`, *optional*, defaults to `model.config.no_repeat_ngram_size` or 0 if the config does not set any value):
|
||||
If set to int > 0, all ngrams of that size can only occur once.
|
||||
encoder_no_repeat_ngram_size (`int`, *optional*, defaults to `model.config.encoder_no_repeat_ngram_size` or 0 if the config does not set any value):
|
||||
|
@ -107,7 +107,10 @@ class FSMTConfig(PretrainedConfig):
|
||||
Number of beams for beam search that will be used by default in the `generate` method of the model. 1 means
|
||||
no beam search.
|
||||
length_penalty (`float`, *optional*, defaults to 1)
|
||||
Exponential penalty to the length that will be used by default in the `generate` method of the model.
|
||||
Exponential penalty to the length that is used with beam-based generation. It is applied as an exponent to
|
||||
the sequence length, which in turn is used to divide the score of the sequence. Since the score is the log
|
||||
likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences, while
|
||||
`length_penalty` < 0.0 encourages shorter sequences.
|
||||
early_stopping (`bool`, *optional*, defaults to `False`)
|
||||
Flag that will be used by default in the `generate` method of the model. Whether to stop the beam search
|
||||
when at least `num_beams` sentences are finished per batch or not.
|
||||
|
@ -1463,10 +1463,10 @@ class RagTokenForGeneration(RagPreTrainedModel):
|
||||
eos_token_id (`int`, *optional*):
|
||||
The id of the *end-of-sequence* token.
|
||||
length_penalty (`float`, *optional*, defaults to 1.0):
|
||||
Exponential penalty to the length. 1.0 means no penalty.
|
||||
|
||||
Set to values < 1.0 in order to encourage the model to generate shorter sequences, to a value > 1.0 in
|
||||
order to encourage the model to produce longer sequences.
|
||||
Exponential penalty to the length that is used with beam-based generation. It is applied as an exponent
|
||||
to the sequence length, which in turn is used to divide the score of the sequence. Since the score is
|
||||
the log likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences,
|
||||
while `length_penalty` < 0.0 encourages shorter sequences.
|
||||
no_repeat_ngram_size (`int`, *optional*, defaults to 0):
|
||||
If set to int > 0, all ngrams of that size can only occur once.
|
||||
encoder_no_repeat_ngram_size (`int`, *optional*, defaults to 0):
|
||||
|
@ -1054,10 +1054,10 @@ class TFRagTokenForGeneration(TFRagPreTrainedModel, TFCausalLanguageModelingLoss
|
||||
eos_token_id (`int`, *optional*):
|
||||
The id of the *end-of-sequence* token.
|
||||
length_penalty (`float`, *optional*, defaults to 1.0):
|
||||
Exponential penalty to the length. 1.0 means no penalty.
|
||||
|
||||
Set to values < 1.0 in order to encourage the model to generate shorter sequences, to a value > 1.0 in
|
||||
order to encourage the model to produce longer sequences.
|
||||
Exponential penalty to the length that is used with beam-based generation. It is applied as an exponent
|
||||
to the sequence length, which in turn is used to divide the score of the sequence. Since the score is
|
||||
the log likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences,
|
||||
while `length_penalty` < 0.0 encourages shorter sequences.
|
||||
no_repeat_ngram_size (`int`, *optional*, defaults to 0):
|
||||
If set to int > 0, all ngrams of that size can only occur once.
|
||||
bad_words_ids(`List[int]`, *optional*):
|
||||
|
Loading…
Reference in New Issue
Block a user