mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-31 18:22:34 +06:00
Make doc regarding masked indices more clear.
Signed-off-by: Romain Keramitas <r.keramitas@gmail.com>
This commit is contained in:
parent
43114b89ba
commit
569da80ced
@ -597,7 +597,7 @@ class AlbertForMaskedLM(AlbertPreTrainedModel):
|
|||||||
r"""
|
r"""
|
||||||
**masked_lm_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
**masked_lm_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
||||||
Labels for computing the masked language modeling loss.
|
Labels for computing the masked language modeling loss.
|
||||||
Indices should be in ``[-1, 0, ..., config.vocab_size]`` (see ``input_ids`` docstring)
|
Indices should be in ``[-100, 0, ..., config.vocab_size]`` (see ``input_ids`` docstring)
|
||||||
Tokens with indices set to ``-100`` are ignored (masked), the loss is only computed for the tokens with labels
|
Tokens with indices set to ``-100`` are ignored (masked), the loss is only computed for the tokens with labels
|
||||||
in ``[0, ..., config.vocab_size]``
|
in ``[0, ..., config.vocab_size]``
|
||||||
|
|
||||||
|
@ -826,7 +826,7 @@ class BertForPreTraining(BertPreTrainedModel):
|
|||||||
r"""
|
r"""
|
||||||
**masked_lm_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
**masked_lm_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
||||||
Labels for computing the masked language modeling loss.
|
Labels for computing the masked language modeling loss.
|
||||||
Indices should be in ``[-1, 0, ..., config.vocab_size]`` (see ``input_ids`` docstring)
|
Indices should be in ``[-100, 0, ..., config.vocab_size]`` (see ``input_ids`` docstring)
|
||||||
Tokens with indices set to ``-100`` are ignored (masked), the loss is only computed for the tokens with labels
|
Tokens with indices set to ``-100`` are ignored (masked), the loss is only computed for the tokens with labels
|
||||||
in ``[0, ..., config.vocab_size]``
|
in ``[0, ..., config.vocab_size]``
|
||||||
**next_sentence_label**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size,)``:
|
**next_sentence_label**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size,)``:
|
||||||
@ -916,12 +916,12 @@ class BertForMaskedLM(BertPreTrainedModel):
|
|||||||
r"""
|
r"""
|
||||||
**masked_lm_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
**masked_lm_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
||||||
Labels for computing the masked language modeling loss.
|
Labels for computing the masked language modeling loss.
|
||||||
Indices should be in ``[-1, 0, ..., config.vocab_size]`` (see ``input_ids`` docstring)
|
Indices should be in ``[-100, 0, ..., config.vocab_size]`` (see ``input_ids`` docstring)
|
||||||
Tokens with indices set to ``-100`` are ignored (masked), the loss is only computed for the tokens with labels
|
Tokens with indices set to ``-100`` are ignored (masked), the loss is only computed for the tokens with labels
|
||||||
in ``[0, ..., config.vocab_size]``
|
in ``[0, ..., config.vocab_size]``
|
||||||
**lm_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
**lm_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
||||||
Labels for computing the left-to-right language modeling loss (next word prediction).
|
Labels for computing the left-to-right language modeling loss (next word prediction).
|
||||||
Indices should be in ``[-1, 0, ..., config.vocab_size]`` (see ``input_ids`` docstring)
|
Indices should be in ``[-100, 0, ..., config.vocab_size]`` (see ``input_ids`` docstring)
|
||||||
Tokens with indices set to ``-100`` are ignored (masked), the loss is only computed for the tokens with labels
|
Tokens with indices set to ``-100`` are ignored (masked), the loss is only computed for the tokens with labels
|
||||||
in ``[0, ..., config.vocab_size]``
|
in ``[0, ..., config.vocab_size]``
|
||||||
|
|
||||||
|
@ -167,7 +167,7 @@ class CamembertForMaskedLM(RobertaForMaskedLM):
|
|||||||
r"""
|
r"""
|
||||||
**masked_lm_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
**masked_lm_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
||||||
Labels for computing the masked language modeling loss.
|
Labels for computing the masked language modeling loss.
|
||||||
Indices should be in ``[-1, 0, ..., config.vocab_size]`` (see ``input_ids`` docstring)
|
Indices should be in ``[-100, 0, ..., config.vocab_size]`` (see ``input_ids`` docstring)
|
||||||
Tokens with indices set to ``-100`` are ignored (masked), the loss is only computed for the tokens with labels
|
Tokens with indices set to ``-100`` are ignored (masked), the loss is only computed for the tokens with labels
|
||||||
in ``[0, ..., config.vocab_size]``
|
in ``[0, ..., config.vocab_size]``
|
||||||
|
|
||||||
|
@ -444,7 +444,7 @@ class CTRLLMHeadModel(CTRLPreTrainedModel):
|
|||||||
**labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
**labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
||||||
Labels for language modeling.
|
Labels for language modeling.
|
||||||
Note that the labels **are shifted** inside the model, i.e. you can set ``lm_labels = input_ids``
|
Note that the labels **are shifted** inside the model, i.e. you can set ``lm_labels = input_ids``
|
||||||
Indices are selected in ``[-1, 0, ..., config.vocab_size]``
|
Indices are selected in ``[-100, 0, ..., config.vocab_size]``
|
||||||
All labels set to ``-100`` are ignored (masked), the loss is only
|
All labels set to ``-100`` are ignored (masked), the loss is only
|
||||||
computed for labels in ``[0, ..., config.vocab_size]``
|
computed for labels in ``[0, ..., config.vocab_size]``
|
||||||
|
|
||||||
|
@ -496,7 +496,7 @@ class DistilBertForMaskedLM(DistilBertPreTrainedModel):
|
|||||||
r"""
|
r"""
|
||||||
**masked_lm_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
**masked_lm_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
||||||
Labels for computing the masked language modeling loss.
|
Labels for computing the masked language modeling loss.
|
||||||
Indices should be in ``[-1, 0, ..., config.vocab_size]`` (see ``input_ids`` docstring)
|
Indices should be in ``[-100, 0, ..., config.vocab_size]`` (see ``input_ids`` docstring)
|
||||||
Tokens with indices set to ``-100`` are ignored (masked), the loss is only computed for the tokens with labels
|
Tokens with indices set to ``-100`` are ignored (masked), the loss is only computed for the tokens with labels
|
||||||
in ``[0, ..., config.vocab_size]``
|
in ``[0, ..., config.vocab_size]``
|
||||||
|
|
||||||
|
@ -513,7 +513,7 @@ class GPT2LMHeadModel(GPT2PreTrainedModel):
|
|||||||
**labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
**labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
||||||
Labels for language modeling.
|
Labels for language modeling.
|
||||||
Note that the labels **are shifted** inside the model, i.e. you can set ``lm_labels = input_ids``
|
Note that the labels **are shifted** inside the model, i.e. you can set ``lm_labels = input_ids``
|
||||||
Indices are selected in ``[-1, 0, ..., config.vocab_size]``
|
Indices are selected in ``[-100, 0, ..., config.vocab_size]``
|
||||||
All labels set to ``-100`` are ignored (masked), the loss is only
|
All labels set to ``-100`` are ignored (masked), the loss is only
|
||||||
computed for labels in ``[0, ..., config.vocab_size]``
|
computed for labels in ``[0, ..., config.vocab_size]``
|
||||||
|
|
||||||
|
@ -490,7 +490,7 @@ class OpenAIGPTLMHeadModel(OpenAIGPTPreTrainedModel):
|
|||||||
**labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
**labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
||||||
Labels for language modeling.
|
Labels for language modeling.
|
||||||
Note that the labels **are shifted** inside the model, i.e. you can set ``labels = input_ids``
|
Note that the labels **are shifted** inside the model, i.e. you can set ``labels = input_ids``
|
||||||
Indices are selected in ``[-1, 0, ..., config.vocab_size]``
|
Indices are selected in ``[-100, 0, ..., config.vocab_size]``
|
||||||
All labels set to ``-100`` are ignored (masked), the loss is only
|
All labels set to ``-100`` are ignored (masked), the loss is only
|
||||||
computed for labels in ``[0, ..., config.vocab_size]``
|
computed for labels in ``[0, ..., config.vocab_size]``
|
||||||
|
|
||||||
@ -578,7 +578,7 @@ class OpenAIGPTDoubleHeadsModel(OpenAIGPTPreTrainedModel):
|
|||||||
**lm_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
**lm_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
||||||
Labels for language modeling.
|
Labels for language modeling.
|
||||||
Note that the labels **are shifted** inside the model, i.e. you can set ``lm_labels = input_ids``
|
Note that the labels **are shifted** inside the model, i.e. you can set ``lm_labels = input_ids``
|
||||||
Indices are selected in ``[-1, 0, ..., config.vocab_size]``
|
Indices are selected in ``[-100, 0, ..., config.vocab_size]``
|
||||||
All labels set to ``-100`` are ignored (masked), the loss is only
|
All labels set to ``-100`` are ignored (masked), the loss is only
|
||||||
computed for labels in ``[0, ..., config.vocab_size]``
|
computed for labels in ``[0, ..., config.vocab_size]``
|
||||||
**mc_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size)``:
|
**mc_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size)``:
|
||||||
|
@ -223,7 +223,7 @@ class RobertaForMaskedLM(BertPreTrainedModel):
|
|||||||
r"""
|
r"""
|
||||||
**masked_lm_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
**masked_lm_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
||||||
Labels for computing the masked language modeling loss.
|
Labels for computing the masked language modeling loss.
|
||||||
Indices should be in ``[-1, 0, ..., config.vocab_size]`` (see ``input_ids`` docstring)
|
Indices should be in ``[-100, 0, ..., config.vocab_size]`` (see ``input_ids`` docstring)
|
||||||
Tokens with indices set to ``-100`` are ignored (masked), the loss is only computed for the tokens with labels
|
Tokens with indices set to ``-100`` are ignored (masked), the loss is only computed for the tokens with labels
|
||||||
in ``[0, ..., config.vocab_size]``
|
in ``[0, ..., config.vocab_size]``
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user