mirror of
https://github.com/huggingface/transformers.git
synced 2025-08-01 18:51:14 +06:00
Documentation additions
This commit is contained in:
parent
912a377e90
commit
1dc43e56c9
@ -48,3 +48,4 @@ The library currently contains PyTorch implementations, pre-trained model weight
|
|||||||
model_doc/xlm
|
model_doc/xlm
|
||||||
model_doc/xlnet
|
model_doc/xlnet
|
||||||
model_doc/roberta
|
model_doc/roberta
|
||||||
|
model_doc/distilbert
|
||||||
|
43
docs/source/model_doc/distilbert.rst
Normal file
43
docs/source/model_doc/distilbert.rst
Normal file
@ -0,0 +1,43 @@
|
|||||||
|
DistilBERT
|
||||||
|
----------------------------------------------------
|
||||||
|
|
||||||
|
``DistilBertConfig``
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
.. autoclass:: pytorch_transformers.DistilBertConfig
|
||||||
|
:members:
|
||||||
|
|
||||||
|
|
||||||
|
``DistilBertTokenizer``
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
.. autoclass:: pytorch_transformers.DistilBertTokenizer
|
||||||
|
:members:
|
||||||
|
|
||||||
|
|
||||||
|
``DistilBertModel``
|
||||||
|
~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
.. autoclass:: pytorch_transformers.DistilBertModel
|
||||||
|
:members:
|
||||||
|
|
||||||
|
|
||||||
|
``DistilBertForMaskedLM``
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
.. autoclass:: pytorch_transformers.DistilBertForMaskedLM
|
||||||
|
:members:
|
||||||
|
|
||||||
|
|
||||||
|
``DistilBertForSequenceClassification``
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
.. autoclass:: pytorch_transformers.DistilBertForSequenceClassification
|
||||||
|
:members:
|
||||||
|
|
||||||
|
|
||||||
|
``DistilBertForQuestionAnswering``
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
.. autoclass:: pytorch_transformers.DistilBertForQuestionAnswering
|
||||||
|
:members:
|
@ -111,5 +111,13 @@ Here is the full list of the currently provided pretrained models together with
|
|||||||
| | | | ``roberta-large`` fine-tuned on `MNLI <http://www.nyu.edu/projects/bowman/multinli/>`__. |
|
| | | | ``roberta-large`` fine-tuned on `MNLI <http://www.nyu.edu/projects/bowman/multinli/>`__. |
|
||||||
| | | (see `details <https://github.com/pytorch/fairseq/tree/master/examples/roberta>`__) |
|
| | | (see `details <https://github.com/pytorch/fairseq/tree/master/examples/roberta>`__) |
|
||||||
+-------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
|
+-------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
|
||||||
|
| DistilBERT | ``distilbert-base-uncased`` | | 6-layer, 768-hidden, 12-heads, 66M parameters |
|
||||||
|
| | | | The DistilBERT model distilled from the BERT model `bert-base-uncased` checkpoint |
|
||||||
|
| | | (see `details <https://medium.com/@victorsanh/8cf3380435b5>`__) |
|
||||||
|
| +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
|
||||||
|
| | ``distilbert-base-uncased-distilled-squad`` | | 6-layer, 768-hidden, 12-heads, 66M parameters |
|
||||||
|
| | | | The DistilBERT model distilled from the BERT model `bert-base-uncased` checkpoint, with an additional linear layer. |
|
||||||
|
| | | (see `details <https://medium.com/@victorsanh/8cf3380435b5>`__) |
|
||||||
|
+-------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
|
||||||
|
|
||||||
.. <https://huggingface.co/pytorch-transformers/examples.html>`__
|
.. <https://huggingface.co/pytorch-transformers/examples.html>`__
|
@ -433,7 +433,7 @@ DISTILBERT_START_DOCSTRING = r"""
|
|||||||
|
|
||||||
Here are the differences between the interface of Bert and DistilBert:
|
Here are the differences between the interface of Bert and DistilBert:
|
||||||
|
|
||||||
- DistilBert doesn't have `token_type_ids`, you don't need to indicate which token belong to which segment. Just separate your segments with the separation token `tokenizer.sep_token` (or `[SEP]`)
|
- DistilBert doesn't have `token_type_ids`, you don't need to indicate which token belongs to which segment. Just separate your segments with the separation token `tokenizer.sep_token` (or `[SEP]`)
|
||||||
- DistilBert doesn't have options to select the input positions (`position_ids` input). This could be added if necessary though, just let's us know if you need this option.
|
- DistilBert doesn't have options to select the input positions (`position_ids` input). This could be added if necessary though, just let's us know if you need this option.
|
||||||
|
|
||||||
For more information on DistilBERT, please refer to our
|
For more information on DistilBERT, please refer to our
|
||||||
@ -450,9 +450,9 @@ DISTILBERT_START_DOCSTRING = r"""
|
|||||||
|
|
||||||
DISTILBERT_INPUTS_DOCSTRING = r"""
|
DISTILBERT_INPUTS_DOCSTRING = r"""
|
||||||
Inputs:
|
Inputs:
|
||||||
**input_ids**L ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
**input_ids** ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
||||||
Indices oof input sequence tokens in the vocabulary.
|
Indices of input sequence tokens in the vocabulary.
|
||||||
The input sequences should start with `[CLS]` and `[SEP]` tokens.
|
The input sequences should start with `[CLS]` and end with `[SEP]` tokens.
|
||||||
|
|
||||||
For now, ONLY BertTokenizer(`bert-base-uncased`) is supported and you should use this tokenizer when using DistilBERT.
|
For now, ONLY BertTokenizer(`bert-base-uncased`) is supported and you should use this tokenizer when using DistilBERT.
|
||||||
**attention_mask**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
**attention_mask**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
|
||||||
|
Loading…
Reference in New Issue
Block a user