Documentation additions

This commit is contained in:
LysandreJik 2019-08-28 09:37:27 -04:00
parent 912a377e90
commit 1dc43e56c9
4 changed files with 56 additions and 4 deletions

View File

@ -48,3 +48,4 @@ The library currently contains PyTorch implementations, pre-trained model weight
model_doc/xlm model_doc/xlm
model_doc/xlnet model_doc/xlnet
model_doc/roberta model_doc/roberta
model_doc/distilbert

View File

@ -0,0 +1,43 @@
DistilBERT
----------------------------------------------------
``DistilBertConfig``
~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: pytorch_transformers.DistilBertConfig
:members:
``DistilBertTokenizer``
~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: pytorch_transformers.DistilBertTokenizer
:members:
``DistilBertModel``
~~~~~~~~~~~~~~~~~~~~
.. autoclass:: pytorch_transformers.DistilBertModel
:members:
``DistilBertForMaskedLM``
~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: pytorch_transformers.DistilBertForMaskedLM
:members:
``DistilBertForSequenceClassification``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: pytorch_transformers.DistilBertForSequenceClassification
:members:
``DistilBertForQuestionAnswering``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: pytorch_transformers.DistilBertForQuestionAnswering
:members:

View File

@ -111,5 +111,13 @@ Here is the full list of the currently provided pretrained models together with
| | | | ``roberta-large`` fine-tuned on `MNLI <http://www.nyu.edu/projects/bowman/multinli/>`__. | | | | | ``roberta-large`` fine-tuned on `MNLI <http://www.nyu.edu/projects/bowman/multinli/>`__. |
| | | (see `details <https://github.com/pytorch/fairseq/tree/master/examples/roberta>`__) | | | | (see `details <https://github.com/pytorch/fairseq/tree/master/examples/roberta>`__) |
+-------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+ +-------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
| DistilBERT | ``distilbert-base-uncased`` | | 6-layer, 768-hidden, 12-heads, 66M parameters |
| | | | The DistilBERT model distilled from the BERT model `bert-base-uncased` checkpoint |
| | | (see `details <https://medium.com/@victorsanh/8cf3380435b5>`__) |
| +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
| | ``distilbert-base-uncased-distilled-squad`` | | 6-layer, 768-hidden, 12-heads, 66M parameters |
| | | | The DistilBERT model distilled from the BERT model `bert-base-uncased` checkpoint, with an additional linear layer. |
| | | (see `details <https://medium.com/@victorsanh/8cf3380435b5>`__) |
+-------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
.. <https://huggingface.co/pytorch-transformers/examples.html>`__ .. <https://huggingface.co/pytorch-transformers/examples.html>`__

View File

@ -433,7 +433,7 @@ DISTILBERT_START_DOCSTRING = r"""
Here are the differences between the interface of Bert and DistilBert: Here are the differences between the interface of Bert and DistilBert:
- DistilBert doesn't have `token_type_ids`, you don't need to indicate which token belong to which segment. Just separate your segments with the separation token `tokenizer.sep_token` (or `[SEP]`) - DistilBert doesn't have `token_type_ids`, you don't need to indicate which token belongs to which segment. Just separate your segments with the separation token `tokenizer.sep_token` (or `[SEP]`)
- DistilBert doesn't have options to select the input positions (`position_ids` input). This could be added if necessary though, just let's us know if you need this option. - DistilBert doesn't have options to select the input positions (`position_ids` input). This could be added if necessary though, just let's us know if you need this option.
For more information on DistilBERT, please refer to our For more information on DistilBERT, please refer to our
@ -450,9 +450,9 @@ DISTILBERT_START_DOCSTRING = r"""
DISTILBERT_INPUTS_DOCSTRING = r""" DISTILBERT_INPUTS_DOCSTRING = r"""
Inputs: Inputs:
**input_ids**L ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``: **input_ids** ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
Indices oof input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
The input sequences should start with `[CLS]` and `[SEP]` tokens. The input sequences should start with `[CLS]` and end with `[SEP]` tokens.
For now, ONLY BertTokenizer(`bert-base-uncased`) is supported and you should use this tokenizer when using DistilBERT. For now, ONLY BertTokenizer(`bert-base-uncased`) is supported and you should use this tokenizer when using DistilBERT.
**attention_mask**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``: **attention_mask**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``: