Add model and doc badges (#4811)

* Add badges for models and docs
This commit is contained in:
Sylvain Gugger 2020-06-05 18:45:42 -04:00 committed by GitHub
parent 4ab7424597
commit 56d5d160cd
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -50,6 +50,15 @@ that at each position, the model can only look at the tokens before in the atten
Original GPT
----------------------------------------------
.. raw:: html
<a href="https://huggingface.co/models?filter=openai-gpt">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-openai--gpt-blueviolet">
</a>
<a href="/model_doc/gpt">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-openai--gpt-blueviolet">
</a>
`Improving Language Understanding by Generative Pre-Training <https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf>`_,
Alec Radford et al.
@ -58,11 +67,18 @@ The first autoregressive model based on the transformer architecture, pretrained
The library provides versions of the model for language modeling and multitask language modeling/multiple choice
classification.
More information in this :doc:`model documentation </model_doc/gpt>`.
GPT-2
----------------------------------------------
.. raw:: html
<a href="https://huggingface.co/models?filter=gpt2">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-gpt2-blueviolet">
</a>
<a href="/model_doc/gpt2">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-gpt2-blueviolet">
</a>
`Language Models are Unsupervised Multitask Learners <https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf>`_,
Alec Radford et al.
@ -72,11 +88,18 @@ more).
The library provides versions of the model for language modeling and multitask language modeling/multiple choice
classification.
More information in this :doc:`model documentation </model_doc/gpt2>`.
CTRL
----------------------------------------------
.. raw:: html
<a href="https://huggingface.co/models?filter=ctrl">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-ctrl-blueviolet">
</a>
<a href="/model_doc/ctrl">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-ctrl-blueviolet">
</a>
`CTRL: A Conditional Transformer Language Model for Controllable Generation <https://arxiv.org/abs/1909.05858>`_,
Nitish Shirish Keskar et al.
@ -86,11 +109,18 @@ wikipedia article, a book or a movie review.
The library provides a version of the model for language modeling only.
More information in this :doc:`model documentation </model_doc/ctrl>`.
Transformer-XL
----------------------------------------------
.. raw:: html
<a href="https://huggingface.co/models?filter=transfo-xl">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-transfo--xl-blueviolet">
</a>
<a href="/model_doc/transformerxl">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-transfo--xl-blueviolet">
</a>
`Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context <https://arxiv.org/abs/1901.02860>`_,
Zihang Dai et al.
@ -108,13 +138,20 @@ adjustments in the way attention scores are computed.
The library provides a version of the model for language modeling only.
More information in this :doc:`model documentation </model_doc/transformerxl>`.
.. _reformer:
Reformer
----------------------------------------------
.. raw:: html
<a href="https://huggingface.co/models?filter=reformer">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-reformer-blueviolet">
</a>
<a href="/model_doc/reformer">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-reformer-blueviolet">
</a>
`Reformer: The Efficient Transformer <https://arxiv.org/abs/2001.04451>`_,
Nikita Kitaev et al .
@ -138,11 +175,18 @@ pretraining yet, though.
The library provides a version of the model for language modeling only.
More information in this :doc:`model documentation </model_doc/reformer>`.
XLNet
----------------------------------------------
.. raw:: html
<a href="https://huggingface.co/models?filter=xlnet">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-xlnet-blueviolet">
</a>
<a href="/model_doc/xlnet">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlnet-blueviolet">
</a>
`XLNet: Generalized Autoregressive Pretraining for Language Understanding <https://arxiv.org/abs/1906.08237>`_,
Zhilin Yang et al.
@ -156,20 +200,27 @@ XLNet also uses the same recurrence mechanism as TransformerXL to build long-ter
The library provides a version of the model for language modeling, token classification, sentence classification,
multiple choice classification and question answering.
More information in this :doc:`model documentation </model_doc/xlnet>`.
.. _autoencoding-models:
Autoencoding models
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
As mentioned before, these models rely on the encoder part of the original transformer and use no mask so the model can `
As mentioned before, these models rely on the encoder part of the original transformer and use no mask so the model can
look at all the tokens in the attention heads. For pretraining, inputs are a corrupted version of the sentence, usually
obtained by masking tokens, and targets are the original sentences.
BERT
----------------------------------------------
.. raw:: html
<a href="https://huggingface.co/models?filter=bert">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-bert-blueviolet">
</a>
<a href="/model_doc/bert">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-bert-blueviolet">
</a>
`BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding <https://arxiv.org/abs/1810.04805>`_,
Jacob Devlin et al.
@ -187,11 +238,18 @@ they are not related. The model has to predict if the sentences are consecutive
The library provides a version of the model for language modeling (traditional or masked), next sentence prediction,
token classification, sentence classification, multiple choice classification and question answering.
More information in this :doc:`model documentation </model_doc/bert>`.
ALBERT
----------------------------------------------
.. raw:: html
<a href="https://huggingface.co/models?filter=albert">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-albert-blueviolet">
</a>
<a href="/model_doc/albert">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-albert-blueviolet">
</a>
`ALBERT: A Lite BERT for Self-supervised Learning of Language Representations <https://arxiv.org/abs/1909.11942>`_,
Zhenzhong Lan et al.
@ -209,11 +267,18 @@ Same as BERT but with a few tweaks:
The library provides a version of the model for masked language modeling, token classification, sentence
classification, multiple choice classification and question answering.
More information in this :doc:`model documentation </model_doc/albert>`.
RoBERTa
----------------------------------------------
.. raw:: html
<a href="https://huggingface.co/models?filter=roberta">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-roberta-blueviolet">
</a>
<a href="/model_doc/roberta">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-roberta-blueviolet">
</a>
`RoBERTa: A Robustly Optimized BERT Pretraining Approach <https://arxiv.org/abs/1907.11692>`_,
Yinhan Liu et al.
@ -228,11 +293,18 @@ Same as BERT with better pretraining tricks:
The library provides a version of the model for masked language modeling, token classification, sentence
classification, multiple choice classification and question answering.
More information in this :doc:`model documentation </model_doc/roberta>`.
DistilBERT
----------------------------------------------
.. raw:: html
<a href="https://huggingface.co/models?filter=distilbert">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-distilbert-blueviolet">
</a>
<a href="/model_doc/distilbert">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-distilbert-blueviolet">
</a>
`DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter <https://arxiv.org/abs/1910.01108>`_,
Victor Sanh et al.
@ -246,11 +318,18 @@ the same probabilities as the larger model. The actual objective is a combinatio
The library provides a version of the model for masked language modeling, token classification, sentence classification
and question answering.
More information in this :doc:`model documentation </model_doc/distilbert>`.
XLM
----------------------------------------------
.. raw:: html
<a href="https://huggingface.co/models?filter=xlm">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-xlm-blueviolet">
</a>
<a href="/model_doc/xlm">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlm-blueviolet">
</a>
`Cross-lingual Language Model Pretraining <https://arxiv.org/abs/1901.07291>`_, Guillaume Lample and Alexis Conneau
A transformer model trained on several languages. There are three different type of training for this model and the
@ -274,11 +353,18 @@ language.
The library provides a version of the model for language modeling, token classification, sentence classification and
question answering.
More information in this :doc:`model documentation </model_doc/xlm>`.
XLM-RoBERTa
----------------------------------------------
.. raw:: html
<a href="https://huggingface.co/models?filter=xlm-roberta">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-xlm--roberta-blueviolet">
</a>
<a href="/model_doc/xlmroberta">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlm--roberta-blueviolet">
</a>
`Unsupervised Cross-lingual Representation Learning at Scale <https://arxiv.org/abs/1911.02116>`_, Alexis Conneau et
al.
@ -289,22 +375,36 @@ masked language modeling on sentences coming from one language. However, the mod
The library provides a version of the model for masked language modeling, token classification, sentence
classification, multiple choice classification and question answering.
More information in this :doc:`model documentation </model_doc/xlmroberta>`.
FlauBERT
----------------------------------------------
.. raw:: html
<a href="https://huggingface.co/models?filter=flaubert">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-flaubert-blueviolet">
</a>
<a href="/model_doc/flaubert">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-flaubert-blueviolet">
</a>
`FlauBERT: Unsupervised Language Model Pre-training for French <https://arxiv.org/abs/1912.05372>`_, Hang Le et al.
Like RoBERTa, without the sentence ordering prediction (so just trained on the MLM objective).
The library provides a version of the model for language modeling and sentence classification.
More information in this :doc:`model documentation </model_doc/flaubert>`.
ELECTRA
----------------------------------------------
.. raw:: html
<a href="https://huggingface.co/models?filter=electra">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-electra-blueviolet">
</a>
<a href="/model_doc/electra">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-electra-blueviolet">
</a>
`ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators <https://arxiv.org/abs/2003.10555>`_,
Kevin Clark et al.
@ -317,13 +417,20 @@ traditional GAN setting) then the ELECTRA model is trained for a few steps.
The library provides a version of the model for masked language modeling, token classification and sentence
classification.
More information in this :doc:`model documentation </model_doc/electra>`.
.. _longformer:
Longformer
----------------------------------------------
.. raw:: html
<a href="https://huggingface.co/models?filter=longformer">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-longformer-blueviolet">
</a>
<a href="/model_doc/longformer">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-longformer-blueviolet">
</a>
`Longformer: The Long-Document Transformer <https://arxiv.org/abs/2004.05150>`_, Iz Beltagy et al.
A transformer model replacing the attention matrices by sparse matrices to go faster. Often, the local context (e.g.,
@ -339,9 +446,6 @@ pretraining yet, though.
The library provides a version of the model for masked language modeling, token classification, sentence
classification, multiple choice classification and question answering.
More information in this :doc:`model documentation </model_doc/longformer>`.
.. _seq-to-seq-models:
Sequence-to-sequence models
@ -352,8 +456,17 @@ As mentioned before, these models keep both the encoder and the decoder of the o
BART
----------------------------------------------
`BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension <https://arxiv.org/abs/1910.13461>`_,
Mike Lewis et al.
.. raw:: html
<a href="https://huggingface.co/models?filter=bart">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-bart-blueviolet">
</a>
<a href="/model_doc/bart">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-bart-blueviolet">
</a>
`BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
<https://arxiv.org/abs/1910.13461>`_, Mike Lewis et al.
Sequence-to-sequence model with an encoder and a decoder. Encoder is fed a corrupted version of the tokens, decoder is
fed the tokens (but has a mask to hide the future words like a regular transformers decoder). For the encoder, on the
@ -367,22 +480,36 @@ pretraining tasks, a composition of the following transformations are applied:
The library provides a version of this model for conditional generation and sequence classification.
More information in this :doc:`model documentation </model_doc/bart>`.
MarianMT
----------------------------------------------
.. raw:: html
<a href="https://huggingface.co/models?filter=marian">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-marian-blueviolet">
</a>
<a href="/model_doc/marian">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-marian-blueviolet">
</a>
`Marian: Fast Neural Machine Translation in C++ <https://arxiv.org/abs/1804.00344>`_, Marcin Junczys-Dowmunt et al.
A framework for translation models, using the same models as BART
The library provides a version of this model for conditional generation.
More information in this :doc:`model documentation </model_doc/marian>`.
T5
----------------------------------------------
.. raw:: html
<a href="https://huggingface.co/models?filter=t5">
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-t5-blueviolet">
</a>
<a href="/model_doc/t5">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-t5-blueviolet">
</a>
`Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer <https://arxiv.org/abs/1910.10683>`_,
Colin Raffel et al.
@ -403,8 +530,6 @@ input becomes “My <x> very <y> .” and the target is “<x> dog is <y> . <z>
The library provides a version of this model for conditional generation.
More information in this :doc:`model documentation </model_doc/t5>`.
.. _multimodal-models:
Multimodal models