From 5668fdb09e1bcd888930c1ff242bf200649da39c Mon Sep 17 00:00:00 2001 From: Noah Trenaman Date: Fri, 9 Oct 2020 02:16:58 -0700 Subject: [PATCH] Update XLM-RoBERTa details (#7669) --- docs/source/pretrained_models.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/source/pretrained_models.rst b/docs/source/pretrained_models.rst index 812b5f894f9..517ca7cd2d7 100644 --- a/docs/source/pretrained_models.rst +++ b/docs/source/pretrained_models.rst @@ -294,10 +294,10 @@ For a list that includes community-uploaded models, refer to `https://huggingfac | | ``t5-11B`` | | ~11B parameters with 24-layers, 1024-hidden-state, 65536 feed-forward hidden-state, 128-heads, | | | | | Trained on English text: the Colossal Clean Crawled Corpus (C4) | +--------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+ -| XLM-RoBERTa | ``xlm-roberta-base`` | | ~125M parameters with 12-layers, 768-hidden-state, 3072 feed-forward hidden-state, 8-heads, | +| XLM-RoBERTa | ``xlm-roberta-base`` | | ~270M parameters with 12-layers, 768-hidden-state, 3072 feed-forward hidden-state, 8-heads, | | | | | Trained on on 2.5 TB of newly created clean CommonCrawl data in 100 languages | | +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+ -| | ``xlm-roberta-large`` | | ~355M parameters with 24-layers, 1027-hidden-state, 4096 feed-forward hidden-state, 16-heads, | +| | ``xlm-roberta-large`` | | ~550M parameters with 24-layers, 1024-hidden-state, 4096 feed-forward hidden-state, 16-heads, | | | | | Trained on 2.5 TB of newly created clean CommonCrawl data in 100 languages | +--------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+ | FlauBERT | ``flaubert/flaubert_small_cased`` | | 6-layer, 512-hidden, 8-heads, 54M parameters |