[model_cards] Add language metadata to existing model cards

This will enable filtering on language (amongst other tags) on the website

cc @loretoparisi, @stefan-it, @HenrykBorzymowski, @marma
This commit is contained in:
Julien Chaumond 2020-02-10 17:42:42 -05:00
parent ba498eac38
commit 95bac8dabb
13 changed files with 49 additions and 1 deletions

View File

@ -1,3 +1,7 @@
---
language: swedish
---
# Swedish BERT Models
The National Library of Sweden / KBLab releases three pretrained language models based on BERT and ALBERT. The models are trained on aproximately 15-20GB of text (200M sentences, 3000M tokens) from various sources (books, news, government publications, swedish wikipedia and internet forums) aiming to provide a representative BERT model for Swedish text. A more complete description will be published later on.

View File

@ -1,3 +1,7 @@
---
language: swedish
---
# Swedish BERT Models
The National Library of Sweden / KBLab releases three pretrained language models based on BERT and ALBERT. The models are trained on aproximately 15-20GB of text (200M sentences, 3000M tokens) from various sources (books, news, government publications, swedish wikipedia and internet forums) aiming to provide a representative BERT model for Swedish text. A more complete description will be published later on.

View File

@ -1,3 +1,7 @@
---
language: swedish
---
# Swedish BERT Models
The National Library of Sweden / KBLab releases three pretrained language models based on BERT and ALBERT. The models are trained on aproximately 15-20GB of text (200M sentences, 3000M tokens) from various sources (books, news, government publications, swedish wikipedia and internet forums) aiming to provide a representative BERT model for Swedish text. A more complete description will be published later on.

View File

@ -1,3 +1,7 @@
---
language: italian
---
# UmBERTo Commoncrawl Cased
[UmBERTo](https://github.com/musixmatchresearch/umberto) is a Roberta-based Language Model trained on large Italian Corpora and uses two innovative approaches: SentencePiece and Whole Word Masking. Now available at [github.com/huggingface/transformers](https://huggingface.co/Musixmatch/umberto-commoncrawl-cased-v1)

View File

@ -1,3 +1,7 @@
---
language: italian
---
# UmBERTo Wikipedia Uncased
[UmBERTo](https://github.com/musixmatchresearch/umberto) is a Roberta-based Language Model trained on large Italian Corpora and uses two innovative approaches: SentencePiece and Whole Word Masking. Now available at [github.com/huggingface/transformers](https://huggingface.co/Musixmatch/umberto-commoncrawl-cased-v1)

View File

@ -1,5 +1,5 @@
---
thumbnail: https://github.com/JetRunner/BERT-of-Theseus/blob/master/bert-of-theseus.png?raw=true
thumbnail: https://raw.githubusercontent.com/JetRunner/BERT-of-Theseus/master/bert-of-theseus.png
---
# BERT-of-Theseus

View File

@ -1,3 +1,7 @@
---
language: german
---
# 🤗 + 📚 dbmdz German BERT models
In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State

View File

@ -1,3 +1,7 @@
---
language: german
---
# 🤗 + 📚 dbmdz German BERT models
In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State

View File

@ -1,3 +1,7 @@
---
language: italian
---
# 🤗 + 📚 dbmdz BERT models
In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State

View File

@ -1,3 +1,7 @@
---
language: italian
---
# 🤗 + 📚 dbmdz BERT models
In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State

View File

@ -1,3 +1,7 @@
---
language: italian
---
# 🤗 + 📚 dbmdz BERT models
In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State

View File

@ -1,3 +1,7 @@
---
language: italian
---
# 🤗 + 📚 dbmdz BERT models
In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State

View File

@ -1,3 +1,7 @@
---
language: dutch
---
# Multilingual + Dutch SQuAD2.0
This model is the multilingual model provided by the Google research team with a fine-tuned dutch Q&A downstream task.