From 939328111bf872954b48177eada5f06c05f79fce Mon Sep 17 00:00:00 2001 From: Junyi_Li Date: Thu, 19 Mar 2020 11:07:23 +0800 Subject: [PATCH] Create README.md roberta_chinese_base card --- .../clue/roberta_chinese_base/README.md | 31 +++++++++++++++++++ 1 file changed, 31 insertions(+) create mode 100644 model_cards/clue/roberta_chinese_base/README.md diff --git a/model_cards/clue/roberta_chinese_base/README.md b/model_cards/clue/roberta_chinese_base/README.md new file mode 100644 index 00000000000..b0fcb124c1e --- /dev/null +++ b/model_cards/clue/roberta_chinese_base/README.md @@ -0,0 +1,31 @@ +## roberta_chinese_base + +### Overview + +**Language model:** roberta-base +**Model size:** 392M +**Language:** Chinese +**Training data:** [CLUECorpusSmall](https://github.com/CLUEbenchmark/CLUECorpus2020) +**Eval data:** [CLUE dataset](https://github.com/CLUEbenchmark/CLUE) + +### Results + +For results on downstream tasks like text classification, please refer to [this repository](https://github.com/CLUEbenchmark/CLUE). + +### Usage + +**NOTE:** You have to call **BertTokenizer** instead of RobertaTokenizer !!! + +``` +import torch +from transformers import BertTokenizer, BertModel +tokenizer = BertTokenizer.from_pretrained("clue/roberta_chinese_base") +roberta = BertModel.from_pretrained("clue/roberta_chinese_base") +``` + +### About CLUE benchmark + +Organization of Language Understanding Evaluation benchmark for Chinese: tasks & datasets, baselines, pre-trained Chinese models, corpus and leaderboard. + +Github: https://github.com/CLUEbenchmark +Website: https://www.cluebenchmarks.com/