transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

History

Julien Chaumond b23d3a5ad4 [model_cards] Switch all languages codes to ISO-639-{1,2,3}		2020-07-15 18:59:20 +02:00
..
README.md	[model_cards] Switch all languages codes to ISO-639-{1,2,3}	2020-07-15 18:59:20 +02:00

README.md

language
zh

roberta_chinese_large

Overview

Language model: roberta-large Model size: 1.2G Language: Chinese Training data: CLUECorpusSmall Eval data: CLUE dataset

Results

For results on downstream tasks like text classification, please refer to this repository.

Usage

NOTE: You have to call BertTokenizer instead of RobertaTokenizer !!!

import torch
from transformers import BertTokenizer, BertModel
tokenizer = BertTokenizer.from_pretrained("clue/roberta_chinese_large")
roberta = BertModel.from_pretrained("clue/roberta_chinese_large")

About CLUE benchmark

Organization of Language Understanding Evaluation benchmark for Chinese: tasks & datasets, baselines, pre-trained Chinese models, corpus and leaderboard.

Github: https://github.com/CLUEbenchmark Website: https://www.cluebenchmarks.com/