mirror of
https://github.com/huggingface/transformers.git
synced 2025-08-03 03:31:05 +06:00
![]() * Add new token classification example * Remove txt file * Add test * With actual testing done * Less warmup is better * Update examples/token-classification/run_ner_new.py Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Address review comments * Fix test * Make Lysandre happy * Last touches and rename * Rename in tests * Address review comments * More run_ner -> run_ner_old Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> |
||
---|---|---|
.. | ||
README.md |
language | thumbnail |
---|---|
es | https://i.imgur.com/jgBdimh.png |
Spanish BERT (BETO) + NER
This model is a fine-tuned on NER-C version of the Spanish BERT cased (BETO) for NER downstream task.
Details of the downstream task (NER) - Dataset
I preprocessed the dataset and split it as train / dev (80/20)
Dataset | # Examples |
---|---|
Train | 8.7 K |
Dev | 2.2 K |
-
Labels covered:
B-LOC
B-MISC
B-ORG
B-PER
I-LOC
I-MISC
I-ORG
I-PER
O
Metrics on evaluation set:
Metric | # score |
---|---|
F1 | 90.17 |
Precision | 89.86 |
Recall | 90.47 |
Comparison:
Model | # F1 score | Size(MB) |
---|---|---|
bert-base-spanish-wwm-cased (BETO) | 88.43 | 421 |
bert-spanish-cased-finetuned-ner (this one) | 90.17 | 420 |
Best Multilingual BERT | 87.38 | 681 |
TinyBERT-spanish-uncased-finetuned-ner | 70.00 | 55 |
Model in action
Fast usage with pipelines:
from transformers import pipeline
nlp_ner = pipeline(
"ner",
model="mrm8488/bert-spanish-cased-finetuned-ner",
tokenizer=(
'mrm8488/bert-spanish-cased-finetuned-ner',
{"use_fast": False}
))
text = 'Mis amigos están pensando viajar a Londres este verano'
nlp_ner(text)
#Output: [{'entity': 'B-LOC', 'score': 0.9998720288276672, 'word': 'Londres'}]
Created by Manuel Romero/@mrm8488
Made with ♥ in Spain