From 2c7749784c2581cff57fdfbecfc16fb4a11a45e0 Mon Sep 17 00:00:00 2001 From: Manuel Romero Date: Mon, 2 Mar 2020 19:31:58 +0100 Subject: [PATCH] Update README.md - Add example of usage - Update metrics --- .../README.md | 56 ++++++++++++------- 1 file changed, 37 insertions(+), 19 deletions(-) diff --git a/model_cards/mrm8488/bert-spanish-cased-finetuned-ner/README.md b/model_cards/mrm8488/bert-spanish-cased-finetuned-ner/README.md index aa32ca6fde6..e50b8cfd0ce 100644 --- a/model_cards/mrm8488/bert-spanish-cased-finetuned-ner/README.md +++ b/model_cards/mrm8488/bert-spanish-cased-finetuned-ner/README.md @@ -19,22 +19,29 @@ I preprocessed the dataset and splitted it as train / dev (80/20) | Dev | 2.2 K | -- [Fine-tune on NER script](https://github.com/huggingface/transformers/blob/master/examples/run_ner.py) +- [Fine-tune on NER script provided by Huggingface](https://github.com/huggingface/transformers/blob/master/examples/run_ner.py) -```bash -!export NER_DIR='/content/ner_dataset' -!python /content/transformers/examples/run_ner.py \ - --model_type bert \ - --model_name_or_path dccuchile/bert-base-spanish-wwm-cased \ - --do_train \ - --do_eval \ - --data_dir '/content/ner_dataset' \ - --num_train_epochs 15.0 \ - --max_seq_length 384 \ - --output_dir /content/model_output \ - --save_steps 5000 \ +- Labels covered: ``` +B-LOC +B-MISC +B-ORG +B-PER +I-LOC +I-MISC +I-ORG +I-PER +O +``` + +## Metrics on evaluation set: + +| Metric | # score | +| :------------------------------------------------------------------------------------: | :-------: | +| F1 | **90.17** +| Precision | **89.86** | +| Recall | **90.47** | ## Comparison: @@ -44,13 +51,24 @@ I preprocessed the dataset and splitted it as train / dev (80/20) | [bert-spanish-cased-finetuned-ner (this one)](https://huggingface.co/mrm8488/bert-spanish-cased-finetuned-ner) | **89.65** | | Best Multilingual BERT | 87.38 | -``` - ***** All metrics on Eval results ***** +## Model in action -f1 = 0.8965040489828165 -loss = 0.11504213575173258 -precision = 0.893679858239811 -recall = 0.8993461462254805 +Fast usage with **pipelines**: + +```python +from transformers import pipeline + +nlp_ner = pipeline( + "ner", + model="mrm8488/bert-spanish-cased-finetuned-ner", + tokenizer=( + 'mrm8488/bert-spanish-cased-finetuned-ner', + {"use_fast": False} +)) + +nlp_ner(text) + +#Output: [{'entity': 'B-LOC', 'score': 0.9998720288276672, 'word': 'Londres'}] ``` > Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488)