mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-31 02:02:21 +06:00
Update README.md (model_card) (#4424)
- add a citation. - modify the table of the BLUE benchmark. The table of the first version was not displayed correctly on https://huggingface.co/seiya/oubiobert-base-uncased. Could you please confirm that this fix will allow you to display it correctly?
This commit is contained in:
parent
235777ccc9
commit
09b933f19d
@ -1,37 +1,49 @@
|
||||
---
|
||||
tags:
|
||||
- pytorch
|
||||
- exbert
|
||||
license: apache-2.0
|
||||
---
|
||||
|
||||
# ouBioBERT-Base, Uncased
|
||||
|
||||
Bidirectional Encoder Representations from Transformers for Biomedical Text Mining by Osaka University (ouBioBERT) is a language model based on the BERT-Base (Devlin, et al., 2019) architecture. We pre-trained ouBioBERT on PubMed abstracts from the PubMed baseline (ftp://ftp.ncbi.nlm.nih.gov/pubmed/baseline) via our method.
|
||||
|
||||
The details of the pre-training procedure can be found in Wada, et al. (2020).
|
||||
|
||||
## Evaluation
|
||||
|
||||
We evaluated the performance of ouBioBERT in terms of the biomedical language understanding evaluation (BLUE) benchmark (Peng, et al., 2019). The numbers are mean (standard deviation) on five different random seeds.
|
||||
| Dataset | Task Type | Score |
|
||||
|:----------------|:--------------------------|-------------:|
|
||||
| MedSTS | Sentence similarity | 84.9 (0.6) |
|
||||
| BIOSSES | Sentence similarity | 92.3 (0.8) |
|
||||
| BC5CDR-disease | Named-entity recognition | 87.4 (0.1) |
|
||||
| BC5CDR-chemical | Named-entity recognition | 93.7 (0.2) |
|
||||
| ShARe/CLEFE | Named-entity recognition | 80.1 (0.4) |
|
||||
| DDI | Relation extraction | 81.1 (1.5) |
|
||||
| ChemProt | Relation extraction | 75.0 (0.3) |
|
||||
| i2b2 2010 | Relation extraction | 74.0 (0.8) |
|
||||
| HoC | Document classification | 86.4 (0.5) |
|
||||
| MedNLI | Inference | 83.6 (0.7) |
|
||||
| **Total** | - |**83.8 (0.3)**|
|
||||
|
||||
|
||||
| Dataset | Task Type | Score |
|
||||
|:----------------|:-----------------------------|-------------:|
|
||||
| MedSTS | Sentence similarity | 84.9 (0.6) |
|
||||
| BIOSSES | Sentence similarity | 92.3 (0.8) |
|
||||
| BC5CDR-disease | Named-entity recognition | 87.4 (0.1) |
|
||||
| BC5CDR-chemical | Named-entity recognition | 93.7 (0.2) |
|
||||
| ShARe/CLEFE | Named-entity recognition | 80.1 (0.4) |
|
||||
| DDI | Relation extraction | 81.1 (1.5) |
|
||||
| ChemProt | Relation extraction | 75.0 (0.3) |
|
||||
| i2b2 2010 | Relation extraction | 74.0 (0.8) |
|
||||
| HoC | Document classification | 86.4 (0.5) |
|
||||
| MedNLI | Inference | 83.6 (0.7) |
|
||||
| **Total** | Macro average of the scores |**83.8 (0.3)**|
|
||||
|
||||
|
||||
## Code for Fine-tuning
|
||||
We made the source code for fine-tuning freely available at [our repository](https://github.com/sy-wada/blue_benchmark_with_transformers).
|
||||
|
||||
## Citation
|
||||
|
||||
If you use our work in your research, please kindly cite the following paper:
|
||||
|
||||
```bibtex
|
||||
now preparing...
|
||||
@misc{2005.07202,
|
||||
Author = {Shoya Wada and Toshihiro Takeda and Shiro Manabe and Shozo Konishi and Jun Kamohara and Yasushi Matsumura},
|
||||
Title = {A pre-training technique to localize medical BERT and enhance BioBERT},
|
||||
Year = {2020},
|
||||
Eprint = {arXiv:2005.07202},
|
||||
}
|
||||
```
|
||||
|
||||
<a href="https://huggingface.co/exbert/?model=seiya/oubiobert-base-uncased&sentence=Coronavirus%20disease%20(COVID-19)%20is%20caused%20by%20SARS-COV2%20and%20represents%20the%20causative%20agent%20of%20a%20potentially%20fatal%20disease%20that%20is%20of%20great%20global%20public%20health%20concern.">
|
||||
|
Loading…
Reference in New Issue
Block a user