Creating a readme for ALBERT in Mongolian (#4603)

Here I am uploading Mongolian masked language model (ALBERT) on your platform. https://en.wikipedia.org/wiki/Mongolia
2025-08-01 10:41:07 +06:00 · 2020-05-27 04:54:42 +08:00 · 2020-05-27 04:54:42 +08:00 · a801c7fd74
commit a801c7fd74
parent 6458c0e268
1 changed files with 55 additions and 0 deletions
--- a/model_cards/bayartsogt/albert-mongolian/README.md
+++ b/model_cards/bayartsogt/albert-mongolian/README.md
@ -0,0 +1,55 @@
 # ALBERT-Mongolian
 [pretraining repo link](https://github.com/bayartsogt-ya/albert-mongolian)
 ## Model description
 Here we provide pretrained ALBERT model and trained SentencePiece model for Mongolia text. Training data is the Mongolian wikipedia corpus from Wikipedia Downloads and Mongolian News corpus.
 ## Evaluation Result:
 ```
 loss = 1.7478163
 masked_lm_accuracy = 0.6838185
 masked_lm_loss = 1.6687671
 sentence_order_accuracy = 0.998125
 sentence_order_loss = 0.007942731
 ```
 ## Fine-tuning Result on Eduge Dataset:
 ```
                precision    recall  f1-score   support
  байгал орчин       0.83      0.76      0.80       483
     боловсрол       0.79      0.75      0.77       420
         спорт       0.98      0.96      0.97      1391
     технологи       0.85      0.83      0.84       543
       улс төр       0.88      0.87      0.87      1336
    урлаг соёл       0.89      0.94      0.91       726
         хууль       0.87      0.83      0.85       840
   эдийн засаг       0.80      0.84      0.82      1265
    эрүүл мэнд       0.84      0.90      0.87       562
      accuracy                           0.87      7566
     macro avg       0.86      0.85      0.86      7566
  weighted avg       0.87      0.87      0.87      7566
 ```
 ## Reference
 1. [ALBERT - official repo](https://github.com/google-research/albert)
 2. [WikiExtrator](https://github.com/attardi/wikiextractor)
 3. [Mongolian BERT](https://github.com/tugstugi/mongolian-bert)
 4. [ALBERT - Japanese](https://github.com/alinear-corp/albert-japanese)
 5. [Mongolian Text Classification](https://github.com/sharavsambuu/mongolian-text-classification)
 6. [You's paper](https://arxiv.org/abs/1904.00962)
 ## Citation
 ```
@misc{albert-mongolian,
  author = {Bayartsogt Yadamsuren},
  title = {ALBERT Pretrained Model on Mongolian Datasets},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/bayartsogt-ya/albert-mongolian/}}
 }
 ```
 ## For More Information
 Please contact by bayartsogtyadamsuren@icloud.com