Create README.md (#5872)

2025-07-21 13:38:31 +06:00 · 2020-07-21 19:20:21 +02:00 · 2020-07-21 19:20:21 +02:00 · 783a0c7ee9
commit 783a0c7ee9
parent e7844d60c2
1 changed files with 50 additions and 0 deletions
--- a/model_cards/jannesg/takalane_nso_roberta/README.md
+++ b/model_cards/jannesg/takalane_nso_roberta/README.md
@ -0,0 +1,50 @@
+---
+language: 
+- nso
+thumbnail: https://pbs.twimg.com/media/EVjR6BsWoAAFaq5.jpg
+tags:
+- nso
+- fill-mask
+- pytorch
+- roberta
+- lm-head
+- masked-lm
+license: MIT
+---
+
+# Takalani Sesame - Northern Sotho 🇿🇦
+
+<img src="https://pbs.twimg.com/media/EVjR6BsWoAAFaq5.jpg" width="600"/> 
+
+## Model description
+
+Takalani Sesame (named after the South African version of Sesame Street) is a project that aims to promote the use of South African languages in NLP, and in particular look at techniques for low-resource languages to equalise performance with larger languages around the world.
+
+## Intended uses & limitations
+
+#### How to use
+
+```python
+from transformers import AutoTokenizer, AutoModelWithLMHead
+
+tokenizer = AutoTokenizer.from_pretrained("jannesg/takalane_nso_roberta")
+
+model = AutoModelWithLMHead.from_pretrained("jannesg/takalane_nso_roberta")
+```
+
+#### Limitations and bias
+
+Updates will be added continously to improve performance. 
+
+## Training data
+
+Data collected from [https://wortschatz.uni-leipzig.de/en](https://wortschatz.uni-leipzig.de/en) <br/>
+**Sentences:** 4746
+
+## Training procedure
+
+No preprocessing. Standard Huggingface hyperparameters. 
+
+## Author
+
+Jannes Germishuys [website](http://jannesgg.github.io)