Arxiv README (#2747)

* Arxiv README

* ArXiv-NLP readme
This commit is contained in:
Lysandre Debut 2020-02-05 15:26:28 -05:00 committed by GitHub
parent eae8ee0389
commit 33d3072e1c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 14 additions and 0 deletions

View File

@ -0,0 +1,7 @@
# ArXiv-NLP GPT-2 checkpoint
This is a GPT-2 small checkpoint for PyTorch. It is the official `gpt2-small` fine-tuned to ArXiv paper on the computational linguistics field.
## Training data
This model was trained on a subset of ArXiv papers that were parsed from PDF to txt. The resulting data is made of 80MB of text from the computational linguistics (cs.CL) field.

View File

@ -0,0 +1,7 @@
# ArXiv GPT-2 checkpoint
This is a GPT-2 small checkpoint for PyTorch. It is the official `gpt2-small` finetuned to ArXiv paper on physics fields.
## Training data
This model was trained on a subset of ArXiv papers that were parsed from PDF to txt. The resulting data is made of 130MB of text, mostly from quantum physics (quant-ph) and other physics sub-fields.