diff --git a/model_cards/lysandre/arxiv-nlp/README.md b/model_cards/lysandre/arxiv-nlp/README.md new file mode 100644 index 00000000000..dfb295ab3b7 --- /dev/null +++ b/model_cards/lysandre/arxiv-nlp/README.md @@ -0,0 +1,7 @@ +# ArXiv-NLP GPT-2 checkpoint + +This is a GPT-2 small checkpoint for PyTorch. It is the official `gpt2-small` fine-tuned to ArXiv paper on the computational linguistics field. + +## Training data + +This model was trained on a subset of ArXiv papers that were parsed from PDF to txt. The resulting data is made of 80MB of text from the computational linguistics (cs.CL) field. \ No newline at end of file diff --git a/model_cards/lysandre/arxiv/README.md b/model_cards/lysandre/arxiv/README.md new file mode 100644 index 00000000000..2996ef75a4a --- /dev/null +++ b/model_cards/lysandre/arxiv/README.md @@ -0,0 +1,7 @@ +# ArXiv GPT-2 checkpoint + +This is a GPT-2 small checkpoint for PyTorch. It is the official `gpt2-small` finetuned to ArXiv paper on physics fields. + +## Training data + +This model was trained on a subset of ArXiv papers that were parsed from PDF to txt. The resulting data is made of 130MB of text, mostly from quantum physics (quant-ph) and other physics sub-fields.