From 33d3072e1c54bcd235447b98c6dea1b4cb71234c Mon Sep 17 00:00:00 2001 From: Lysandre Debut Date: Wed, 5 Feb 2020 15:26:28 -0500 Subject: [PATCH] Arxiv README (#2747) * Arxiv README * ArXiv-NLP readme --- model_cards/lysandre/arxiv-nlp/README.md | 7 +++++++ model_cards/lysandre/arxiv/README.md | 7 +++++++ 2 files changed, 14 insertions(+) create mode 100644 model_cards/lysandre/arxiv-nlp/README.md create mode 100644 model_cards/lysandre/arxiv/README.md diff --git a/model_cards/lysandre/arxiv-nlp/README.md b/model_cards/lysandre/arxiv-nlp/README.md new file mode 100644 index 00000000000..dfb295ab3b7 --- /dev/null +++ b/model_cards/lysandre/arxiv-nlp/README.md @@ -0,0 +1,7 @@ +# ArXiv-NLP GPT-2 checkpoint + +This is a GPT-2 small checkpoint for PyTorch. It is the official `gpt2-small` fine-tuned to ArXiv paper on the computational linguistics field. + +## Training data + +This model was trained on a subset of ArXiv papers that were parsed from PDF to txt. The resulting data is made of 80MB of text from the computational linguistics (cs.CL) field. \ No newline at end of file diff --git a/model_cards/lysandre/arxiv/README.md b/model_cards/lysandre/arxiv/README.md new file mode 100644 index 00000000000..2996ef75a4a --- /dev/null +++ b/model_cards/lysandre/arxiv/README.md @@ -0,0 +1,7 @@ +# ArXiv GPT-2 checkpoint + +This is a GPT-2 small checkpoint for PyTorch. It is the official `gpt2-small` finetuned to ArXiv paper on physics fields. + +## Training data + +This model was trained on a subset of ArXiv papers that were parsed from PDF to txt. The resulting data is made of 130MB of text, mostly from quantum physics (quant-ph) and other physics sub-fields.