transformers/docs/source/en/model_doc/gpt_neo.md
Sylvain Gugger eb849f6604
Migrate doc files to Markdown. (#24376)
* Rename index.mdx to index.md

* With saved modifs

* Address review comment

* Treat all files

* .mdx -> .md

* Remove special char

* Update utils/tests_fetcher.py

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

---------

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2023-06-20 18:07:47 -04:00

2.8 KiB

GPT Neo

Overview

The GPTNeo model was released in the EleutherAI/gpt-neo repository by Sid Black, Stella Biderman, Leo Gao, Phil Wang and Connor Leahy. It is a GPT2 like causal language model trained on the Pile dataset.

The architecture is similar to GPT2 except that GPT Neo uses local attention in every other layer with a window size of 256 tokens.

This model was contributed by valhalla.

Generation

The generate() method can be used to generate text using GPT Neo model.

>>> from transformers import GPTNeoForCausalLM, GPT2Tokenizer

>>> model = GPTNeoForCausalLM.from_pretrained("EleutherAI/gpt-neo-1.3B")
>>> tokenizer = GPT2Tokenizer.from_pretrained("EleutherAI/gpt-neo-1.3B")

>>> prompt = (
...     "In a shocking finding, scientists discovered a herd of unicorns living in a remote, "
...     "previously unexplored valley, in the Andes Mountains. Even more surprising to the "
...     "researchers was the fact that the unicorns spoke perfect English."
... )

>>> input_ids = tokenizer(prompt, return_tensors="pt").input_ids

>>> gen_tokens = model.generate(
...     input_ids,
...     do_sample=True,
...     temperature=0.9,
...     max_length=100,
... )
>>> gen_text = tokenizer.batch_decode(gen_tokens)[0]

Documentation resources

GPTNeoConfig

autodoc GPTNeoConfig

GPTNeoModel

autodoc GPTNeoModel - forward

GPTNeoForCausalLM

autodoc GPTNeoForCausalLM - forward

GPTNeoForQuestionAnswering

autodoc GPTNeoForQuestionAnswering - forward

GPTNeoForSequenceClassification

autodoc GPTNeoForSequenceClassification - forward

GPTNeoForTokenClassification

autodoc GPTNeoForTokenClassification - forward

FlaxGPTNeoModel

autodoc FlaxGPTNeoModel - call

FlaxGPTNeoForCausalLM

autodoc FlaxGPTNeoForCausalLM - call