mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-03 12:50:06 +06:00

* first batch of structure improvements for model_docs * second batch of structure improvements for model_docs * more structure improvements for model_docs * more structure improvements for model_docs * structure improvements for cv model_docs * more structural refactoring * addressed feedback about image processors
2.8 KiB
2.8 KiB
GPT Neo
Overview
The GPTNeo model was released in the EleutherAI/gpt-neo repository by Sid Black, Stella Biderman, Leo Gao, Phil Wang and Connor Leahy. It is a GPT2 like causal language model trained on the Pile dataset.
The architecture is similar to GPT2 except that GPT Neo uses local attention in every other layer with a window size of 256 tokens.
This model was contributed by valhalla.
Usage example
The generate()
method can be used to generate text using GPT Neo model.
>>> from transformers import GPTNeoForCausalLM, GPT2Tokenizer
>>> model = GPTNeoForCausalLM.from_pretrained("EleutherAI/gpt-neo-1.3B")
>>> tokenizer = GPT2Tokenizer.from_pretrained("EleutherAI/gpt-neo-1.3B")
>>> prompt = (
... "In a shocking finding, scientists discovered a herd of unicorns living in a remote, "
... "previously unexplored valley, in the Andes Mountains. Even more surprising to the "
... "researchers was the fact that the unicorns spoke perfect English."
... )
>>> input_ids = tokenizer(prompt, return_tensors="pt").input_ids
>>> gen_tokens = model.generate(
... input_ids,
... do_sample=True,
... temperature=0.9,
... max_length=100,
... )
>>> gen_text = tokenizer.batch_decode(gen_tokens)[0]
Resources
GPTNeoConfig
autodoc GPTNeoConfig
GPTNeoModel
autodoc GPTNeoModel - forward
GPTNeoForCausalLM
autodoc GPTNeoForCausalLM - forward
GPTNeoForQuestionAnswering
autodoc GPTNeoForQuestionAnswering - forward
GPTNeoForSequenceClassification
autodoc GPTNeoForSequenceClassification - forward
GPTNeoForTokenClassification
autodoc GPTNeoForTokenClassification - forward
FlaxGPTNeoModel
autodoc FlaxGPTNeoModel - call
FlaxGPTNeoForCausalLM
autodoc FlaxGPTNeoForCausalLM - call