transformers/docs/source/en/model_doc/openai-gpt.md
Lysandre Debut d538293f62
Transformers cli clean command (#37657)
* transformers-cli -> transformers

* Chat command works with positional argument

* update doc references to transformers-cli

* doc headers

* deepspeed

---------

Co-authored-by: Joao Gante <joao@huggingface.co>
2025-04-30 12:15:43 +01:00

4.1 KiB

PyTorch TensorFlow Flax SDPA FlashAttention

GPT

GPT (Generative Pre-trained Transformer) focuses on effectively learning text representations and transferring them to tasks. This model trains the Transformer decoder to predict the next word, and then fine-tuned on labeled data.

GPT can generate high-quality text, making it well-suited for a variety of natural language understanding tasks such as textual entailment, question answering, semantic similarity, and document classification.

You can find all the original GPT checkpoints under the OpenAI community organization.

Tip

Click on the GPT models in the right sidebar for more examples of how to apply GPT to different language tasks.

The example below demonstrates how to generate text with [Pipeline], [AutoModel], and from the command line.

import torch
from transformers import pipeline

generator = pipeline(task="text-generation", model="openai-community/gpt", torch_dtype=torch.float16, device=0)
output = generator("The future of AI is", max_length=50, do_sample=True)
print(output[0]["generated_text"])
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("openai-community/gpt")
model = AutoModelForCausalLM.from_pretrained("openai-community/openai-gpt", torch_dtype=torch.float16)

inputs = tokenizer("The future of AI is", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
echo -e "The future of AI is" | transformers run --task text-generation --model openai-community/openai-gpt --device 0

Notes

  • Inputs should be padded on the right because GPT uses absolute position embeddings.

OpenAIGPTConfig

autodoc OpenAIGPTConfig

OpenAIGPTModel

autodoc OpenAIGPTModel

  • forward

OpenAIGPTLMHeadModel

autodoc OpenAIGPTLMHeadModel

  • forward

OpenAIGPTDoubleHeadsModel

autodoc OpenAIGPTDoubleHeadsModel

  • forward

OpenAIGPTForSequenceClassification

autodoc OpenAIGPTForSequenceClassification

  • forward

OpenAIGPTTokenizer

autodoc OpenAIGPTTokenizer

OpenAIGPTTokenizerFast

autodoc OpenAIGPTTokenizerFast

TFOpenAIGPTModel

autodoc TFOpenAIGPTModel

  • call

TFOpenAIGPTLMHeadModel

autodoc TFOpenAIGPTLMHeadModel

  • call

TFOpenAIGPTDoubleHeadsModel

autodoc TFOpenAIGPTDoubleHeadsModel

  • call

TFOpenAIGPTForSequenceClassification

autodoc TFOpenAIGPTForSequenceClassification

  • call