transformers/docs/source/en/model_doc/bigbird_pegasus.md
Drew Ross fe35eca7bd
Update BigBirdPegasus model card (#39104)
* Update igbird_pegasus.md

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-06-30 10:42:56 -07:00

9.3 KiB

PyTorch

BigBirdPegasus

BigBirdPegasus is an encoder-decoder (sequence-to-sequence) transformer model for long-input summarization. It extends the BigBird architecture with an additional pretraining objective borrowed from Pegasus called gap sequence generation (GSG). Whole sentences are masked and the model has to fill in the gaps in the document. BigBirdPegasus's ability to keep track of long contexts makes it effective at summarizing lengthy inputs, surpassing the performance of base Pegasus models.

You can find all the original BigBirdPegasus checkpoints under the Google organization.

Tip

This model was contributed by vasudevgupta.

Click on the BigBirdPegasus models in the right sidebar for more examples of how to apply BigBirdPegasus to different language tasks.

The example below demonstrates how to summarize text with [Pipeline], [AutoModel], and from the command line.

import torch
from transformers import pipeline

pipeline = pipeline(
    task="summarization",
    model="google/bigbird-pegasus-large-arxiv",
    torch_dtype=torch.float32,
    device=0
)
pipeline("""Plants are among the most remarkable and essential life forms on Earth, possessing a unique ability to produce their own food through a process known as photosynthesis. This complex biochemical process is fundamental not only to plant life but to virtually all life on the planet.
Through photosynthesis, plants capture energy from sunlight using a green pigment called chlorophyll, which is located in specialized cell structures called chloroplasts. In the presence of light, plants absorb carbon dioxide from the atmosphere through small pores in their leaves called stomata, and take in water from the soil through their root systems.
These ingredients are then transformed into glucose, a type of sugar that serves as a source of chemical energy, and oxygen, which is released as a byproduct into the atmosphere. The glucose produced during photosynthesis is not just used immediately; plants also store it as starch or convert it into other organic compounds like cellulose, which is essential for building their cellular structure.
This energy reserve allows them to grow, develop leaves, produce flowers, bear fruit, and carry out various physiological processes throughout their lifecycle.""")
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained(
    "google/bigbird-pegasus-large-arxiv"
)
model = AutoModelForSeq2SeqLM.from_pretrained(
    "google/bigbird-pegasus-large-arxiv",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

input_text = """Plants are among the most remarkable and essential life forms on Earth, possessing a unique ability to produce their own food through a process known as photosynthesis. This complex biochemical process is fundamental not only to plant life but to virtually all life on the planet.
Through photosynthesis, plants capture energy from sunlight using a green pigment called chlorophyll, which is located in specialized cell structures called chloroplasts. In the presence of light, plants absorb carbon dioxide from the atmosphere through small pores in their leaves called stomata, and take in water from the soil through their root systems.
These ingredients are then transformed into glucose, a type of sugar that serves as a source of chemical energy, and oxygen, which is released as a byproduct into the atmosphere. The glucose produced during photosynthesis is not just used immediately; plants also store it as starch or convert it into other organic compounds like cellulose, which is essential for building their cellular structure.
This energy reserve allows them to grow, develop leaves, produce flowers, bear fruit, and carry out various physiological processes throughout their lifecycle."""
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

output = model.generate(**input_ids, cache_implementation="static")
print(tokenizer.decode(output[0], skip_special_tokens=True))
echo -e "Plants are among the most remarkable and essential life forms on Earth, possessing a unique ability to produce their own food through a process known as photosynthesis. This complex biochemical process is fundamental not only to plant life but to virtually all life on the planet. Through photosynthesis, plants capture energy from sunlight using a green pigment called chlorophyll, which is located in specialized cell structures called chloroplasts." | transformers-cli run --task summarization --model google/bigbird-pegasus-large-arxiv --device 0

Quantization reduces the memory burden of large models by representing the weights in a lower precision. Refer to the Quantization overview for more available quantization backends.

The example below uses bitsandbytes to only quantize the weights to int4.

import torch
from transformers import BitsAndBytesConfig, AutoModelForSeq2SeqLM, AutoTokenizer

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_quant_type="nf4"
)
model = AutoModelForSeq2SeqLM.from_pretrained(
    "google/bigbird-pegasus-large-arxiv",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    quantization_config=quantization_config
)

tokenizer = AutoTokenizer.from_pretrained(
    "google/bigbird-pegasus-large-arxiv"
)

input_text = """Plants are among the most remarkable and essential life forms on Earth, possessing a unique ability to produce their own food through a process known as photosynthesis. This complex biochemical process is fundamental not only to plant life but to virtually all life on the planet.
Through photosynthesis, plants capture energy from sunlight using a green pigment called chlorophyll, which is located in specialized cell structures called chloroplasts. In the presence of light, plants absorb carbon dioxide from the atmosphere through small pores in their leaves called stomata, and take in water from the soil through their root systems.
These ingredients are then transformed into glucose, a type of sugar that serves as a source of chemical energy, and oxygen, which is released as a byproduct into the atmosphere. The glucose produced during photosynthesis is not just used immediately; plants also store it as starch or convert it into other organic compounds like cellulose, which is essential for building their cellular structure.
This energy reserve allows them to grow, develop leaves, produce flowers, bear fruit, and carry out various physiological processes throughout their lifecycle."""
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

output = model.generate(**input_ids, cache_implementation="static")
print(tokenizer.decode(output[0], skip_special_tokens=True))

Notes

  • BigBirdPegasus also uses the [PegasusTokenizer].
  • Inputs should be padded on the right because BigBird uses absolute position embeddings.
  • BigBirdPegasus supports original_full and block_sparse attention. If the input sequence length is less than 1024, it is recommended to use original_full since sparse patterns don't offer much benefit for smaller inputs.
  • The current implementation uses window size of 3 blocks and 2 global blocks, only supports the ITC-implementation, and doesn't support num_random_blocks=0.
  • The sequence length must be divisible by the block size.

Resources

Read the Understanding BigBird's Block Sparse Attention blog post for more details about how BigBird's attention works.

BigBirdPegasusConfig

autodoc BigBirdPegasusConfig - all

BigBirdPegasusModel

autodoc BigBirdPegasusModel - forward

BigBirdPegasusForConditionalGeneration

autodoc BigBirdPegasusForConditionalGeneration - forward

BigBirdPegasusForSequenceClassification

autodoc BigBirdPegasusForSequenceClassification - forward

BigBirdPegasusForQuestionAnswering

autodoc BigBirdPegasusForQuestionAnswering - forward

BigBirdPegasusForCausalLM

autodoc BigBirdPegasusForCausalLM - forward