transformers/docs/source/en/model_doc/bart.md
RogerSinghChugh be7aa3210b
New bart model card (#37858)
* Modified BART documentation wrt to issue #36979.

* Modified BART documentation wrt to issue #36979.

* fixed a typo.

* Update docs/source/en/model_doc/bart.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/bart.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/bart.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/bart.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/bart.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/bart.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* blank commit.

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-05-27 11:51:41 -07:00

8.6 KiB
Raw Blame History

PyTorch TensorFlow Flax FlashAttention SDPA

BART

BART is a sequence-to-sequence model that combines the pretraining objectives from BERT and GPT. Its pretrained by corrupting text in different ways like deleting words, shuffling sentences, or masking tokens and learning how to fix it. The encoder encodes the corrupted document and the corrupted text is fixed by the decoder. As it learns to recover the original text, BART gets really good at both understanding and generating language.

You can find all the original BART checkpoints under the AI at Meta organization.

The example below demonstrates how to predict the [MASK] token with [Pipeline], [AutoModel], and from the command line.

import torch
from transformers import pipeline

pipeline = pipeline(
    task="fill-mask",
    model="facebook/bart-large",
    torch_dtype=torch.float16,
    device=0
)
pipeline("Plants create <mask> through a process known as photosynthesis.")

import torch
from transformers import AutoModelForMaskedLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(
    "facebook/bart-large",
)
model = AutoModelForMaskedLM.from_pretrained(
    "facebook/bart-large",
    torch_dtype=torch.float16,
    device_map="auto",
    attn_implementation="sdpa"
)
inputs = tokenizer("Plants create <mask> through a process known as photosynthesis.", return_tensors="pt").to("cuda")

with torch.no_grad():
    outputs = model(**inputs)
    predictions = outputs.logits

masked_index = torch.where(inputs['input_ids'] == tokenizer.mask_token_id)[1]
predicted_token_id = predictions[0, masked_index].argmax(dim=-1)
predicted_token = tokenizer.decode(predicted_token_id)

print(f"The predicted token is: {predicted_token}")
echo -e "Plants create <mask> through a process known as photosynthesis." | transformers-cli run --task fill-mask --model facebook/bart-large --device 0

Notes

  • Inputs should be padded on the right because BERT uses absolute position embeddings.
  • The facebook/bart-large-cnn checkpoint doesn't include mask_token_id which means it can't perform mask-filling tasks.
  • BART doesnt use token_type_ids for sequence classification. Use [BartTokenizer] or [~PreTrainedTokenizerBase.encode] to get the proper splitting.
  • The forward pass of [BartModel] creates the decoder_input_ids if they're not passed. This can be different from other model APIs, but it is a useful feature for mask-filling tasks.
  • Model predictions are intended to be identical to the original implementation when forced_bos_token_id=0. This only works if the text passed to fairseq.encode begins with a space.
  • [~GenerationMixin.generate] should be used for conditional generation tasks like summarization.

BartConfig

autodoc BartConfig - all

BartTokenizer

autodoc BartTokenizer - all

BartTokenizerFast

autodoc BartTokenizerFast - all

BartModel

autodoc BartModel - forward

BartForConditionalGeneration

autodoc BartForConditionalGeneration - forward

BartForSequenceClassification

autodoc BartForSequenceClassification - forward

BartForQuestionAnswering

autodoc BartForQuestionAnswering - forward

BartForCausalLM

autodoc BartForCausalLM - forward

TFBartModel

autodoc TFBartModel - call

TFBartForConditionalGeneration

autodoc TFBartForConditionalGeneration - call

TFBartForSequenceClassification

autodoc TFBartForSequenceClassification - call

FlaxBartModel

autodoc FlaxBartModel - call - encode - decode

FlaxBartForConditionalGeneration

autodoc FlaxBartForConditionalGeneration - call - encode - decode

FlaxBartForSequenceClassification

autodoc FlaxBartForSequenceClassification - call - encode - decode

FlaxBartForQuestionAnswering

autodoc FlaxBartForQuestionAnswering - call - encode - decode

FlaxBartForCausalLM

autodoc FlaxBartForCausalLM - call