
* Modified BART documentation wrt to issue #36979. * Modified BART documentation wrt to issue #36979. * fixed a typo. * Update docs/source/en/model_doc/bart.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bart.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bart.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bart.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bart.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bart.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * blank commit. --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
8.6 KiB
BART
BART is a sequence-to-sequence model that combines the pretraining objectives from BERT and GPT. It’s pretrained by corrupting text in different ways like deleting words, shuffling sentences, or masking tokens and learning how to fix it. The encoder encodes the corrupted document and the corrupted text is fixed by the decoder. As it learns to recover the original text, BART gets really good at both understanding and generating language.
You can find all the original BART checkpoints under the AI at Meta organization.
The example below demonstrates how to predict the [MASK]
token with [Pipeline
], [AutoModel
], and from the command line.
import torch
from transformers import pipeline
pipeline = pipeline(
task="fill-mask",
model="facebook/bart-large",
torch_dtype=torch.float16,
device=0
)
pipeline("Plants create <mask> through a process known as photosynthesis.")
import torch
from transformers import AutoModelForMaskedLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(
"facebook/bart-large",
)
model = AutoModelForMaskedLM.from_pretrained(
"facebook/bart-large",
torch_dtype=torch.float16,
device_map="auto",
attn_implementation="sdpa"
)
inputs = tokenizer("Plants create <mask> through a process known as photosynthesis.", return_tensors="pt").to("cuda")
with torch.no_grad():
outputs = model(**inputs)
predictions = outputs.logits
masked_index = torch.where(inputs['input_ids'] == tokenizer.mask_token_id)[1]
predicted_token_id = predictions[0, masked_index].argmax(dim=-1)
predicted_token = tokenizer.decode(predicted_token_id)
print(f"The predicted token is: {predicted_token}")
echo -e "Plants create <mask> through a process known as photosynthesis." | transformers-cli run --task fill-mask --model facebook/bart-large --device 0
Notes
- Inputs should be padded on the right because BERT uses absolute position embeddings.
- The facebook/bart-large-cnn checkpoint doesn't include
mask_token_id
which means it can't perform mask-filling tasks. - BART doesn’t use
token_type_ids
for sequence classification. Use [BartTokenizer
] or [~PreTrainedTokenizerBase.encode
] to get the proper splitting. - The forward pass of [
BartModel
] creates thedecoder_input_ids
if they're not passed. This can be different from other model APIs, but it is a useful feature for mask-filling tasks. - Model predictions are intended to be identical to the original implementation when
forced_bos_token_id=0
. This only works if the text passed tofairseq.encode
begins with a space. - [
~GenerationMixin.generate
] should be used for conditional generation tasks like summarization.
BartConfig
autodoc BartConfig - all
BartTokenizer
autodoc BartTokenizer - all
BartTokenizerFast
autodoc BartTokenizerFast - all
BartModel
autodoc BartModel - forward
BartForConditionalGeneration
autodoc BartForConditionalGeneration - forward
BartForSequenceClassification
autodoc BartForSequenceClassification - forward
BartForQuestionAnswering
autodoc BartForQuestionAnswering - forward
BartForCausalLM
autodoc BartForCausalLM - forward
TFBartModel
autodoc TFBartModel - call
TFBartForConditionalGeneration
autodoc TFBartForConditionalGeneration - call
TFBartForSequenceClassification
autodoc TFBartForSequenceClassification - call
FlaxBartModel
autodoc FlaxBartModel - call - encode - decode
FlaxBartForConditionalGeneration
autodoc FlaxBartForConditionalGeneration - call - encode - decode
FlaxBartForSequenceClassification
autodoc FlaxBartForSequenceClassification - call - encode - decode
FlaxBartForQuestionAnswering
autodoc FlaxBartForQuestionAnswering - call - encode - decode
FlaxBartForCausalLM
autodoc FlaxBartForCausalLM - call