riaz.somc/transformers

Fork 0

mirror of https://github.com/huggingface/transformers.git synced 2025-07-04 13:20:12 +06:00

Sunil Reddy 443aafd3d6

Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run

Details

Build documentation / build (push) Waiting to run

Details

Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run

Details

Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions

Details

Secret Leaks / trufflehog (push) Waiting to run

Details

Update Transformers metadata / build_and_package (push) Waiting to run

Details

New model PR merged notification / Notify new model (push) Has been cancelled

Details

Self-hosted runner (push-caller) / Check if setup was changed (push) Has been cancelled

Details

Self-hosted runner (push-caller) / build-docker-containers (push) Has been cancelled

Details

Self-hosted runner (push-caller) / Trigger Push CI (push) Has been cancelled

Details

[docs] updated roberta model card (#38777 )

* updated roberta model card

* fixes suggested after reviewing

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

2025-06-13 12:02:44 -07:00

8.5 KiB

Raw Blame History

RoBERTa

RoBERTa improves BERT with new pretraining objectives, demonstrating BERT was undertrained and training design is important. The pretraining objectives include dynamic masking, sentence packing, larger batches and a byte-level BPE tokenizer.

You can find all the original RoBERTa checkpoints under the Facebook AI organization.

Tip

Click on the RoBERTa models in the right sidebar for more examples of how to apply RoBERTa to different language tasks.

The example below demonstrates how to predict the <mask> token with [Pipeline], [AutoModel], and from the command line.

import torch
from transformers import pipeline

pipeline = pipeline(
    task="fill-mask",
    model="FacebookAI/roberta-base",
    torch_dtype=torch.float16,
    device=0
)
pipeline("Plants create <mask> through a process known as photosynthesis.")

import torch
from transformers import AutoModelForMaskedLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(
    "FacebookAI/roberta-base",
)
model = AutoModelForMaskedLM.from_pretrained(
    "FacebookAI/roberta-base",
    torch_dtype=torch.float16,
    device_map="auto",
    attn_implementation="sdpa"
)
inputs = tokenizer("Plants create <mask> through a process known as photosynthesis.", return_tensors="pt").to("cuda")

with torch.no_grad():
    outputs = model(**inputs)
    predictions = outputs.logits

masked_index = torch.where(inputs['input_ids'] == tokenizer.mask_token_id)[1]
predicted_token_id = predictions[0, masked_index].argmax(dim=-1)
predicted_token = tokenizer.decode(predicted_token_id)

print(f"The predicted token is: {predicted_token}")

echo -e "Plants create <mask> through a process known as photosynthesis." | transformers-cli run --task fill-mask --model FacebookAI/roberta-base --device 0

Notes

RoBERTa doesn't have token_type_ids so you don't need to indicate which token belongs to which segment. Separate your segments with the separation token tokenizer.sep_token or </s>.

RobertaConfig

autodoc RobertaConfig

RobertaTokenizer

autodoc RobertaTokenizer - build_inputs_with_special_tokens - get_special_tokens_mask - create_token_type_ids_from_sequences - save_vocabulary

RobertaTokenizerFast

autodoc RobertaTokenizerFast - build_inputs_with_special_tokens

RobertaModel

autodoc RobertaModel - forward

RobertaForCausalLM

autodoc RobertaForCausalLM - forward

RobertaForMaskedLM

autodoc RobertaForMaskedLM - forward

RobertaForSequenceClassification

autodoc RobertaForSequenceClassification - forward

RobertaForMultipleChoice

autodoc RobertaForMultipleChoice - forward

RobertaForTokenClassification

autodoc RobertaForTokenClassification - forward

RobertaForQuestionAnswering

autodoc RobertaForQuestionAnswering - forward

TFRobertaModel

autodoc TFRobertaModel - call

TFRobertaForCausalLM

autodoc TFRobertaForCausalLM - call

TFRobertaForMaskedLM

autodoc TFRobertaForMaskedLM - call

TFRobertaForSequenceClassification

autodoc TFRobertaForSequenceClassification - call

TFRobertaForMultipleChoice

autodoc TFRobertaForMultipleChoice - call

TFRobertaForTokenClassification

autodoc TFRobertaForTokenClassification - call

TFRobertaForQuestionAnswering

autodoc TFRobertaForQuestionAnswering - call

FlaxRobertaModel

autodoc FlaxRobertaModel - call

FlaxRobertaForCausalLM

autodoc FlaxRobertaForCausalLM - call

FlaxRobertaForMaskedLM

autodoc FlaxRobertaForMaskedLM - call

FlaxRobertaForSequenceClassification

autodoc FlaxRobertaForSequenceClassification - call

FlaxRobertaForMultipleChoice

autodoc FlaxRobertaForMultipleChoice - call

FlaxRobertaForTokenClassification

autodoc FlaxRobertaForTokenClassification - call

FlaxRobertaForQuestionAnswering

autodoc FlaxRobertaForQuestionAnswering - call

8.5 KiB Raw Blame History

RoBERTa

Notes

RobertaConfig

RobertaTokenizer

RobertaTokenizerFast

RobertaModel

RobertaForCausalLM

RobertaForMaskedLM

RobertaForSequenceClassification

RobertaForMultipleChoice

RobertaForTokenClassification

RobertaForQuestionAnswering

TFRobertaModel

TFRobertaForCausalLM

TFRobertaForMaskedLM

TFRobertaForSequenceClassification

TFRobertaForMultipleChoice

TFRobertaForTokenClassification

TFRobertaForQuestionAnswering

FlaxRobertaModel

FlaxRobertaForCausalLM

FlaxRobertaForMaskedLM

FlaxRobertaForSequenceClassification

FlaxRobertaForMultipleChoice

FlaxRobertaForTokenClassification

FlaxRobertaForQuestionAnswering

8.5 KiB

Raw Blame History