PyTorch TensorFlow Flax SDPA
# BERT [BERT](https://huggingface.co/papers/1810.04805) 是一个在无标签的文本数据上预训练的双向 transformer,用于预测句子中被掩码的(masked) token,以及预测一个句子是否跟随在另一个句子之后。其主要思想是,在预训练过程中,通过随机掩码一些 token,让模型利用左右上下文的信息预测它们,从而获得更全面深入的理解。此外,BERT 具有很强的通用性,其学习到的语言表示可以通过额外的层或头进行微调,从而适配其他下游 NLP 任务。 你可以在 [BERT](https://huggingface.co/collections/google/bert-release-64ff5e7a4be99045d1896dbc) 集合下找到 BERT 的所有原始 checkpoint。 > [!TIP] > 点击右侧边栏中的 BERT 模型,以查看将 BERT 应用于不同语言任务的更多示例。 下面的示例演示了如何使用 [`Pipeline`], [`AutoModel`] 和命令行预测 `[MASK]` token。 ```py import torch from transformers import pipeline pipeline = pipeline( task="fill-mask", model="google-bert/bert-base-uncased", torch_dtype=torch.float16, device=0 ) pipeline("Plants create [MASK] through a process known as photosynthesis.") ``` ```py import torch from transformers import AutoModelForMaskedLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained( "google-bert/bert-base-uncased", ) model = AutoModelForMaskedLM.from_pretrained( "google-bert/bert-base-uncased", torch_dtype=torch.float16, device_map="auto", attn_implementation="sdpa" ) inputs = tokenizer("Plants create [MASK] through a process known as photosynthesis.", return_tensors="pt").to("cuda") with torch.no_grad(): outputs = model(**inputs) predictions = outputs.logits masked_index = torch.where(inputs['input_ids'] == tokenizer.mask_token_id)[1] predicted_token_id = predictions[0, masked_index].argmax(dim=-1) predicted_token = tokenizer.decode(predicted_token_id) print(f"The predicted token is: {predicted_token}") ``` ```bash echo -e "Plants create [MASK] through a process known as photosynthesis." | transformers-cli run --task fill-mask --model google-bert/bert-base-uncased --device 0 ``` ## 注意 - 输入内容应在右侧进行填充,因为 BERT 使用绝对位置嵌入。 ## BertConfig [[autodoc]] BertConfig - all ## BertTokenizer [[autodoc]] BertTokenizer - build_inputs_with_special_tokens - get_special_tokens_mask - create_token_type_ids_from_sequences - save_vocabulary ## BertTokenizerFast [[autodoc]] BertTokenizerFast ## BertModel [[autodoc]] BertModel - forward ## BertForPreTraining [[autodoc]] BertForPreTraining - forward ## BertLMHeadModel [[autodoc]] BertLMHeadModel - forward ## BertForMaskedLM [[autodoc]] BertForMaskedLM - forward ## BertForNextSentencePrediction [[autodoc]] BertForNextSentencePrediction - forward ## BertForSequenceClassification [[autodoc]] BertForSequenceClassification - forward ## BertForMultipleChoice [[autodoc]] BertForMultipleChoice - forward ## BertForTokenClassification [[autodoc]] BertForTokenClassification - forward ## BertForQuestionAnswering [[autodoc]] BertForQuestionAnswering - forward ## TFBertTokenizer [[autodoc]] TFBertTokenizer ## TFBertModel [[autodoc]] TFBertModel - call ## TFBertForPreTraining [[autodoc]] TFBertForPreTraining - call ## TFBertModelLMHeadModel [[autodoc]] TFBertLMHeadModel - call ## TFBertForMaskedLM [[autodoc]] TFBertForMaskedLM - call ## TFBertForNextSentencePrediction [[autodoc]] TFBertForNextSentencePrediction - call ## TFBertForSequenceClassification [[autodoc]] TFBertForSequenceClassification - call ## TFBertForMultipleChoice [[autodoc]] TFBertForMultipleChoice - call ## TFBertForTokenClassification [[autodoc]] TFBertForTokenClassification - call ## TFBertForQuestionAnswering [[autodoc]] TFBertForQuestionAnswering - call ## FlaxBertModel [[autodoc]] FlaxBertModel - __call__ ## FlaxBertForPreTraining [[autodoc]] FlaxBertForPreTraining - __call__ ## FlaxBertForCausalLM [[autodoc]] FlaxBertForCausalLM - __call__ ## FlaxBertForMaskedLM [[autodoc]] FlaxBertForMaskedLM - __call__ ## FlaxBertForNextSentencePrediction [[autodoc]] FlaxBertForNextSentencePrediction - __call__ ## FlaxBertForSequenceClassification [[autodoc]] FlaxBertForSequenceClassification - __call__ ## FlaxBertForMultipleChoice [[autodoc]] FlaxBertForMultipleChoice - __call__ ## FlaxBertForTokenClassification [[autodoc]] FlaxBertForTokenClassification - __call__ ## FlaxBertForQuestionAnswering [[autodoc]] FlaxBertForQuestionAnswering - __call__ ## Bert specific outputs [[autodoc]] models.bert.modeling_bert.BertForPreTrainingOutput [[autodoc]] models.bert.modeling_tf_bert.TFBertForPreTrainingOutput [[autodoc]] models.bert.modeling_flax_bert.FlaxBertForPreTrainingOutput