From e3af4fec91bdac3381e07f6e60fb8b74c3957c11 Mon Sep 17 00:00:00 2001
From: Christopher Akiki <christopher.akiki@gmail.com>
Date: Mon, 17 Mar 2025 17:07:51 +0100
Subject: [PATCH] [MINOR:TYPO] Update hubert.md (#36733)

* [MINOR:TYPO] Update hubert.md

- typo fix (wave2vec instead of hubert)
- make code snippet copiable and runnable

* Run tests
---
 docs/source/en/model_doc/hubert.md | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/docs/source/en/model_doc/hubert.md b/docs/source/en/model_doc/hubert.md
index 432e127c786..67e7d78beb6 100644
--- a/docs/source/en/model_doc/hubert.md
+++ b/docs/source/en/model_doc/hubert.md
@@ -71,9 +71,10 @@ pip install -U flash-attn --no-build-isolation
 Below is an expected speedup diagram comparing the pure inference time between the native implementation in transformers of `facebook/hubert-large-ls960-ft`, the flash-attention-2 and the sdpa (scale-dot-product-attention) version. We show the average speedup obtained on the `librispeech_asr` `clean` validation split: 
 
 ```python
->>> from transformers import Wav2Vec2Model
+>>> from transformers import HubertModel
+>>> import torch
 
-model = Wav2Vec2Model.from_pretrained("facebook/hubert-large-ls960-ft", torch_dtype=torch.float16, attn_implementation="flash_attention_2").to(device)
+>>> model = HubertModel.from_pretrained("facebook/hubert-large-ls960-ft", torch_dtype=torch.float16, attn_implementation="flash_attention_2").to("cuda")
 ...
 ```