From e3af4fec91bdac3381e07f6e60fb8b74c3957c11 Mon Sep 17 00:00:00 2001 From: Christopher Akiki Date: Mon, 17 Mar 2025 17:07:51 +0100 Subject: [PATCH] [MINOR:TYPO] Update hubert.md (#36733) * [MINOR:TYPO] Update hubert.md - typo fix (wave2vec instead of hubert) - make code snippet copiable and runnable * Run tests --- docs/source/en/model_doc/hubert.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/source/en/model_doc/hubert.md b/docs/source/en/model_doc/hubert.md index 432e127c786..67e7d78beb6 100644 --- a/docs/source/en/model_doc/hubert.md +++ b/docs/source/en/model_doc/hubert.md @@ -71,9 +71,10 @@ pip install -U flash-attn --no-build-isolation Below is an expected speedup diagram comparing the pure inference time between the native implementation in transformers of `facebook/hubert-large-ls960-ft`, the flash-attention-2 and the sdpa (scale-dot-product-attention) version. We show the average speedup obtained on the `librispeech_asr` `clean` validation split: ```python ->>> from transformers import Wav2Vec2Model +>>> from transformers import HubertModel +>>> import torch -model = Wav2Vec2Model.from_pretrained("facebook/hubert-large-ls960-ft", torch_dtype=torch.float16, attn_implementation="flash_attention_2").to(device) +>>> model = HubertModel.from_pretrained("facebook/hubert-large-ls960-ft", torch_dtype=torch.float16, attn_implementation="flash_attention_2").to("cuda") ... ```