[MINOR:TYPO] Update hubert.md (#36733)

* [MINOR:TYPO] Update hubert.md

- typo fix (wave2vec instead of hubert)
- make code snippet copiable and runnable

* Run tests
This commit is contained in:
Christopher Akiki 2025-03-17 17:07:51 +01:00 committed by GitHub
parent c8a2b25f91
commit e3af4fec91
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -71,9 +71,10 @@ pip install -U flash-attn --no-build-isolation
Below is an expected speedup diagram comparing the pure inference time between the native implementation in transformers of `facebook/hubert-large-ls960-ft`, the flash-attention-2 and the sdpa (scale-dot-product-attention) version. We show the average speedup obtained on the `librispeech_asr` `clean` validation split: Below is an expected speedup diagram comparing the pure inference time between the native implementation in transformers of `facebook/hubert-large-ls960-ft`, the flash-attention-2 and the sdpa (scale-dot-product-attention) version. We show the average speedup obtained on the `librispeech_asr` `clean` validation split:
```python ```python
>>> from transformers import Wav2Vec2Model >>> from transformers import HubertModel
>>> import torch
model = Wav2Vec2Model.from_pretrained("facebook/hubert-large-ls960-ft", torch_dtype=torch.float16, attn_implementation="flash_attention_2").to(device) >>> model = HubertModel.from_pretrained("facebook/hubert-large-ls960-ft", torch_dtype=torch.float16, attn_implementation="flash_attention_2").to("cuda")
... ...
``` ```