mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-04 05:10:06 +06:00
[MINOR:TYPO] Update hubert.md (#36733)
* [MINOR:TYPO] Update hubert.md - typo fix (wave2vec instead of hubert) - make code snippet copiable and runnable * Run tests
This commit is contained in:
parent
c8a2b25f91
commit
e3af4fec91
@ -71,9 +71,10 @@ pip install -U flash-attn --no-build-isolation
|
||||
Below is an expected speedup diagram comparing the pure inference time between the native implementation in transformers of `facebook/hubert-large-ls960-ft`, the flash-attention-2 and the sdpa (scale-dot-product-attention) version. We show the average speedup obtained on the `librispeech_asr` `clean` validation split:
|
||||
|
||||
```python
|
||||
>>> from transformers import Wav2Vec2Model
|
||||
>>> from transformers import HubertModel
|
||||
>>> import torch
|
||||
|
||||
model = Wav2Vec2Model.from_pretrained("facebook/hubert-large-ls960-ft", torch_dtype=torch.float16, attn_implementation="flash_attention_2").to(device)
|
||||
>>> model = HubertModel.from_pretrained("facebook/hubert-large-ls960-ft", torch_dtype=torch.float16, attn_implementation="flash_attention_2").to("cuda")
|
||||
...
|
||||
```
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user