[docs] Typos - Single GPU efficient training features (#38964)
Some checks are pending
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run
Build documentation / build (push) Waiting to run
Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions
Secret Leaks / trufflehog (push) Waiting to run
Update Transformers metadata / build_and_package (push) Waiting to run

* Typos

- corrected bf16 training argument
- corrected header for SDPA

* improved readability for SDPA suggested by @stevhliu

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
This commit is contained in:
casinca 2025-06-23 21:33:10 +02:00 committed by GitHub
parent f9be71b34d
commit 21cb353b7b
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -31,7 +31,7 @@ Refer to the table below to quickly help you identify the features relevant to y
| data preloading | yes | no |
| torch_empty_cache_steps | no | yes |
| torch.compile | yes | no |
| PEFT | no | yes |
| scaled dot production attention (SDPA) | yes | yes |
## Trainer
@ -128,7 +128,7 @@ fp16 isn't memory-optimized because the gradients that are computed in fp16 are
[bf16](https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus) trades off some precision for a much larger dynamic range, which is helpful for avoiding overflow and underflow errors. You can use bf16 without adding any loss scaling methods like you would with fp16. bf16 is supported by NVIDIAs Ampere architecture or newer.
Configure [`~TrainingArguments.fp16`] in [`TrainingArguments`] to enable mixed precision training with the bf16 data type.
Configure [`~TrainingArguments.bf16`] in [`TrainingArguments`] to enable mixed precision training with the bf16 data type.
```py
from transformers import TrainingArguments