diff --git a/docs/source/en/performance.mdx b/docs/source/en/performance.mdx index d75d62001c9..8b6d45cde64 100644 --- a/docs/source/en/performance.mdx +++ b/docs/source/en/performance.mdx @@ -30,7 +30,7 @@ Training transformer models efficiently requires an accelerator such as a GPU or Training large models on a single GPU can be challenging but there are a number of tools and methods that make it feasible. In this section methods such as mixed precision training, gradient accumulation and checkpointing, efficient optimizers, as well as strategies to determine the best batch size are discussed. -[Go to single GPU training section](perf_train_gpu_single) +[Go to single GPU training section](perf_train_gpu_one) ### Multi-GPU