From 91f3e9422fb2dc01cdc9d13913b36489fcbc0b51 Mon Sep 17 00:00:00 2001 From: regisss <15324346+regisss@users.noreply.github.com> Date: Tue, 29 Apr 2025 14:28:06 -0600 Subject: [PATCH] Add Intel Gaudi doc (#37855) * Add Intel Gaudi doc * Use "TIP" instead of "NOTE" * Address comments from reviews --- docs/source/en/_toctree.yml | 2 ++ docs/source/en/perf_train_gaudi.md | 34 ++++++++++++++++++++++++++++++ 2 files changed, 36 insertions(+) create mode 100644 docs/source/en/perf_train_gaudi.md diff --git a/docs/source/en/_toctree.yml b/docs/source/en/_toctree.yml index bf503f96be0..14cf7340ea7 100644 --- a/docs/source/en/_toctree.yml +++ b/docs/source/en/_toctree.yml @@ -149,6 +149,8 @@ title: TPU - local: perf_train_special title: Apple Silicon + - local: perf_train_gaudi + title: Intel Gaudi - local: perf_hardware title: Build your own machine title: Hardware diff --git a/docs/source/en/perf_train_gaudi.md b/docs/source/en/perf_train_gaudi.md new file mode 100644 index 00000000000..2ba792d484a --- /dev/null +++ b/docs/source/en/perf_train_gaudi.md @@ -0,0 +1,34 @@ + + +# Intel Gaudi + +The Intel Gaudi AI accelerator family includes [Intel Gaudi 1](https://habana.ai/products/gaudi/), [Intel Gaudi 2](https://habana.ai/products/gaudi2/), and [Intel Gaudi 3](https://habana.ai/products/gaudi3/). Each server is equipped with 8 devices, known as Habana Processing Units (HPUs), providing 128GB of memory on Gaudi 3, 96GB on Gaudi 2, and 32GB on the first-gen Gaudi. For more details on the underlying hardware architecture, check out the [Gaudi Architecture](https://docs.habana.ai/en/latest/Gaudi_Overview/Gaudi_Architecture.html) overview. + +[`TrainingArguments`], [`Trainer`] and [`Pipeline`] detect and set the backend device to `hpu` if an Intel Gaudi device is available. No additional changes are required to enable training and inference on your device. + +Some modeling code in Transformers is not optimized for HPU lazy mode. If you encounter any errors, set the environment variable below to use eager mode: +``` +PT_HPU_LAZY_MODE=0 +``` + +In some cases, you'll also need to enable int64 support to avoid casting issues with long integers: +``` +PT_ENABLE_INT64_SUPPORT=1 +``` +Refer to the [Gaudi docs](https://docs.habana.ai/en/latest/index.html) for more details. + +> [!TIP] +> For training and inference with Gaudi-optimized model implementations, we recommend using [Optimum for Intel Gaudi](https://huggingface.co/docs/optimum/main/en/habana/index).