From 91f3e9422fb2dc01cdc9d13913b36489fcbc0b51 Mon Sep 17 00:00:00 2001
From: regisss <15324346+regisss@users.noreply.github.com>
Date: Tue, 29 Apr 2025 14:28:06 -0600
Subject: [PATCH] Add Intel Gaudi doc (#37855)

* Add Intel Gaudi doc

* Use "TIP" instead of "NOTE"

* Address comments from reviews
---
 docs/source/en/_toctree.yml        |  2 ++
 docs/source/en/perf_train_gaudi.md | 34 ++++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+)
 create mode 100644 docs/source/en/perf_train_gaudi.md

diff --git a/docs/source/en/_toctree.yml b/docs/source/en/_toctree.yml
index bf503f96be0..14cf7340ea7 100644
--- a/docs/source/en/_toctree.yml
+++ b/docs/source/en/_toctree.yml
@@ -149,6 +149,8 @@
       title: TPU
     - local: perf_train_special
       title: Apple Silicon
+    - local: perf_train_gaudi
+      title: Intel Gaudi
     - local: perf_hardware
       title: Build your own machine
     title: Hardware
diff --git a/docs/source/en/perf_train_gaudi.md b/docs/source/en/perf_train_gaudi.md
new file mode 100644
index 00000000000..2ba792d484a
--- /dev/null
+++ b/docs/source/en/perf_train_gaudi.md
@@ -0,0 +1,34 @@
+<!--Copyright 2025 The HuggingFace Team. All rights reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
+an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
+
+⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
+rendered properly in your Markdown viewer.
+
+-->
+
+# Intel Gaudi
+
+The Intel Gaudi AI accelerator family includes [Intel Gaudi 1](https://habana.ai/products/gaudi/), [Intel Gaudi 2](https://habana.ai/products/gaudi2/), and [Intel Gaudi 3](https://habana.ai/products/gaudi3/). Each server is equipped with 8 devices, known as Habana Processing Units (HPUs), providing 128GB of memory on Gaudi 3, 96GB on Gaudi 2, and 32GB on the first-gen Gaudi. For more details on the underlying hardware architecture, check out the [Gaudi Architecture](https://docs.habana.ai/en/latest/Gaudi_Overview/Gaudi_Architecture.html) overview.
+
+[`TrainingArguments`], [`Trainer`] and [`Pipeline`] detect and set the backend device to `hpu` if an Intel Gaudi device is available. No additional changes are required to enable training and inference on your device.
+
+Some modeling code in Transformers is not optimized for HPU lazy mode. If you encounter any errors, set the environment variable below to use eager mode:
+```
+PT_HPU_LAZY_MODE=0
+```
+
+In some cases, you'll also need to enable int64 support to avoid casting issues with long integers:
+```
+PT_ENABLE_INT64_SUPPORT=1
+```
+Refer to the [Gaudi docs](https://docs.habana.ai/en/latest/index.html) for more details.
+
+> [!TIP]
+> For training and inference with Gaudi-optimized model implementations, we recommend using [Optimum for Intel Gaudi](https://huggingface.co/docs/optimum/main/en/habana/index).