transformers/docs/source/en/quantization
Jerry Zhang 86777b5e2f
Support AOPerModuleConfig and include_embedding (#37802)
* Support `AOPerModuleConfig` and include_embedding

Summary:
This PR adds support per module configuration for torchao
Also added per module quantization examples:

1. Quantizing different layers with different quantization configs
2. Skip quantization for certain layers

Test Plan:
python tests/quantization/torchao_integration/test_torchao.py -k test_include_embedding
python tests/quantization/torchao_integration/test_torchao.py -k test_per_module_config_skip

Reviewers:

Subscribers:

Tasks:

Tags:

* format

* format

* inlcude embedding remove input embedding from module not to convert

* more docs

* Update docs/source/en/quantization/torchao.md

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

* Update src/transformers/quantizers/quantizer_torchao.py

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

* Update src/transformers/quantizers/quantizer_torchao.py

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-04-30 20:16:29 +02:00
..
aqlm.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
auto_round.md Fix auto-round hfoption (#37759) 2025-04-24 18:19:38 +02:00
awq.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
bitnet.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
bitsandbytes.md Refactor bitsandbytes doc (#37668) 2025-04-22 16:13:25 +02:00
compressed_tensors.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
concept_guide.md Update quantization docs (#37439) 2025-04-16 15:44:53 +02:00
contribute.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
eetq.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
fbgemm_fp8.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
finegrained_fp8.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
gptq.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
higgs.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
hqq.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
optimum.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
overview.md Add AutoRound quantization support (#37393) 2025-04-22 13:56:54 +02:00
quanto.md fix typos in the docs directory (#36639) 2025-03-11 09:41:41 -07:00
quark.md Support loading Quark quantized models in Transformers (#36372) 2025-03-20 15:40:51 +01:00
selecting.md Update quantization docs (#37439) 2025-04-16 15:44:53 +02:00
spqr.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
torchao.md Support AOPerModuleConfig and include_embedding (#37802) 2025-04-30 20:16:29 +02:00
vptq.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00