transformers/tests/quantization
Jerry Zhang 86777b5e2f
Support AOPerModuleConfig and include_embedding (#37802)
* Support `AOPerModuleConfig` and include_embedding

Summary:
This PR adds support per module configuration for torchao
Also added per module quantization examples:

1. Quantizing different layers with different quantization configs
2. Skip quantization for certain layers

Test Plan:
python tests/quantization/torchao_integration/test_torchao.py -k test_include_embedding
python tests/quantization/torchao_integration/test_torchao.py -k test_per_module_config_skip

Reviewers:

Subscribers:

Tasks:

Tags:

* format

* format

* inlcude embedding remove input embedding from module not to convert

* more docs

* Update docs/source/en/quantization/torchao.md

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

* Update src/transformers/quantizers/quantizer_torchao.py

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

* Update src/transformers/quantizers/quantizer_torchao.py

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-04-30 20:16:29 +02:00
..
aqlm_integration Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
autoawq Fixing quantization tests (#37650) 2025-04-22 13:59:57 +02:00
autoround Fix typos in strings and comments (#37799) 2025-04-28 11:39:11 +01:00
bitnet_integration Add Bitnet model (#37742) 2025-04-28 15:08:46 +02:00
bnb Fix typos in strings and comments (#37799) 2025-04-28 11:39:11 +01:00
compressed_tensors_integration Fix: Unexpected Keys, Improve run_compressed, Rename Test Folder (#37077) 2025-04-04 21:30:11 +02:00
eetq_integration Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
fbgemm_fp8 Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
finegrained_fp8 Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
ggml Fixing quantization tests (#37650) 2025-04-22 13:59:57 +02:00
gptq Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
higgs Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
hqq Fixes hqq by following a new path for bias parameter in pre_quantized models (#37530) 2025-04-16 13:58:14 +02:00
quanto_integration Fixing quantization tests (#37650) 2025-04-22 13:59:57 +02:00
quark_integration Fix typos in strings and comments (#37799) 2025-04-28 11:39:11 +01:00
spqr_integration Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
torchao_integration Support AOPerModuleConfig and include_embedding (#37802) 2025-04-30 20:16:29 +02:00
vptq_integration Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00