mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-03 21:00:08 +06:00
![]() * Support `AOPerModuleConfig` and include_embedding Summary: This PR adds support per module configuration for torchao Also added per module quantization examples: 1. Quantizing different layers with different quantization configs 2. Skip quantization for certain layers Test Plan: python tests/quantization/torchao_integration/test_torchao.py -k test_include_embedding python tests/quantization/torchao_integration/test_torchao.py -k test_per_module_config_skip Reviewers: Subscribers: Tasks: Tags: * format * format * inlcude embedding remove input embedding from module not to convert * more docs * Update docs/source/en/quantization/torchao.md Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_torchao.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_torchao.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> |
||
---|---|---|
.. | ||
aqlm.md | ||
auto_round.md | ||
awq.md | ||
bitnet.md | ||
bitsandbytes.md | ||
compressed_tensors.md | ||
concept_guide.md | ||
contribute.md | ||
eetq.md | ||
fbgemm_fp8.md | ||
finegrained_fp8.md | ||
gptq.md | ||
higgs.md | ||
hqq.md | ||
optimum.md | ||
overview.md | ||
quanto.md | ||
quark.md | ||
selecting.md | ||
spqr.md | ||
torchao.md | ||
vptq.md |