transformers/tests/quantization
jiqing-feng b916efcb3c
Enables CPU AWQ model with IPEX version. (#33460)
* enable cpu awq ipex linear

* add doc for cpu awq with ipex kernel

* add tests for cpu awq

* fix code style

* fix doc and tests

* Update docs/source/en/quantization/awq.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update tests/quantization/autoawq/test_awq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* fix comments

* fix log

* fix log

* fix style

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-10-04 16:25:10 +02:00
..
aqlm_integration Cache: use batch_size instead of max_batch_size (#32657) 2024-08-16 11:48:45 +01:00
autoawq Enables CPU AWQ model with IPEX version. (#33460) 2024-10-04 16:25:10 +02:00
bnb Enable BNB multi-backend support (#31098) 2024-09-24 03:40:56 -06:00
compressed_tensor HFQuantizer implementation for compressed-tensors library (#31704) 2024-09-25 14:31:38 +02:00
eetq_integration [FEAT]: EETQ quantizer support (#30262) 2024-04-22 20:38:58 +01:00
fbgemm_fp8 Fix FbgemmFp8Linear not preserving tensor shape (#33239) 2024-09-11 13:26:44 +02:00
ggml Add falcon gguf (#33437) 2024-10-02 14:10:39 +02:00
gptq 🚨 Remove dataset with restrictive license (#31452) 2024-06-17 17:56:51 +01:00
hqq Hqq serialization (#33141) 2024-09-30 14:47:18 +02:00
quanto_integration [Quantization] Switch to optimum-quanto (#31732) 2024-10-02 15:14:34 +02:00
torchao_integration Add TorchAOHfQuantizer (#32306) 2024-08-14 16:14:24 +02:00