transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 10:12:23 +06:00

History

jiqing-feng b916efcb3c Enables CPU AWQ model with IPEX version. (#33460 ) * enable cpu awq ipex linear * add doc for cpu awq with ipex kernel * add tests for cpu awq * fix code style * fix doc and tests * Update docs/source/en/quantization/awq.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/autoawq/test_awq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix comments * fix log * fix log * fix style --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>		2024-10-04 16:25:10 +02:00
..
aqlm_integration	Cache: use `batch_size` instead of `max_batch_size` (#32657 )	2024-08-16 11:48:45 +01:00
autoawq	Enables CPU AWQ model with IPEX version. (#33460 )	2024-10-04 16:25:10 +02:00
bnb	Enable BNB multi-backend support (#31098 )	2024-09-24 03:40:56 -06:00
compressed_tensor	HFQuantizer implementation for compressed-tensors library (#31704 )	2024-09-25 14:31:38 +02:00
eetq_integration	[FEAT]: EETQ quantizer support (#30262 )	2024-04-22 20:38:58 +01:00
fbgemm_fp8	Fix `FbgemmFp8Linear` not preserving tensor shape (#33239 )	2024-09-11 13:26:44 +02:00
ggml	Add falcon gguf (#33437 )	2024-10-02 14:10:39 +02:00
gptq	🚨 Remove dataset with restrictive license (#31452 )	2024-06-17 17:56:51 +01:00
hqq	Hqq serialization (#33141 )	2024-09-30 14:47:18 +02:00
quanto_integration	[Quantization] Switch to optimum-quanto (#31732 )	2024-10-02 15:14:34 +02:00
torchao_integration	Add TorchAOHfQuantizer (#32306 )	2024-08-14 16:14:24 +02:00