transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-05 22:00:09 +06:00

History

Jerry Zhang 78d78cdf8a Add TorchAOHfQuantizer (#32306 ) * Add TorchAOHfQuantizer Summary: Enable loading torchao quantized model in huggingface. Test Plan: local test Reviewers: Subscribers: Tasks: Tags: * Fix a few issues * style * Added tests and addressed some comments about dtype conversion * fix torch_dtype warning message * fix tests * style * TorchAOConfig -> TorchAoConfig * enable offload + fix memory with multi-gpu * update torchao version requirement to 0.4.0 * better comments * add torch.compile to torchao README, add perf number link --------- Co-authored-by: Marc Sun <marc@huggingface.co>		2024-08-14 16:14:24 +02:00
..
aqlm.md	Docs / Quantization: refactor quantization documentation (#30942 )	2024-05-23 14:31:52 +02:00
awq.md	docs: fix broken link (#31370 )	2024-06-12 11:33:00 +01:00
bitsandbytes.md	Docs / Quantization: refactor quantization documentation (#30942 )	2024-05-23 14:31:52 +02:00
contribute.md	Docs / Quantization: refactor quantization documentation (#30942 )	2024-05-23 14:31:52 +02:00
eetq.md	Docs / Quantization: refactor quantization documentation (#30942 )	2024-05-23 14:31:52 +02:00
fbgemm_fp8.md	Add new quant method (#32047 )	2024-07-22 20:21:59 +02:00
gptq.md	Docs / Quantization: refactor quantization documentation (#30942 )	2024-05-23 14:31:52 +02:00
hqq.md	Docs / Quantization: refactor quantization documentation (#30942 )	2024-05-23 14:31:52 +02:00
optimum.md	Docs / Quantization: refactor quantization documentation (#30942 )	2024-05-23 14:31:52 +02:00
overview.md	Add TorchAOHfQuantizer (#32306 )	2024-08-14 16:14:24 +02:00
quanto.md	Docs / Quantization: refactor quantization documentation (#30942 )	2024-05-23 14:31:52 +02:00
torchao.md	Add TorchAOHfQuantizer (#32306 )	2024-08-14 16:14:24 +02:00