transformers/docs/source/en/quantization
Jerry Zhang 78d78cdf8a
Add TorchAOHfQuantizer (#32306)
* Add TorchAOHfQuantizer

Summary:
Enable loading torchao quantized model in huggingface.

Test Plan:
local test

Reviewers:

Subscribers:

Tasks:

Tags:

* Fix a few issues

* style

* Added tests and addressed some comments about dtype conversion

* fix torch_dtype warning message

* fix tests

* style

* TorchAOConfig -> TorchAoConfig

* enable offload + fix memory with multi-gpu

* update torchao version requirement to 0.4.0

* better comments

* add torch.compile to torchao README, add perf number link

---------

Co-authored-by: Marc Sun <marc@huggingface.co>
2024-08-14 16:14:24 +02:00
..
aqlm.md Docs / Quantization: refactor quantization documentation (#30942) 2024-05-23 14:31:52 +02:00
awq.md docs: fix broken link (#31370) 2024-06-12 11:33:00 +01:00
bitsandbytes.md Docs / Quantization: refactor quantization documentation (#30942) 2024-05-23 14:31:52 +02:00
contribute.md Docs / Quantization: refactor quantization documentation (#30942) 2024-05-23 14:31:52 +02:00
eetq.md Docs / Quantization: refactor quantization documentation (#30942) 2024-05-23 14:31:52 +02:00
fbgemm_fp8.md Add new quant method (#32047) 2024-07-22 20:21:59 +02:00
gptq.md Docs / Quantization: refactor quantization documentation (#30942) 2024-05-23 14:31:52 +02:00
hqq.md Docs / Quantization: refactor quantization documentation (#30942) 2024-05-23 14:31:52 +02:00
optimum.md Docs / Quantization: refactor quantization documentation (#30942) 2024-05-23 14:31:52 +02:00
overview.md Add TorchAOHfQuantizer (#32306) 2024-08-14 16:14:24 +02:00
quanto.md Docs / Quantization: refactor quantization documentation (#30942) 2024-05-23 14:31:52 +02:00
torchao.md Add TorchAOHfQuantizer (#32306) 2024-08-14 16:14:24 +02:00