mirror of https://github.com/huggingface/transformers.git synced 2025-07-04 05:10:06 +06:00

Docs / Quantization: refactor quantization documentation (#30942 )

* refactor quant docs

* delete file

* rename to overview

* fix

* fix table

* fix

* add content

* fix library versions

* fix table

* fix table

* fix table

* fix table

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* replace to quantization_config

* fix aqlm snippet

* add DLAI courses

* fix

* fix table

* fix bulet points

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

2024-05-23 14:31:52 +02:00

1.1 KiB

Raw Blame History

Optimum

The Optimum library supports quantization for Intel, Furiosa, ONNX Runtime, GPTQ, and lower-level PyTorch quantization functions. Consider using Optimum for quantization if you're using specific and optimized hardware like Intel CPUs, Furiosa NPUs or a model accelerator like ONNX Runtime.

1.1 KiB Raw Blame History

Optimum

1.1 KiB

Raw Blame History