mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-23 14:29:01 +06:00
![]() * Add support for loading GPTQ models on CPU Right now, we can only load the GPTQ Quantized model on the CUDA device. The attribute `gptq_supports_cpu` checks if the current auto_gptq version is the one which has the cpu support for the model or not. The larger variants of the model are hard to load/run/trace on the GPU and that's the rationale behind adding this attribute. Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com> * Update quantization.md * Update quantization.md * Update quantization.md |
||
---|---|---|
.. | ||
de | ||
en | ||
es | ||
fr | ||
hi | ||
it | ||
ja | ||
ko | ||
ms | ||
pt | ||
te | ||
zh | ||
_config.py |