transformers/tests/quantization
mobicham f5247aca01
Hqq serialization (#33141)
* HQQ model serialization attempt

* fix hqq dispatch and unexpected keys

* style

* remove check_old_param

* revert to check HQQLinear in quantizer_hqq.py

* revert to check HQQLinear in quantizer_hqq.py

* update HqqConfig default params

* make ci happy

* make ci happy

* revert to HQQLinear check in quantizer_hqq.py

* check hqq_min version 0.2.0

* set axis=1 as default in quantization_config.py

* validate_env with hqq>=0.2.0 version message

* deprecated hqq kwargs message

* make ci happy

* remove run_expected_keys_check hack + bump to 0.2.1 min hqq version

* fix unexpected_keys hqq update

* add pre_quantized check

* add update_expected_keys to base quantizerr

* ci base.py fix?

* ci base.py fix?

* fix "quantization typo" src/transformers/utils/quantization_config.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix post merge

---------

Co-authored-by: Marc Sun <marc@huggingface.co>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-09-30 14:47:18 +02:00
..
aqlm_integration Cache: use batch_size instead of max_batch_size (#32657) 2024-08-16 11:48:45 +01:00
autoawq Skip tests properly (#31308) 2024-06-26 21:59:08 +01:00
bnb Enable BNB multi-backend support (#31098) 2024-09-24 03:40:56 -06:00
compressed_tensor HFQuantizer implementation for compressed-tensors library (#31704) 2024-09-25 14:31:38 +02:00
eetq_integration [FEAT]: EETQ quantizer support (#30262) 2024-04-22 20:38:58 +01:00
fbgemm_fp8 Fix FbgemmFp8Linear not preserving tensor shape (#33239) 2024-09-11 13:26:44 +02:00
ggml Add gguf support for bloom (#33473) 2024-09-27 12:13:40 +02:00
gptq 🚨 Remove dataset with restrictive license (#31452) 2024-06-17 17:56:51 +01:00
hqq Hqq serialization (#33141) 2024-09-30 14:47:18 +02:00
quanto_integration Skip tests properly (#31308) 2024-06-26 21:59:08 +01:00
torchao_integration Add TorchAOHfQuantizer (#32306) 2024-08-14 16:14:24 +02:00