transformers/tests/quantization
George e4e404fdd0
Run model as compressed/uncompressed mode (#34719)
* draft, run model as compreszed/uncompressed mode

* draft

* run run_compressed=False

* run_compressed as attr

* set run_compressed=False using quantization_config

* remove redundant line

* make is_qat_trainable dependent on run_compressed status

* add tests

* lint

* full in docstring

* add decompress

* comments

* decompress if model is compresssed and not run_compressed

* apply_quant_config logic fix -- populate statedict properly

* comments

* remove non  compressed model

* make is_compressed as property

* cosmetic

* run apply_quant_config for non-compressed models -- popualte scales and zeropoints

* add pahtway for decompressing sparse models

* typo on is_quantization_compressed

* lint

* fix typo
2024-12-13 08:23:31 +01:00
..
aqlm_integration Skipping aqlm non working inference tests till fix merged (#34865) 2024-11-26 11:09:30 +01:00
autoawq Enables CPU AWQ model with IPEX version. (#33460) 2024-10-04 16:25:10 +02:00
bitnet_integration Fix : BitNet tests (#34895) 2024-11-25 16:47:14 +01:00
bnb [CI] Fix bnb quantization tests with accelerate>=1.2.0 (#35172) 2024-12-09 13:55:16 -05:00
compressed_tensor Run model as compressed/uncompressed mode (#34719) 2024-12-13 08:23:31 +01:00
eetq_integration Fix typo in EETQ Tests (#35160) 2024-12-09 14:13:36 +01:00
fbgemm_fp8 Fix FbgemmFp8Linear not preserving tensor shape (#33239) 2024-09-11 13:26:44 +02:00
ggml Fix failling GGML test (#34871) 2024-11-25 18:04:52 +01:00
gptq 🚨 Remove dataset with restrictive license (#31452) 2024-06-17 17:56:51 +01:00
hqq Hqq serialization (#33141) 2024-09-30 14:47:18 +02:00
quanto_integration [Quantization] Switch to optimum-quanto (#31732) 2024-10-02 15:14:34 +02:00
torchao_integration Fix CI by tweaking torchao tests (#34832) 2024-11-20 20:28:51 +01:00