transformers/tests/quantization/compressed_tensor
George e4e404fdd0
Run model as compressed/uncompressed mode (#34719)
* draft, run model as compreszed/uncompressed mode

* draft

* run run_compressed=False

* run_compressed as attr

* set run_compressed=False using quantization_config

* remove redundant line

* make is_qat_trainable dependent on run_compressed status

* add tests

* lint

* full in docstring

* add decompress

* comments

* decompress if model is compresssed and not run_compressed

* apply_quant_config logic fix -- populate statedict properly

* comments

* remove non  compressed model

* make is_compressed as property

* cosmetic

* run apply_quant_config for non-compressed models -- popualte scales and zeropoints

* add pahtway for decompressing sparse models

* typo on is_quantization_compressed

* lint

* fix typo
2024-12-13 08:23:31 +01:00
..
__init__.py HFQuantizer implementation for compressed-tensors library (#31704) 2024-09-25 14:31:38 +02:00
test_compressed_tensors.py HFQuantizer implementation for compressed-tensors library (#31704) 2024-09-25 14:31:38 +02:00
test_load_sparse_model.py Run model as compressed/uncompressed mode (#34719) 2024-12-13 08:23:31 +01:00
test_run_compressed_model.py Run model as compressed/uncompressed mode (#34719) 2024-12-13 08:23:31 +01:00