Marc Sun
|
28de2f4de3
|
[Quantization] Quanto quantizer (#29023)
* start integration
* fix
* add and debug tests
* update tests
* make pytorch serialization works
* compatible with device_map and offload
* fix tests
* make style
* add ref
* guard against safetensors
* add float8 and style
* fix is_serializable
* Fix shard_checkpoint compatibility with quanto
* more tests
* docs
* adjust memory
* better
* style
* pass tests
* Update src/transformers/modeling_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* add is_safe_serialization instead
* Update src/transformers/quantizers/quantizer_quanto.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* add QbitsTensor tests
* fix tests
* simplify activation list
* Update docs/source/en/quantization.md
Co-authored-by: David Corvoysier <david.corvoysier@gmail.com>
* better comment
* Update tests/quantization/quanto_integration/test_quanto.py
Co-authored-by: David Corvoysier <david.corvoysier@gmail.com>
* Update tests/quantization/quanto_integration/test_quanto.py
Co-authored-by: David Corvoysier <david.corvoysier@gmail.com>
* find and fix edge case
* Update docs/source/en/quantization.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* pass weights_only_kwarg instead
* fix shard_checkpoint loading
* simplify update_missing_keys
* Update tests/quantization/quanto_integration/test_quanto.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* recursion to get all tensors
* block serialization
* skip serialization tests
* fix
* change by cuda:0 for now
* fix regression
* update device_map
* fix doc
* add noteboon
* update torch_dtype
* update doc
* typo
* typo
* remove comm
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: David Corvoysier <david.corvoysier@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
|
2024-03-15 11:51:29 -04:00 |
|