transformers/tests/quantization
Yijun Lee e5fd865eba
Add Gemma2 GGUF support (#34002)
* initial setup for ggml.py

* initial setup of GGUFGemma2Converter class

* Add gemma2 model to gguf.md doc

* Partial work on GGUF_TENSOR_MAPPING

* initial setup of GGUF_TENSOR_MAPPING for Gemma2

* refactor: rename GemmaConvert class to GemmaConverter for naming consistency

* feat: complete gemma2 tensor mapping implementation

* feat: add initial implementation of GGUFGemmaConverter

* feat: complete GGUFGemmaConverter implementation

* feat: add test code for gemma2

* refactor: minor code cleanup

* refactor: minor code cleanup

* fix: resolve suggestions

* Update tests/quantization/ggml/test_ggml.py

Co-authored-by: Isotr0py <2037008807@qq.com>

---------

Co-authored-by: Isotr0py <2037008807@qq.com>
2025-01-03 14:50:07 +01:00
..
aqlm_integration Skipping aqlm non working inference tests till fix merged (#34865) 2024-11-26 11:09:30 +01:00
autoawq Enables CPU AWQ model with IPEX version. (#33460) 2024-10-04 16:25:10 +02:00
bitnet_integration Fix : BitNet tests (#34895) 2024-11-25 16:47:14 +01:00
bnb Fix new BNB test failures (#35345) 2025-01-02 11:24:52 +01:00
compressed_tensor Run model as compressed/uncompressed mode (#34719) 2024-12-13 08:23:31 +01:00
eetq_integration Fix typo in EETQ Tests (#35160) 2024-12-09 14:13:36 +01:00
fbgemm_fp8 Fix FbgemmFp8Linear not preserving tensor shape (#33239) 2024-09-11 13:26:44 +02:00
ggml Add Gemma2 GGUF support (#34002) 2025-01-03 14:50:07 +01:00
gptq 🚨 Remove dataset with restrictive license (#31452) 2024-06-17 17:56:51 +01:00
higgs HIGGS Quantization Support (#34997) 2024-12-23 16:54:49 +01:00
hqq Hqq serialization (#33141) 2024-09-30 14:47:18 +02:00
quanto_integration [Quantization] Switch to optimum-quanto (#31732) 2024-10-02 15:14:34 +02:00
torchao_integration Fix CI by tweaking torchao tests (#34832) 2024-11-20 20:28:51 +01:00
vptq_integration Fix : VPTQ test (#35394) 2024-12-23 16:27:46 +01:00