transformers/tests/quantization
Lysandre Debut a42844955f
Loading GGUF files support (#30391)
* Adds support for loading GGUF files

Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: 99991 <99991@users.noreply.github.com>

* add q2_k q3_k q5_k support from @99991

* fix tests

* Update doc

* Style

* Docs

* fix CI

* Update docs/source/en/gguf.md

* Update docs/source/en/gguf.md

* Compute merges

* change logic

* add comment for clarity

* add comment for clarity

* Update src/transformers/models/auto/tokenization_auto.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* change logic

* Update src/transformers/modeling_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* change

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/modeling_gguf_pytorch_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* put back comment

* add comment about mistral

* comments and added tests

* fix unconsistent type

* more

* fix tokenizer

* Update src/transformers/modeling_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* address comments about tests and tokenizer + add added_tokens

* from_gguf -> gguf_file

* replace on docs too

---------

Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: 99991 <99991@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-15 14:28:20 +02:00
..
aqlm_integration Cache: Static cache as a standalone object (#30476) 2024-04-30 16:37:19 +01:00
autoawq [awq] replace scale when we have GELU (#30074) 2024-05-13 11:41:03 +02:00
bnb [bnb] Fix offload test (#30039) 2024-04-05 13:11:28 +02:00
eetq_integration [FEAT]: EETQ quantizer support (#30262) 2024-04-22 20:38:58 +01:00
ggml Loading GGUF files support (#30391) 2024-05-15 14:28:20 +02:00
gptq [GPTQ] Fix test (#28018) 2024-01-15 11:22:54 -05:00
hqq Quantization / HQQ: Fix HQQ tests on our runner (#30668) 2024-05-06 11:33:52 +02:00
quanto_integration [Quantization] Quanto quantizer (#29023) 2024-03-15 11:51:29 -04:00