Alazar
96429e74a8
Add support for GGUF Phi-3 ( #31844 )
...
* Update docs for GGUF supported models
* Add tensor mappings and define class GGUFPhi3Converter
* Fix tokenizer
* Working version
* Attempt to fix some CI failures
* Run ruff format
* Add vocab, merges, decoder methods like LlamaConverter
* Resolve conflicts since Qwen2Moe was added to gguf
- I missed one place when resolving conflict
- I also made a mistake with tests_ggml.py and now has been fixed to reflect
its master version.
2024-09-10 13:32:38 +02:00
Vladislav Bronzov
5d11de4a2f
Add Qwen2Moe GGUF loading support ( #33264 )
...
* update gguf doc, config and tensor mapping
* add qwen2moe architecture support, GGUFQwen2MoeConverter and q4 unit tests
* apply code style fixes
* reformat files
* assign GGUFQwen2Converter to qwen2_moe
2024-09-05 17:42:03 +02:00
Isotr0py
edeca4387c
🚨 Support dequantization for most GGML types ( #32625 )
...
* use gguf internal dequantize
* add Q5_0 test
* add iq1 test
* add remained test
* remove duplicated test
* update docs
* add gguf version limit
* make style
* update gguf import catch
* revert vocab_size patch
* make style
* use GGUF_MIN_VERSION everywhere
2024-09-03 12:58:14 +02:00
Isotr0py
e4628434d8
Add Qwen2 GGUF loading support ( #31175 )
...
* add qwen2 gguf support
* Update docs
* fix qwen2 tokenizer
* add qwen2 gguf test
* fix typo in qwen2 gguf test
* format code
* Remove mistral, clarify the error message
* format code
* add typing and update docstring
2024-06-03 14:55:10 +01:00
Lysandre Debut
a42844955f
Loading GGUF files support ( #30391 )
...
* Adds support for loading GGUF files
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: 99991 <99991@users.noreply.github.com>
* add q2_k q3_k q5_k support from @99991
* fix tests
* Update doc
* Style
* Docs
* fix CI
* Update docs/source/en/gguf.md
* Update docs/source/en/gguf.md
* Compute merges
* change logic
* add comment for clarity
* add comment for clarity
* Update src/transformers/models/auto/tokenization_auto.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* change logic
* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* change
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/modeling_gguf_pytorch_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* put back comment
* add comment about mistral
* comments and added tests
* fix unconsistent type
* more
* fix tokenizer
* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* address comments about tests and tokenizer + add added_tokens
* from_gguf -> gguf_file
* replace on docs too
---------
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: 99991 <99991@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-15 14:28:20 +02:00