Yijun Lee
e5fd865eba
Add Gemma2 GGUF support ( #34002 )
...
* initial setup for ggml.py
* initial setup of GGUFGemma2Converter class
* Add gemma2 model to gguf.md doc
* Partial work on GGUF_TENSOR_MAPPING
* initial setup of GGUF_TENSOR_MAPPING for Gemma2
* refactor: rename GemmaConvert class to GemmaConverter for naming consistency
* feat: complete gemma2 tensor mapping implementation
* feat: add initial implementation of GGUFGemmaConverter
* feat: complete GGUFGemmaConverter implementation
* feat: add test code for gemma2
* refactor: minor code cleanup
* refactor: minor code cleanup
* fix: resolve suggestions
* Update tests/quantization/ggml/test_ggml.py
Co-authored-by: Isotr0py <2037008807@qq.com>
---------
Co-authored-by: Isotr0py <2037008807@qq.com>
2025-01-03 14:50:07 +01:00
farrosalferro
c57eafdaa1
Add Nemotron GGUF Loading Support ( #34725 )
...
* Add Nemotron GGUF Loading Support
* fix the Nemotron architecture assignation
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-11-21 11:37:34 +01:00
Vladislav Bronzov
5251fe6271
Add GGUF for Mamba ( #34200 )
...
* add mamba architecture for gguf
* add logic for weights conversion, some fixes and refactoring
* add lm_head layers, unit test refactoring
* more fixes for tests
* remove lm_head creation
* remove unused comments
2024-10-30 16:52:17 +01:00
김준재
dd267fca72
Add T5 GGUF loading support ( #33389 )
...
* add: GGUFT5Converter
* add: tensormapping for t5
* add: test code for t5
* fix: Remove whitespace from blank line
* add: t5 fp16 tests
* fix: whitespace formatting
* fix: minor formatting
* fix: testing every weights
2024-10-24 15:10:59 +02:00
Vladislav Bronzov
cb5ca3265f
Add GGUF for starcoder2 ( #34094 )
...
* add starcoder2 arch support for gguf
* fix q6 test
2024-10-14 10:22:49 +02:00
Vladislav Bronzov
c9afee5392
Add gguf support for gpt2 ( #34044 )
...
* add gpt2 gguf support
* add doc change
* small refactoring
2024-10-10 13:42:18 +02:00
Vladislav Bronzov
faa0f63b93
Add gguf support for StableLM ( #33793 )
...
* add stablelm gguf architecture support
* add additional quantization tests
* resolve merge conflict, add weight conversion tests for fp16
2024-10-09 12:16:13 +02:00
g-prz
fe484726aa
Add falcon gguf ( #33437 )
...
* feat(gguf): add falcon q2 k
* fix(gguf): remove useless renaming
* feat(gguf): seperate falcon 7b and 40b
* feat(gguf): apply fixup
* fix(test): error rebase
* feat(gguf): add fp16 weight comparison for falcon
* feat(gguf): test weight of all layers
* test(gguf): add falcon 40b under skip decorator
* feat(gguf): quick example for extracting model size
2024-10-02 14:10:39 +02:00
pogpog
b77846a6e6
Fix link in gguf.md ( #33768 )
...
Change hyphen to underscore for URL in link to convert_hf_to_gguf.py
2024-09-30 20:17:33 +02:00
Vladislav Bronzov
9d200cfbee
Add gguf support for bloom ( #33473 )
...
* add bloom arch support for gguf
* apply format
* small refactoring, bug fix in GGUF_TENSOR_MAPPING naming
* optimize bloom GGUF_TENSOR_MAPPING
* implement reverse reshaping for bloom gguf
* add qkv weights test
* add q_8 test for bloom
2024-09-27 12:13:40 +02:00
Alazar
96429e74a8
Add support for GGUF Phi-3 ( #31844 )
...
* Update docs for GGUF supported models
* Add tensor mappings and define class GGUFPhi3Converter
* Fix tokenizer
* Working version
* Attempt to fix some CI failures
* Run ruff format
* Add vocab, merges, decoder methods like LlamaConverter
* Resolve conflicts since Qwen2Moe was added to gguf
- I missed one place when resolving conflict
- I also made a mistake with tests_ggml.py and now has been fixed to reflect
its master version.
2024-09-10 13:32:38 +02:00
Vladislav Bronzov
5d11de4a2f
Add Qwen2Moe GGUF loading support ( #33264 )
...
* update gguf doc, config and tensor mapping
* add qwen2moe architecture support, GGUFQwen2MoeConverter and q4 unit tests
* apply code style fixes
* reformat files
* assign GGUFQwen2Converter to qwen2_moe
2024-09-05 17:42:03 +02:00
Isotr0py
edeca4387c
🚨 Support dequantization for most GGML types ( #32625 )
...
* use gguf internal dequantize
* add Q5_0 test
* add iq1 test
* add remained test
* remove duplicated test
* update docs
* add gguf version limit
* make style
* update gguf import catch
* revert vocab_size patch
* make style
* use GGUF_MIN_VERSION everywhere
2024-09-03 12:58:14 +02:00
Isotr0py
e4628434d8
Add Qwen2 GGUF loading support ( #31175 )
...
* add qwen2 gguf support
* Update docs
* fix qwen2 tokenizer
* add qwen2 gguf test
* fix typo in qwen2 gguf test
* format code
* Remove mistral, clarify the error message
* format code
* add typing and update docstring
2024-06-03 14:55:10 +01:00
Lysandre Debut
a42844955f
Loading GGUF files support ( #30391 )
...
* Adds support for loading GGUF files
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: 99991 <99991@users.noreply.github.com>
* add q2_k q3_k q5_k support from @99991
* fix tests
* Update doc
* Style
* Docs
* fix CI
* Update docs/source/en/gguf.md
* Update docs/source/en/gguf.md
* Compute merges
* change logic
* add comment for clarity
* add comment for clarity
* Update src/transformers/models/auto/tokenization_auto.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* change logic
* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* change
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/modeling_gguf_pytorch_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* put back comment
* add comment about mistral
* comments and added tests
* fix unconsistent type
* more
* fix tokenizer
* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* address comments about tests and tokenizer + add added_tokens
* from_gguf -> gguf_file
* replace on docs too
---------
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: 99991 <99991@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-15 14:28:20 +02:00