Marc Sun
cb384dcd7a
Fix gguf docs ( #36601 )
...
* update
* doc
* update
* Update docs/source/en/gguf.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* fix
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-03-11 15:29:14 +01:00
Steven Liu
c0f8d055ce
[docs] Redesign ( #31757 )
...
* toctree
* not-doctested.txt
* collapse sections
* feedback
* update
* rewrite get started sections
* fixes
* fix
* loading models
* fix
* customize models
* share
* fix link
* contribute part 1
* contribute pt 2
* fix toctree
* tokenization pt 1
* Add new model (#32615 )
* v1 - working version
* fix
* fix
* fix
* fix
* rename to correct name
* fix title
* fixup
* rename files
* fix
* add copied from on tests
* rename to `FalconMamba` everywhere and fix bugs
* fix quantization + accelerate
* fix copies
* add `torch.compile` support
* fix tests
* fix tests and add slow tests
* copies on config
* merge the latest changes
* fix tests
* add few lines about instruct
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix
* fix tests
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* "to be not" -> "not to be" (#32636 )
* "to be not" -> "not to be"
* Update sam.md
* Update trainer.py
* Update modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* fix hfoption tag
* tokenization pt. 2
* image processor
* fix toctree
* backbones
* feature extractor
* fix file name
* processor
* update not-doctested
* update
* make style
* fix toctree
* revision
* make fixup
* fix toctree
* fix
* make style
* fix hfoption tag
* pipeline
* pipeline gradio
* pipeline web server
* add pipeline
* fix toctree
* not-doctested
* prompting
* llm optims
* fix toctree
* fixes
* cache
* text generation
* fix
* chat pipeline
* chat stuff
* xla
* torch.compile
* cpu inference
* toctree
* gpu inference
* agents and tools
* gguf/tiktoken
* finetune
* toctree
* trainer
* trainer pt 2
* optims
* optimizers
* accelerate
* parallelism
* fsdp
* update
* distributed cpu
* hardware training
* gpu training
* gpu training 2
* peft
* distrib debug
* deepspeed 1
* deepspeed 2
* chat toctree
* quant pt 1
* quant pt 2
* fix toctree
* fix
* fix
* quant pt 3
* quant pt 4
* serialization
* torchscript
* scripts
* tpu
* review
* model addition timeline
* modular
* more reviews
* reviews
* fix toctree
* reviews reviews
* continue reviews
* more reviews
* modular transformers
* more review
* zamba2
* fix
* all frameworks
* pytorch
* supported model frameworks
* flashattention
* rm check_table
* not-doctested.txt
* rm check_support_list.py
* feedback
* updates/feedback
* review
* feedback
* fix
* update
* feedback
* updates
* update
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
2025-03-03 10:33:46 -08:00
Yijun Lee
e5fd865eba
Add Gemma2 GGUF support ( #34002 )
...
* initial setup for ggml.py
* initial setup of GGUFGemma2Converter class
* Add gemma2 model to gguf.md doc
* Partial work on GGUF_TENSOR_MAPPING
* initial setup of GGUF_TENSOR_MAPPING for Gemma2
* refactor: rename GemmaConvert class to GemmaConverter for naming consistency
* feat: complete gemma2 tensor mapping implementation
* feat: add initial implementation of GGUFGemmaConverter
* feat: complete GGUFGemmaConverter implementation
* feat: add test code for gemma2
* refactor: minor code cleanup
* refactor: minor code cleanup
* fix: resolve suggestions
* Update tests/quantization/ggml/test_ggml.py
Co-authored-by: Isotr0py <2037008807@qq.com>
---------
Co-authored-by: Isotr0py <2037008807@qq.com>
2025-01-03 14:50:07 +01:00
farrosalferro
c57eafdaa1
Add Nemotron GGUF Loading Support ( #34725 )
...
* Add Nemotron GGUF Loading Support
* fix the Nemotron architecture assignation
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-11-21 11:37:34 +01:00
Vladislav Bronzov
5251fe6271
Add GGUF for Mamba ( #34200 )
...
* add mamba architecture for gguf
* add logic for weights conversion, some fixes and refactoring
* add lm_head layers, unit test refactoring
* more fixes for tests
* remove lm_head creation
* remove unused comments
2024-10-30 16:52:17 +01:00
김준재
dd267fca72
Add T5 GGUF loading support ( #33389 )
...
* add: GGUFT5Converter
* add: tensormapping for t5
* add: test code for t5
* fix: Remove whitespace from blank line
* add: t5 fp16 tests
* fix: whitespace formatting
* fix: minor formatting
* fix: testing every weights
2024-10-24 15:10:59 +02:00
Vladislav Bronzov
cb5ca3265f
Add GGUF for starcoder2 ( #34094 )
...
* add starcoder2 arch support for gguf
* fix q6 test
2024-10-14 10:22:49 +02:00
Vladislav Bronzov
c9afee5392
Add gguf support for gpt2 ( #34044 )
...
* add gpt2 gguf support
* add doc change
* small refactoring
2024-10-10 13:42:18 +02:00
Vladislav Bronzov
faa0f63b93
Add gguf support for StableLM ( #33793 )
...
* add stablelm gguf architecture support
* add additional quantization tests
* resolve merge conflict, add weight conversion tests for fp16
2024-10-09 12:16:13 +02:00
g-prz
fe484726aa
Add falcon gguf ( #33437 )
...
* feat(gguf): add falcon q2 k
* fix(gguf): remove useless renaming
* feat(gguf): seperate falcon 7b and 40b
* feat(gguf): apply fixup
* fix(test): error rebase
* feat(gguf): add fp16 weight comparison for falcon
* feat(gguf): test weight of all layers
* test(gguf): add falcon 40b under skip decorator
* feat(gguf): quick example for extracting model size
2024-10-02 14:10:39 +02:00
pogpog
b77846a6e6
Fix link in gguf.md ( #33768 )
...
Change hyphen to underscore for URL in link to convert_hf_to_gguf.py
2024-09-30 20:17:33 +02:00
Vladislav Bronzov
9d200cfbee
Add gguf support for bloom ( #33473 )
...
* add bloom arch support for gguf
* apply format
* small refactoring, bug fix in GGUF_TENSOR_MAPPING naming
* optimize bloom GGUF_TENSOR_MAPPING
* implement reverse reshaping for bloom gguf
* add qkv weights test
* add q_8 test for bloom
2024-09-27 12:13:40 +02:00
Alazar
96429e74a8
Add support for GGUF Phi-3 ( #31844 )
...
* Update docs for GGUF supported models
* Add tensor mappings and define class GGUFPhi3Converter
* Fix tokenizer
* Working version
* Attempt to fix some CI failures
* Run ruff format
* Add vocab, merges, decoder methods like LlamaConverter
* Resolve conflicts since Qwen2Moe was added to gguf
- I missed one place when resolving conflict
- I also made a mistake with tests_ggml.py and now has been fixed to reflect
its master version.
2024-09-10 13:32:38 +02:00
Vladislav Bronzov
5d11de4a2f
Add Qwen2Moe GGUF loading support ( #33264 )
...
* update gguf doc, config and tensor mapping
* add qwen2moe architecture support, GGUFQwen2MoeConverter and q4 unit tests
* apply code style fixes
* reformat files
* assign GGUFQwen2Converter to qwen2_moe
2024-09-05 17:42:03 +02:00
Isotr0py
edeca4387c
🚨 Support dequantization for most GGML types ( #32625 )
...
* use gguf internal dequantize
* add Q5_0 test
* add iq1 test
* add remained test
* remove duplicated test
* update docs
* add gguf version limit
* make style
* update gguf import catch
* revert vocab_size patch
* make style
* use GGUF_MIN_VERSION everywhere
2024-09-03 12:58:14 +02:00
Isotr0py
e4628434d8
Add Qwen2 GGUF loading support ( #31175 )
...
* add qwen2 gguf support
* Update docs
* fix qwen2 tokenizer
* add qwen2 gguf test
* fix typo in qwen2 gguf test
* format code
* Remove mistral, clarify the error message
* format code
* add typing and update docstring
2024-06-03 14:55:10 +01:00
Lysandre Debut
a42844955f
Loading GGUF files support ( #30391 )
...
* Adds support for loading GGUF files
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: 99991 <99991@users.noreply.github.com>
* add q2_k q3_k q5_k support from @99991
* fix tests
* Update doc
* Style
* Docs
* fix CI
* Update docs/source/en/gguf.md
* Update docs/source/en/gguf.md
* Compute merges
* change logic
* add comment for clarity
* add comment for clarity
* Update src/transformers/models/auto/tokenization_auto.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* change logic
* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* change
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/modeling_gguf_pytorch_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* put back comment
* add comment about mistral
* comments and added tests
* fix unconsistent type
* more
* fix tokenizer
* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* address comments about tests and tokenizer + add added_tokens
* from_gguf -> gguf_file
* replace on docs too
---------
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: 99991 <99991@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-15 14:28:20 +02:00