Commit Graph

37 Commits

Author SHA1 Message Date
Yao Matrix
89542fb81c
enable more test cases on xpu (#38572)
* enable glm4 integration cases on XPU, set xpu expectation for blip2

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* more

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* refine wording

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* refine test case names

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* run

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* add gemma2 and chameleon

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* fix review comments

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

---------

Signed-off-by: Matrix YAO <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
2025-06-06 09:29:51 +02:00
Mohamed Mekkouri
38c406844e
Fixing quantization tests (#37650)
* fix

* style

* add capability check
2025-04-22 13:59:57 +02:00
Isotr0py
c69e23455d
Support loading Gemma3 QAT GGUF models (#37649)
* fix gemma3 qat gguf support

Signed-off-by: isotr0py <2037008807@qq.com>

* update test

Signed-off-by: isotr0py <2037008807@qq.com>

* make ruff happy

Signed-off-by: isotr0py <2037008807@qq.com>

---------

Signed-off-by: isotr0py <2037008807@qq.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-04-22 11:23:17 +02:00
Isotr0py
6daec12d0b
Add GGUF support to Gemma3 Text backbone (#37424)
* add gemma3 gguf support

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix typo and add gguf limit

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix a typo

Signed-off-by: Isotr0py <2037008807@qq.com>

* add vision conversion test

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix typos

Signed-off-by: Isotr0py <2037008807@qq.com>

---------

Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-04-10 17:15:43 +02:00
cyyever
1e6b546ea6
Use Python 3.9 syntax in tests (#37343)
Signed-off-by: cyy <cyyever@outlook.com>
2025-04-08 14:12:08 +02:00
David LaPalomento
b45cf0e90a
Guard against unset resolved_archive_file (#35628)
* archive_file may not be specified
When loading a pre-trained model from a gguf file, resolved_archive_file may not be set. Guard against that case in the safetensors availability check.

* Remap partial disk offload to cpu for GGUF files
GGUF files don't support disk offload so attempt to remap them to the CPU when device_map is auto. If device_map is anything else but None, raise a NotImplementedError.

* Don't remap auto device_map and raise RuntimeError
If device_map=auto and modules are selected for disk offload, don't attempt to map them to any other device. Raise a runtime error when a GGUF model is configured to map any modules to disk.

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-14 14:44:31 +01:00
湛露先生
1590c66430
Fix words typos in ggml test. (#36060)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-02-06 15:32:40 +00:00
Isotr0py
e57b459997
Split and clean up GGUF quantization tests (#35502)
* clean up ggml test

Signed-off-by: Isotr0py <2037008807@qq.com>

* port remaining tests

Signed-off-by: Isotr0py <2037008807@qq.com>

* further cleanup

Signed-off-by: Isotr0py <2037008807@qq.com>

* format

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix broken tests

Signed-off-by: Isotr0py <2037008807@qq.com>

* update comment

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix

Signed-off-by: Isotr0py <2037008807@qq.com>

* reorganize tests

Signed-off-by: Isotr0py <2037008807@qq.com>

* k-quants use qwen2.5-0.5B

Signed-off-by: Isotr0py <2037008807@qq.com>

* move ggml tokenization test

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove dead code

Signed-off-by: Isotr0py <2037008807@qq.com>

* add assert for serilization test

Signed-off-by: Isotr0py <2037008807@qq.com>

* use str for parameterize

Signed-off-by: Isotr0py <2037008807@qq.com>

---------

Signed-off-by: Isotr0py <2037008807@qq.com>
2025-01-27 15:46:57 +01:00
Arthur
b912f5ee43
use torch.testing.assertclose instead to get more details about error in cis (#35659)
* use torch.testing.assertclose instead to get more details about error in cis

* fix

* style

* test_all

* revert for I bert

* fixes and updates

* more image processing fixes

* more image processors

* fix mamba and co

* style

* less strick

* ok I won't be strict

* skip and be done

* up
2025-01-24 16:55:28 +01:00
Mohamed Mekkouri
a7738f5a89
Fix : Nemotron tokenizer for GGUF format (#35836)
fix nemotron gguf
2025-01-22 12:28:40 +01:00
Mohamed Mekkouri
dbd8474125
Fix : BLOOM tie_word_embeddings in GGUF (#35812)
* fix bloom ggml

* fix falcon output

* make style
2025-01-21 15:35:54 +01:00
Mohamed Mekkouri
b80e334e71
Skip Falcon 7B GGML Test (#35783)
skip test
2025-01-20 15:00:34 +01:00
Mohamed Mekkouri
a11041ffad
Fix : add require_read_token for gemma2 gated model (#35687)
fix gemma2 gated model test
2025-01-14 11:47:05 +01:00
Mohamed Mekkouri
df2a812e95
Fix expected output for ggml test (#35686)
fix expected output
2025-01-14 11:46:55 +01:00
Yijun Lee
e5fd865eba
Add Gemma2 GGUF support (#34002)
* initial setup for ggml.py

* initial setup of GGUFGemma2Converter class

* Add gemma2 model to gguf.md doc

* Partial work on GGUF_TENSOR_MAPPING

* initial setup of GGUF_TENSOR_MAPPING for Gemma2

* refactor: rename GemmaConvert class to GemmaConverter for naming consistency

* feat: complete gemma2 tensor mapping implementation

* feat: add initial implementation of GGUFGemmaConverter

* feat: complete GGUFGemmaConverter implementation

* feat: add test code for gemma2

* refactor: minor code cleanup

* refactor: minor code cleanup

* fix: resolve suggestions

* Update tests/quantization/ggml/test_ggml.py

Co-authored-by: Isotr0py <2037008807@qq.com>

---------

Co-authored-by: Isotr0py <2037008807@qq.com>
2025-01-03 14:50:07 +01:00
Mohamed Mekkouri
85eb339231
Fix : model used to test ggml conversion of Falcon-7b is incorrect (#35083)
fixing test model
2024-12-16 13:21:44 +01:00
Mohamed Mekkouri
890ea7de93
Fix failling GGML test (#34871)
fix_test
2024-11-25 18:04:52 +01:00
farrosalferro
c57eafdaa1
Add Nemotron GGUF Loading Support (#34725)
* Add Nemotron GGUF Loading Support

* fix the Nemotron architecture assignation

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-11-21 11:37:34 +01:00
Isotr0py
e83aaaa86b
Fix use_parallel_residual and qkv_bias for StableLM GGUF config extraction (#34450)
* fix stablelm qkv_bias

* fix stablelm qkv_bias and use_parallel_residual

* remove original_model.config for stablelm gguf test
2024-11-05 18:26:20 +01:00
Vladislav Bronzov
5251fe6271
Add GGUF for Mamba (#34200)
* add mamba architecture for gguf

* add logic for weights conversion, some fixes and refactoring

* add lm_head layers, unit test refactoring

* more fixes for tests

* remove lm_head creation

* remove unused comments
2024-10-30 16:52:17 +01:00
김준재
dd267fca72
Add T5 GGUF loading support (#33389)
* add: GGUFT5Converter

* add: tensormapping for t5

* add: test code for t5

* fix: Remove whitespace from blank line

* add: t5 fp16 tests

* fix: whitespace formatting

* fix: minor formatting

* fix: testing every weights
2024-10-24 15:10:59 +02:00
Vladislav Bronzov
cb5ca3265f
Add GGUF for starcoder2 (#34094)
* add starcoder2 arch support for gguf

* fix q6 test
2024-10-14 10:22:49 +02:00
Vladislav Bronzov
c9afee5392
Add gguf support for gpt2 (#34044)
* add gpt2 gguf support

* add doc change

* small refactoring
2024-10-10 13:42:18 +02:00
Vladislav Bronzov
faa0f63b93
Add gguf support for StableLM (#33793)
* add stablelm gguf architecture support

* add additional quantization tests

* resolve merge conflict, add weight conversion tests for fp16
2024-10-09 12:16:13 +02:00
Vladislav Bronzov
22e102ad98
Bug fix gguf qwen2moe (#33940)
* fix qwen2moe tensors mapping, add unit tests

* add expert tensor split logic, test refactoring

* small params refactoring

* add comment to tensor reshaping
2024-10-05 16:19:01 +02:00
g-prz
fe484726aa
Add falcon gguf (#33437)
* feat(gguf): add falcon q2 k

* fix(gguf): remove useless renaming

* feat(gguf): seperate falcon 7b and 40b

* feat(gguf): apply fixup

* fix(test): error rebase

* feat(gguf): add fp16 weight comparison for falcon

* feat(gguf): test weight of all layers

* test(gguf): add falcon 40b under skip decorator

* feat(gguf): quick example for extracting model size
2024-10-02 14:10:39 +02:00
Vladislav Bronzov
9d200cfbee
Add gguf support for bloom (#33473)
* add bloom arch support for gguf

* apply format

* small refactoring, bug fix in GGUF_TENSOR_MAPPING naming

* optimize bloom GGUF_TENSOR_MAPPING

* implement reverse reshaping for bloom gguf

* add qkv weights test

* add q_8 test for bloom
2024-09-27 12:13:40 +02:00
Alazar
96429e74a8
Add support for GGUF Phi-3 (#31844)
* Update docs for GGUF supported models

* Add tensor mappings and define class GGUFPhi3Converter

* Fix tokenizer

* Working version

* Attempt to fix some CI failures

* Run ruff format

* Add vocab, merges, decoder methods like LlamaConverter

* Resolve conflicts since Qwen2Moe was added to gguf

- I missed one place when resolving conflict
- I also made a mistake with tests_ggml.py and now has been fixed to reflect
its master version.
2024-09-10 13:32:38 +02:00
Vladislav Bronzov
5d11de4a2f
Add Qwen2Moe GGUF loading support (#33264)
* update gguf doc, config and tensor mapping

* add qwen2moe architecture support, GGUFQwen2MoeConverter and q4 unit tests

* apply code style fixes

* reformat files

* assign GGUFQwen2Converter to qwen2_moe
2024-09-05 17:42:03 +02:00
Isotr0py
edeca4387c
🚨 Support dequantization for most GGML types (#32625)
* use gguf internal dequantize

* add Q5_0 test

* add iq1 test

* add remained test

* remove duplicated test

* update docs

* add gguf version limit

* make style

* update gguf import catch

* revert vocab_size patch

* make style

* use GGUF_MIN_VERSION everywhere
2024-09-03 12:58:14 +02:00
Penut Chen
1c122a46dc
Support dequantizing GGUF FP16 format (#31783)
* support gguf fp16

* support gguf bf16 with pytorch

* add gguf f16 test

* remove bf16
2024-07-24 17:59:59 +02:00
Ita Zaporozhets
a1844a3209
gguf conversion add_prefix_space=None for llama3 (#31937)
* gguf conversion forces add_prefix_space=False for llama3, this is not required and forces from_slow, which fails. changing to None + test

* typo

* clean test
2024-07-23 11:45:54 +02:00
Penut Chen
ac946aac25
Fix the incorrect permutation of gguf (#31788)
* Fix the incorrect permutation of gguf

* rename num_kv_heads

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* add typing to num_kv_heads

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* rename variables

* refactor permute function name

* update the expected text of the llama3 q4 test

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-07-16 08:20:34 +02:00
Younes Belkada
6d4306160a
GGUF: Fix llama 3 GGUF (#31358)
* Create push-important-models.yml

* llama3 support for GGUF

* fixup

* Update src/transformers/integrations/ggml.py

* fix pre-tokenizer

* fix

* fix

* fix

* fix

* fix

* fix

* address final comment

* handle special tokens + add tests
2024-06-20 14:29:58 +02:00
Albert Villanova del Moral
a14b055b65
Pass datasets trust_remote_code (#31406)
* Pass datasets trust_remote_code

* Pass trust_remote_code in more tests

* Add trust_remote_dataset_code arg to some tests

* Revert "Temporarily pin datasets upper version to fix CI"

This reverts commit b7672826ca.

* Pass trust_remote_code in librispeech_asr_dummy docstrings

* Revert "Pin datasets<2.20.0 for examples"

This reverts commit 833fc17a3e.

* Pass trust_remote_code to all examples

* Revert "Add trust_remote_dataset_code arg to some tests" to research_projects

* Pass trust_remote_code to tests

* Pass trust_remote_code to docstrings

* Fix flax examples tests requirements

* Pass trust_remote_dataset_code arg to tests

* Replace trust_remote_dataset_code with trust_remote_code in one example

* Fix duplicate trust_remote_code

* Replace args.trust_remote_dataset_code with args.trust_remote_code

* Replace trust_remote_dataset_code with trust_remote_code in parser

* Replace trust_remote_dataset_code with trust_remote_code in dataclasses

* Replace trust_remote_dataset_code with trust_remote_code arg
2024-06-17 17:29:13 +01:00
Isotr0py
e4628434d8
Add Qwen2 GGUF loading support (#31175)
* add qwen2 gguf support

* Update docs

* fix qwen2 tokenizer

* add qwen2 gguf test

* fix typo in qwen2 gguf test

* format code

* Remove mistral, clarify the error message

* format code

* add typing and update docstring
2024-06-03 14:55:10 +01:00
Lysandre Debut
a42844955f
Loading GGUF files support (#30391)
* Adds support for loading GGUF files

Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: 99991 <99991@users.noreply.github.com>

* add q2_k q3_k q5_k support from @99991

* fix tests

* Update doc

* Style

* Docs

* fix CI

* Update docs/source/en/gguf.md

* Update docs/source/en/gguf.md

* Compute merges

* change logic

* add comment for clarity

* add comment for clarity

* Update src/transformers/models/auto/tokenization_auto.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* change logic

* Update src/transformers/modeling_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* change

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/modeling_gguf_pytorch_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* put back comment

* add comment about mistral

* comments and added tests

* fix unconsistent type

* more

* fix tokenizer

* Update src/transformers/modeling_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* address comments about tests and tokenizer + add added_tokens

* from_gguf -> gguf_file

* replace on docs too

---------

Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: 99991 <99991@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-15 14:28:20 +02:00