Commit Graph

18471 Commits

Author SHA1 Message Date
Joao Gante
e90d55ebcc
[Tests] add min_new_tokens to prevent flaky length checks (#37175) 2025-04-02 15:24:00 +01:00
Matt
cbfa14823b
No more dtype_byte_size() (#37144)
* No more dtype_byte_size()

* Remove function once again

* Fix rebase cruft

* Trigger tests
2025-04-02 14:58:38 +01:00
cyyever
7613cf1a45
Add py.typed (#37022) 2025-04-02 14:17:27 +01:00
cyyever
32c12aaec3
[3/N] Use pyupgrade --py39-plus to improve code (#36936)
Use pyupgrade --py39-plus to improve code

Signed-off-by: cyy <cyyever@outlook.com>
2025-04-02 14:16:06 +01:00
cyyever
764ab0d46a
Merge tensor operations with device transfer operations (#37097)
* Merge operations with to

Signed-off-by: cyy <cyyever@outlook.com>

* Use dtype

Signed-off-by: cyy <cyyever@outlook.com>

---------

Signed-off-by: cyy <cyyever@outlook.com>
2025-04-02 14:15:23 +01:00
湛露先生
c94c6ed397
Fix some code annotation typos. (#37102)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-04-02 14:00:41 +01:00
Dan Saattrup Nielsen
e94d607c8b
fix: Add 'image-text-to-text' to TASK_MAPPING (#37107)
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-04-02 14:51:03 +02:00
Yih-Dar
adfc91cd46
Try to avoid/reduce some remaining CI job failures (#37202)
* try

* try

* Update tests/pipelines/test_pipelines_video_classification.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-04-02 14:39:57 +02:00
Xavier Dupré
6f5dc9c82e
Fixes DynamicCache export issues due to control flow and inplace modifications (#36652)
* Remove unnecessary masked_fill in deberta models

* Enable some code when exporting but not compiling

* add missing import

* style

* replace if by torch.cond

* style

* use numel

* style

* add unit tests

* style

* change empty value for dynamic cache

* replace != [] by numel()

* fix import issue

* style
2025-04-02 12:04:40 +01:00
Jerry Zhang
a165458901
Add device workaround for int4 weight only quantization after API update (#36980)
* merge

* fix import

* format

* reformat

* reformat

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-04-02 12:42:22 +02:00
Yih-Dar
ed95493ce0
Skip code 307 in RequestCounter (#36953)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-02 11:35:46 +02:00
Raushan Turganbay
211e4dc9a4
[chat-template] fix video loading (#37146)
* fix

* add video

* trigger

* push new iamges

* fix tests

* revert

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-02 11:27:50 +02:00
Bowen Bao
800510c67b
[doc] Fix link for Quark quantization page (#37179)
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-04-01 20:57:38 +02:00
Cyril Vallez
41f5c3216c
Revert #37031 (#37178)
Update modeling_utils.py
2025-04-01 19:48:15 +02:00
Cyril Vallez
bc2dea3f54
Fix meta state dict loading with quantizers (#37136)
Update modeling_utils.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-04-01 18:45:58 +02:00
Yih-Dar
35253076f4
Avoid pipeline test failing related to Hub call (#37170)
* cls

* cls

* cls

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-01 18:22:45 +02:00
Yufeng Xu
bf41e54fc8
Fixes the inconsistency of the optionality of attention_mask (#37153)
* debugging issue 36758

* debugging issue 36758

* debugging issue 36758

* updated attn_mask type specification in _flash_attention_forward

* removed pdb

* added a blank line

* removed indentation
2025-04-01 15:31:10 +01:00
Pavel Iakubovskii
3249c5dc15
Refactor attention for SigLIP based models (#36981)
* Update Siglip attention implementation

* Update tests for Siglip

* Remove one level of indentation

* Update test to be more specific

* Fixup

* Idefics2

* Idefics3

* Emu3

* SmolVLM

* Phi4 (just init small update)

* Idefics2 (test fix)

* Update siglip2 tests

* Update eager

* trigger

* Clean up

* Transfer inputs to device in test

* Fixing test

* Fixing test

* Revert contiguous

* Remove unused is_flash_attn_2_available

* Move flaky to specific models
2025-04-01 15:37:25 +02:00
Yao Matrix
24e311f42b
fix XPU UT error case brough by RNG difference btw XPU and CUDA (#37121)
* fix XPU UT error case brough by RNG difference btw XPU and CUDA

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* Revert "enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu"

This reverts commit 3ef83a4f02.

---------

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
2025-04-01 13:52:55 +01:00
Tom Aarsen
897ff9af0e
[ModernBERT] Never save 'reference_compile' config; should be set based on end user (#36305)
* Never save 'reference_compile' config; should be set based on end user

* Reformat (I ran 'make style' from the wrong env)

* Use pop instead of del

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Use pop instead of del

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-04-01 14:14:39 +02:00
Tugsbayasgalan Manlaibaatar
c0bd8048a5
Make canine model exportable by removing unncessary complicated logic (#37124) 2025-04-01 12:31:12 +01:00
Ilyas Moutawwakil
60b75d99b6
Only count num items in batch when needed (#36867)
only count num itels when needed
2025-04-01 12:30:39 +02:00
Qizhi Chen
fac70ff3c0
Convert _VALID_DICT_FIELDS to class attribute for shared dict parsing in subclasses (#36736)
* make _VALID_DICT_FIELDS as a class attribute

* fix test case about TrainingArguments
2025-04-01 12:29:12 +02:00
Guang Yang
ae34bd75fd
Use public export API on torch 2.5 and future (#36781)
Co-authored-by: Guang Yang <guangyang@fb.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-04-01 10:47:38 +01:00
Yao Matrix
8f6b27eb5c
enable test_assisted_decoding_in_different_gpu test on XPU (#37120)
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-04-01 11:22:59 +02:00
jiqing-feng
737cbd2109
Fix llava xpu tests. (#37130)
* fix llava 4bit xpu test

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix llava 4bit xpu test

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-04-01 11:10:13 +02:00
jiqing-feng
3a6ab46a0b
add gpt2 test on XPU (#37028)
* add gpt2 test on XPU

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* auto dtype has been fixed

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* convert model to train mode

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-04-01 11:09:29 +02:00
Yaswanth Gali
4b13a02920
Fix std initialization in Idefics variants (#37100)
* Nit 😅

* Another one

* fix

* run ci

* revert change
2025-04-01 09:18:54 +02:00
cyyever
786d9c5ed9
Fix more inefficient PT operations (#37060)
* Fix inefficient operations

* Remove cpu() call

* Reorder detach()

* Reorder detach()

* tolist without detach

* item without detach

* Update src/transformers/models/rag/modeling_rag.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/models/encodec/test_modeling_encodec.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Use detach().cpu().numpy

* Revert some numpy operations

* More fixes

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-31 16:31:24 +01:00
Pavel Iakubovskii
a1e389e637
Refactor return_dict logic to remove complicated if/else paths (#36794)
* SAM

* CLIP

* SigLIP

* GOT-OCR2 (depends on SAM)

* SigLIP2 (depends on SigLIP)

* trigger tests

* Fix SAM

* Fix missed indexing, use named attributes

* Llama

* Aria

* Bamba

* Update llama: missed outputs return type

* (fixup) Aria

* DiffLlama

* Emu3

* Gemma

* Gemma2

* Paligemma

* Fix paligemma

* Gemma3

* GLM

* Helium

* JetMoe

* Jamba

* Mistral

* Mistral

* Mixtral

* Nemotron

* Olmo

* Olmo2

* Persimmon

* Phi

* Phi3

* PhiMoe

* Qwen2

* Qwen2_moe

* StableLM

* Starcoder2

* Add return_dict decorator

* SAM

* Update decorator: compile, export, trace - friendly

* Llama (decorator)

* SAM (decorator)

* Add decorator `can_return_tuple`

* Llama

* Update to decorator

* Update CLIP

* Update decorator to store `_is_top_level_module` in self

* Update decorator to correctly handle compile/export

* Remove is_torchdynamo_compiling constraint, all work fine with self attribute assignment

* Typing

* GPT NeoX

* Fixup

* Fix attribute Granite

* Fix return type mixtral

* Update Gemma3

* Fix Cohere amd Cohere2

* Fixup

* Fix corner case for Phi4, when activation is shared

* (fix-copies) deepseekv3, phi4

* Fixup

* Apply to qwen3/qwen3_moe

* Fix
2025-03-31 16:23:37 +01:00
Cyril Vallez
f304318f5f
Remove low_cpu_mem_usage and _fast_init (#36963)
* Remove low_cpu_mem_usage and _fast_init

* Update deepspeed.py

* Update modeling_utils.py

* remove the first 2 tests everywhere

* Update test_modeling_common.py

* remove what was remaining about fast_init

* fix logic and simplify

* mismatched keys logic update

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* fix 2 models init_weights

* extend to others

* remove grad

* Update modeling_fsmt.py

* init weights in tests

* style

* Update test_modeling_fsmt.py

* more old models

* fix more init_weights

* copies

* fix

* style

* Update modeling_lxmert.py

* fix inits

* more and more

* more

* should finalize

* style

* Update modeling_dinov2_with_registers.py

* fix

* Update modeling_encoder_decoder.py

* fix

* style

* Update modeling_lxmert.py

* post rebase cleanup

* Update modeling_informer.py

* back to start for device

* fix

* add test to detect all failing cases correctly

* Update test_modeling_common.py

* fix

* fix

* sam

* style

* Update modeling_maskformer_swin.py

* CIs

* CIs

* remove test - will add it on separate PR

* fix

* fix

* Update modeling_sam.py

* CIs

* CIs

* CIs

* convnext

* suggestions

* CIs

* fix copies after merge

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-03-31 17:18:43 +02:00
Raushan Turganbay
8805600406
[qwen3] fix generation tests (#37142)
* do not skip tests

* fix qwen3-moe as well

* fixup

* fixup
2025-03-31 16:33:41 +02:00
Zhen
e686fed635
[Feature] Support using FlashAttention2 on Ascend NPU (#36696)
* [Feature] Support using flash-attention on Ascend NPU

* Fix qwen3 and qwen3_moe moduler conversion mismatch
2025-03-31 16:12:58 +02:00
Yih-Dar
a03cee7a1d
skip (#37141)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-31 15:38:40 +02:00
Guang Yang
3b07ca78bb
Export T5 (encoder-decoder) to ExecuTorch (#36486)
Co-authored-by: Guang Yang <guangyang@fb.com>
2025-03-31 12:10:26 +02:00
Fanli Lin
475664e2c6
[tests] remove cuda-only test marker in AwqConfigTest (#37032)
* enable on xpu

* add xpu support

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-31 11:53:02 +02:00
Armaghan Shakir
0710e9b1e8
Create and Expose SamVisionModel as public for better accessibility (#36493)
* move encoder below

* auto modeling

* write SamVisionTester

* fix vision attention shape

* fix SamVisionTest

* minor changes to SamVisionTest

* Revert "fix vision attention shape"

This reverts commit d2a4083ae5.

* fix attention output shape in new tests

* remove encoder examples

* run modular on got_ocr2

* code formatting

* fix got_ocr2

* ruff fixes

* code quality

* add sam_vision in auto modeling and auto configuration

* remove composite test

* updated index.md

* add TFSamVisionEncoder to __init__

* fix public TFSamVisionEncoder

* remove outdated todo comment

* set test_torch_exportable

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* rename: VisionEncoder -> VisionModel

* bring back original SamVisionEncoder

* rename back: VisionEncoderOutput -> VisionModelOutput

* undo changes in SamModelTester

* reuse SamVisionEncoder in SamVisionModel

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-03-31 11:45:07 +02:00
cyyever
f99c279d20
Remove deprecated code (#37059)
* Remove deprecated code

* fix get_loading_attributes

* fix error

* skip test

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-03-31 11:15:35 +02:00
Robin Kahlow
d1efaf0318
RWKV: fix mask warning typo (#37114)
rwkv: fix mask warning typo
2025-03-31 11:07:51 +02:00
Thien Tran
19919689b2
Fix Gemma3 embedding scaling (#37109)
fix gemma3 embedding
2025-03-31 11:04:02 +02:00
huismiling
d0b65bb479
[MLU] Fix FA2 check error, remove deepspeed-mlu deps. (#36159)
* add Cambricon MLUs support

* fix mlu device rng state

* up for quality check

* up mlu to support fp16

* fix mlu device dependency error

* fix mlu device dependency error

* enable mlu device for bf16

* fix mlu device memory tracker

* Cambricon support SDPA and flash_attn

* MLU devices : Checks if `mlu` is available via an `cndev-based` check which won't trigger the drivers and leave mlu

* Fix mlu FA2 check. Remove deepspeed-mlu check. add mlu tests support.

* fix testing errors.

* Merge branch 'hf/main' into main

* fix get_device_count error.

* fix mlu testing utils.

* fix code quality and style.

* switch to @require_torch_multi_accelerator
2025-03-31 11:02:49 +02:00
jiqing-feng
ad63d20dff
fix whisper re-compile (#36712)
* fix whisper re-compile

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix copy

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix comment

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix copies

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* revert useless changes

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-31 11:01:51 +02:00
jiqing-feng
286393fbb1
enable tp on CPU (#36299)
* enable tp on CPU

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* get rank from cpu

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* enable TP tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix comment

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* em print

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix model id

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix conflict

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix index and add doc

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-03-31 10:55:47 +02:00
Qubitium-ModelCloud
4705b04c74
Fix 4090/ada not detected as having FP8 support (#37067)
fix 4090/ada not detected as having FP8 support

Signed-off-by: Qubitium <qubitium@modelcloud.ai>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-03-31 10:53:48 +02:00
efsotr
2b4734bd49
Support passing flash_attn_kwargs when gradient_checkpointing is enabled (#37037)
* support passing flash_attn_kwargs when gradient_checkpointing is enabled

* make modeling_deepspeek_v3.py consistent with modular_deepseek_v3.py
2025-03-31 10:53:02 +02:00
Yuan Wu
bd41b9c1ac
Gaudi: Fix the pipeline failed issue with hpu device (#36990)
* Gaudi: fix the issue of is_torch_hpu_available() returns false

Signed-off-by: yuanwu <yuan.wu@intel.com>

* Fix make fixup

Signed-off-by: yuanwu <yuan.wu@intel.com>

* Add comments for the implicit behavior of import

Signed-off-by: yuanwu <yuan.wu@intel.com>

* Update src/transformers/utils/import_utils.py

* Update src/transformers/utils/import_utils.py

---------

Signed-off-by: yuanwu <yuan.wu@intel.com>
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
2025-03-31 10:23:47 +02:00
Bo Zheng
6acd5aecb3
Adding Qwen3 and Qwen3MoE (#36878)
* Initial commit for Qwen3

* fix and add tests for qwen3 & qwen3_moe

* rename models for tests.

* fix

* fix

* fix and add docs.

* fix model name in docs.

* simplify modular and fix configuration issues

* Fix the red CI: ruff was updated

* revert ruff, version was wrong

* fix qwen3moe.

* fix

* make sure MOE can load

* fix copies

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2025-03-31 09:50:49 +02:00
MinJu-Ha
0d6a60fe55
🌐 [i18n-KO] Translated qwen2_vl.md to Korean (#36750)
* fix: manual edits

* fix: resolve suggestions

* Update toctree.yml
2025-03-30 15:00:27 -07:00
Yih-Dar
b7fc2daf8b
Kenlm (#37091)
* kenlm

* kenlm

* kenlm

* kenlm

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-28 21:42:54 +01:00
Joao Gante
bab605dd04
[Cache] rename dtype attribute 🚨 🚨 (#37044)
* yoink

* same pattern in all cache
2025-03-28 19:08:02 +01:00