Matt
3d133cc557
Stop DOSing the Hub in the CI ( #37209 )
...
* As the title suggests, stop hammering the same files
* make fixup
* Use shutil instead of pathlib
2025-04-02 17:19:33 +01:00
Joao Gante
e90d55ebcc
[Tests] add min_new_tokens
to prevent flaky length checks ( #37175 )
2025-04-02 15:24:00 +01:00
Matt
cbfa14823b
No more dtype_byte_size() ( #37144 )
...
* No more dtype_byte_size()
* Remove function once again
* Fix rebase cruft
* Trigger tests
2025-04-02 14:58:38 +01:00
cyyever
7613cf1a45
Add py.typed ( #37022 )
2025-04-02 14:17:27 +01:00
cyyever
32c12aaec3
[3/N] Use pyupgrade --py39-plus to improve code ( #36936 )
...
Use pyupgrade --py39-plus to improve code
Signed-off-by: cyy <cyyever@outlook.com>
2025-04-02 14:16:06 +01:00
cyyever
764ab0d46a
Merge tensor operations with device transfer operations ( #37097 )
...
* Merge operations with to
Signed-off-by: cyy <cyyever@outlook.com>
* Use dtype
Signed-off-by: cyy <cyyever@outlook.com>
---------
Signed-off-by: cyy <cyyever@outlook.com>
2025-04-02 14:15:23 +01:00
湛露先生
c94c6ed397
Fix some code annotation typos. ( #37102 )
...
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-04-02 14:00:41 +01:00
Dan Saattrup Nielsen
e94d607c8b
fix: Add 'image-text-to-text' to TASK_MAPPING
( #37107 )
...
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-04-02 14:51:03 +02:00
Yih-Dar
adfc91cd46
Try to avoid/reduce some remaining CI job failures ( #37202 )
...
* try
* try
* Update tests/pipelines/test_pipelines_video_classification.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-04-02 14:39:57 +02:00
Xavier Dupré
6f5dc9c82e
Fixes DynamicCache export issues due to control flow and inplace modifications ( #36652 )
...
* Remove unnecessary masked_fill in deberta models
* Enable some code when exporting but not compiling
* add missing import
* style
* replace if by torch.cond
* style
* use numel
* style
* add unit tests
* style
* change empty value for dynamic cache
* replace != [] by numel()
* fix import issue
* style
2025-04-02 12:04:40 +01:00
Jerry Zhang
a165458901
Add device workaround for int4 weight only quantization after API update ( #36980 )
...
* merge
* fix import
* format
* reformat
* reformat
---------
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-04-02 12:42:22 +02:00
Yih-Dar
ed95493ce0
Skip code 307
in RequestCounter
( #36953 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-02 11:35:46 +02:00
Raushan Turganbay
211e4dc9a4
[chat-template] fix video loading ( #37146 )
...
* fix
* add video
* trigger
* push new iamges
* fix tests
* revert
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-02 11:27:50 +02:00
Bowen Bao
800510c67b
[doc] Fix link for Quark quantization page ( #37179 )
...
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-04-01 20:57:38 +02:00
Cyril Vallez
41f5c3216c
Revert #37031 ( #37178 )
...
Update modeling_utils.py
2025-04-01 19:48:15 +02:00
Cyril Vallez
bc2dea3f54
Fix meta state dict loading with quantizers ( #37136 )
...
Update modeling_utils.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-04-01 18:45:58 +02:00
Yih-Dar
35253076f4
Avoid pipeline test failing related to Hub call ( #37170 )
...
* cls
* cls
* cls
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-01 18:22:45 +02:00
Yufeng Xu
bf41e54fc8
Fixes the inconsistency of the optionality of attention_mask ( #37153 )
...
* debugging issue 36758
* debugging issue 36758
* debugging issue 36758
* updated attn_mask type specification in _flash_attention_forward
* removed pdb
* added a blank line
* removed indentation
2025-04-01 15:31:10 +01:00
Pavel Iakubovskii
3249c5dc15
Refactor attention for SigLIP based models ( #36981 )
...
* Update Siglip attention implementation
* Update tests for Siglip
* Remove one level of indentation
* Update test to be more specific
* Fixup
* Idefics2
* Idefics3
* Emu3
* SmolVLM
* Phi4 (just init small update)
* Idefics2 (test fix)
* Update siglip2 tests
* Update eager
* trigger
* Clean up
* Transfer inputs to device in test
* Fixing test
* Fixing test
* Revert contiguous
* Remove unused is_flash_attn_2_available
* Move flaky to specific models
2025-04-01 15:37:25 +02:00
Yao Matrix
24e311f42b
fix XPU UT error case brough by RNG difference btw XPU and CUDA ( #37121 )
...
* fix XPU UT error case brough by RNG difference btw XPU and CUDA
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* Revert "enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu"
This reverts commit 3ef83a4f02
.
---------
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
2025-04-01 13:52:55 +01:00
Tom Aarsen
897ff9af0e
[ModernBERT
] Never save 'reference_compile' config; should be set based on end user ( #36305 )
...
* Never save 'reference_compile' config; should be set based on end user
* Reformat (I ran 'make style' from the wrong env)
* Use pop instead of del
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Use pop instead of del
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
---------
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-04-01 14:14:39 +02:00
Tugsbayasgalan Manlaibaatar
c0bd8048a5
Make canine model exportable by removing unncessary complicated logic ( #37124 )
2025-04-01 12:31:12 +01:00
Ilyas Moutawwakil
60b75d99b6
Only count num items in batch when needed ( #36867 )
...
only count num itels when needed
2025-04-01 12:30:39 +02:00
Qizhi Chen
fac70ff3c0
Convert _VALID_DICT_FIELDS
to class attribute for shared dict parsing in subclasses ( #36736 )
...
* make _VALID_DICT_FIELDS as a class attribute
* fix test case about TrainingArguments
2025-04-01 12:29:12 +02:00
Guang Yang
ae34bd75fd
Use public export API on torch 2.5 and future ( #36781 )
...
Co-authored-by: Guang Yang <guangyang@fb.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-04-01 10:47:38 +01:00
Yao Matrix
8f6b27eb5c
enable test_assisted_decoding_in_different_gpu
test on XPU ( #37120 )
...
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-04-01 11:22:59 +02:00
jiqing-feng
737cbd2109
Fix llava xpu tests. ( #37130 )
...
* fix llava 4bit xpu test
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix llava 4bit xpu test
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-04-01 11:10:13 +02:00
jiqing-feng
3a6ab46a0b
add gpt2 test on XPU ( #37028 )
...
* add gpt2 test on XPU
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* auto dtype has been fixed
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* convert model to train mode
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-04-01 11:09:29 +02:00
Yaswanth Gali
4b13a02920
Fix std initialization in Idefics variants ( #37100 )
...
* Nit 😅
* Another one
* fix
* run ci
* revert change
2025-04-01 09:18:54 +02:00
cyyever
786d9c5ed9
Fix more inefficient PT operations ( #37060 )
...
* Fix inefficient operations
* Remove cpu() call
* Reorder detach()
* Reorder detach()
* tolist without detach
* item without detach
* Update src/transformers/models/rag/modeling_rag.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update tests/models/encodec/test_modeling_encodec.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Use detach().cpu().numpy
* Revert some numpy operations
* More fixes
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-31 16:31:24 +01:00
Pavel Iakubovskii
a1e389e637
Refactor return_dict
logic to remove complicated if/else paths ( #36794 )
...
* SAM
* CLIP
* SigLIP
* GOT-OCR2 (depends on SAM)
* SigLIP2 (depends on SigLIP)
* trigger tests
* Fix SAM
* Fix missed indexing, use named attributes
* Llama
* Aria
* Bamba
* Update llama: missed outputs return type
* (fixup) Aria
* DiffLlama
* Emu3
* Gemma
* Gemma2
* Paligemma
* Fix paligemma
* Gemma3
* GLM
* Helium
* JetMoe
* Jamba
* Mistral
* Mistral
* Mixtral
* Nemotron
* Olmo
* Olmo2
* Persimmon
* Phi
* Phi3
* PhiMoe
* Qwen2
* Qwen2_moe
* StableLM
* Starcoder2
* Add return_dict decorator
* SAM
* Update decorator: compile, export, trace - friendly
* Llama (decorator)
* SAM (decorator)
* Add decorator `can_return_tuple`
* Llama
* Update to decorator
* Update CLIP
* Update decorator to store `_is_top_level_module` in self
* Update decorator to correctly handle compile/export
* Remove is_torchdynamo_compiling constraint, all work fine with self attribute assignment
* Typing
* GPT NeoX
* Fixup
* Fix attribute Granite
* Fix return type mixtral
* Update Gemma3
* Fix Cohere amd Cohere2
* Fixup
* Fix corner case for Phi4, when activation is shared
* (fix-copies) deepseekv3, phi4
* Fixup
* Apply to qwen3/qwen3_moe
* Fix
2025-03-31 16:23:37 +01:00
Cyril Vallez
f304318f5f
Remove low_cpu_mem_usage and _fast_init ( #36963 )
...
* Remove low_cpu_mem_usage and _fast_init
* Update deepspeed.py
* Update modeling_utils.py
* remove the first 2 tests everywhere
* Update test_modeling_common.py
* remove what was remaining about fast_init
* fix logic and simplify
* mismatched keys logic update
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* fix 2 models init_weights
* extend to others
* remove grad
* Update modeling_fsmt.py
* init weights in tests
* style
* Update test_modeling_fsmt.py
* more old models
* fix more init_weights
* copies
* fix
* style
* Update modeling_lxmert.py
* fix inits
* more and more
* more
* should finalize
* style
* Update modeling_dinov2_with_registers.py
* fix
* Update modeling_encoder_decoder.py
* fix
* style
* Update modeling_lxmert.py
* post rebase cleanup
* Update modeling_informer.py
* back to start for device
* fix
* add test to detect all failing cases correctly
* Update test_modeling_common.py
* fix
* fix
* sam
* style
* Update modeling_maskformer_swin.py
* CIs
* CIs
* remove test - will add it on separate PR
* fix
* fix
* Update modeling_sam.py
* CIs
* CIs
* CIs
* convnext
* suggestions
* CIs
* fix copies after merge
---------
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-03-31 17:18:43 +02:00
Raushan Turganbay
8805600406
[qwen3] fix generation tests ( #37142 )
...
* do not skip tests
* fix qwen3-moe as well
* fixup
* fixup
2025-03-31 16:33:41 +02:00
Zhen
e686fed635
[Feature] Support using FlashAttention2 on Ascend NPU ( #36696 )
...
* [Feature] Support using flash-attention on Ascend NPU
* Fix qwen3 and qwen3_moe moduler conversion mismatch
2025-03-31 16:12:58 +02:00
Yih-Dar
a03cee7a1d
skip ( #37141 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-31 15:38:40 +02:00
Guang Yang
3b07ca78bb
Export T5 (encoder-decoder) to ExecuTorch ( #36486 )
...
Co-authored-by: Guang Yang <guangyang@fb.com>
2025-03-31 12:10:26 +02:00
Fanli Lin
475664e2c6
[tests] remove cuda-only test marker in AwqConfigTest
( #37032 )
...
* enable on xpu
* add xpu support
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-31 11:53:02 +02:00
Armaghan Shakir
0710e9b1e8
Create and Expose SamVisionModel as public for better accessibility ( #36493 )
...
* move encoder below
* auto modeling
* write SamVisionTester
* fix vision attention shape
* fix SamVisionTest
* minor changes to SamVisionTest
* Revert "fix vision attention shape"
This reverts commit d2a4083ae5
.
* fix attention output shape in new tests
* remove encoder examples
* run modular on got_ocr2
* code formatting
* fix got_ocr2
* ruff fixes
* code quality
* add sam_vision in auto modeling and auto configuration
* remove composite test
* updated index.md
* add TFSamVisionEncoder to __init__
* fix public TFSamVisionEncoder
* remove outdated todo comment
* set test_torch_exportable
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* rename: VisionEncoder -> VisionModel
* bring back original SamVisionEncoder
* rename back: VisionEncoderOutput -> VisionModelOutput
* undo changes in SamModelTester
* reuse SamVisionEncoder in SamVisionModel
---------
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-03-31 11:45:07 +02:00
cyyever
f99c279d20
Remove deprecated code ( #37059 )
...
* Remove deprecated code
* fix get_loading_attributes
* fix error
* skip test
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-03-31 11:15:35 +02:00
Robin Kahlow
d1efaf0318
RWKV: fix mask warning typo ( #37114 )
...
rwkv: fix mask warning typo
2025-03-31 11:07:51 +02:00
Thien Tran
19919689b2
Fix Gemma3 embedding scaling ( #37109 )
...
fix gemma3 embedding
2025-03-31 11:04:02 +02:00
huismiling
d0b65bb479
[MLU] Fix FA2 check error, remove deepspeed-mlu deps. ( #36159 )
...
* add Cambricon MLUs support
* fix mlu device rng state
* up for quality check
* up mlu to support fp16
* fix mlu device dependency error
* fix mlu device dependency error
* enable mlu device for bf16
* fix mlu device memory tracker
* Cambricon support SDPA and flash_attn
* MLU devices : Checks if `mlu` is available via an `cndev-based` check which won't trigger the drivers and leave mlu
* Fix mlu FA2 check. Remove deepspeed-mlu check. add mlu tests support.
* fix testing errors.
* Merge branch 'hf/main' into main
* fix get_device_count error.
* fix mlu testing utils.
* fix code quality and style.
* switch to @require_torch_multi_accelerator
2025-03-31 11:02:49 +02:00
jiqing-feng
ad63d20dff
fix whisper re-compile ( #36712 )
...
* fix whisper re-compile
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix copy
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix comment
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix copies
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* revert useless changes
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-31 11:01:51 +02:00
jiqing-feng
286393fbb1
enable tp on CPU ( #36299 )
...
* enable tp on CPU
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* get rank from cpu
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* update
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* enable TP tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix comment
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* em print
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix model id
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix conflict
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix index and add doc
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-03-31 10:55:47 +02:00
Qubitium-ModelCloud
4705b04c74
Fix 4090/ada not detected as having FP8 support ( #37067 )
...
fix 4090/ada not detected as having FP8 support
Signed-off-by: Qubitium <qubitium@modelcloud.ai>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-03-31 10:53:48 +02:00
efsotr
2b4734bd49
Support passing flash_attn_kwargs when gradient_checkpointing is enabled ( #37037 )
...
* support passing flash_attn_kwargs when gradient_checkpointing is enabled
* make modeling_deepspeek_v3.py consistent with modular_deepseek_v3.py
2025-03-31 10:53:02 +02:00
Yuan Wu
bd41b9c1ac
Gaudi: Fix the pipeline failed issue with hpu device ( #36990 )
...
* Gaudi: fix the issue of is_torch_hpu_available() returns false
Signed-off-by: yuanwu <yuan.wu@intel.com>
* Fix make fixup
Signed-off-by: yuanwu <yuan.wu@intel.com>
* Add comments for the implicit behavior of import
Signed-off-by: yuanwu <yuan.wu@intel.com>
* Update src/transformers/utils/import_utils.py
* Update src/transformers/utils/import_utils.py
---------
Signed-off-by: yuanwu <yuan.wu@intel.com>
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
2025-03-31 10:23:47 +02:00
Bo Zheng
6acd5aecb3
Adding Qwen3 and Qwen3MoE ( #36878 )
...
* Initial commit for Qwen3
* fix and add tests for qwen3 & qwen3_moe
* rename models for tests.
* fix
* fix
* fix and add docs.
* fix model name in docs.
* simplify modular and fix configuration issues
* Fix the red CI: ruff was updated
* revert ruff, version was wrong
* fix qwen3moe.
* fix
* make sure MOE can load
* fix copies
---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2025-03-31 09:50:49 +02:00
MinJu-Ha
0d6a60fe55
🌐 [i18n-KO] Translated qwen2_vl.md
to Korean ( #36750 )
...
* fix: manual edits
* fix: resolve suggestions
* Update toctree.yml
2025-03-30 15:00:27 -07:00
Yih-Dar
b7fc2daf8b
Kenlm ( #37091 )
...
* kenlm
* kenlm
* kenlm
* kenlm
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-28 21:42:54 +01:00