eustlb
fb8e6c50e4
[audio utils] fix fft_bin_width computation ( #36603 )
...
* fix fft_bin_width computation
* update docstring + enforce correct params
* update test with correct value
* udpate test
* update feature extractors for concerned models
* update
* make
* udpate docstring
* udpate docstring
2025-03-27 15:20:02 +01:00
Raushan Turganbay
e97c760006
[chat templates} support loading audio from video ( #36955 )
...
* add audio from video
* typos
* delete print
* comments
2025-03-27 14:46:11 +01:00
Pavel Iakubovskii
c7bc79bd2a
Fixup for distill_any_depth conversion script ( #37043 )
...
* Fixup
* trigger
2025-03-27 13:29:25 +00:00
Sungyoon Jeong
d1eafe8d4e
Optimize to_py_obj
for python-native numeric lists and scalars ( #36885 )
...
* Optimize to_py_obj for python-native numeric lists and scalars
* Fix bug that tuple is not converted to list
* Try np.array for more robust type checking
* Apply review and add tests for to_py_obj
2025-03-27 14:16:46 +01:00
jiqing-feng
0e56fb69a2
fix pegasus init weights and other copied models ( #36844 )
...
* fix pegasus init weights
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix the rest of models
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix test
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix informer init
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* init weight before checking
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix roformer tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix roformer tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-03-27 14:14:30 +01:00
Parteek
7e813f9cf0
Add Distill Any Depth ( #36614 )
...
* Added conversion Script
* Update src/transformers/models/depth_anything/convert_distill_any_depth_to_hf.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Updated Conversion Script
* Update src/transformers/models/depth_anything/convert_distill_any_depth_to_hf.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
---------
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-03-27 13:10:03 +00:00
Mohamed Mekkouri
92429057d9
Skip FP8 linear tests For device capability < 9.0( #37008 )
...
* skip fp8 linear
* add capability check
* format
2025-03-27 12:38:37 +01:00
hoshi-hiyouga
279c2e302a
remove redundant code in trainer ( #36994 )
...
* Update optimization.py
* Update optimization.py
2025-03-27 11:35:15 +01:00
Yih-Dar
d13c390d01
Mark 2 tests as flaky for now ( #37038 )
...
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-27 10:59:47 +01:00
Kyle Sayers
d6d930a64b
[Modeling] Load FP8 safetensors such as DeepSeek ( #36828 )
...
support loading fp8
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-03-27 10:47:10 +01:00
Michael Goin
927ce1d39f
Fix PixtralProcessor patch_size when spatial_merge_size is used ( #37019 )
2025-03-27 10:46:23 +01:00
Abu Bakr Soliman
49b5ab6a27
Support QuestionAnswering Module for ModernBert based models. ( #35566 )
...
* push ModernBertForQuestionAnswering
* update ModernBertForQuestionAnswering
* update __init__ loading
* set imports for ModernBertForQuestionAnswering
* update ModernBertForQuestionAnswering
* remove debugging logs
* update init_weights method
* remove custom initialization for ModernBertForQuestionAnswering
* apply make fix-copies
* apply make style
* apply make fix-copies
* append ModernBertForQuestionAnswering to the pipeline supported models
* remove unused file
* remove invalid autoload value
* update en/model_doc/modernbert.md
* apply make fixup command
* make fixup
* Update dummies
* update usage tips for ModernBertForQuestionAnswering
* update usage tips for ModernBertForQuestionAnswering
* add init
* add lint
* add consistency
* update init test
* change text to trigger stuck text
* use self.loss_function instead of custom loss
By @Cyrilvallez
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
* Update modeling_modernbert.py
make comparable commit to even it out
* Match whitespace
* whitespace
---------
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Orion Weller <wellerorion@gmail.com>
Co-authored-by: Orion Weller <31665361+orionw@users.noreply.github.com>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2025-03-26 21:24:18 +01:00
Yao Matrix
5b08db8844
fix transformers_cli import relative path issue ( #36989 )
...
* fix transformers_cli relative import path issue
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
* fix style
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
---------
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-26 18:45:56 +00:00
Steven Liu
3a8ec8c467
[docs] Attention mask image ( #36970 )
...
add image
2025-03-26 10:11:34 -07:00
cyyever
2b550c47b2
Remove deprecated training arguments ( #36946 )
...
* Remove deprecated training arguments
* More fixes
* More fixes
* More fixes
2025-03-26 16:44:48 +00:00
Afanti
44715225e3
fix typos in the code comments and error messages ( #36993 )
...
* chore: enhance code comments
* chore: enhance code comments
* chore: enhance code comments
* chore: enhance code comments
* chore: enhance code comments
* chore: enhance code comments
* chore: enhance code comments
2025-03-26 16:09:48 +00:00
Marc Sun
79d6f9fd70
Log the correct learning rate ( #36973 )
...
* fix learning rate log
* fix lr log
* add lr
2025-03-26 16:52:00 +01:00
Mohamed Mekkouri
13d36e89fe
Fix device_map check for ggml files ( #37003 )
...
fix
2025-03-26 16:24:57 +01:00
Josh Marshall
021006e1b0
Fix removing "cpu" from frozenset in bitsandbytes.py to allow better ROCm support. ( #36975 )
...
* Fix removing "cpu" from frozenset in bitsandbytes.py to allow better ROCm support.
Related to https://github.com/bitsandbytes-foundation/bitsandbytes/issues/1573 and https://github.com/huggingface/transformers/issues/36949 , this resolves a bug in allowing ROCm/HIP support in bitsandbytes.
* Related to bitsandbytes-foundation/bitsandbytes#1573 and huggingface#36949 , this resolves a bug in the biteandbytes integration, allowing ROCm/HIP support in bitsandbytes.
---------
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-03-26 16:18:08 +01:00
Cyril Vallez
788e1092e9
Allow easy registration of custom attention functions ( #36889 )
...
* Update modeling_utils.py
* style
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* add to init
* Update modeling_utils.py
* style
* update
* Update modeling_utils.py
* Update modeling_utils.py
* style
* Add some doc
* Update _toctree.yml
* readd it for tgi/vllm compat
* CIs
* CIs
2025-03-26 16:15:06 +01:00
ivarflakstad
ad5d40de9c
Fix get_device_properties ( #36997 )
...
Fix remove remnant self from get_device_properties
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-03-26 15:46:34 +01:00
cyyever
8084b26294
Fix Optional type annotation ( #36841 )
...
* Fix annotation
* Update src/transformers/generation/candidate_generator.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-26 13:53:44 +00:00
Yih-Dar
b56d8f07e4
Install networkx==3.2.1
manually in some CircleCI jobs after #36957 ( #37000 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-26 14:49:09 +01:00
cyyever
78afa1c537
Use torch.expm1 ( #36995 )
2025-03-26 13:06:33 +00:00
Yih-Dar
181d453069
byebye CircleCI TF jobs ( #36998 )
...
* byebye tf jobs
* byebye tf jobs
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-26 12:49:50 +01:00
cyyever
e7139d06f5
Fix tensor dtype mismatch ( #36985 )
...
* Fix tensor dtype mismatch
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-26 10:37:46 +01:00
Yoni Gozlan
be37d34f44
🚨 Deprecate legacy argument for image-text-to-text models and adopt new behavior by default ( #36307 )
...
* deprecate legacy argument and adopt new behavior by default
* revert back modification git
2025-03-25 17:32:17 -04:00
Yih-Dar
ab4656f6b7
update bot comment again ( #36974 )
...
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-25 19:42:09 +01:00
cyyever
ba531278ca
Add ruff target-version ( #36971 )
2025-03-25 19:41:25 +01:00
Steven Liu
a844297088
[docs] Fix image link ( #36869 )
...
* fix image link
* fix
* update
* fix
2025-03-25 11:34:21 -07:00
cyyever
d68a91aebf
Remove extra tensor clone in PyTorch code ( #36748 )
...
* Use detach().clone()
* Eliminate continuous()
* Merge clone and other calls with to
* Merge clone and other calls with to
2025-03-25 17:42:15 +00:00
Yih-Dar
121830ab47
update examples after ruff being updated ( #36972 )
...
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-25 18:15:47 +01:00
Sai-Suraj-27
a41677a68b
Updated docker files to use uv
for installing packages ( #36957 )
...
* Updated docker files to use uv pip install as uv is blazingly fast.
* Removed -y flag for uv pip uninstall.
* Passed --no-build-isolation flag
---------
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-03-25 18:12:51 +01:00
NargiT
3dce98a437
typo fixed in README_fr.md ( #36951 )
2025-03-25 09:29:36 -07:00
湛露先生
ebd2029483
Change GPUS to GPUs ( #36945 )
...
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-03-25 17:25:39 +01:00
Yih-Dar
69632aadb7
Update after #36962 ( #36965 )
...
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-25 16:16:06 +01:00
Yih-Dar
c6814b4ee8
Update ruff to 0.11.2
( #36962 )
...
* update
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-25 16:00:11 +01:00
Joao Gante
bc1c90a755
[Utils] torch version checks optionally accept dev versions ( #36847 )
2025-03-25 10:58:58 +00:00
Marc Sun
80b4c5dcc9
Fix cuda index issue in cache allocator ( #36937 )
...
fix
2025-03-25 11:51:41 +01:00
Raushan Turganbay
0f733110a6
Support return_tensors
in audio chat templates ( #34601 )
...
* add audio chat templates
* update
* update
* nit
* green ci
* we dont care about the order anymore
* clean up after rebase
* overriden tests rename
* rename shieldgemma also
* one more rename
* require_read_token
* removde images/videos
* retrigger CI flaky
2025-03-25 11:08:47 +01:00
Afanti
19085c28da
fix typos in the tests directory ( #36932 )
...
* chore: fix typos in test codes
* chore: fix typos in test codes
* chore: fix typos in test codes
* chore: fix typos in test codes
* chore: fix typos in test codes
* chore: fix typos in test codes
* chore: fix typos in test codes
* chore: fix typos in test codes
* chore: format codes
2025-03-25 10:49:24 +01:00
Guang Yang
69bcb86c58
Export for Phi4-mini ( #36780 )
...
* Export for Phi4-mini
* Update tests/models/phi3/test_modeling_phi3.py
---------
Co-authored-by: Guang Yang <guangyang@fb.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-03-25 10:46:38 +01:00
Mohamed Mekkouri
be2c0e7bff
Fixing _pre_quantization_dtype when torch_dtype is None ( #36930 )
...
fix
2025-03-25 10:43:27 +01:00
Cyril Vallez
4303d88c09
Add Phi4 multimodal ( #36939 )
...
* raw start
* update
* update
* add to imports
* update
* up
* simplify configs
* clean configs
* style
* typos
* Update convert_phi4_multimodal_weights_to_hf.py
* Update convert_phi4_multimodal_weights_to_hf.py
* fix
* up
* up
* up
* Update convert_phi4_multimodal_weights_to_hf.py
* Update convert_phi4_multimodal_weights_to_hf.py
* up
* up
* up
* Update feature_extraction_phi4_multimodal.py
* up
* up
* up
* up
* up
* simplify configs
* typo
* cut code
* typo
* typo
* typo
* re
* typo
* up
* up
* up
* add tests
* fix
* fix
* Update test_modeling_phi4_multimodal.py
* up
* Update test_modeling_phi4_multimodal.py
* doc
* fix
* up
* up
* up
* up
* up
* up
* simplify
* up
* simplify
* config docstrings
* cleanup
* clean
* typo
* typo
* fix
* Update phi4_multimodal.md
* fix
* fix
* Update test_modeling_phi4_multimodal.py
* update
* simplify reshapes and permutes
* up
* simplify special tokens
* simplify processor a lot
* Update processing_phi4_multimodal.py
* Update processing_phi4_multimodal.py
* switch to fast processor
* image processor
* Update image_processing_phi4_multimodal_fast.py
* add lora extraction to converter
* Update convert_phi4_multimodal_weights_to_hf.py
* Update __init__.py
* add AudioInput type in audio_utils
* rewrite feature_extraction: support torch batched FFT
* input_audio_embeds -> audio_input_features, input_image_embeds -> image_pixel_values
* test update
* not mono channel warning update
* remove auto maps from processor
* kargs dispatch in processor
* simplify kwargs dispatch
* simplify merging
* remove default sampling rate
* style
* Update test_modeling_phi4_multimodal.py
* update doc
* doc
* torch only feature extractor
* make fake tokens adjustable
* Update feature_extraction_phi4_multimodal.py
* fix
* Update processing_phi4_multimodal.py
* simplify mask
* last touch
* fix copies
* style
* Update audio_utils.py
* style
* Update feature_extraction_phi4_multimodal.py
* Update __init__.py
* docstrings
* copies
* fix all checks
* back to fix-copies
* trigger CIs
* Update feature_extraction_phi4_multimodal.py
* improve tests with multimodal inputs
* trigger CIs
---------
Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com>
2025-03-25 09:55:21 +01:00
Raushan Turganbay
47e5432805
Deprecate #36741 and map Causal to Conditional ( #36917 )
...
* deprecate the prev fix
* reword warning and update docs
* reword warning
* tests
* dont bloat `get_text_config()`
2025-03-25 09:13:56 +01:00
Mohamed Mekkouri
2b8a15cc3f
Disallow Offload to disk for gguf files ( #36933 )
...
update
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-24 19:30:01 +01:00
Yoni Gozlan
91455c1825
Fix processor kwargs qwen2 vl ( #36890 )
...
* Fix qwen2_vl and qwen2_5_vl processors cutom images kwargs
* change version warning
2025-03-24 13:19:26 -04:00
gautham
48385aa4f4
Added support for seed in DataCollatorForWholeWordMask
( #36903 )
...
* Added support for seed in `DataCollatorForWholeWordMask`, and also wrote tests.
Also fixed bugs where the code hardcoded values for mask replacement probability and random replacement probability, instead of using the values passed by the user.
* formatting issues
* Used better way to generate seed in TF. Made tests more consistent.
2025-03-24 16:57:17 +00:00
Yih-Dar
5932606d8e
More precise comment ( #36935 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-24 17:03:09 +01:00
Pavel Iakubovskii
2be2984462
Fix pytorch defomr attn path ( #36923 )
...
* Fix pytorch path for DeformableAttention
* Apply for GroundingDino
2025-03-24 15:58:51 +00:00