Commit Graph

19383 Commits

Author SHA1 Message Date
youngrok cha
a5cc7a67d7
[bug] fix llava processor to calculate unpadding size correctly (#37988)
* fix llava processor to calculate unpad size correctly

* repo consistency

* Revert "repo consistency" & "setUp in llava family"

This reverts commit 26a50af8db.

* add edge case test for padding & unpadding

* compute unpadding size from original size

* make test config explicit

* Revert "compute unpadding size from original size"

This reverts commit 752cd27ad9.

* Revert "add edge case test for padding & unpadding"

This reverts commit ccbd094d69.

* revert unpad logic

* remove irrelevant tests

* model test

* remove processor from model test

---------

Co-authored-by: jaycha <jaycha@ncsoft.com>
2025-05-13 13:49:09 +00:00
Chris
67b3d45eb6
Fix past_key_values type hint in model output types (#37953)
* F: Fix type hint.

* F: Use Cache type.

* F: Sort import.

* U: Format.

* U: Address reviews.
2025-05-13 13:36:49 +00:00
Eva Koroleva
07feaad8fb
Fix bug in prefill_chunk_size that ignores disable_compile flag (#38067)
Fix bug in prefill_chunk_size implementation that ignores disable_compile flag
2025-05-13 13:23:23 +00:00
Raushan Turganbay
e40f301f1f
[smolvlm] skip the test (#38099)
skip the test
2025-05-13 12:50:43 +00:00
ivarflakstad
e27d230ddd
Disable report callbacks for certain training tests (#38088)
* Disable report callbacks for certain training tests

* Disable report callbacks for test_auto_batch_size_finder
2025-05-13 14:49:55 +02:00
Bongseok Lee
ab65ba47ad
fix: Propagate lr_scheduler_kwargs options to create LR Scheduler when LayerWiseDummyOptimizer is used (#34559)
fix: fix get_scheduler
2025-05-13 13:56:45 +02:00
Fanli Lin
8fb60bf6be
add timeout for downloading the librispeech_asr dataset (#38073)
* add timeout

* change 10 to 60
2025-05-13 11:50:12 +01:00
Yih-Dar
3ad35d0bca
update require_read_token (#38093)
* update require_read_token

* new repo

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-13 12:07:07 +02:00
Yoni Gozlan
e3b70b0d1c
Refactor image processor phi4 (#36976)
* refactor image processor phi4

* nits fast image proc

* add image tests phi4

* Fix image processing tests

* update integration tests

* remove revision and add comment in integration tests
2025-05-12 15:13:40 -04:00
Yih-Dar
4143f94d51
uninstall kernels from docker images (#38083)
uninstall kernels

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-12 18:03:47 +02:00
Shiyu
a63cb7578e
update seed_worker to set seed based on worker_id and rank (#37980)
* update seed_worker to set seed based on worker_id and rank

* test case

* set output_dir as remove tmp dir
2025-05-12 15:59:16 +00:00
efsotr
e387821a96
Fix tot update in trainer (#37923)
* fix total updates in epoch

* add test; fix max_steps

* replace with multi-gpu decorator
2025-05-12 17:45:24 +02:00
Weipeng Jiang
f0e975c6cf
fix the inconsist docstring in apply_chat_template (#38069)
The commit (5cf11e5ab9) fixed the type hints for the parameter `tools` in apply_chat_template, but the docstring was not changed.
2025-05-12 16:32:01 +01:00
Junlin Zhou
31791b16a1
chore(qwen2): display warning log only when sliding window attention … (#36316)
* chore(qwen2): display warning log only when sliding window attention is enabled

* Align modeling_qwen2.py and modular_qwen2.py

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-05-12 16:31:44 +01:00
ivarflakstad
8ea72d12a2
Fix mt5 test on AMD devices (#38081) 2025-05-12 16:59:00 +02:00
谭九鼎
5c85018072
docs: fix md style (#38057) 2025-05-12 15:56:31 +01:00
ivarflakstad
7eaa90b87b
Add AMD expectation to test_gpt2_sample (#38079) 2025-05-12 16:51:21 +02:00
Pavel Iakubovskii
4220039b29
Fix OneFormer integration test (#38016)
* Fix integration tests

* format
2025-05-12 16:02:41 +02:00
Joao Gante
8efe3a9d77
[chat] generate parameterization powered by GenerationConfig and UX-related changes (#38047)
* accept arbitrary kwargs

* move user commands to a separate fn

* work with generation config files

* rm cmmt

* docs

* base generate flag doc section

* nits

* nits

* nits

* no <br>

* better basic args description
2025-05-12 14:04:41 +01:00
Raushan Turganbay
a5c6172c81
[VLM] fix loading issues (#38051)
* fix qwen2-vl loading

* fix a few nore models

* delete print

* fix copies
2025-05-12 10:14:04 +00:00
Raushan Turganbay
a31fa218ad
🔴 Video processors as a separate class (#35206)
* initial design

* update all video processors

* add tests

* need to add qwen2-vl (not tested yet)

* add qwen2-vl in auto map

* fix copies

* isort

* resolve confilicts kinda

* nit:

* qwen2-vl is happy now

* qwen2-5 happy

* other models are happy

* fix copies

* fix tests

* add docs

* CI green now?

* add more tests

* even more changes + tests

* doc builder fail

* nit

* Update src/transformers/models/auto/processing_auto.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* small update

* imports correctly

* dump, otherwise this is getting unmanagebale T-T

* dump

* update

* another update

* update

* tests

* move

* modular

* docs

* test

* another update

* init

* remove flakiness in tests

* fixup

* clean up and remove commented lines

* docs

* skip this one!

* last fix after rebasing

* run fixup

* delete slow files

* remove unnecessary tests + clean up a bit

* small fixes

* fix tests

* more updates

* docs

* fix tests

* update

* style

* fix qwen2-5-vl

* fixup

* fixup

* unflatten batch when preparing

* dump, come back soon

* add docs and fix some tests

* how to guard this with new dummies?

* chat templates in qwen

* address some comments

* remove `Fast` suffix

* fixup

* oops should be imported from transforms

* typo in requires dummies

* new model added with video support

* fixup once more

* last fixup I hope

* revert image processor name + comments

* oh, this is why fetch test is failing

* fix tests

* fix more tests

* fixup

* add new models: internvl, smolvlm

* update docs

* imprt once

* fix failing tests

* do we need to guard it here again, why?

* new model was added, update it

* remove testcase from tester

* fix tests

* make style

* not related CI fail, lets' just fix here

* mark flaky for now, filas 15 out of 100

* style

* maybe we can do this way?

* don't download images in setup class

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-05-12 11:55:51 +02:00
Arjuna Sky Kok
716819b830
fix(conversion): Fix size mismatch error during TF->PT model loading (#38014) 2025-05-10 11:11:07 +00:00
Yao Matrix
8f08318769
enable generation fsdp/utils cases on XPU (#38009)
* enable generation fsdp/utils test cases on XPU

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* xx

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* use backend_xx APIs

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

---------

Signed-off-by: Yao Matrix <matrix.yao@intel.com>
2025-05-09 20:52:41 +00:00
Pavel Iakubovskii
87e971e14d
Fix linalg.norm for CovnNextV2 (#38015)
Fix norm
2025-05-09 17:44:28 +01:00
Cyril Vallez
aaed2f5577
Fix cache update! (#38046)
* fix slicing

* better fix
2025-05-09 17:54:48 +02:00
Mikhail Moskovchenko
7f1a97bae3
Fix reduce-labels in BEIT Fast Image Processor (#38042)
* Fixed reduce-labels

* Little doc fix

* Change docstring
2025-05-09 11:51:46 -04:00
Yih-Dar
9f9020fed3
Re-Enable Trigger CircleCI via GitHub Actions when "ready for review" (#37885) (#38041)
* check actions

* trigger CI

* check actions

* finally

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-09 16:57:54 +02:00
Lysandre Debut
23d79cea75
Support for version spec in requires & arbitrary mismatching depths across folders (#37854)
* Support for version spec in requires & arbitrary mismatching depths

* Quality

* Testing
2025-05-09 15:26:27 +02:00
François REMY
774dc274ac
Do not erase a cache_position passed explicitly to generate(), if there is one (#37986)
Do not erase a cache_position initialization passed explicitly to generate(), if there is one.

But: Let initialization replace cache_position if it's set to None. I assume that if the value is explicitly passed but None, we should initialize anyway.
2025-05-09 10:56:21 +00:00
Yih-Dar
0010b41524
Disable Trigger CircleCI via GitHub Actions when ready for review` (#38038)
disable

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-09 12:27:53 +02:00
Yih-Dar
d498528800
Trigger CircleCI via GitHub Actions when ready for review (#37885)
* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-09 11:45:03 +02:00
Yih-Dar
66e696ee15
[Temporary] Log some information in some pytest/pluggy internal places (#37996)
log pytest info

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-09 11:06:37 +02:00
Yao Matrix
a72cb31434
enable utils test cases on XPU (#38005)
* enable utils test cases on XPU

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* Update tests/utils/test_skip_decorators.py

Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>

* fix comment

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

---------

Signed-off-by: Yao Matrix <matrix.yao@intel.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
2025-05-09 08:45:01 +02:00
Yao Matrix
1dfad4beb2
make mistral3 pass on xpu (#37882)
* enabled mistral3 test cases on XPU

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* calibrate A100 expectation

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* update

* update

* update

* update

* update

* update

---------

Signed-off-by: Yao Matrix <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-09 06:41:11 +00:00
Wing Lian
121f7037c7
fix document masking for chunked attention (#37429)
* fix document masking for chunked attention

* remove accidental debugging sum
2025-05-09 08:22:00 +02:00
Arthur
5f5ccfdc54
[AutoDocstring] Based on inspect parsing of the signature (#33771)
* delete common docstring

* nit

* updates

* push

* fixup

* move stuff around fixup

* no need for dataclas

* damn nice modular

* add auto class docstring

* style

* modular update

* import autodocstring

* fixup

* maybe add original doc!

* more cleanup

* remove class do cas well

* update

* nits

* more celanup

* fix

* wups

* small check

* updatez

* some fixes

* fix doc

* update

* nits

* try?

* nit

* some updates

* a little bit better

* where ever we did not have help we are not really adding it!

* revert llama config

* small fixes and small tests

* test

* fixup

* more fix-copies

* updates

* updates

* fix doc building

* style

* small fixes

* nits

* fix-copies

* fix merge issues faster

* fix merge conf

* nits jamba

* ?

* working autodoc for model class and forward except returns and example

* support return section and unpack kwargs description

* nits and cleanup

* fix-copies

* fix-copies

* nits

* Add support for llava-like models

* fixup

* add class args subset support

* add examples inferred from automodel/pipelines

* update ruff

* autodocstring for Aria, Albert + fixups

* Fix empty return blocks

* fix copies

* fix copies

* add autodoc for all fast image processors + align, altclip

* fix copies

* add auto_doc for audio_spectrogram, auto_former, bark, bamba

* Drastically improve speed + add bart beit bert

* add autodoc to all bert-like models

* Fix broken doc

* fix copies

* fix auto_docstring after merge

* add autodoc to models

* add models

* add models

* add models and improve support for optional, and custom shape in args docstring

* update fast image processors

* refactor auto_method_docstring in args_doc

* add models and fix docstring parsing

* add models

* add models

* remove debugging

* add models

* add fix_auto_docstrings and improve args_docs

* add support for additional_info in args docstring

* refactor (almost) all models

* fix check docstring

* fix -copies

* fill in all missing docstrings

* fix copies

* fix qwen3 moe docstring

* add documentation

* add back labels

* update docs and fix can_return_tuple in modular files

* fix LongformerForMaskedLM docstring

* add auto_docstring to _toctree

* remove auto_docstring tests temporarily

* fix copyrights new files

* fix can_return_tuple granite hybrid

* fix fast beit

* Fix empty config doc

* add support for COMMON_CUSTOM_ARGS in check_docstrings and add missing models

* fix code block not closed flava

* fix can_return_tuple sam hq

* Fix Flaubert dataclass

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2025-05-08 17:46:07 -04:00
jiqing-feng
d231f5a7d4
update bnb tests (#38011)
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-05-08 20:35:24 +00:00
Yao Matrix
b3db4ddb22
enable mamba2 integration cases on xpu (#38006)
* enable mamba2 integration cases on XPU

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

---------

Signed-off-by: Yao Matrix <matrix.yao@intel.com>
2025-05-08 19:48:09 +00:00
Fanli Lin
c7c2f08994
make test_speculative_decoding_non_distil device-agnostic (#38010)
* make device-agnostic

* use condition

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-05-08 19:19:47 +00:00
Raushan Turganbay
d23aae2b8c
[VLMs] support attention backends (#37576)
* update models

* why rename

* return attn weights when sdpa

* fixes

* fix attn implementation composite

* fix moshi

* add message

* add typings

* use explicitly all flags for each attn type

* fix some tests

* import what is needed

* kosmos on main has ew attention already, yay

* new models in main, run fixup

* won't fix kosmos yet

* fix-copies

* clean up after rebasing

* fix tests

* style

* dont cast attns to fp32

* did we update ruff? oke, let's just do what it asks

* fix pixtral after rebase
2025-05-08 18:18:54 +02:00
Tomek
e296c63cd4
Fix wording in torchscript.md (#38004)
Fix wording in torchscript.md
2025-05-08 16:47:45 +01:00
Yufeng Xu
1c65aef923
Fix incorrect installation instructions (for issue #37476) (#37640)
* debugging issue 36758

* debugging issue 36758

* debugging issue 36758

* updated attn_mask type specification in _flash_attention_forward

* removed pdb

* added a blank line

* removed indentation

* update constants

* remove unnecessary files

* created installation script, modified README

* modified requirements and install.sh

* undo irrelevant changes

* removed blank line

* fixing installation guide

* modified README, python requirements, and install script

* removed tests_otuput

* modified README

* discarded installation script and python<3.13 requirement
2025-05-08 16:32:58 +01:00
Yih-Dar
f2909e024c
Skip test_push_to_hub_with_saves_each_epoch for now (#38022)
* update

* trigger CI

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-08 16:26:24 +02:00
Joao Gante
f2b59c6173
[caches] Raise exception on offloaded static caches + multi device (#37974)
* skip tests on >1 gpu

* add todo
2025-05-08 14:37:36 +01:00
Joao Gante
4279057d70
[CI] remove duplicated message on GH comment to run slow tests (#37970)
duplicated msg
2025-05-08 14:35:54 +01:00
Yih-Dar
3390534f36
Print commit SHA on slack message for new model notification. (#38019)
add commit info

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-08 15:26:19 +02:00
Pavel Iakubovskii
9f8fffed3c
Fix Optional typing (#38018)
* Fix

* trigger
2025-05-08 14:51:45 +02:00
Yuanyuan Chen
06c16de3d3
Enable RUF013 to enforce optional typing (#37266)
* Enable RUF013 for Optional typing

Signed-off-by: cyy <cyyever@outlook.com>

* Add Optional to types

* Format code

Signed-off-by: cyy <cyyever@outlook.com>

---------

Signed-off-by: cyy <cyyever@outlook.com>
2025-05-08 12:39:56 +02:00
Aurélien Lac
f6664ee713
Add ALL_ATTENTION_FUNCTIONS compatibility for Pixtral model (#37960)
* Add ALL_ATTENTION_FUNCTIONS compatibility for Pixtral model

* Fix invalid operand type

* Allow image_sizes to be optional in forward pass to fit tests

Disallow using sdpa and output_attentions

* Disallow using sdpa with output_attentions

* Delete useless comments, use eager attention from smolvlm, use pattern from mistral

* add _supports_attention_backend

* use kwargs instead of position_ids

---------

Co-authored-by: aurelien.lac <aurelien.lac@lighton.ai>
2025-05-08 12:13:13 +02:00
Sebastiaan Vermeulen
015b6dfbf8
Fix pad image transform for batched inputs (#37544)
* fix

* add batch dimension to expected output
2025-05-08 10:51:15 +01:00