Fanli Lin
8fb60bf6be
add timeout for downloading the librispeech_asr
dataset ( #38073 )
...
* add timeout
* change 10 to 60
2025-05-13 11:50:12 +01:00
Yih-Dar
3ad35d0bca
update require_read_token
( #38093 )
...
* update require_read_token
* new repo
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-13 12:07:07 +02:00
Yoni Gozlan
e3b70b0d1c
Refactor image processor phi4 ( #36976 )
...
* refactor image processor phi4
* nits fast image proc
* add image tests phi4
* Fix image processing tests
* update integration tests
* remove revision and add comment in integration tests
2025-05-12 15:13:40 -04:00
Yih-Dar
4143f94d51
uninstall kernels
from docker images ( #38083 )
...
uninstall kernels
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-12 18:03:47 +02:00
Shiyu
a63cb7578e
update seed_worker to set seed based on worker_id and rank ( #37980 )
...
* update seed_worker to set seed based on worker_id and rank
* test case
* set output_dir as remove tmp dir
2025-05-12 15:59:16 +00:00
efsotr
e387821a96
Fix tot update in trainer ( #37923 )
...
* fix total updates in epoch
* add test; fix max_steps
* replace with multi-gpu decorator
2025-05-12 17:45:24 +02:00
Weipeng Jiang
f0e975c6cf
fix the inconsist docstring in apply_chat_template ( #38069 )
...
The commit (5cf11e5ab9
) fixed the type hints for the parameter `tools` in apply_chat_template, but the docstring was not changed.
2025-05-12 16:32:01 +01:00
Junlin Zhou
31791b16a1
chore(qwen2): display warning log only when sliding window attention … ( #36316 )
...
* chore(qwen2): display warning log only when sliding window attention is enabled
* Align modeling_qwen2.py and modular_qwen2.py
---------
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-05-12 16:31:44 +01:00
ivarflakstad
8ea72d12a2
Fix mt5 test on AMD devices ( #38081 )
2025-05-12 16:59:00 +02:00
谭九鼎
5c85018072
docs: fix md style ( #38057 )
2025-05-12 15:56:31 +01:00
ivarflakstad
7eaa90b87b
Add AMD expectation to test_gpt2_sample ( #38079 )
2025-05-12 16:51:21 +02:00
Pavel Iakubovskii
4220039b29
Fix OneFormer integration test ( #38016 )
...
* Fix integration tests
* format
2025-05-12 16:02:41 +02:00
Joao Gante
8efe3a9d77
[chat
] generate parameterization powered by GenerationConfig
and UX-related changes ( #38047 )
...
* accept arbitrary kwargs
* move user commands to a separate fn
* work with generation config files
* rm cmmt
* docs
* base generate flag doc section
* nits
* nits
* nits
* no <br>
* better basic args description
2025-05-12 14:04:41 +01:00
Raushan Turganbay
a5c6172c81
[VLM] fix loading issues ( #38051 )
...
* fix qwen2-vl loading
* fix a few nore models
* delete print
* fix copies
2025-05-12 10:14:04 +00:00
Raushan Turganbay
a31fa218ad
🔴 Video processors as a separate class ( #35206 )
...
* initial design
* update all video processors
* add tests
* need to add qwen2-vl (not tested yet)
* add qwen2-vl in auto map
* fix copies
* isort
* resolve confilicts kinda
* nit:
* qwen2-vl is happy now
* qwen2-5 happy
* other models are happy
* fix copies
* fix tests
* add docs
* CI green now?
* add more tests
* even more changes + tests
* doc builder fail
* nit
* Update src/transformers/models/auto/processing_auto.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* small update
* imports correctly
* dump, otherwise this is getting unmanagebale T-T
* dump
* update
* another update
* update
* tests
* move
* modular
* docs
* test
* another update
* init
* remove flakiness in tests
* fixup
* clean up and remove commented lines
* docs
* skip this one!
* last fix after rebasing
* run fixup
* delete slow files
* remove unnecessary tests + clean up a bit
* small fixes
* fix tests
* more updates
* docs
* fix tests
* update
* style
* fix qwen2-5-vl
* fixup
* fixup
* unflatten batch when preparing
* dump, come back soon
* add docs and fix some tests
* how to guard this with new dummies?
* chat templates in qwen
* address some comments
* remove `Fast` suffix
* fixup
* oops should be imported from transforms
* typo in requires dummies
* new model added with video support
* fixup once more
* last fixup I hope
* revert image processor name + comments
* oh, this is why fetch test is failing
* fix tests
* fix more tests
* fixup
* add new models: internvl, smolvlm
* update docs
* imprt once
* fix failing tests
* do we need to guard it here again, why?
* new model was added, update it
* remove testcase from tester
* fix tests
* make style
* not related CI fail, lets' just fix here
* mark flaky for now, filas 15 out of 100
* style
* maybe we can do this way?
* don't download images in setup class
---------
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-05-12 11:55:51 +02:00
Arjuna Sky Kok
716819b830
fix(conversion): Fix size mismatch error during TF->PT model loading ( #38014 )
2025-05-10 11:11:07 +00:00
Yao Matrix
8f08318769
enable generation fsdp/utils cases on XPU ( #38009 )
...
* enable generation fsdp/utils test cases on XPU
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
* fix style
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
* xx
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
* use backend_xx APIs
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
* fix style
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
---------
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
2025-05-09 20:52:41 +00:00
Pavel Iakubovskii
87e971e14d
Fix linalg.norm for CovnNextV2 ( #38015 )
...
Fix norm
2025-05-09 17:44:28 +01:00
Cyril Vallez
aaed2f5577
Fix cache update! ( #38046 )
...
* fix slicing
* better fix
2025-05-09 17:54:48 +02:00
Mikhail Moskovchenko
7f1a97bae3
Fix reduce-labels in BEIT Fast Image Processor ( #38042 )
...
* Fixed reduce-labels
* Little doc fix
* Change docstring
2025-05-09 11:51:46 -04:00
Yih-Dar
9f9020fed3
Re-Enable Trigger CircleCI via GitHub Actions when "ready for review" (
#37885 ) ( #38041 )
...
* check actions
* trigger CI
* check actions
* finally
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-09 16:57:54 +02:00
Lysandre Debut
23d79cea75
Support for version spec in requires & arbitrary mismatching depths across folders ( #37854 )
...
* Support for version spec in requires & arbitrary mismatching depths
* Quality
* Testing
2025-05-09 15:26:27 +02:00
François REMY
774dc274ac
Do not erase a cache_position passed explicitly to generate(), if there is one ( #37986 )
...
Do not erase a cache_position initialization passed explicitly to generate(), if there is one.
But: Let initialization replace cache_position if it's set to None. I assume that if the value is explicitly passed but None, we should initialize anyway.
2025-05-09 10:56:21 +00:00
Yih-Dar
0010b41524
Disable Trigger CircleCI via GitHub Actions when
ready for review` ( #38038 )
...
disable
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-09 12:27:53 +02:00
Yih-Dar
d498528800
Trigger CircleCI via GitHub Actions when ready for review
( #37885 )
...
* update
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-09 11:45:03 +02:00
Yih-Dar
66e696ee15
[Temporary] Log some information in some pytest/pluggy internal places ( #37996 )
...
log pytest info
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-09 11:06:37 +02:00
Yao Matrix
a72cb31434
enable utils test cases on XPU ( #38005 )
...
* enable utils test cases on XPU
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
* fix style
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
* Update tests/utils/test_skip_decorators.py
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
* fix comment
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
---------
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
2025-05-09 08:45:01 +02:00
Yao Matrix
1dfad4beb2
make mistral3 pass on xpu ( #37882 )
...
* enabled mistral3 test cases on XPU
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
* calibrate A100 expectation
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* update
* update
* update
* update
* update
* update
---------
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-09 06:41:11 +00:00
Wing Lian
121f7037c7
fix document masking for chunked attention ( #37429 )
...
* fix document masking for chunked attention
* remove accidental debugging sum
2025-05-09 08:22:00 +02:00
Arthur
5f5ccfdc54
[AutoDocstring
] Based on inspect parsing of the signature ( #33771 )
...
* delete common docstring
* nit
* updates
* push
* fixup
* move stuff around fixup
* no need for dataclas
* damn nice modular
* add auto class docstring
* style
* modular update
* import autodocstring
* fixup
* maybe add original doc!
* more cleanup
* remove class do cas well
* update
* nits
* more celanup
* fix
* wups
* small check
* updatez
* some fixes
* fix doc
* update
* nits
* try?
* nit
* some updates
* a little bit better
* where ever we did not have help we are not really adding it!
* revert llama config
* small fixes and small tests
* test
* fixup
* more fix-copies
* updates
* updates
* fix doc building
* style
* small fixes
* nits
* fix-copies
* fix merge issues faster
* fix merge conf
* nits jamba
* ?
* working autodoc for model class and forward except returns and example
* support return section and unpack kwargs description
* nits and cleanup
* fix-copies
* fix-copies
* nits
* Add support for llava-like models
* fixup
* add class args subset support
* add examples inferred from automodel/pipelines
* update ruff
* autodocstring for Aria, Albert + fixups
* Fix empty return blocks
* fix copies
* fix copies
* add autodoc for all fast image processors + align, altclip
* fix copies
* add auto_doc for audio_spectrogram, auto_former, bark, bamba
* Drastically improve speed + add bart beit bert
* add autodoc to all bert-like models
* Fix broken doc
* fix copies
* fix auto_docstring after merge
* add autodoc to models
* add models
* add models
* add models and improve support for optional, and custom shape in args docstring
* update fast image processors
* refactor auto_method_docstring in args_doc
* add models and fix docstring parsing
* add models
* add models
* remove debugging
* add models
* add fix_auto_docstrings and improve args_docs
* add support for additional_info in args docstring
* refactor (almost) all models
* fix check docstring
* fix -copies
* fill in all missing docstrings
* fix copies
* fix qwen3 moe docstring
* add documentation
* add back labels
* update docs and fix can_return_tuple in modular files
* fix LongformerForMaskedLM docstring
* add auto_docstring to _toctree
* remove auto_docstring tests temporarily
* fix copyrights new files
* fix can_return_tuple granite hybrid
* fix fast beit
* Fix empty config doc
* add support for COMMON_CUSTOM_ARGS in check_docstrings and add missing models
* fix code block not closed flava
* fix can_return_tuple sam hq
* Fix Flaubert dataclass
---------
Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2025-05-08 17:46:07 -04:00
jiqing-feng
d231f5a7d4
update bnb tests ( #38011 )
...
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-05-08 20:35:24 +00:00
Yao Matrix
b3db4ddb22
enable mamba2 integration cases on xpu ( #38006 )
...
* enable mamba2 integration cases on XPU
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
* fix style
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
---------
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
2025-05-08 19:48:09 +00:00
Fanli Lin
c7c2f08994
make test_speculative_decoding_non_distil
device-agnostic ( #38010 )
...
* make device-agnostic
* use condition
---------
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-05-08 19:19:47 +00:00
Raushan Turganbay
d23aae2b8c
[VLMs] support attention backends ( #37576 )
...
* update models
* why rename
* return attn weights when sdpa
* fixes
* fix attn implementation composite
* fix moshi
* add message
* add typings
* use explicitly all flags for each attn type
* fix some tests
* import what is needed
* kosmos on main has ew attention already, yay
* new models in main, run fixup
* won't fix kosmos yet
* fix-copies
* clean up after rebasing
* fix tests
* style
* dont cast attns to fp32
* did we update ruff? oke, let's just do what it asks
* fix pixtral after rebase
2025-05-08 18:18:54 +02:00
Tomek
e296c63cd4
Fix wording in torchscript.md
( #38004 )
...
Fix wording in torchscript.md
2025-05-08 16:47:45 +01:00
Yufeng Xu
1c65aef923
Fix incorrect installation instructions (for issue #37476 ) ( #37640 )
...
* debugging issue 36758
* debugging issue 36758
* debugging issue 36758
* updated attn_mask type specification in _flash_attention_forward
* removed pdb
* added a blank line
* removed indentation
* update constants
* remove unnecessary files
* created installation script, modified README
* modified requirements and install.sh
* undo irrelevant changes
* removed blank line
* fixing installation guide
* modified README, python requirements, and install script
* removed tests_otuput
* modified README
* discarded installation script and python<3.13 requirement
2025-05-08 16:32:58 +01:00
Yih-Dar
f2909e024c
Skip test_push_to_hub_with_saves_each_epoch
for now ( #38022 )
...
* update
* trigger CI
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-08 16:26:24 +02:00
Joao Gante
f2b59c6173
[caches] Raise exception on offloaded static caches + multi device ( #37974 )
...
* skip tests on >1 gpu
* add todo
2025-05-08 14:37:36 +01:00
Joao Gante
4279057d70
[CI] remove duplicated message on GH comment to run slow tests ( #37970 )
...
duplicated msg
2025-05-08 14:35:54 +01:00
Yih-Dar
3390534f36
Print commit SHA on slack message for new model notification. ( #38019 )
...
add commit info
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-08 15:26:19 +02:00
Pavel Iakubovskii
9f8fffed3c
Fix Optional
typing ( #38018 )
...
* Fix
* trigger
2025-05-08 14:51:45 +02:00
Yuanyuan Chen
06c16de3d3
Enable RUF013 to enforce optional typing ( #37266 )
...
* Enable RUF013 for Optional typing
Signed-off-by: cyy <cyyever@outlook.com>
* Add Optional to types
* Format code
Signed-off-by: cyy <cyyever@outlook.com>
---------
Signed-off-by: cyy <cyyever@outlook.com>
2025-05-08 12:39:56 +02:00
Aurélien Lac
f6664ee713
Add ALL_ATTENTION_FUNCTIONS compatibility for Pixtral model ( #37960 )
...
* Add ALL_ATTENTION_FUNCTIONS compatibility for Pixtral model
* Fix invalid operand type
* Allow image_sizes to be optional in forward pass to fit tests
Disallow using sdpa and output_attentions
* Disallow using sdpa with output_attentions
* Delete useless comments, use eager attention from smolvlm, use pattern from mistral
* add _supports_attention_backend
* use kwargs instead of position_ids
---------
Co-authored-by: aurelien.lac <aurelien.lac@lighton.ai>
2025-05-08 12:13:13 +02:00
Sebastiaan Vermeulen
015b6dfbf8
Fix pad
image transform for batched inputs ( #37544 )
...
* fix
* add batch dimension to expected output
2025-05-08 10:51:15 +01:00
Eon Kim
5c47d08b0d
Add Swin2SR ImageProcessorFast ( #37169 )
...
* Add fast image processor support for Swin2SR
* Add Swin2SR tests of fast image processing
* Update docs and remove unnecessary test func
* Fix docstring formatting
* Skip fast vs slow processing test
---------
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2025-05-07 12:20:16 -04:00
Raushan Turganbay
17742bd9c8
🔴 [VLM] Add base model without head ( #37033 )
...
* i guessreverted all CdGen classes
* style
* llava onevision
* fix copies
* fix some tests
* some more tests
* dump
* skip these
* nevermind, i am dumb
* revert fix not needed
* fixup
* fixup
* another fixup
* more fixup to make ci finally happy
* fixup after rebasing
* fix qwen tests
* add internVL + typos here and there
* image token index -> id
* style
* fix init weights
* revert blip-2 not supported
* address comments
* fix copies
* revert blip2 test file as well
* as discussed internally, revert back CdGen models
* fix some tests
* fix more tests for compile
* CI red
* fix copies
* enumerate explicitly allowed models
* address comments
* fix tests
* fixup
* style again
* add tests for new model class
* another fixup ( x _ x )
* [fixup] unused attributes can be removed post-deprecation
2025-05-07 17:47:51 +02:00
eustlb
3fa8d9c20e
[CSM] tiny fix on generation ( #38001 )
...
nit
2025-05-07 11:45:23 -04:00
eustlb
798f948e88
Add CSM model ( #36719 )
...
* draft structure
* depth decoder with forward pre hook
* full model forward draft
* draft update
* depth decoder update
* ConversationalSpeechModelForCausalLM udpates
* add generate
* max length criteria small fix
* udpate
* updates
* generation update
* update in loss compute
* conversion script
* update for correct input embeddings
* handle interleaved rope
* update
* update
* update
* support compile
* update training
* add doc
* update doc
* correct inits
* ConversationalSpeechModel -> Csm
* conf update
* name update
* tests CsmForCausalLMTest
* convert use cached_file
* conf + modeling updates
* generate utils handle third dim shape
* integration test
* modeling + conf updates
* common test handle more than 2 dims
* add nested audio list utils
* processing handle nested audio list
* csm processing draft
* mimi util
* init updates
* modular update
* convert modular
* processing update
* csm tests update
* generate tests handle third dim
* generate utils handle third dim
* propagate _get_initial_cache_position update
* tied_weight_keys update + convert correctly
* fix inputs_embeds
* revert audio nested list
* batch inference update + return audio
* audio_utils update
* processor update
* some more integration tests
* remove old test
* porcessing output labels
* improve
* fix
* update rope values with equivalent ones
* conversion update
* udpate tests
* handle depth decoder generation config
* remove default eos_token_id
* make style
* revert modeling_mimi
* add default generation_config
* remove sdpa since handled by default
* make
* fix conflict
* fix conflicts
* correct naming
* correct imports
* make
* causal -> conditional naming
* causal -> conditional naming
* auto update
* make
* make
* add doc
* test update
* fix weight init
* audio tokens offsets as buffer
* 4d mask in conditional class
* make
* doc update
* fix causal mask
* fix causal mask
* doc update
* doc update
* add processor doc
* update doc
* fix 4d causal mask
* update make_list_of_audio
* do not default to mutable
* remove duplicates
* remove useless reset_parameters
* use GradientCheckpointingLayer
* use can_return_tuple
* formatting
* prepend placeholder in _sample
* torch compile fix
* some more fixies
* convert modular
* fix
* default max_length in convert
* handle depth decoder generation config correctly
* clearer formulation
* handle output_loading_info
* handle softmax warning
* add doc
* propagate _get_initial_cache_position changes
* generation in its own module
* add processor tests
* fix compile witu cuda graphs
* fix compile with cuda graphs
* add csm.md
* include CSM loss
* doc nit
* doc nit
* doc nit
* Update docs/source/en/model_doc/csm.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add save_audio to processor
* Update src/transformers/models/csm/modular_csm.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* doc update
* simplify audio_codes_mask computation
* doc update
* simplify loss computation
* fix static cache test
* fix
* remove comment
* simplify encoded length computation
* use hf-internal-testing
* doc update
* cast to float before numpy
* nit
* mem efficient codebook head
* nit
* cat input values with cutoffs
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-05-07 10:20:13 -04:00
Fiona Waters
c8607a17cb
Add a check to import_utils.py to allow for use of faiss_gpu installation ( #37997 )
...
Adding check to import_utils.py for faiss_gpu
2025-05-07 14:27:41 +01:00
kaixuanliu
fb1e3a4daa
remove duplicate code ( #37991 )
...
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
2025-05-07 13:46:45 +01:00