Lysandre Debut
548794b886
[serve] Model name or path should be required ( #39178 )
...
* Model name or path should be required
* Fix + add tests
* Change print to log so it doesn't display in transformers chat
2025-07-02 22:06:47 +02:00
Joao Gante
2d561713f8
[generate] document non-canonical beam search default behavior ( #39000 )
2025-07-02 18:29:16 +01:00
Vasqu
ca7c9304f1
Merge branch 'main' into vas-bert-attn-refactors
Secret Leaks / trufflehog (push) Has been cancelled
2025-07-02 19:00:17 +02:00
Vasqu
95210dd5c0
RemBert: remove copy, maybe doing it later
2025-07-02 18:09:19 +02:00
Vasqu
cfcb267892
roberta prelayernorm
2025-07-02 17:03:32 +02:00
Steven Liu
df12d87d18
[docs] ViTPose ( #38630 )
...
* vitpose
* fix?
* fix?
* feedback
* fix
* feedback
* feedback
* update sample image
2025-07-02 07:56:29 -07:00
Vasqu
f199dec418
xlm_roberta + some embedding fixes
Secret Leaks / trufflehog (push) Waiting to run
2025-07-02 16:14:31 +02:00
Cyril Vallez
2b4a12b5bf
Reduce Glm4v model test size significantly ( #39173 )
...
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run
Build documentation / build (push) Waiting to run
New model PR merged notification / Notify new model (push) Waiting to run
Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions
Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run
Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions
Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions
Secret Leaks / trufflehog (push) Waiting to run
Update Transformers metadata / build_and_package (push) Waiting to run
* fix test size
* Update test_modeling_glm4v.py
2025-07-02 15:55:05 +02:00
BUI Van Tuan
e355c0a11c
Fix missing initializations for models created in 2024 ( #38987 )
...
* fix GroundingDino
* fix SuperGlue
* fix GroundingDino
* fix MambaModel
* fix OmDetTurbo
* fix SegGpt
* fix Qwen2Audio
* fix Mamba2
* fix DabDetr
* fix Dac
* fix FalconMamba
* skip timm initialization
* fix Encodec and MusicgenMelody
* fix Musicgen
* skip timm initialization test
* fix OmDetTurbo
* clean the code
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
* add reviewed changes
* add back timm
* style
* better check for parametrizations
---------
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2025-07-02 15:03:57 +02:00
Rémi Ouazan
1125513a8d
Blip2 fixes ( #39080 )
...
* Fixed some devices errors
* Fixed other device issues and more expectations
* Reverted support flags
* style
* More granular support
* Fixed some rebase stuff
* add a not None check before .to
2025-07-02 14:39:39 +02:00
Isotr0py
28df7f854a
Fix multimodal processor get duplicate arguments when receive kwargs for initialization ( #39125 )
...
* fix processor tokenizer override
Signed-off-by: Isotr0py <2037008807@qq.com>
* code format
Signed-off-by: Isotr0py <2037008807@qq.com>
* add regression test
Signed-off-by: Isotr0py <2037008807@qq.com>
* fix
Signed-off-by: Isotr0py <2037008807@qq.com>
* check image processor same
Signed-off-by: Isotr0py <2037008807@qq.com>
---------
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-07-02 19:57:15 +08:00
Vasqu
32cd8d2c0d
remove wrong copy
2025-07-02 13:49:17 +02:00
Vasqu
1865eb330a
electra + markuplm, small fixes
2025-07-02 13:47:32 +02:00
Yaswanth Gali
b61023a1b7
🚨 🚨 🚨 [eomt] make EoMT compatible with pipeline ( #39122 )
...
* Make EoMT compatible with pipeline
* Implicit patch offsets
* remove patch offsets from arg
* Modify tests
* Update example
* fix proc testcase
* Add few more args
* add pipeline test suite
* fix
* docstring fixes
* add pipeline test
* changes w.r.t review
* 🙈 MB
* should fix device mismatch
* debug
* Fixes device mismatch
* use decorator
* we can split mlp
* expected values update
---------
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2025-07-02 12:25:26 +01:00
Vasqu
786230b463
whoops
2025-07-02 13:01:34 +02:00
Vasqu
8fa32ca900
xmod + cache position fixes
2025-07-02 12:57:15 +02:00
Raushan Turganbay
4d5822e65d
[smolvlm] fix video inference ( #39147 )
...
* fix smolvlm
* better do as before, set sampling params in overwritten `apply_chat_template`
* style
* update with `setdefault`
2025-07-02 12:05:10 +02:00
वेदांत
9b2f5b66d8
fix default value of config to match checkpionts in LLaVa-OV models ( #39163 )
2025-07-02 09:45:50 +00:00
Vasqu
baaa3ecccc
tmp disable
2025-07-02 10:54:48 +02:00
Vasqu
52d2052b4e
modular data2vec text
2025-07-02 10:53:12 +02:00
Chong You
e8e0c76162
Add activation sparsity reference in gemma3n doc ( #39160 )
...
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run
Build documentation / build (push) Waiting to run
Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions
Secret Leaks / trufflehog (push) Waiting to run
Update Transformers metadata / build_and_package (push) Waiting to run
Add activation sparsity reference in the description of gemma3n
2025-07-02 04:11:03 +02:00
Yih-Dar
8e87adc45f
fix llama
tests ( #39161 )
...
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run
Build documentation / build (push) Waiting to run
New model PR merged notification / Notify new model (push) Waiting to run
Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions
Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run
Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions
Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions
Secret Leaks / trufflehog (push) Waiting to run
Update Transformers metadata / build_and_package (push) Waiting to run
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-07-01 23:27:22 +02:00
Yih-Dar
4c1715b610
Update expected values (after switching to A10) ( #39157 )
...
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* empty
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-07-01 20:54:31 +02:00
Yih-Dar
ab59cc27fe
Suggest jobs to use in run-slow
( #39100 )
...
* pr
* pr
* pr
* pr
* pr
* pr
* pr
* pr
* pr
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-07-01 20:19:06 +02:00
jiqing-feng
db2f535443
update bnb ground truth ( #39117 )
...
* update bnb resulte
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* set seed to avoid sampling different results
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix int8 tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix typo
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* add comments
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-07-01 20:06:37 +02:00
ybkurt
260846efad
fix: remove undefined variable ( #39146 )
2025-07-01 19:10:29 +02:00
Vasqu
ad3ffe55a9
data2vectext, making it modular tomorrow
Secret Leaks / trufflehog (push) Waiting to run
2025-07-01 18:41:29 +02:00
rasmi
cdfe49a4d0
Change @lru_cache()
to @lru_cache
to match styles from #38883 . ( #39093 )
...
Match styles in #38883
2025-07-01 18:29:16 +02:00
DavidS2106
f46798193e
Fix: Ensure wandb logs config in offline mode ( #38992 )
...
* Fix: Ensure wandb logs config in offline mode
* Apply style fixes
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-07-01 16:17:58 +00:00
Yih-Dar
fe838d6631
Fix missing fsdp & trainer jobs in daily CI ( #39153 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-07-01 18:10:30 +02:00
Vasqu
dd7aeca424
albert
2025-07-01 17:55:05 +02:00
Vasqu
11de15bda4
modular roberta
2025-07-01 16:47:04 +02:00
Vasqu
5120ca6c8e
fix test
Secret Leaks / trufflehog (push) Waiting to run
2025-07-01 15:58:43 +02:00
Vasqu
38e8de3104
fix encoder decoder
2025-07-01 14:52:46 +02:00
StevenBucaille
1283877571
[superglue] fix wrong concatenation which made batching results wrong ( #38850 )
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run
Build documentation / build (push) Waiting to run
New model PR merged notification / Notify new model (push) Waiting to run
Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions
Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run
Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions
Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions
Secret Leaks / trufflehog (push) Waiting to run
Update Transformers metadata / build_and_package (push) Waiting to run
2025-07-01 12:14:44 +00:00
Raushan Turganbay
f8b88866f5
[VLMs] support passing embeds along with pixels ( #38467 )
...
* VLMs can work with embeds now
* update more models
* fix tests
* fix copies
* fixup
* fix
* style
* unskip tests
* fix copies
* fix tests
* style
* omni modality models
* qwen models had extra indentation
* fix some other tests
* fix copies
* fix test last time
* unrelated changes revert
* we can't rely only on embeds
* delete file
* de-flake mistral3
* fix qwen models
* fix style
* fix tests
* fix copies
* deflake the test
* modular reverted by fixes, fix again
* flaky test, overwritten
* fix copies
* style
2025-07-01 11:33:20 +00:00
Vasqu
306a5c2a5c
attention split, simplify args and kwargs, better typing
2025-07-01 12:50:45 +02:00
Ayush Singh
20901f1d68
[typing] LlamaAttention return typehint ( #38998 )
...
* helo llama
* helo llama
* helo llama
* apply modular
* fix dia
---------
Co-authored-by: qubvel <qubvel@gmail.com>
2025-07-01 11:29:52 +01:00
Raushan Turganbay
7a25f8dfdb
[qwen2-vl] fix FA2 inference ( #39121 )
...
* fix FA2
* update is causal flag and remove mask for FA2
* update for FA2 with varlen path
* how the tests were passing with different devices?
* add comment and ref to the PR
* move mask preparation to base pretrained model
* seq len is the first dim, not second
* fix copies to fix GLM4V
2025-07-01 10:18:37 +00:00
Mehant Kammakomati
def9663239
feat: support indivisible shards for TP model loading and TPlizing. ( #37220 )
...
* feat: support uneven loading and sharding
resolve merge conflicts
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
* fix: allow for empty tensor computations
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
* test: add llama1b test case
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
* due to q_proj colwise it has to be multi of 2
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
* refactor: use slice API
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
* refactor: use slice API
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
* refactor: use slice API
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
* refactor: use slice API
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
---------
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
2025-07-01 10:03:22 +00:00
jiqing-feng
06c4a4d499
fix caching_allocator_warmup with tie weights ( #39070 )
...
* fix caching_allocator_warmup with tie weights
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix comment
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-07-01 11:32:20 +02:00
Vasqu
b82b47e5d5
Merge branch 'main' into vas-bert-attn-refactors
2025-07-01 11:32:01 +02:00
Raushan Turganbay
e435574721
🚨 Don't use cache in non-generative models ( #38751 )
...
* deprecate for 1 version
* style
* fix some tests
* fix esm
* skip for now, GC requires positional args but we have keyword args
* remove transpose for scores in modified models only
* skip fx trace tests
2025-07-01 09:08:21 +00:00
Cyril Vallez
dbc98328da
Several fixes for Gemma3n ( #39135 )
...
* remove the skips
* fix the epsilon to a small value (does not make sense otherwise)
* safeguard
* overload test_eager_matches_sdpa
* Update test_modeling_common.py
* skip appropriate tests
* correct no_split_layer
* fix all devices issue
* fix backward
* fix
2025-07-01 10:34:53 +02:00
BUI Van Tuan
d53518c5f2
Fix key mapping for VLMs ( #39029 )
...
* fix key mapping for VLMs
* use __mro__ instead
* update key mapping in save_pretrained
2025-07-01 09:47:53 +02:00
eustlb
3457e8e73e
[Whisper] update token timestamps tests ( #39126 )
...
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run
Build documentation / build (push) Waiting to run
Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions
Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run
Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions
Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions
Secret Leaks / trufflehog (push) Waiting to run
Update Transformers metadata / build_and_package (push) Waiting to run
* fixes
* update comment
* update for A10
* all a10
* all a10
* all a10
* all a10
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-06-30 21:55:36 +02:00
Drew Ross
fe35eca7bd
Update BigBirdPegasus model card ( #39104 )
...
* Update igbird_pegasus.md
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-06-30 10:42:56 -07:00
Vasqu
d1c76901b4
fixup sdpa remains
Secret Leaks / trufflehog (push) Waiting to run
2025-06-30 18:00:02 +02:00
Yao Matrix
29a3f5ed8c
switch default xpu tp backend to pytorch built-in XCCL from pytorch 2.8 ( #39024 )
...
* switch default xpu tp backend to pytorch built-in XCCL from pytorch 2.8
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* Update docs/source/en/perf_infer_gpu_multi.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update perf_infer_gpu_multi.md
* Update perf_infer_gpu_multi.md
* Update perf_infer_gpu_multi.md
---------
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-06-30 08:54:05 -07:00
Vladimir Gutuev
9e0c865b8b
docs: correct two typos in awesome-transformers.md ( #39102 )
...
* docs(awesome-projects): fix typo “Itt leverages” → “It leverages” (#39101 )
closes #39101
* docs(awesome-projects): fix grammar “We provides” → “We provide” (#39101 )
closes #39101
2025-06-30 08:53:43 -07:00