Commit Graph

19383 Commits

Author SHA1 Message Date
AbdelKarim ELJANDOUBI
7ecc5b88c0
Add image classifier donut & update loss calculation for all swins (#37224)
* add classifier head to donut

* add to transformers __init__

* add to auto model

* fix typo

* add loss for image classification

* add checkpoint

* remove no needed import

* reoder import

* format

* consistency

* add test of classifier

* add doc

* try ignore

* update loss for all swin models
2025-04-10 15:00:42 +02:00
Mohamed Mekkouri
5ae9b2cac0
Quark Quantization gated repo (#37412)
* fix

* empty commit

* empty

* nit

* fix maybe ?
2025-04-10 14:57:15 +02:00
Yih-Dar
d9e76656ae
Fix new failure reports not including anything other than tests/models/ (#37415)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-10 14:47:23 +02:00
Raushan Turganbay
1ae8d54b04
[chat-template] Unify tests and clean up 🧼 (#37275)
* fix tests and some clean up

* make one general test for each modality

* remove redundant merging of kwargs

* edge cases

* dont enforce slow when reloading

* fix gemma3 tests

* has to adapt llama 4 after rebase

* remove also from overriden tests

* should be green now
2025-04-10 14:42:32 +02:00
Arthur
10144ff116
use rms_norm_eps for the L2Norm for Llama4 (#37418)
use `rms_norm_eps`
2025-04-10 13:33:50 +02:00
ivarflakstad
aa478567f8
Allow rocm systems to run these tests (#37278)
* Allow rocm systems to run these tests

* Fix skipTest logic

* Use get_device_properties to check system capabilities
2025-04-10 13:33:01 +02:00
Wang, Yi
ae5ce22664
from_pretrained should handle xpu case (#37382)
* from_pretrained should handle xpu case

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* fmt

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

---------

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2025-04-10 13:23:17 +02:00
Yih-Dar
4f139f5a50
Send trainer/fsdp/deepspeed CI job reports to a single channel (#37411)
* send trainer/fsdd/deepspeed channel

* update

* change name

* no .

* final

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-10 13:17:31 +02:00
Arthur
a2c2fb0108
update kernels to 0.4.3 (#37419)
* update `kernels`

* oups
2025-04-10 12:14:22 +02:00
Wing Lian
0ddad2d655
mark llama4 as not supported with fa2 (#37416) 2025-04-10 11:48:46 +02:00
Cyril Vallez
fbb2054ed5
Offloaded hybrid cache for Llama4 (#37401)
* first try (maybe race condition)

* Update cache_utils.py

* cannot avoid the race condition -> use 2 layers

* Update cache_utils.py

* Update cache_utils.py
2025-04-10 11:44:34 +02:00
Cyril Vallez
6d8b0b3378
Fix Llama4 offset (#37414)
* add +1

* Update modeling_llama4.py
2025-04-10 11:40:58 +02:00
Mohamed Mekkouri
f5865d32a2
Restrict & Explain tp_plan for FBgemm (#37404)
* explain tp_plan

* add llama4 check

* add clarification
2025-04-10 11:33:33 +02:00
Serge Panev
e39c732644
Handle torch ver in flexattn (#37400)
* Handle torch ver in flexattn

* update
2025-04-10 11:27:54 +02:00
Manuel de Prada Corral
bc0150bb04
Add warning when failed to acquire other user's lock at model download (#37395) 2025-04-10 11:18:27 +02:00
Wing Lian
9cda4265d6
handle torch version edge cases (#37399) 2025-04-09 21:49:57 +02:00
Arthur
e032d12e8a
the fix that did not get in (#37370)
* debugging improvements

* add debugging details

* add more debugging details

* debug more

* the fix that did not get in

* First fix flex

* fix query offset

* fix flex first

* fix device mask creation for speed

* small mask creation sdpa

* Update flex_attention.py

* remove chunked prefill from HybridChunkedCache

* never seen such a fucked up merged

* clean up layers + output

* add summary json file

* Efficient general cache

* Update cache_utils.py

* cleanup

* fix?

* fix!

* oups typo

* not everywhere

* more fixes

* revert unrelated changes

* Fix but ugly for now -> should use pad instead

* oups

* re-initialize the cache

* Use pad to simplify

* style

* correct slicing

---------

Co-authored-by: Pablo <pablo.montalvo.leroux@gmail.com>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2025-04-09 20:15:33 +02:00
Mohamed Mekkouri
f834ca2c19
Attention Quantization with FBGemm & TP (#37384)
* fix

* keep fused

* contiguous

* rm print

* update

* update

* rm print
2025-04-09 18:45:42 +02:00
DerekLiu35
c5c648dd74
Fix some failing AWQ tests (#37383)
* update AwqQuantizer

* fix style

* add an arg to get_modules_to_not_convert to add get_keys_to_not_convert(model)
2025-04-09 18:24:57 +02:00
Brayden Zhong
71b35387fd
Apply torchfix to replace deprecated functions: _pytree._register_pytree_node and torch.cpu.amp.autocast (#37372)
fix: apply torchfix
2025-04-09 16:11:18 +01:00
Sangyun_LEE (이상윤)
ad340908e4
Fix warning message for PEFT models in text-generation pipeline #36783 (#36887)
* add peft model in constant

* add test

* fix formating

* make fixup execute

* change code

* check by self.task

* add test

* fixup test code

* fix minor typo

* fix pipeline test

* apply maintainers reqests
2025-04-09 15:36:52 +01:00
DerekLiu35
2527f71a47
Add "selecting a quantization method" doc (#37159)
* initial draft

* make documentation simpler

* Update docs/source/en/quantization/selecting.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/selecting.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/selecting.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/selecting.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/selecting.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/selecting.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/selecting.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/selecting.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/selecting.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/selecting.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* turn pros and cons into tables

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* add links to each quant method page

* separate calibration vs no calibration methods

* add calibration time estimates

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-09 15:51:37 +02:00
Marc Sun
7ae0be722e
update deepspeed docker (#37371)
* update

* create docker image

* 03

* uninstall pytest as it conflits with transformers

* wrong one

* better

* see which package depends on pytest

* up

* resintall

* fix

* deepspeedddddddd

* deepspeedddddddd

* deepspeedddddddd

* deepspeedddddddd

* deepspeedddddddd

* deepspeedddddddd

* deepspeedddddddd

* deepspeedddddddd

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-09 14:54:06 +02:00
Arthur
e3eda6d188
Add glm4 (#37388)
* add changed

* Revert "add changed"

This reverts commit 0a0166a1fe.

* update with NEW MODEL class called GLM4

* update

* Update glm4.md

* Name

* style

* fix copies

* fixup test

---------

Co-authored-by: Yuxuan Zhang <2448370773@qq.com>
2025-04-09 14:02:04 +02:00
Jonas M. Kübler
1e6ff5fd55
fix: llama4 conversion script no_rope_layers (#37359)
fix conversion script no_rope_layers

`no_rope_layers` should either be a list of NoPE layers or None, such that it is created in the config from the `no_rope_layer_interval`

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2025-04-09 13:02:15 +02:00
Raushan Turganbay
6f4058aee3
Update composition flag usage (#36263)
* update composition flag usage

* remove print

* fix tests

* actually fix

* oh c'mon

* now should be fixed right?

* fix copies
2025-04-09 11:48:49 +02:00
Jerry Zhang
08e3217baf
Preserve requires_grad in pre quantized model (#37354)
* Preserve requires_grad in pre quantized model

Summary:
discovered this when running lm-eval for some models, current
code will set requires_grad to True always

Test Plan:
lm_eval --model hf --model_args pretrained=jerryzh168/phi4-torchao-gguf-q4_k --tasks hellaswag --device cuda:0 --batch_size 8

Reviewers:

Subscribers:

Tasks:

Tags:

* ruff format

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-04-08 18:41:30 +02:00
Matt
4d0de5f73a
🚨 🚨 Setup -> setupclass conversion (#37282)
* More limited setup -> setupclass conversion

* make fixup

* Trigger tests

* Fixup UDOP

* Missed a spot

* tearDown -> tearDownClass where appropriate

* Couple more class fixes

* Fixups for UDOP and VisionTextDualEncoder

* Ignore errors when removing the tmpdir, in case it already got cleaned up somewhere

* CLIP fixes

* More correct classmethods

* Wav2Vec2Bert fixes

* More methods become static

* More class methods

* More class methods

* Revert changes for integration tests / modeling files

* Use a different tempdir for tests that actually write to it

* Remove addClassCleanup and just use teardownclass

* Remove changes in modeling files

* Cleanup get_processor_dict() for got_ocr2

* Fix regression on Wav2Vec2BERT test that was masked by this before

* Rework tests that modify the tmpdir

* make fix-copies

* revert clvp modeling test changes

* Fix CLIP processor test

* make fix-copies
2025-04-08 17:15:37 +01:00
KimmiShi
c15a7adb28
fix(qwen): fix shape error when using tp (#36947)
* fix(qwen): fix shape error when using tp

* Update modeling_qwen2_vl.py

---------

Co-authored-by: shidongxing <shidongxing@pjlab.org.cn>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-04-08 17:47:30 +02:00
Jonathan Mamou
121f91d36c
prune LM Head for USD (#36695)
* initial commit

* fix

* fix style

* set default to prune

* add tests

* comment

* remove prune flag from generate

* address Joao's comments

* deprecate_kwarg

* add doc

* fix target_vocab_size

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* fix deprecated argument assistant_model_device

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-04-08 16:44:10 +01:00
Joao Gante
4321b0648c
[core] remove GenerationMixin inheritance by default in PreTrainedModel (#37173) 2025-04-08 16:42:05 +01:00
Kerry
aab0878327
Skip non-selected experts for mixtral and qwen2_moe (#32429)
* Skip non-selected experts for mixtral and qwen2_moe

* Fix: tensor tolist()

* WIP: tokenization test

* fix modular source of truth

* nits

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-04-08 17:41:28 +02:00
Joao Gante
35f0f5b5da
[llama 4] dynamic rope decorator (#37365)
l4 + dynamic rope decorator
2025-04-08 15:56:31 +01:00
Ryan Mullins
530322ccb6
Set vision config to None for Gemma 1B conversion (#37366)
* Set vision config to None for Gemma 1B conversion

* Trigger tests

---------

Co-authored-by: Matt <rocketknight1@gmail.com>
2025-04-08 14:22:32 +01:00
Yih-Dar
8064cd9b4f
fix deepspeed job (#37284)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-08 15:19:33 +02:00
Cyril Vallez
cdfb018d03
A bit of cleaning 🧹🧹 (#37215)
* cleaning

* CIs
2025-04-08 14:33:58 +02:00
cyyever
1e6b546ea6
Use Python 3.9 syntax in tests (#37343)
Signed-off-by: cyy <cyyever@outlook.com>
2025-04-08 14:12:08 +02:00
Minho Ryu
0fc683d1cd
convert float for yarn related arguments in rope_scaling (#37139)
* convert float for yarn related arguments in rope_scaling

* sort keys alphabetically

---------

Co-authored-by: ryan.agile <ryan.agile@kakaobrain.com>
2025-04-08 13:58:22 +02:00
Alex Brooks
2515a5a290
Expose blip2qformer (#37254)
* Expose blip2qformer

* Add missing args to blip2 config
2025-04-08 12:04:33 +02:00
Arthur
2da82e432d
Multiple llama4 fixe (#37353)
* update for fixes

* more fixes

* fuxix dynamic cache?

* style

* fix both traiining and generating. Eager seems alright

* dynamic does not work

* fix most cases, use_cache or not, eager or not, no default cache (ex: not training but you want to get cache states)

* should be final fixes

* fix more stuff no cat

* style

* fix

* style

* final sytle

* qualityeioiwhjfaopsejdpofqsdjkfjha;wesdhgfkjlqsw.denghjkaswednkgs

* fix

* revert
2025-04-08 11:14:49 +02:00
salman
794fde7b1c
Fixing flex attention for torch=2.6.0 (#37285)
* adding compile kwarg for torch 2.6

* fixing dynamic

* addressing comment

* typo

* Update src/transformers/integrations/flex_attention.py

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-04-07 23:04:46 +02:00
Wing Lian
b54c2f4689
more fixes for post-training llama4 (#37329)
* more fixes for post-training llama4

* use target_length instead of guearded past_key_values
2025-04-07 21:20:23 +02:00
Tugsbayasgalan Manlaibaatar
754a370bca
Remove unnecessary attr assignment (#36837)
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-04-07 20:19:54 +01:00
logesh R
31a62c2eb8
Updated Model-card for donut (#37290)
* Updated documentation for Donut model

* Update docs/source/en/model_doc/donut.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/donut.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/donut.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/donut.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Updated code suggestions

* Update docs/source/en/model_doc/donut.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Updated code suggestion to Align with the AutoModel example

* Update docs/source/en/model_doc/donut.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Updated notes section included code examples

* close hfoption block and indent

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-07 11:54:47 -07:00
Mohamed Mekkouri
f830105183
Add bnb to the list of supported quantization methods for LLama4 (#37348)
* add bnb

* style

* update

* add pre_quantized check
2025-04-07 20:34:06 +02:00
Parag Ekbote
e2b0224d94
Update Model Card for Jamba (#37152)
* Update model card for jamba

* Apply the suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Apply suggestions from code review-2

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update model page.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update as per code review.

* Update docs/source/en/model_doc/jamba.md as per code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/jamba.md as per code review

`

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update as per code review.

* fixes

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-07 11:02:59 -07:00
Devesh Rahatekar
6cc109c354
Improvements in Gemma2 model card (#37076)
* Improved Model card for Gemma2

* Made changes in gemma2 as suggested

* Made more changes in the doc (adding image, notes, closing hfoptions)

* minor fixes

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-07 10:51:26 -07:00
Mohamed Mekkouri
8bbcdf5409
Clean up the compressed-tensors integration (#37349)
clean up
2025-04-07 19:26:45 +02:00
Ashvanth.S
3a826a45ca
Update Model card for GPT2 (#37101)
* Update Model card for gpt2

* Update link for gpt2 space

* fixes docs based on suggestions

* Add transformers-cli and quantization example for GPT-2

* Remove resources and flash attention docs and fix typos
2025-04-07 10:15:28 -07:00
Ricardo Alanis
5e855095a2
Update falcon mamba card (#37253)
* feat: edit falcon mamba card

* fix: edit statement on falconmamba arch

* Update docs/source/en/model_doc/falcon_mamba.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon_mamba.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon_mamba.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix: add right indent for tags

* fix: remove notas

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-07 10:12:44 -07:00