Commit Graph

18769 Commits

Author SHA1 Message Date
Poedator
7c62e69326
GPT2Model StaticCache support (#35761)
* initial GPT2 changes

* causal_mask support

* return_legacy_cache

* cleanup

* fix1

* outputs shape fixes

* gpt2 return fix

* pkv, attn fixes

* fix dual_head

* is_causal arg fix

* decision transformer updated

* style fix

* batch_size from inputs_embeds

* DecisionTransformerModel fixes

* cross-attn support + cache warning

* x-attn @decision

* EDCache proper init

* simplified logic in `if use_cache:` for GPT2Model

* @deprecate_kwarg for DecisionTr attn fwd

* @deprecate_kwarg in gpt2

* deprecation version updated to 4.51

* kwargs in gradient_checkpointing_fn

* rename next_cache to past_key_values

* attention_mask prep

* +cache_position in GPT2DoubleHeadsModel

* undo kwargs in gradient checkpointing

* moved up `if self.gradient_checkpointing`

* consistency in decision_transformer

* pastkv, cache_pos in grad_checkpt args

* rm _reorder_cache

* output_attentions streamlined

* decision_transformer consistency

* return_legacy_cache improved

* ClvpForCausalLM used for legacy cache test now

* is_causal fixed

* attn_output cleanup

* consistency @ decision_transformer

* Updated deprecation notice version to 4.52

* upd deprecation

* consistent legacy cache code in decision transformers\

* next_cache -> past_kv in decision_tr

* cache support flags in decision_transf

* rm legacy cache warning

* consistency in cache init for decision transf

* no Static Cache for Decision Transformer

---------

Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
2025-04-24 14:46:35 +02:00
Joao Gante
9f927c8250
[cache] fix HybridCache init when device is passed (#37718)
fix device init
2025-04-24 13:36:52 +01:00
amd-xiaoyu12
4fee320926
Expand quantized data type support for tensor parallelism (#37719)
Update tensor_parallel.py

Co-authored-by: Xiao YU <Xiao.YU@xilinx.com>
2025-04-24 14:34:32 +02:00
Yih-Dar
0f7940bb3f
Update MllamaForConditionalGenerationIntegrationTest (#37750)
* fix 1

* fix 2

* fix 3

* fix 4

* fix 5

* fix 6

* trigger CI

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-24 14:29:46 +02:00
Yih-Dar
7e6f36cd38
Skip all AriaForConditionalGenerationIntegrationTest on T4 (#37746)
* skip

* ruff

* trigger CI

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-24 14:11:56 +02:00
Zhen
0327d0f7f2
[performance_optim] define flash attention mask on NPU device directly (#37698)
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-04-24 14:06:47 +02:00
Cyril Vallez
14e28bd721
Correctly raise errors when downloading tokenizer files (#37740)
* first try

* Update tokenization_utils_base.py

* Update tokenization_utils_base.py

* standardize
2025-04-24 12:53:07 +02:00
BakerBunker
0ec0495967
Fix embeds_to_talker device in Qwen2.5-Omni (#37739)
Fix `embeds_to_talker` device

Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.com>
2025-04-24 12:49:57 +02:00
NanoCode012
72e4844059
fix: learning_rate logged as tensor causing save issue with deepspeed (#37704)
* fix: learning_rate logged as tensor causing save issue with deepspeed

* chore: lint

---------

Co-authored-by: NanoCode012 <chanvichet@Chanvichets-MacBook-Pro.local>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-04-24 12:20:47 +02:00
Raushan Turganbay
1cfcbfcab8
[VLMs] fix flash-attention tests (#37603)
* fix one test

* fa2 ln test

* remove keys from config recursively

* fix

* fixup
2025-04-24 11:48:11 +02:00
Mohamed Mekkouri
02baa61fab
Make sure torch_is_available before using torch.distributed (#37693)
fix
2025-04-24 11:31:35 +02:00
Fanli Lin
864e9636ff
[tests] fix test_nemotron_8b_generation_sdpa (#37665)
add max_new_tokens
2025-04-24 11:28:35 +02:00
Mohamed Mekkouri
9b3bf4a206
Fix torchao doc examples (#37697)
fix

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-04-24 11:10:27 +02:00
BakerBunker
3ed56bea0f
Fix inference bugs in Qwen2.5 Omni (#37701)
* Init `SinusoidsPositionEmbedding` with float to avoid precision problem

* fix hidden_state for talker

* Update modular_qwen2_5_omni.py

* Move hidden processing out from thinker

* fixup

---------

Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.com>
2025-04-24 10:51:44 +02:00
jiqing-feng
b7f7aa78a0
Fix Aria tests (#37444)
* update aria tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add cuda tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* check outputs for cpu and cuda and xpu

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* check outputs for cpu and cuda and xpu

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* check outputs for cpu and cuda and xpu

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* check output for each device

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix style

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix style

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix xpu output

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add comments and use assert list equal

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* rm pad token assign

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-04-24 10:51:29 +02:00
Daksh Maheshwari
b6d65e40b2
Add Fast Image Processor for MobileNetV1 (#37111)
* fast image processor template for MobileNetV1 via transformers-cli

* Add fast image processors and unify tests for slow/fast image processor classes

* added loop over image_processor_list for all tests and removed boilerplate comments.

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2025-04-23 15:55:41 -04:00
Vinh H. Pham
dea1919be4
Add Fast Image Processor for PoolFormer (#37182)
* support poolformer fast image processor

* support test for crop_pct=None

* run make style

* Apply suggestions from code review

* rename test

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2025-04-23 15:55:33 -04:00
Parteek
b491f128d6
Add Fast PVT Processor (#37204)
* Add Fast PVT Processor

* Update image_processing_pvt_fast.py

* Update image_processing_pvt_fast.py

* remove kwargs

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2025-04-23 15:55:20 -04:00
Yao Matrix
19e9079dc1
enable 4 test_trainer cases on XPU (#37645)
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
2025-04-23 21:29:42 +02:00
Yoni Gozlan
5cd6b64059
Process inputs directly in apply_chat_template in image-text-to-text pipeline (#35616)
* tokenize inputs directly in apply_chat_template

* refactor processing

* revert changes processing llava

* Update docs

* fix issue with str being iterable

* add test chat text only

* change function name
2025-04-23 13:31:33 -04:00
Joao Gante
80ea2c05c2
[tests, qwen2_5_omni] fix flaky tests (#37721) 2025-04-23 17:54:12 +01:00
Pedro Cuenca
63c6331387
Qwen 2.5 Omni: apply video defaults (#37660)
* Apply video defaults for min_pixels and max_pixels

* fps kwarg should not be a list

* Update test to account for new resizing
2025-04-23 17:08:11 +02:00
Raushan Turganbay
1e9087368c
[internvl] fix chat template (#37656)
* fix chat template

* update

* update conversion

* rename `fake_image_token` in tests
2025-04-23 16:56:36 +02:00
Matt
9ec8be56dd
TransfoXL is deprecated, don't keep it in tested examples! (#37707)
* TransfoXL is deprecated, so we should remove it from examples that get tested

* Remove the tokenizer too

* Trigger tests
2025-04-23 14:59:38 +01:00
Joao Gante
be9b0e8521
[CI] add back sacrebleu (and document why) (#37700)
* example test

* add back dep

* dev-ci

* dev-ci
2025-04-23 14:45:00 +01:00
Matt
1d7d7a942e
Add maintainers for ROCm/Intel XPU/Ascend NPU (#37678)
* Add maintainers for ROCm/Intel XPU/Ascend NPU

* Correct capitalization for usernames

* Update .github/ISSUE_TEMPLATE/bug-report.yml

Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>

* Update .github/ISSUE_TEMPLATE/bug-report.yml

Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>

* Trigger tests

---------

Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
2025-04-23 14:28:32 +01:00
Joao Gante
cc9a245e6d
[cleanup] remove /model_cards 🧹 🧹 (#37685)
rm model_cards
2025-04-23 12:45:27 +01:00
Yih-Dar
ca790303f7
Pin torch == 2.6 on PR CI docker images for now (#37695)
pin 2.6 on CircleCi images

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-23 11:47:23 +02:00
Yao Matrix
12f65ee752
enable cpu offloading for Bark on xpu (#37599)
* enable cpu offloading of bark modeling on XPU

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* remove debug print

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* fix review comments

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* enhance test

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* update

* add deprecate message

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* update

* update

* trigger CI

---------

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-23 11:37:15 +02:00
Shahruk Hossain
4f9893cbbc
fix: remove classmethod from Qwen2_5OmniConfig.get_text_config (#37690)
- Since the `get_text_config` references an instance variable within
    the class (`self.thinker_config`), the `get_text_config` method
    should not be a classmethod.

  - Before this fix, users were getting the following error:

    '''
    AttributeError: type object 'Qwen2_5OmniConfig' has no attribute 'thinker_config'
    '''
2025-04-23 09:30:57 +02:00
Vishesh-Mistry
1d9743edc2
Updated model card for mbart and mbart50 (#37619)
* new card for mbart and mbart50

* removed comment BADGES

* Update mBart overview

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix typo (MBart to mBart)

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* maybe fix typo

* update typo and combine notes

* changed notes

* changed the example sentence

* fixed grammatical error and removed some lines from notes example

* missed one word

* removed documentation resources and added some lines of example code back in notes.

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-22 12:26:47 -07:00
Jinyong Lee
fbfa1dd4db
🌐 [i18n-KO] Translated siglip.md to Korean (#37145)
* docs: ko: siglip.md

* feat: nmt draft

* fix: manual edits

* chore: Correct document title to kebab-case format

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Apply suggestions from code review

Convert unnatural language to natural Korean

Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>
2025-04-22 12:23:19 -07:00
Yao Matrix
ece79b0688
enable blip2 and emu3 cases on XPU (#37662)
* enable blip2 and emu3 modeling cases on XPU

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* remove extra new line

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* update

---------

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-22 18:37:09 +02:00
Ken J
ca4c114dc4
Add counters for dataset classes (#37636)
* add counters for dataset classes

* fix failed code style
2025-04-22 17:30:43 +01:00
NielsRogge
d47cdae27e
[Docs] Move models to appropriate section (#37338)
* Move models

* update

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-22 18:23:14 +02:00
Deepak Sahu
dbfccd3c92
typo update in the parameter name (#37655)
See L118 and L143 for the class attribute `hidden_dim`
2025-04-22 18:14:20 +02:00
Joao Gante
de8916dde6
[docs] only build en docs in push CI (#37677) 2025-04-22 17:05:11 +01:00
Joao Gante
0f8c34b0a0
[cleanup] remove old scripts in /scripts 🧹 🧹 (#37676)
* rm old files

* not this one
2025-04-22 16:59:03 +01:00
Yao Matrix
6673081b21
enable 6 granite cases on xpu (#37569)
* enable 6 granite cases on XPU

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* make them all pass on A100

Signed-off-by: N <matrix.yao@intel.com>

* fix style

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* update

---------

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Signed-off-by: N <matrix.yao@intel.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-22 17:55:02 +02:00
Yao Matrix
9167461a7d
enable mllama cases on xpu (#37644)
* enable mllama testing on xpu

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* more mllama cases enabling

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* make cases pass on A100

Signed-off-by: N <matrix.yao@intel.com>

---------

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Signed-off-by: N <matrix.yao@intel.com>
2025-04-22 17:39:10 +02:00
Mohamed Mekkouri
de182ba269
Refactor bitsandbytes doc (#37668)
* doc

* torch ops

* fix

* nits

* Update docs/source/en/quantization/bitsandbytes.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-04-22 16:13:25 +02:00
Antonin Stefanutti
dde9b03e3b
Fix no_split_modules for Llama4 pretrained models (#37673) 2025-04-22 16:05:12 +02:00
Marc Sun
9481e9e9f1
Fix autoround docs (#37675)
* fix

* empty
2025-04-22 15:33:13 +02:00
Mohamed Mekkouri
38c406844e
Fixing quantization tests (#37650)
* fix

* style

* add capability check
2025-04-22 13:59:57 +02:00
Wenhua Cheng
b3492ff9f7
Add AutoRound quantization support (#37393)
* add auto-round support

* Update src/transformers/quantizers/auto.py

Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>

* fix style issue

Signed-off-by: wenhuach <wenhuach87@gmail.com>

* tiny change

* tiny change

* refine ut and doc

* revert unnecessary change

* tiny change

* try to fix style issue

* try to fix style issue

* try to fix style issue

* try to fix style issue

* try to fix style issue

* try to fix style issue

* try to fix style issue

* fix doc issue

* Update tests/quantization/autoround/test_auto_round.py

* fix comments

* Update tests/quantization/autoround/test_auto_round.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update tests/quantization/autoround/test_auto_round.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* update doc

* Update src/transformers/quantizers/quantizer_auto_round.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* update

* update

* fix

* try to fix style issue

* Update src/transformers/quantizers/auto.py

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

* Update docs/source/en/quantization/auto_round.md

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

* Update docs/source/en/quantization/auto_round.md

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

* Update docs/source/en/quantization/auto_round.md

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

* update

* fix style issue

* update doc

* update doc

* Refine the doc

* refine doc

* revert one change

* set sym to True by default

* Enhance the unit test's robustness.

* update

* add torch dtype

* tiny change

* add awq convert test

* fix typo

* update

* fix packing format issue

* use one gpu

---------

Signed-off-by: wenhuach <wenhuach87@gmail.com>
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Shen, Haihao <haihao.shen@intel.com>
2025-04-22 13:56:54 +02:00
Cyril Vallez
9608908639
Correct warm-up with fp8 (#37670)
* start clean warmup for quantizers

* style

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-04-22 13:12:49 +02:00
Cyril Vallez
6614209b96
Fix duplicated weights in fp8 quantization (#37667)
* fix fp8

* Update quantizer_finegrained_fp8.py

* fix circular import

* Update quantizer_finegrained_fp8.py
2025-04-22 13:12:27 +02:00
Raushan Turganbay
dcf6df5b0d
[qwen-omni] fix training (#37517)
* fix

* add text config

* fixup

* fix docs
2025-04-22 12:36:07 +02:00
Pavel Iakubovskii
9167fadab9
Introduce GradientCheckpointingLayer (#37223)
* GradientCheckpointingLayer

* trigger

* Move GC layer to a separate file

* Update import

* Expose and document GC layer

* Fix dummy

* Apply to llama-based models

* Update modulars

* Update a few more models for consistency

* Update glm4

* Update Janus
2025-04-22 11:33:31 +01:00
Manuel de Prada Corral
413f9bbf80
Fixes #37219 : RecurrentGemma crashes for inputs longer than sliding window length (#37613)
* fix: RecurrentGemma crashes during inference for inputs longer than sliding window width

* fix recurrentgemma tests; add long test bigger than context window
2025-04-22 12:21:16 +02:00