Commit Graph

19219 Commits

Author SHA1 Message Date
Yih-Dar
de4cf5a38e
Fix blip2 tests (#38510)
Some checks are pending
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run
Build documentation / build (push) Waiting to run
New model PR merged notification / Notify new model (push) Waiting to run
Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions
Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run
Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions
Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions
Secret Leaks / trufflehog (push) Waiting to run
Update Transformers metadata / build_and_package (push) Waiting to run
* fix 1: not sure

* fix 2: _supports_flex_attn = False

* fix 3: embedding_output = self.layernorm(query_embeds.to(self.layernorm.weight.dtype))

* fix 4: query_embeds = query_embeds.to(self.layernorm.weight.dtype)

* fix 5: text_embeds = text_embeds.to(dtype=torch.float16)

* fix 5: question_embeds.to(dtype=torch.float16)

* fix 6: text_embeds = text_embeds.to(dtype=self.itm_head.weight.dtype)

* fix 7: image_embeds and question_embeds

* fix 8: fix other 2 fp16 tests

* fix 9: fix T5 OOM

* fix 10: fix T5 OOM

* fix 11: fix T5

* fix 11: fix T5 beam

* fix 12: _supports_sdpa=False

* fix 12: style and expect

* revert

* revert

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-06-02 22:46:35 +02:00
Yih-Dar
ccc859620a
Fix Gemma2IntegrationTest (#38492)
* fix

* fix

* skip-ci

* skip-ci

* skip-ci

* skip-ci

* skip-ci

* skip-ci

* skip-ci

* skip-ci

* skip-ci

* skip-ci

* skip-ci

* update

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-06-02 22:45:09 +02:00
Yaswanth Gali
1094dd34f7
Remove type annotation in Siglip Attention Module (#38503)
* Remove type annotation

* remove print statement
2025-06-02 17:51:07 +02:00
Lysandre Debut
afb35a10ed
Num parameters in model.safetensors.index.json (#38531)
Some checks are pending
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run
Build documentation / build (push) Waiting to run
New model PR merged notification / Notify new model (push) Waiting to run
Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions
Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run
Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions
Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions
Secret Leaks / trufflehog (push) Waiting to run
Update Transformers metadata / build_and_package (push) Waiting to run
Num parameters in index.json
2025-06-02 17:16:31 +02:00
Yiding Jia
cceab972ba
[flax/mistral] support sliding_window: null in config (#37402)
flax/mistral: Allow sliding_window to be set to none
2025-06-02 16:45:02 +02:00
Marc Sun
1a25fd2f6d
Fix amp deprecation issue (#38100)
apex amp is deprecated
2025-06-02 16:15:41 +02:00
Ita Zaporozhets
05ad826002
remove unhandled parameter (#38145) 2025-06-02 15:57:32 +02:00
Tony Wu
c72ba69441
Add ColQwen2 to 🤗 transformers (#35778)
* feat: add colqwen2 (wip)

* tests: fix test_attention_outputs

* tests: reduce hidden size to accelerate tests

* tests: fix `test_attention_outputs` 🥳

* fix: fix wrong parent class for `ColQwen2ForRetrievalOutput`

* fix: minor typing and style changes

* chore: run `make style`

* feat: remove redundant `max_num_visual_tokens` attribute in `ColQwen2Processor`

* tests: tweak comments

* style: apply ruff formatter

* feat: move default values for `visual_prompt_prefix` and `query_prefix`

* docs: update ColQwen2 model card

* docs: tweak model cards

* docs: add required example config checkpoint

* tests: update expected scores in integration test

* docs: tweak quickstart snippets

* fix: address PR comments

* tests: fix colqwen2 tests + tweak comment in colpali test

* tests: unskip useful tests

* fix: fix bug when `visual_prompt_prefix` or `query_prefix` is an empty string

* fix: fix ColPali outputs when `return_dict == False`

* fix: fix issue with PaliGemma output not being a dict

* docs: set default dtype to bfloat16 in quickstart snippets

* fix: fix error when `return_dict=False` in ColPali and ColQwen2

* tests: fix special tokens not being replaced in input_ids

* style: fix lint

* fix: `ColQwen2Processor`'s `padding_side` is now set from `processor_config.json`

* fix: remove unused `padding_side` in ColQwen2 model

* docs: update ColQwen2's model doc

* fix: fix harcoded vlm backbone class in ColQwen2Config

* fix: remove `padding_side` from ColQwen2Processor as should fed from kwargs

* docs: fix typo in model docstring

* docs: add illuin mention in model docs

* fix: let `padding_size` be handled by `tokenizer_config.json`

* docs: add colpali reference url in colqwen2's model doc

* docs: add Hf mention in model docs

* docs: add late interaction mention in model docs

* docs: tweak colqwen2 model doc

* docs: update reference checkpoint for ColPali to v1.3

* docs: simplify quickstart snippets

* docs: remove redundant `.eval()`

* refactor:  use `can_return_tuple` decorator for ColPali and ColQwen2

* docs: fix copyright date

* docs: add missing copyright in tests

* fix: raise error when `initializer_range` is not in config

* docs: remove redundant `.eval()` in colpali doc

* fix: fix `get_text_config` now that Qwen2VL has a proper `text_config` attribute

See https://github.com/huggingface/transformers/pull/37268 for details about changes in Qwen2VL's config.

* fix: add missing `initializer_range` attribute in `ColQwen2Config`

* fix: use `get_text_config` in `resize_token_embeddings`

* update colwen2 with auto_docstring

* docs: fix wrong copyright year

* chore: remove `raise` as `initializer_range` has a default value in `ColQwen2Config`

* refactor: merge `inner_forward` into `forward`

* Refactor colqwen2 after refactoring of qwen2VL, use modular for modeling code

* protect torch import in modular to protect in processing

* protect torch import in modular to protect in processing

* tests: fix hf model path in ColQwen2 integration test

* docs: clarify `attn_implementation` and add comments

* docs: add fallback snippet for using offline PIL dummy images

* docs: temporarily revert attn_implementation to `None` while sdpa is not fixed

* docs: tweaks in colpali/colqwen2 quick start snippets

* fix: add missing flags to enable SDPA/Flex Attention in ColQwen2 model

* fix: add missing changes in modular file

* fix modeling tests

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
2025-06-02 12:58:01 +00:00
Joao Gante
beaed8ce01
[generate] move SinkCache to a custom_generate repo (#38399)
remove sink cache
2025-06-02 12:13:30 +02:00
Joao Gante
fe5bfaa4b5
[generate] add soft deprecations on custom generation methods (#38406)
soft deprecations
2025-06-02 12:11:46 +02:00
mohammed benyamna
a75b9ffb5c
Update Loss Functions to Accept Tensor num_items_in_batch (#38029)
* Update Loss Functions to Accept Tensor num_items_in_batch

* Fix device mismatch by moving num_items_in_batch to loss device in fixed_cross_entropy

* fix the ruff check

* delete the unused if stat

* fix the type problem
2025-06-02 11:31:44 +02:00
Rémi Ouazan
493cf1554b
[seamless_m4t] Skip some tests when speech is not available (#38430)
* Added the require_speech decorator

* Added require_speecj to some seamless_m4t tests

* Changed skip message
2025-06-02 09:17:28 +00:00
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟
64d14ef28d
Fix setting FLASH_ATTENTION_DETERMINISTIC after importing (#37185)
transformers.enable_full_determinism enables deterministic
flash attention using `FLASH_ATTENTION_DETERMINISTIC`
800510c67b/src/transformers/trainer_utils.py (L79)

However, current checks use a global variable `deterministic_g`,
which will do the environment variable check as soon as importing,
this will cause issues as users can call
`transformers.enable_full_determinism` after
`transformers.modeling_flash_attention_utils` is imported. This
behavior is introduced in
https://github.com/huggingface/transformers/pull/33932/files#r1806668579
to fix the graph break.

As a result, this PR implement fixes by delaying the environment variable
check to the first time when `_flash_attention_forward` is executed, so
that we can fix this issue and we won't introduce a graph break.

Signed-off-by: Hollow Man <hollowman@opensuse.org>
2025-06-02 11:08:20 +02:00
Yuanyuan Chen
fde1120b6c
Remove deprecated use_flash_attention_2 parameter (#37131)
Signed-off-by: cyy <cyyever@outlook.com>
2025-06-02 11:06:25 +02:00
Fanli Lin
51d732709e
[docs] add xpu environment variable for gpu selection (#38194)
Some checks failed
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Has been cancelled
Build documentation / build (push) Has been cancelled
Slow tests on important models (on Push - A10) / Get all modified files (push) Has been cancelled
Self-hosted runner (push-caller) / Check if setup was changed (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Transformers metadata / build_and_package (push) Has been cancelled
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Has been cancelled
Self-hosted runner (push-caller) / build-docker-containers (push) Has been cancelled
Self-hosted runner (push-caller) / Trigger Push CI (push) Has been cancelled
* squash commits

* rename gpu

* rename accelerator

* change _toctree.yml

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: sdp <sdp@a4bf01943ff7.jf.intel.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-05-30 16:05:07 +00:00
Marc Sun
c7f2b79dd8
protect dtensor import (#38496)
protect
2025-05-30 17:36:00 +02:00
Marc Sun
051a8acc9a
Align TP check (#38328)
align tp check
2025-05-30 17:15:39 +02:00
M Saqlain
e0545ef0b8
[Tests] Reduced model size for albert-test model (#38480)
* Reduced model size for albert-test model

* Run checks

* Removed test_save_load

* Removed test skipping functions
2025-05-30 14:22:32 +00:00
dependabot[bot]
f962c862ff
Bump torch from 2.2.0 to 2.6.0 in /examples/flax/vision (#37618)
Some checks failed
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run
Build documentation / build (push) Waiting to run
Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions
Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run
Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions
Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions
Secret Leaks / trufflehog (push) Waiting to run
Update Transformers metadata / build_and_package (push) Waiting to run
New model PR merged notification / Notify new model (push) Has been cancelled
Bumps [torch](https://github.com/pytorch/pytorch) from 2.2.0 to 2.6.0.
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md)
- [Commits](https://github.com/pytorch/pytorch/compare/v2.2.0...v2.6.0)

---
updated-dependencies:
- dependency-name: torch
  dependency-version: 2.6.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-30 14:04:52 +01:00
islemyakoubi
98568d1e25
Fix incorrect bbox_embed initialization when decoder_bbox_embed_share=False in GroundingDINO (#38238)
* A shallow copy in groundingdino
Fixes #37333

* Supprimer une ligne vide dans la classe GroundingDinoForObjectDetection

* Translate comments in the GroundingDinoForObjectDetection class from French to English
2025-05-30 15:02:18 +02:00
Winston Castorp
d0fccbf7ef
Fix convert_internvl_weights_to_hf.py to support local paths (#38264)
fix(internvl): add local path support to convert_internvl_weights_to_hf.py
2025-05-30 14:56:32 +02:00
Arthur
858ce6879a
make it go brrrr (#38409)
* make it go brrrr

* date time

* update

* fix

* up

* uppp

* up

* no number i

* udpate

* fix

* [paligemma] fix processor with suffix (#38365)

fix pg processor

* [video utils] group and reorder by number of frames (#38374)

fix

* Fix convert to original state dict for VLMs (#38385)

* fix convert to original state dict

* fix

* lint

* Update modeling_utils.py

* update

* warn

* no verbose

* fginal

* ouft

* style

---------

Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-05-30 11:19:42 +02:00
Luc Georges
ab5067e7fd
fix: handle no scheduler passed by user (#38407) 2025-05-30 11:00:44 +02:00
XING, Zhenghao
42ef218b58
[Qwen2.5-Omni] Fix dtype of cos,sin when used with flash attention (#38453)
Some checks are pending
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run
Build documentation / build (push) Waiting to run
New model PR merged notification / Notify new model (push) Waiting to run
Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions
Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run
Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions
Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions
Secret Leaks / trufflehog (push) Waiting to run
Update Transformers metadata / build_and_package (push) Waiting to run
* Fix dtype of cos,sin when used with flash attention

* Fix dtype of cos,sin when used with flash attention
2025-05-29 18:24:40 +00:00
Yih-Dar
81cff7ad34
Fix Gemma3IntegrationTest (#38471)
* check

* check

* check

* check

* check

* check

* check

* test style bot

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-29 16:51:12 +02:00
Lukas Geiger
e508965df7
Cleanup BatchFeature and BatchEncoding (#38459)
* Use dict comprehension to create dict

* Fix type annotation

Union[Any] doesn't really make any sense

* Remove methods that are already implemented in the `UserDict` parent
class
2025-05-29 14:13:43 +00:00
Rahul
8e5cefcb1e
Fix TypeError in save_pretrained error handling (fixes #38422) (#38449) 2025-05-29 13:58:16 +00:00
Raushan Turganbay
ad9dd3d17b
🔴 [VLM] modeling updates (#38317)
Some checks are pending
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run
Build documentation / build (push) Waiting to run
New model PR merged notification / Notify new model (push) Waiting to run
Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions
Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run
Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions
Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions
Secret Leaks / trufflehog (push) Waiting to run
Update Transformers metadata / build_and_package (push) Waiting to run
* updates

* fixup

* fix tests

* fix test

* fix

* let it be here for now, till monday

* two more fixes

* persimmon

* fixup

* fix

* fixup

* make sure fuyu runs now that LM has new attn API

* fixup + tests

* qwen vl uses new mask interface as well

* qwen image features format

* update

* remove image_sizes

* address comments

* i am dumb...
2025-05-29 11:08:23 +00:00
Yaswanth Gali
a6f7acb603
[Tests] Clean up test cases for few models (#38315)
* Update tests

* revert aria change

* too slow hence revert
2025-05-29 08:21:28 +00:00
Luc Georges
8010f3cf61
feat: add cache retention for requests (#38446)
Some checks are pending
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run
Build documentation / build (push) Waiting to run
New model PR merged notification / Notify new model (push) Waiting to run
Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions
Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run
Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions
Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions
Secret Leaks / trufflehog (push) Waiting to run
Update Transformers metadata / build_and_package (push) Waiting to run
* feat: add cache retention for requests

* fix: propagate `manual_eviction` param & refactor `finish_request`

`finish_request` now only takes `request_id: str` as an input rather
than the full `RequestState`, which was not needed and simplifies
calling from `ContinuousBatchingManager::evict_request_from_cache`

* refactor: pop req from `active_requests`

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-05-28 18:15:10 +00:00
Yih-Dar
66da700145
Fix GLM4 checkpoints (#38412)
* fix

* fix

* fix

* fix

* fix

* fix

* test style bot

* Apply style fixes

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-05-28 16:40:08 +00:00
Avasam
2872e8bac5
Merge type hints from microsoft/python-type-stubs (post dropping support for Python 3.8) (#38335)
* Merge type hints from microsoft/python-type-stubs (post Python 3.8)

* Remove mention of pylance

* Resolved conflict

* Merge type hints from microsoft/python-type-stubs (post Python 3.8)

* Remove mention of pylance

* Resolved conflict

* Update src/transformers/models/auto/configuration_auto.py

Co-authored-by: Avasam <samuel.06@hotmail.com>

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-05-28 16:21:40 +00:00
Yuanzhou Cai
942c60956f
Model card for mobilenet v1 and v2 (#37948)
* doc: #36979

* doc: update hfoptions

* add model checkpoints links

* add model checkpoints links

* update example output

* update style #36979

* add pipeline tags

* improve comments

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* apply suggested changes

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-05-28 09:20:19 -07:00
Jiwook Han
9a8510572b
Updated the model card for ViTMAE (#38302)
* Update vit_mae.md

* badge float:right

* Update docs/source/en/model_doc/vit_mae.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/vit_mae.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/vit_mae.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/vit_mae.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/vit_mae.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/vit_mae.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/vit_mae.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/vit_mae.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/vit_mae.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update model_doc/vit_mae.md

* fix

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-05-28 09:19:43 -07:00
Vanshu
c9fcbd5bf9
Updated the Model docs - for the ALIGN model (#38072)
* Updated the Model docs - for the ALIGN model

* Update docs/source/en/model_doc/align.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/align.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Updated align.md

* Update docs/source/en/model_doc/align.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/align.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update align.md

* fix

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-05-28 09:19:09 -07:00
Yoni Gozlan
cba94e9272
Fix handling of slow/fast image processors in image_processing_auto.py (#38161)
Fix wrong error when torchvision is not installed
2025-05-28 16:00:23 +00:00
Yoni Gozlan
21b10d9aa4
Fix from_args_and_dict ProcessorMixin (#38296)
* fix-from-args-and-dict-processormixin

* change used_kwargs to valid_kwargs

* remove manual valid_kwargs

* fix copies

* fix modular aria
2025-05-28 11:46:33 -04:00
Matt
f844733568
Fix MoE gradient test (#38438) 2025-05-28 16:44:20 +01:00
Matt
0ed6f7e6b4
Remove redundant test_sdpa_equivalence test (#38436)
* Remove redundant test

* make fixup
2025-05-28 17:22:25 +02:00
Yih-Dar
51e0fac29f
Trigger doc-builder job after style bot (#38398)
* update

* update

* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-28 17:15:34 +02:00
Yoni Gozlan
c24d18bbae
Fix convert weights for InternVL (#38233)
Fix internvl convert weights
2025-05-28 11:14:56 -04:00
Matthew Ngan
8850427242
Fix typo in tokenization_utils_base.py docstring (#38418)
Fix typo in tokenization_utils_base.py
2025-05-28 14:52:10 +00:00
Peter St. John
bab40c6838
[core] support tensor-valued _extra_state values in from_pretrained (#38155)
Support tensor-valued _extra_state values

TransformerEngine uses the pytorch get/set_extra_state API to store FP8
layer config information as bytes Tensor in the _extra_state entry in
the state dict. With recent changes to from_pretrained, this
functionality has broken and loading a model that uses this API doesn't
appear to work. This PR fixes the save/load pretrained functions for
extra state entries that use a pytorch tensor, and adds a (currently
x-failing) test for a dictionary extra state.

Signed-off-by: Peter St. John <pstjohn@nvidia.com>
2025-05-28 15:38:42 +02:00
Anton Vlasjuk
badc71b9f6
🔴[Attention] Attention refactor for Whisper-based models (#38235)
Some checks failed
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run
Build documentation / build (push) Waiting to run
New model PR merged notification / Notify new model (push) Waiting to run
Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions
Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run
Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions
Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions
Update Transformers metadata / build_and_package (push) Waiting to run
Self-hosted runner (AMD mi250 scheduled CI caller) / Model CI (push) Has been cancelled
Self-hosted runner (AMD mi250 scheduled CI caller) / Torch pipeline CI (push) Has been cancelled
Self-hosted runner (AMD mi250 scheduled CI caller) / Example CI (push) Has been cancelled
Self-hosted runner (AMD mi250 scheduled CI caller) / DeepSpeed CI (push) Has been cancelled
Self-hosted runner scale set (AMD mi300 scheduled CI caller) / Model CI (push) Has been cancelled
Self-hosted runner scale set (AMD mi300 scheduled CI caller) / Torch pipeline CI (push) Has been cancelled
Self-hosted runner scale set (AMD mi300 scheduled CI caller) / Example CI (push) Has been cancelled
Self-hosted runner scale set (AMD mi300 scheduled CI caller) / DeepSpeed CI (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
* start refactoring whisper

* revert for now

* first step

* carry over attn fixes

* check if this works

* whisper has an off by one somewhere - cutting mask in any interface

* make it based on interface

* remove some tests that were skipped but now work

* some fixes for whisper tests

* interface changes

* change the order of fix

* some attention adjustments for eager + TP

* fix scaling

* mask changes

* why does whisper contain those extra seq lens?

* fix from config for fa2 as input_ids is invalid

* fix another test

* another fix

* disable flex attn due to compile issues

* copies and refactor for qwen audio since it somewhat relies on whisper

* fix scaling and smaller things

* retrigger

* new new interface version + more fixups

* adjust qwen

* add comment

* forgot this one

* change copies as whisper cuts on the mask

* add guard

* add flex attention

* switch to new mask function + add skips for torchscript

* remove old api with cache position

* last changes?

* trigger ci
2025-05-28 13:32:38 +02:00
JJJYmmm
565a0052ed
make Llama4TextMoe forward more readable (#37529)
* update forward of Llama4TextMoe

* remove redudant transpose

* fix formatting

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-05-28 11:54:45 +02:00
Yih-Dar
defeb04299
Fix CircleCI not triggered when PR is opened from a branch of huggingface/transformers (#38413)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-28 11:25:43 +02:00
Cyril Vallez
593276fe1e
Update error when using additional and/or masks (#38429)
update error
2025-05-28 11:08:49 +02:00
ivarflakstad
3aab6e95cb
Disable mi210 scheduled CI (#38411) 2025-05-28 10:35:41 +02:00
Yao Matrix
fb82a98717
enable large_gpu and torchao cases on XPU (#38355)
* cohere2 done

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* enable torchao cases on XPU

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* fix

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* fix

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* fix

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* rename

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* fix

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* fix comments

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

---------

Signed-off-by: Matrix Yao <matrix.yao@intel.com>
Signed-off-by: Matrix YAO <matrix.yao@intel.com>
2025-05-28 10:30:16 +02:00
Yih-Dar
cea254c909
Update CsmForConditionalGenerationIntegrationTest (#38424)
* require_read_token

* ruff

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-28 10:20:43 +02:00