Commit Graph

1679 Commits

Author SHA1 Message Date
Cyril Vallez
ab1afd56f5
Fix some tests (#35682)
* cohere tests

* glm tests

* cohere2 model name

* create decorator

* update

* fix cohere2 completions

* style

* style

* style

* add cuda in comments
2025-01-17 12:10:43 +00:00
Joao Gante
94af1c0aa2
[generate] return Cache object even if passed in a legacy format (#35673)
* generate returns a Cache object by default

* fix tests

* fix test for encoder-decoder models
2025-01-16 17:06:24 +00:00
Raushan Turganbay
09d5f76274
Clean-up composite configs (#34603)
* remove manual assignment tie-word-embeddings

* remove another unused attribute

* fix tests

* fix tests

* remove unnecessary overwrites

* fix

* decoder=True

* clean pix2struct

* run-all

* forgot `_tied_weights_keys` when adding Emu3

* also Aria + fix-copies

* and clean aria
2025-01-15 10:04:07 +01:00
Arthur
c23a1c1932
Add-helium (#35669)
* Add the helium model.

* Add a missing helium.

* And add another missing helium.

* Use float for the rmsnorm mul.

* Add the Helium tokenizer converter.

* Add the pad token as suggested by Arthur.

* Update the RMSNorm + some other tweaks.

* Fix more rebase issues.

* fix copies and style

* fixes and add helium.md

* add missing tests

* udpate the backlink

* oups

* style

* update init, and expected results

* small fixes

* match test outputs

* style fixup, fix doc builder

* add dummies and we should be good to go!z

* update sdpa and fa2 documentation

---------

Co-authored-by: laurent <laurent.mazare@gmail.com>
2025-01-13 18:41:15 +01:00
Fanli Lin
2fa876d2d8
[tests] make cuda-only tests device-agnostic (#35607)
* intial commit

* remove unrelated files

* further remove

* Update test_trainer.py

* fix style
2025-01-13 14:48:39 +01:00
Arthur
e6f9b03464
[Compile] Only test compiling model forward pass (#35658)
* rename test to only compile forward!

* style emu
2025-01-13 13:43:29 +01:00
Raushan Turganbay
84a6789145
Enable different torch dtype in sub models (#34873)
* fix

* fix test

* add tests

* add more tests

* fix tests

* supposed to be a torch.dtype test

* handle BC and make fp32 default
2025-01-13 13:42:08 +01:00
Yih-Dar
1e3c6c1f7d
Skip MobileNetV1ModelTest::test_batching_equivalence for now (#35614)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-10 18:32:36 +01:00
Raushan Turganbay
52e1f87c7d
[WIP] Emu3: add model (#33770)
* model can convert to HF and be loaded back

* nit

* works in single batch generation but hallucinates

* use the image tokens

* add image generation

* now it works

* add tests

* update

* add modulare but it doesn't work for porting docstring :(

* skip some tests

* add slow tests

* modular removed the import?

* guess this works

* update

* update

* fix copies

* fix test

* fix copies

* update

* docs

* fix tests

* last fix tests?

* pls

* repo consistency

* more style

* style

* remove file

* address comments

* tiny bits

* update after the new modular

* fix tests

* add one more cond in check attributes

* decompose down/up/mid blocks

* allow static cache generation in VLMs

* nit

* fix copies

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix VAE upsampling

* Update src/transformers/models/emu3/modular_emu3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* address comments

* state overwritten stuff explicitly

* fix copies

* add the flag for flex attn

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-10 12:23:00 +01:00
Raushan Turganbay
e0646f3dce
Chat template: return vectorized output in processors (#34275)
* update chat template

* style

* fix tests

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* typehints + docs

* fix tests

* remove unnecessary warnings

* forgot code style :(

* allow users to pass backend and num frames

* Update docs/source/en/chat_templating.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/processing_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* typo fix

* style

* address comments

* align with "pipeline" template

* update docs

* update docs

* unpack for all kwargs?

* wrong conflict resolution while rebasing

* tmp

* update docs

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-10 11:05:29 +01:00
eustlb
5f087d1335
Add Moonshine (#34784)
* config draft

* full encoder forward

* full decoder forward

* fix sdpa and FA2

* fix sdpa and FA2

* moonshine model

* moonshine model forward

* fix attention with past_key_values

* add MoonshineForConditionalGeneration

* fix cache handling and causality for cross attention

* no causal attention mask for the encoder

* model addition (imports etc)

* small nit

* nits

* Update src/transformers/models/moonshine/convert_usefulsensors_to_hf.py

Co-authored-by: Joshua Lochner <admin@xenova.com>

* add rope_theta

* nits

* model doc

* Update src/transformers/models/auto/configuration_auto.py

Co-authored-by: Joshua Lochner <admin@xenova.com>

* imports

* add MODEL_FOR_SPEECH_SEQ_2_SEQ_MAPPING_NAMES

* updates modular

* make

* make fix-copies

* ruff check examples fix

* fix check_modular_conversion

* nit

* nits

* nits

* copied from -> imports

* imports fix

* integrate attention refacto

* modular edge case

* remove encoder

* convolutions params in config

* run modular_model_converter

* make

* Update docs/source/en/model_doc/moonshine.md

Co-authored-by: Joshua Lochner <admin@xenova.com>

* MoonshineModelTest

* correct typo

* make style

* integration tests

* make

* modular convert

* name conversion update (up_proj -> fc1 etc)

* update config

* update MLP

* update attention

* update encoder layer

* update decoder layer

* update convolutions parameters

* update encoder

* remove INPUTS_DOCSTRING

* update decoder

* update conditional generation

* update pretrained model

* imports

* modular converted

* update doc

* fix

* typo

* update doc

* update license

* update init

* split config in file

* two classes for MLP

* attention from GLM

* from GlmRotaryEmbedding

* split MLP

* apply arthur's review suggestions

* apply arthur's review suggestions

* apply arthur's review suggestions

* auto feature extractor

* convert modular

* fix + make

* convert modular

* make

* unsplit config

* use correct checkpoint

* wrap generate

* update tests

* typos

* make

* typo

* update doc

---------

Co-authored-by: Joshua Lochner <admin@xenova.com>
2025-01-10 11:00:54 +01:00
Tom Aarsen
6b73ee8905
ModernBert: reuse GemmaRotaryEmbedding via modular + Integration tests (#35459)
* Introduce 5 integration tests for the 4 model classes + torch export

* ModernBert: reuse GemmaRotaryEmbedding via modular

* Revert #35589, keep rope_kwargs; rely on them in modular_modernbert

* Revert "Revert #35589, keep rope_kwargs; rely on them in modular_modernbert"

This reverts commit 11b44b9ee8.

* Don't set rope_kwargs; override 'self.rope_init_fn' call instead
2025-01-10 10:25:10 +01:00
Cyril Vallez
3a4ae6eace
Refactor/fix Cohere2 (#35594)
* refactor/fix cohere2

* add kwargs

* tests

* remove func and import it
2025-01-09 17:54:57 +01:00
Yih-Dar
82dd6c14bb
Fix flaky SwitchTransformersModelTest::test_training_gradient (#35587)
* fix

* Update tests/models/switch_transformers/test_modeling_switch_transformers.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-09 15:36:22 +01:00
Jack Morris
832c6191ed
Add inputs_embeds param to ModernBertModel (#35373)
* update modular_modernbert -- add inputs_embeds param to ModernBertModel

* Fix implementation issues; extend to other classes; docstring

First of all, the inputs_embeds shouldn't fully replace `self.embeddings(input_ids)`, because this call also does layer normalization and dropout. So, now both input_ids and inputs_embeds is passed to the ModernBertEmbeddings, much like how BertEmbeddings is implemented.

I also added `inputs_embeds` to the docstring, and propagated the changes to the other model classes.

I also introduced an error if input_ids and input_embeds are both or neither provided.

Lastly, I fixed an issue with device being based solely on input_ids with attention_mask.

* Propagate inputs_embeds to ModernBertForMaskedLM correctly

Also reintroduce inputs_embeds test

---------

Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>
2025-01-09 14:17:26 +01:00
Yih-Dar
1b2f942af7
Fix flaky test_batching_equivalence (#35564)
* yes!

* oh no!!!

* oh no!!!

* style

* oh no!!!

* oh no!!!

* oh no!!!

* oh no!!!

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-09 14:00:08 +01:00
Cyril Vallez
965a2fb320
More model refactoring! (#35359)
* cohere

* style

* phi3

* style

* small fix

* small fix

* phi3 longrope

* oups

* Update rope (only for phi3 still)

* Update test_modeling_rope_utils.py

* Update modeling_phi3.py

* fix

* fix copies

* style

* Fix copied from bad renaming
2025-01-09 11:09:09 +01:00
Arthur
3f483beab9
[PixtralLarge] Update Pixtral conversion script to support large format! (#34801)
* update conversion script

* update for bias again

* remove pdv

* use my dir

* Update how we initialize the tokenizer

* Convert in bfloat16

* Undo that one again

* fix config dump

* .to() was broken for BatchMixFeature

* quick debug breakpoint

* put the breakpoint in the right place

* Add a config flag for the multimodal projector bias

* Add a config flag for the multimodal projector bias

* Conversion script can load chat templates

* Indent config for comparison

* Stop clobbering the config

* Re-enable the config clobber

* Get rid of the config manual save - it has no effect!

* Handle adapter bias correctly

* Default vision transformer activation to silu

* Remove legacy processing path

* One commit with all the debug breakpoints before I delete them all, in case I need to revert

* Update conversion

* Remove vLLM debugging instrumentation

* Drop xformers

* Remove debug enumerates

* make fixup

* make fixup

* Break copied from in pixtral

* Propagate multimodal_projector_bias change

* Propagate multimodal_projector_bias change

* Remove debug device .to()

* Restore attention weights output

* Fix Pixtral test

* Drop image_seq_length

* Drop image_seq_length

* Put the legacy processing code back

* Add the bias option to the llava_next_video config

* Add the bias option to the llava_next_video config

* Make certain args required in converter

* Make certain args required in converter

* typo

* make fixup

* Reverting some dtype changes since it seems to work without them

---------

Co-authored-by: arthur@huggingface.co <arthur@ip-26-0-166-244.ec2.internal>
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-01-08 17:39:47 +01:00
NielsRogge
8490d3159c
Add ViTPose (#30530)
* First draft

* Make fixup

* Make forward pass worké

* Improve code

* More improvements

* More improvements

* Make predictions match

* More improvements

* Improve image processor

* Fix model tests

* Add classic decoder

* Convert classic decoder

* Verify image processor

* Fix classic decoder logits

* Clean up

* Add post_process_pose_estimation

* Improve post_process_pose_estimation

* Use AutoBackbone

* Add support for MoE models

* Fix tests, improve num_experts%

* Improve variable names

* Make fixup

* More improvements

* Improve post_process_pose_estimation

* Compute centers and scales

* Improve postprocessing

* More improvements

* Fix ViTPoseBackbone tests

* Add docstrings, fix image processor tests

* Update index

* Use is_cv2_available

* Add model to toctree

* Add cv2 to doc tests

* Remove script

* Improve conversion script

* Add coco_to_pascal_voc

* Add box_to_center_and_scale to image_transforms

* Update tests

* Add integration test

* Fix merge

* Address comments

* Replace numpy by pytorch, improve docstrings

* Remove get_input_embeddings

* Address comments

* Move coco_to_pascal_voc

* Address comment

* Fix style

* Address comments

* Fix test

* Address comment

* Remove udp

* Remove comment

* [WIP] need to check if the numpy function is same as cv

* add scipy affine_transform

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* refactor convert

* add output_shape

* add atol 5e-2

* Use hf_hub_download in conversion script

* make box_to_center more applicable

* skipt test_get_set_embedding

* fix to accept array and fix CI

* add co-contributor

* make it to tensor type output

* add torch

* change to torch tensor

* add more test

* minor change

* CI test change

* import torch should be above ImageProcessor

* make style

* try not use torch in def

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vitpose_backbone/configuration_vitpose_backbone.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vitpose/modeling_vitpose.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix

* fix

* add caution

* make more detail about dataset_index

* Update src/transformers/models/vitpose/modeling_vitpose.py

Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>

* add docs

* Update docs/source/en/model_doc/vitpose.md

* Update src/transformers/models/vitpose/configuration_vitpose.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Revert "Update src/transformers/__init__.py"

This reverts commit 7ffa504450.

* change name

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/vitpose/test_modeling_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/vitpose.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vitpose/modeling_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* move vitpose only function to image_processor

* raise valueerror when using timm backbone

* use out_indices

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove camel-case of def flip_back

* rename vitposeEstimatorOutput

* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix confused camelcase of MLP

* remove in-place logic

* clear scale description

* make consistent batch format

* docs update

* formatting docstring

* add batch tests

* test docs change

* Update src/transformers/models/vitpose/image_processing_vitpose.py

* Update src/transformers/models/vitpose/configuration_vitpose.py

* chagne ViT to Vit

* change to enable MoE

* make fix-copies

* Update docs/source/en/model_doc/vitpose.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* extract udp

* add more described docs

* simple fix

* change to accept target_size

* make style

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vitpose/configuration_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* change to `verify_backbone_config_arguments`

* Update docs/source/en/model_doc/vitpose.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove unnecessary copy

* make config immutable

* enable gradient checkpointing

* update inappropriate docstring

* linting docs

* split function for visibility

* make style

* check isinstances

* change to acceptable use_pretrained_backbone

* make style

* remove copy in docs

* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update docs/source/en/model_doc/vitpose.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/vitpose/modeling_vitpose.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* simple fix + make style

* change input config of activation function to string

* Update docs/source/en/model_doc/vitpose.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* tmp docs

* delete index.md

* make fix-copies

* simple fix

* change conversion to sam2/mllama style

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* refactor convert

* add supervision

* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* remove reduntant def

* seperate code block for visualization

* add validation for num_moe

* final commit

* add labels

* [run-slow] vitpose, vitpose_backbone

* Update src/transformers/models/vitpose/convert_vitpose_to_hf.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* enable all conversion

* final commit

* [run-slow] vitpose, vitpose_backbone

* ruff check --fix

* [run-slow] vitpose, vitpose_backbone

* rename split module

* [run-slow] vitpose, vitpose_backbone

* fix pos_embed

* Simplify init

* Revert "fix pos_embed"

This reverts commit 2c56a4806e.

* refactor single loop

* allow flag to enable custom model

* efficiency of MoE to not use unused experts

* make style

* Fix range -> arange to avoid warning

* Revert MOE router, a new one does not work

* Fix postprocessing a bit (labels)

* Fix type hint

* Fix docs snippets

* Fix links to checkpoints

* Fix checkpoints in tests

* Fix test

* Add image to docs

---------

Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: sangbumchoi <danielsejong55@gmail.com>
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-01-08 16:02:14 +00:00
Pavel Iakubovskii
657bb14f98
Enable auto task for timm models in pipeline (#35531)
* Enable auto task for timm models

* Add pipeline test
2025-01-08 15:14:17 +00:00
Pavel Iakubovskii
59e5b3f01b
Timm wrapper label names (#35553)
* Add timm wrapper label names mapping

* Add index to classification pipeline

* Revert adding index for pipelines

* Add custom model check for loading timm labels

* Add tests for labels

* [run-slow] timm_wrapper

* Add note regarding label2id mapping
2025-01-08 14:09:46 +00:00
Jacky Lee
3c1895aa65
Fix Qwen2VL processor to handle odd number of frames (#35431)
* fix: processing odd number of frames

* feat: add test case

* update: test one frame

* feat: support custom patch size

* fix: test with videos

* revert: change on patch repeat

* fix: much wow

* update: fixups

* fixup pls

* ruff fixup

* fix typo at least
2025-01-08 13:49:00 +01:00
Raushan Turganbay
d1681ec2b6
VLMs: major clean up 🧼 (#34502)
only lllava models are modified
2025-01-08 10:35:23 +01:00
Jade Choghari
7176e06b52
Add TextNet (#34979)
* WIP

* Add config and modeling for Fast model

* Refactor modeling and add tests

* More changes

* WIP

* Add tests

* Add conversion script

* Add conversion scripts, integration tests, image processor

* Fix style and copies

* Add fast model to init

* Add fast model in docs and other places

* Fix import of cv2

* Rename image processing method

* Fix build

* Fix Build

* fix style and fix copies

* Fix build

* Fix build

* Fix Build

* Clean up docstrings

* Fix Build

* Fix Build

* Fix Build

* Fix build

* Add test for image_processing_fast and add documentation tests

* some refactorings

* Fix failing tests

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Introduce TextNet

* Fix failures

* Refactor textnet model

* Fix failures

* Add cv2 to setup

* Fix failures

* Fix failures

* Add CV2 dependency

* Fix bugs

* Fix build issue

* Fix failures

* Remove textnet from modeling fast

* Fix build and other things

* Fix build

* some cleanups

* some cleanups

* Some more cleanups

* Fix build

* Incorporate PR feedbacks

* More cleanup

* More cleanup

* More cleanup

* Fix build

* Remove all the references of fast model

* More cleanup

* Fix build

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Fix Build

* Fix build

* Fix build

* Fix build

* Fix build

* Fix build

* Incorporate PR feedbacks

* Fix style

* Fix build

* Incorporate PR feedbacks

* Fix image processing mean and std

* Incorporate PR feedbacks

* fix build failure

* Add assertion to image processor

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* fix style failures

* fix build

* Fix Imageclassification's linear layer, also introduce TextNetImageProcessor

* Fix build

* Fix build

* Fix build

* Fix build

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Fix build

* Incorporate PR feedbacks

* Remove some script

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Fix image processing in textnet

* Incorporate PR Feedbacks

* Fix CI failures

* Fix failing test

* Fix failing test

* Fix failing test

* Fix failing test

* Fix failing test

* Fix failing test

* Add textnet to readme

* Improve readability

* Incorporate PR feedbacks

* fix code style

* fix key error and convert working

* tvlt shouldn't be here

* fix test modeling test

* Fix tests, make fixup

* Make fixup

* Make fixup

* Remove TEXTNET_PRETRAINED_MODEL_ARCHIVE_LIST

* improve type annotation

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update tests/models/textnet/test_image_processing_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* improve type annotation

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* space typo

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* improve type annotation

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/textnet/configuration_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* make conv layer kernel sizes and strides default to None

* Update src/transformers/models/textnet/modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/textnet/modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* fix keyword bug

* add batch init and make fixup

* Make fixup

* Update integration test

* Add figure

* Update textnet.md

* add testing and fix errors (classification, imgprocess)

* fix error check

* make fixup

* make fixup

* revert to original docstring

* add make style

* remove conflict for now

* Update modeling_auto.py

got a confusion in `timm_wrapper` - was giving some conflicts

* Update tests/models/textnet/test_modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/textnet/modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update tests/models/textnet/test_modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/textnet/modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* add changes

* Update textnet.md

* add doc

* add authors hf ckpt + rename

* add feedback: classifier/docs

---------

Co-authored-by: raghavanone <opensourcemaniacfreak@gmail.com>
Co-authored-by: jadechoghari <jadechoghari@users.noreply.huggingface.co>
Co-authored-by: Niels <niels.rogge1@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-01-08 09:52:51 +01:00
eustlb
7f7677307c
[Qwen2Audio] handle input ids expansion during processing (#35534)
* add audio_token attribute to proc

* expand input_ids

* and legacy and expanded input_ids

* test update

* split lines

* add possibility not to provide eos and bos audio tokens

* raise errors

* test incorrect number of audio tokens

* add example

* fmt

* typo
2025-01-07 16:47:27 +01:00
Francesco Cariaggi
f408d55448
Fix bug when requesting input normalization with EnCodec (#34756)
* EnCodec: unsqueeze padding mask

* add test for normalization
2025-01-07 11:50:02 +01:00
松本和真
96bf3d6cc5
Add diffllama (#34083)
* first adding diffllama

* add Diff Attention and other but still with errors

* complate make attention Diff-Attention

* fix some bugs which may be caused by transformer-cli while adding model

* fix a bug caused by forgetting KV cache...

* Update src/transformers/models/diffllama/modeling_diffllama.py

You don't need to divide by 2 if we use same number of attention heads as llama. instead you can just split in forward.

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

fit to changeing "num_heads // 2" place

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

new codes are more meaningful than before

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

new codes are more meaningful than before

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

fit to changeing "num_heads // 2" place

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

fix 2times divide by sqrt(self.head_dim)

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

fix 2times divide by sqrt(self.head_dim)

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

fit to changeing "num_heads // 2" place.
and more visible

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* I found Attention missed implemented from paper still on e072544a3b.

* re-implemented

* adding groupnorm

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* align with transformers code style

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* fix typo

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* adding groupnorm

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* change SdpaAttention to DiffSdpaAttention

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* fix bug

* Update src/transformers/models/diffllama/modeling_diffllama.py

resolve "not same outputs" problem

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* fix bugs of places of "GroupNorm with scale" and etc

* Revert "fix bugs of places of "GroupNorm with scale" and etc"

This reverts commit 26307d92f6.

* simplify multiple of attention (matmul) operations into one by repeating value_states

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* simplify multiple of attention (matmul) operations into one by repeating value_states

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* simplify multiple of attention (matmul) operations into one by repeating value_states

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* remove missed type

* add diffllama model_doc

* apply make style/quality

* apply review comment about model

* apply review comment about test

* place diffllama alphabetically on the src/transformers/__init__.py

* fix forgot code

* Supports parameters that are not initialized with standard deviation 0 in the conventional method

* add DiffLlamaConfig to CONFIG_CLASSES_TO_IGNORE_FOR_DOCSTRING_CHECKPOINT_CHECK on utils/check_config_docstrings.py

* remove unused property of config

* add to supported model list

* add to spda supported model list

* fix copyright, remove pretraining_tensor_parallel, and modify for initialization test

* remove unused import and etc.

* empty commit

* empty commit

* empty commit

* apply modular transformers but with bugs

* revert prev commit

* create src/transformers/model/diffllama/modular_diffllama.py

* run utils/modular_model_converter.py

* empty commit

* leaner modular diffllama

* remove more and more in modular_diffllama.pt

* remove more and more in modular_diffllama.pt

* resolve missing docstring entries

* force reset

* convert modular

---------

Co-authored-by: Minho Ryu <ryumin93@gmail.com>
2025-01-07 11:34:56 +01:00
pglorio
bd442c6d3a
Zamba new attention standard (#35375)
* updated zamba to new attention standard

* make fixup fixes
2025-01-07 10:08:45 +01:00
NielsRogge
6e0515e99c
Add DINOv2 with registers (#35348)
* added changes from 32905

* fixed mistakes caused by select all paste

* rename diff_dinov2...

* ran tests

* Fix modular

* Fix tests

* Use new init

* Simplify drop path

* Convert all checkpoints

* Add figure and summary

* Update paths

* Update docs

* Update docs

* Update toctree

* Update docs

---------

Co-authored-by: BernardZach <bernardzach00@gmail.com>
Co-authored-by: Zach Bernard <132859071+BernardZach@users.noreply.github.com>
2024-12-24 13:21:59 +01:00
Yoni Gozlan
93aafdc620
Add compile test for fast image processor (#35184)
* add compile test for fast image processor

* override pixtral test
2024-12-23 13:12:45 -05:00
Miquel Farré
a1780b7ba5
bugfix Idefics3 processor - handle gracefully cases with text and no images (#35363)
* bugfix processing empty images

* fix

* fix

* Update src/transformers/models/idefics3/processing_idefics3.py

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* adding tests

* fix

* fix

* fix

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2024-12-23 16:59:01 +01:00
Tibor Reiss
e10be82b71
uniformize kwargs for SAM (#34578)
* Make kwargs uniform for SAM

* Remove unused attribute

* Make point_pad_value part of image_kwargs

* Update annotations

* Code review - use existing methods

* Use ProcessorTesterMixin

* Do not add ProcessorTesterMixin everywhere
2024-12-23 13:54:57 +01:00
bastrob
8f38f58f3d
owlvit/2 dynamic input resolution (#34764)
* owlvit/2 dynamic input resolution.

* adapt box grid to patch_dim_h patch_dim_w

* fix ci

* clarify variable naming

* clarify variable naming..

* compute box_bias dynamically inside box_predictor

* change style part of code

* [run-slow] owlvit, owlv2
2024-12-21 08:51:09 +00:00
Yih-Dar
504c4d3692
Make test_generate_with_static_cache even less flaky (#34995)
* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-20 16:03:26 +01:00
Yih-Dar
05de764e9c
Aurevoir PyTorch 1 (#35358)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-20 14:36:31 +01:00
Anton Vlasjuk
5a2aedca1e
[Mamba2] Fix caching, slow path, and multi-gpu (#35154)
* fixup mamba2 - caching and several other small fixes

* fixup cached forward

* correct fix this time

* fixup cache - we do not need to extend the attn mask it's handled by generate (gives total ids + mask at each step)

* remove unnecessary (un)squeeze

* fixup cache position

* simplify a few things

* [run-slow] mamba2

* multi gpu attempt two

* [run-slow] mamba2

* [run-slow] mamba2

* [run-slow] mamba2

* [run-slow] mamba2

* add newer slow path fix

* [run-slow] mamba2
2024-12-20 09:27:47 +01:00
Arthur
1fa807fa63
Fix some fa2 tests (#35340)
* remove fa2 test

* remove other failing tests

* style
2024-12-19 17:05:25 +01:00
Benjamin Warner
667ed5635e
Add ModernBERT to Transformers (#35158)
* initial cut of modernbert for transformers

* small bug fixes

* fixes

* Update import

* Use compiled mlp->mlp_norm to match research implementation

* Propagate changes in modular to modeling

* Replace duplicate attn_out_dropout in favor of attention_dropout

cc @warner-benjamin let me know if the two should remain separate!

* Update BOS to CLS and EOS to SEP

Please confirm @warner-benjamin

* Set default classifier bias to False, matching research repo

* Update tie_word_embeddings description

* Fix _init_weights for ForMaskedLM

* Match base_model_prefix

* Add compiled_head to match research repo outputs

* Fix imports for ModernBertForMaskedLM

* Just use "gelu" default outright for classifier

* Fix config name typo: initalizer -> initializer

* Remove some unused parameters in docstring. Still lots to edit there!

* Compile the embeddings forward

Not having this resulted in very slight differences - so small it wasn't even noticed for the base model, only for the large model.

But the tiny difference for large propagated at the embedding layer through the rest of the model, leading to notable differences of ~0.0084 average per value, up to 0.2343 for the worst case.

* Add drafts for ForSequenceClassification/ForTokenClassification

* Add initial SDPA support (not exactly equivalent to FA2 yet!)

During testing, FA2 and SDPA still differ by about 0.0098 per value in the token embeddings. It still predicts the correct mask fills, but I'd like to get it fully 1-1 if possible.

* Only use attention dropout if training

* Add initial eager attention support (also not equivalent to FA2 yet!)

Frustratingly, I also can't get eager to be equivalent to FA2 (or sdpa), but it does get really close, i.e. avg ~0.010 difference per value.

Especially if I use fp32 for both FA2&eager, avg ~0.0029 difference per value

The fill-mask results are good with eager.

* Add initial tests, output_attentions, output_hidden_states, prune_heads

Tests are based on BERT, not all tests pass yet: 23 failed, 79 passed, 100 skipped

* Remove kwargs from ModernBertForMaskedLM

Disable sparse_prediction by default to match the normal HF, can be enabled via config

* Remove/adjust/skip improper tests; warn if padding but no attn mask

* Run formatting etc.

* Run python utils/custom_init_isort.py

* FlexAttention with unpadded sequences(matches FA2 within bf16 numerics)

* Reformat init_weights based on review

* self -> module in attention forwards

* Remove if config.tie_word_embeddings

* Reformat output projection on a different line

* Remove pruning

* Remove assert

* Call contiguous() to simplify paths

* Remove prune_qkv_linear_layer

* Format code

* Keep as kwargs, only use if needed

* Remove unused codepaths & related config options

* Remove 3d attn_mask test; fix token classification tuple output

* Reorder: attention_mask above position_ids, fixes gradient checkpointing

* Fix usage if no FA2 or torch v2.5+

* Make torch.compile/triton optional

Should we rename 'compile'? It's a bit vague

* Separate pooling options into separate functions (cls, mean) - cls as default

* Simplify _pad_modernbert_output, remove unused labels path

* Update tied weights to remove decoder.weight, simplify decoder loading

* Adaptively set config.compile based on hf_device_map/device/resize, etc.

* Update ModernBertConfig docstring

* Satisfy some consistency checks, add unfinished docs

* Only set compile to False if there's more than 1 device

* Add docstrings for public ModernBert classes

* Dont replace docstring returns - ends up being duplicate

* Fix mistake in toctree

* Reformat toctree

* Patched FlexAttention, SDPA, Eager with Local Attention

* Implement FA2 -> SDPA -> Eager attn_impl defaulting, crucial

both to match the original performance, and to get the highest inference speed without requiring users to manually pick FA2

* Patch test edge case with Idefics3 not working with 'attn_implementation="sdpa"'

* Repad all_hidden_states as well

* rename config.compile to reference_compile

* disable flex_attention since it crashes

* Update modernbert.md

* Using dtype min to mask in eager

* Fully remove flex attention for now

It's only compatible with the nightly torch 2.6, so we'll leave it be for now. It's also slower than eager/sdpa.

Also, update compile -> reference_compile in one more case

* Call contiguous to allow for .view()

* Copyright 2020 -> 2024

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update/simplify __init__ structure

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Remove "... if dropout_prob > 0 else identity"

As dropout with 0.0 should be efficient like identity

* re-use existing pad/unpad functions instead of creating new ones

* remove flexattention method

* Compute attention_mask and local_attention_mask once in modeling

* Simplify sequence classification prediction heads, only CLS now

Users can make custom heads if they feel like it

Also removes the unnecessary pool parameter

* Simplify module.training in eager attn

* Also export ModernBertPreTrainedModel

* Update the documentation with links to finetuning scripts

* Explain local_attention_mask parameter in docstring

* Simplify _autoset_attn_implementation, rely on super()

* Keep "in" to initialize Prediction head

Doublechecked with Benjamin that it's correct/what we used for pretraining

* add back mean pooling

* Use the pooling head in TokenClassification

* update copyright

* Reset config._attn_implementation_internal on failure

* Allow optional attention_mask in ForMaskedLM head

* fix failing run_slow tests

* Add links to the paper

* Remove unpad_no_grad, always pad/unpad without gradients

* local_attention_mask -> sliding_window_mask

* Revert "Use the pooling head in TokenClassification"

This reverts commit 99c38badd1.

There was no real motivation, no info on whether having this bigger head does anything useful.

* Simplify pooling, 2 options via if-else

---------

Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com>
Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>
Co-authored-by: Said Taghadouini <taghadouinisaid@gmail.com>
Co-authored-by: Benjamin Clavié <ben@clavie.eu>
Co-authored-by: Antoine Chaffin <ant54600@hotmail.fr>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-12-19 14:03:35 +01:00
Yu Chin Fabian Lim
9613933b02
Add the Bamba Model (#34982)
* initial commit for PR

Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com>

* rename dynamic cache

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add more unit tests

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add integration test

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add integration test

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* Add modular bamba file

* Remove trainer changes from unrelated PR

* Modify modular and cofig to get model running

* Fix some CI errors and beam search

* Fix a plethora of bugs from CI/docs/etc

* Add bamba to models with special caches

* Updat to newer mamba PR for mamba sublayer

* fix test_left_padding_compatibility

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* fix style

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* fix remaining tests

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* missed this test

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* ran make style

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* move slow tag to integration obj

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* make style

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* address comments

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* fix modular

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* left out one part of modular

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* change model

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* Make Rotary modular as well

* Update bamba.md

Added overview, update Model inference card and added config

* Update bamba.md

* Update bamba.md

* Update bamba.md

Minor fixes

* Add docs for config and model back

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Add warning when using fast kernels

* replaced generate example

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* Address comments from PR

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Propagate attention fixes

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Fix attention interfaces to the new API

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Fix API for decoder layer

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Remove extra weights

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

---------

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com>
Co-authored-by: Antoni Viros i Martin <aviros@ibm.com>
Co-authored-by: divya-kumari32 <72085811+divya-kumari32@users.noreply.github.com>
Co-authored-by: Antoni Viros <ani300@gmail.com>
2024-12-18 20:18:17 +01:00
Arthur
2c47618c1a
🚨All attention refactor🚨 (#35235)
* refactor LlamaAttention

* minimal changes

* fix llama

* update

* modular gemmas

* modular nits

* modular updates

* nits

* simplify

* gpt2

* more modualr and fixes

* granite

* modular modular modular

* nits

* update

* qwen2 + starcoder2

* mostly gemma2

* Update image_processing_auto.py

* fix

* Update modular_starcoder2.py

* fix

* remove all copied from attentions

* remove gcv

* make fix-copies

* oups

* oups2.0

* fix some modulars + all copied from

* should be good now

* revert unwanted changes

* Update modeling_decision_transformer.py

* finish cleanup

* Update modeling_olmo.py

* consistency

* re-add gradient checkpointing attribute

* fix

* style

* make config necessary

* bis

* bis

* Update modeling_my_new_model2.py

* is_causal attr

* fix

* remove past kv return from decoder layer

* fix

* default rope config

* correctly fix rope config

* fix bias

* fix gpt2 attention output

* fix test

* fix inits

* fix default sdpa

* fix default sdpa implementation

* harmonize classes

* fix mistral

* fix sliding window models

* mixtral

* be more explicit

* style

* fix

* several fixes

* Update modeling_dbrx.py

* fix test

* olmo + phi

* rotary

* syle

* phi

* phi again

* again

* kwargs

* Update test_modeling_common.py

* skip fx tracing tests

* Update modeling_utils.py

* gemma 2

* again

* Update modeling_recurrent_gemma.py

* gemma2

* granite

* style

* starcoder

* Update sdpa_attention.py

* switch args

* Update modeling_mllama.py

* fix

* cache type tests

* gpt2

* Update test_modeling_common.py

* fix

* consistency

* fix shape with encoder

* should be the last one

* tests non model

* most comments

* small oupsi

* be more explicit in modulars

* more explicit modulars

* CIs! it works locally

* add kwargs to _flash_attention_forward

---------

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2024-12-18 16:53:39 +01:00
eustlb
da334bcfa8
[Whisper] 🚨 Fix whisper decoding 🚨 (#34135)
* do not remove decoder_input_ids for the first segment

* do not remove eos token in generate_with_fallback

* when removing padding tokens, do not remove eos token

* remove eos token in generate (and not in generate_with_fallback!)

* reconciliate short-from/ long-form behavior

* correct avg_logprobs calculation

* handle eos token in segments

* handle decoder_input_ids and eos token in _prepare_decoder_input_ids

* fix incorrect time precision

* always remove eos token

* always remove decoder_input_ids

* no need to handle decoder_inputs_ids and eos token

* no need to remove decoder_input_ids

* no need to handle eos token

* fix num_beams in _retrieve_logit_processors

* remove todo unconsistency

* no need to add eos token

* last_timestamp_pos should indeed be timestamp token pos

* patch generate to enable compatibility with GenerationTesterMixin tests

* adapt test_generate_continue_from_past_key_values

* adapt test_prompt_lookup_decoding_matches_greedy_search

* adapt generic GenerationMixin tests to whisper's generate

* fix speculative decoding

* fix

* [run-slow] whisper

* change HF_HUB_TOKEN for require_read_token

* [run-slow] whisper

* prioritize kwargs over generation_config

* remove unnecessary args

* [run-slow] whisper

* update tests

* [run-slow] whisper

* add comment

* update test

* [run-slow] whisper

* update test + revert require_read_token

* docstring updates

* revert tokenizer decode args change

* do not use a patch + docstring updates

* [run-slow] whisper

* make

* [run-slow] whisper

* add a flag to force unique call to generate

* test update

* [run-slow] whisper

* add force_unique_generate_call arg

* do not use a patch

* correct the timestamps for the pad tokens

* docstring update

* docstring update

* docstring update

* upodate TF tests

* add require_read_token

* [run-slow] whisper

* test reset dynamo

* [run-slow] whisper

* fix

* [run-slow] whisper

* avoid iterating twice on current_segments

* [run-slow] whisper

* [run-slow] whisper

---------

Co-authored-by: Eustache Le Bihan <eustlb@users.noreply.huggingface.co>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-18 14:13:21 +01:00
Fanli Lin
c7e48053aa
[tests] make cuda-only tests device-agnostic (#35222)
fix cuda-only tests
2024-12-18 10:14:22 +01:00
Magnus
6eb00dd2f0
Support for SDPA for SAM models (#34110)
* feat: add support for sdpa and gradient checkpointing

* fix: ruff format

* fix: config sdpa

* fix: sdpa layer naming convention

* fix: update test_eager_matches_sdpa_inference to handle vision_hidden_states

* test: skip incompatible tests and fix loading issue with sdpa

- Updated tests to skip cases flash and dynamic compile.
- Minor adjustment to ensure correct loading of model with sdpa for dispatch test.

* style: apply Ruff formatting

* ruff fix again after rebase

* [run-slow] sam

* [run-slow] sam

* refactor: Address review comments and improve sub-config handling in SAM model tests

- Added attributes for sub_configs as per PR #34410.
- Enabled tests for configs, ensuring the composite model (SAM) has several sub-configs in the main config.
- Added class attribute _is_composite=True to the tester class
- test_sdpa_can_dispatch_composite_models added

* [run-slow] sam

* style: ruff

* [run-slow] sam

* style: ruff again ...

* [run-slow] sam
2024-12-17 14:46:05 +01:00
Omar Salman
747f361da1
Add sdpa for Beit (#34941)
* Add sdpa for Beit

* Updates

* [run-slow] beit

* Update inference benchmarks

* Update

* Fix - add missed to super().forward()

* Updates

* Fix missing import
2024-12-17 14:44:47 +01:00
Tony Wu
f33a0cebb3
Add ColPali to 🤗 transformers (#33736)
* feat: run `add-new-model-like`

* feat: add paligemma code with "copied from"

* feat: add ColPaliProcessor

* feat: add ColPaliModel

* feat: add ColPaliConfig

* feat: rename `ColPaliForConditionalGeneration` to `ColPaliModel`

* fixup modeling colpali

* fix: fix root import shortcuts

* fix: fix `modeling_auto` dict

* feat: comment out ColPali test file

* fix: fix typos from `add-new-model-like`

* feat: explicit the forward input args

* feat: move everything to `modular_colpali.py`

* fix: put back ColPaliProcesor

* feat: add auto-generated files

* fix: run `fix-copies`

* fix: remove DOCStRING constants to make modular converter work

* fix: fix typo + modular converter

* fix: add missing imports

* feat: no more errors when loading ColPaliModel

* fix: remove unused args in forward + tweak doc

* feat: rename `ColPaliModel` to `ColPaliForRetrieval`

* fix: apply `fix-copies`

* feat: add ColPaliProcessor to `modular_colpali`

* fix: run make quality + make style

* fix: remove duplicate line in configuration_auto

* feat: make ColPaliModel inehrit from PaliGemmaForConditionalGeneration

* fix: tweak and use ColPaliConfig

* feat: rename `score` to `post_process_retrieval`

* build: run modular formatter + make style

* feat: convert colpali weights + fixes

* feat: remove old weight converter file

* feat: add and validate tests

* feat: replace harcoded path to "vidore/colpali-v1.2-hf" in tests

* fix: add bfloat16 conversion in weight converter

* feat: replace pytest with unittest in modeling colpali test

* feat: add sanity check for weight conversion (doesn't work yet)

* feat: add shape sanity check in weigth converter

* feat: make ColPaliProcessor args explicit

* doc: add doc for ColPali

* fix: trying to fix output mismatch

* feat: tweaks

* fix: ColPaliModelOutput inherits from ModelOutput instead of PaliGemmaCausalLMOutputWithPast

* fix: address comments on PR

* fix: adapt tests to the Hf norm

* wip: try things

* feat: add `__call__` method to `ColPaliProcessor`

* feat: remove need for dummy image in `process_queries`

* build: run new modular converter

* fix: fix incorrect method override

* Fix tests, processing, modular, convert

* fix tokenization auto

* hotfix: manually fix processor -> fixme once convert modular is fixed

* fix: convert weights working

* feat: rename and improve convert weight script

* feat: tweaks

* fest: remove `device` input for `post_process_retrieval`

* refactor: remove unused `get_torch_device`

* Fix all tests

* docs: update ColPali model doc

* wip: fix convert weights to hf

* fix logging modular

* docs: add acknowledgements in model doc

* docs: add missing docstring to ColPaliProcessor

* docs: tweak

* docs: add doc for `ColPaliForRetrievalOutput.forward`

* feat: add modifications from colpali-engine v0.3.2 in ColPaliProcessor

* fix: fix and upload colapli hf weights

* refactor: rename `post_process_retrieval` to `score_retrieval`

* fix: fix wrong typing for `score_retrieval`

* test: add integration test for ColPali

* chore: rerun convert modular

* build: fix root imports

* Update docs/source/en/index.md

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* fix: address PR comments

* wip: reduce the prediction gap in weight conversion

* docs: add comment in weight conversion script

* docs: add example for `ColPaliForRetrieval.forward`

* tests: change dataset path to the new one in hf-internal

* fix: colpali weight conversion works

* test: add fine-grained check for ColPali integration test

* fix: fix typos in convert weight script

* docs: move input docstring in a variable

* fix: remove hardcoded torch device in test

* fix: run the new modular refactor

* docs: fix python example for ColPali

* feat: add option to choose `score_retrieval`'s output dtype and device

* docs: update doc for `score_retrieval`

* feat: add `patch_size` property in ColPali model

* chore: run `make fix-copies`

* docs: update description for ColPali cookbooks

* fix: remove `ignore_index` methods

* feat: remove non-transformers specific methods

* feat: update `__init__.py` to new hf format

* fix: fix root imports in transformers

* feat: remove ColPali's inheritance from PaliGemma

* Fix CI issues

* nit remove prints

* feat: remove ColPali config and model from `modular_colpali.py`

* feat: add `ColPaliPreTrainedModel` and update modeling and configuration code

* fix: fix auto-removed imports in root `__init__.py`

* fix: various fixes

* fix: fix `_init_weight`

* temp: comment `AutoModel.from_config` for experiments

* fix: add missing `output_attentions` arg in ColPali's forward

* fix: fix `resize_token_embeddings`

* fix: make `input_ids` optional in forward

* feat: rename `projection_layer` to `embedding_proj_layer`

* wip: fix convert colpali weight script

* fix tests and convert weights from original repo

* fix unprotected import

* fix unprotected torch import

* fix style

* change vlm_backbone_config to vlm_config

* fix unprotected import in modular this time

* fix: load config from Hub + tweaks in convert weight script

* docs: move example usage from model docstring to model markdown

* docs: fix input docstring for ColPali's forward method

* fix: use `sub_configs` for ColPaliConfig

* fix: remove non-needed sanity checks in weight conversion script + tweaks

* fix: fix issue with `replace_return_docstrings` in ColPali's `forward`

* docs: update docstring for `ColPaliConfig`

* test: change model path in ColPali test

* fix: fix ColPaliConfig

* fix: fix weight conversion script

* test: fix expected weights for ColPali model

* docs: update ColPali markdown

* docs: fix minor typo in ColPaliProcessor

* Fix tests and add _no_split_modules

* add text_config to colpali config

* [run slow] colpali

* move inputs to torch_device in integration test

* skip test_model_parallelism

* docs: clarify quickstart snippet in ColPali's model card

* docs: update ColPali's model card

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2024-12-17 11:26:43 +01:00
Yoni Gozlan
5615a39369
Fall back to slow image processor in ImageProcessingAuto when no fast processor available (#34785)
* refactor image_processing_auto logic

* fix fast image processor tests

* Fix tests fast vit image processor

* Add safeguard when use_fast True and torchvision not available

* change default use_fast back to None, add warnings

* remove debugging print

* call get_image_processor_class_from_name once
2024-12-15 14:00:36 -05:00
Fanli Lin
bdd4201fdb
[tests] fix "Tester object has no attribute '_testMethodName'" (#34910)
* add more cases

* fix method not found in unittest

Signed-off-by: Lin, Fanli <fanli.lin@intel.com>

* fix more cases

* add more models

* add all

* no unittest.case

* remove for oneformer

* fix style

---------

Signed-off-by: Lin, Fanli <fanli.lin@intel.com>
2024-12-13 14:33:45 +01:00
alexrs-cohere
64478c7631
Add Cohere2 model (#35224) 2024-12-13 09:35:50 +01:00
Yoach Lacombe
6181c6b095
Fix seamless TTS generate (#34968)
* fix seamless tts generate

* apply same fix for v2

* [run-slow] seamless_m4t, seamless_m4t_v2

* remove TODO

* [run-slow] seamless_m4t, seamless_m4t_v2

* [run-slow] seamless_m4t, seamless_m4t_v2

* ignore failing test on multigpus

* [run-slow] seamless_m4t, seamless_m4t_v2

* [run-slow] seamless_m4t, seamless_m4t_v2
2024-12-11 15:38:42 +01:00
Pavel Iakubovskii
5fcf6286bf
Add TimmWrapper (#34564)
* Add files

* Init

* Add TimmWrapperModel

* Fix up

* Some fixes

* Fix up

* Remove old file

* Sort out import orders

* Fix some model loading

* Compatible with pipeline and trainer

* Fix up

* Delete test_timm_model_1/config.json

* Remove accidentally commited files

* Delete src/transformers/models/modeling_timm_wrapper.py

* Remove empty imports; fix transformations applied

* Tidy up

* Add image classifcation model to special cases

* Create pretrained model; enable device_map='auto'

* Enable most tests; fix init order

* Sort imports

* [run-slow] timm_wrapper

* Pass num_classes into timm.create_model

* Remove train transforms from image processor

* Update timm creation with pretrained=False

* Fix gamma/beta issue for timm models

* Fixing gamma and beta renaming for timm models

* Simplify config and model creation

* Remove attn_implementation diff

* Fixup

* Docstrings

* Fix warning msg text according to test case

* Fix device_map auto

* Set dtype and device for pixel_values in forward

* Enable output hidden states

* Enable tests for hidden_states and model parallel

* Remove default scriptable arg

* Refactor inner model

* Update timm version

* Fix _find_mismatched_keys function

* Change inheritance for Classification model (fix weights loading with device_map)

* Minor bugfix

* Disable save pretrained for image processor

* Rename hook method for loaded keys correction

* Rename state dict keys on save, remove `timm_model` prefix, make checkpoint compatible with `timm`

* Managing num_labels <-> num_classes attributes

* Enable loading checkpoints in Trainer to resume training

* Update error message for output_hidden_states

* Add output hidden states test

* Decouple base and classification models

* Add more test cases

* Add save-load-to-timm test

* Fix test name

* Fixup

* Add do_pooling

* Add test for do_pooling

* Fix doc

* Add tests for TimmWrapperModel

* Add validation for `num_classes=0` in timm config + test for DINO checkpoint

* Adjust atol for test

* Fix docs

* dev-ci

* dev-ci

* Add tests for image processor

* Update docs

* Update init to new format

* Update docs in configuration

* Fix some docs in image processor

* Improve docs for modeling

* fix for is_timm_checkpoint

* Update code examples

* Fix header

* Fix typehint

* Increase tolerance a bit

* Fix Path

* Fixing model parallel tests

* Disable "parallel" tests

* Add comment for metadata

* Refactor AutoImageProcessor for timm wrapper loading

* Remove custom test_model_outputs_equivalence

* Add require_timm decorator

* Fix comment

* Make image processor work with older timm versions and tensor input

* Save config instead of whole model in image processor tests

* Add docstring for `image_processor_filename`

* Sanitize kwargs for timm image processor

* Fix doc style

* Update check for tensor input

* Update normalize

* Remove _load_timm_model function

---------

Co-authored-by: Amy Roberts <22614925+amyeroberts@users.noreply.github.com>
2024-12-11 12:40:30 +00:00
Cyril Vallez
d363e71d0e
🧹 Remove deprecated RotaryEmbedding parts in the Attention layers (#34858)
* update

* style

* fix missing args

* remove last trace of old rope classes

* remove deprecated copied from

* fix copies

* trigger CIs

* post rebase clean-up

* reverse mistral

* cleanup after dropping commits

* Add comment
2024-12-11 11:16:52 +01:00
Gallil Maimon
6acb4e43a7
Support BatchNorm in Hubert pos_conv_emb as in fairseq (#34389)
* Support BatchNorm in Hubert pos_conv_emb as in fairseq

* Correct the new defaults (#34377)

* Correct the new defaults

* CIs

* add check

* Update utils.py

* Update utils.py

* Add the max_length in generate test checking shape without passing length

* style

* CIs

* fix fx CI issue

* [auto. ping] Avoid sending empty info + add more team members (#34383)

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Fix glm  (#34388)

* Fix duplicated

* fix import

* Use non nested images and batched text Idefics2/3  (#34222)

* add support for non nested images and add tests

* add tests error scenario

* fix style

* added single and no image to error tests

* Fix onnx non-expotable inplace aten op (#34376)

* fix onnx non-expotable inplace op

* mistral, qwen2, qwen2_vl, starcoder2

* fixup copies

* Fix right padding in LLaVA models (#34305)

* fix right pad llavas

* device mismatch

* no filter (#34391)

* no filter

* no filter

* no filter

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* SynthID: better example (#34372)

* better example

* Update src/transformers/generation/configuration_utils.py

* Update src/transformers/generation/logits_process.py

* nits

* Tests: upgrade `test_eager_matches_sdpa_generate` (#34386)

* Fix bnb training test failure (#34414)

* Fix bnb training test: compatibility with OPTSdpaAttention

* Avoid check expected exception when it is on CUDA (#34408)

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Fix typos in agents_advanced.md (#34405)

* [docs] Cache implementations (#34325)

cache

* [run-slow] hubert

* Support BatchNorm in Hubert pos_conv_emb as in fairseq
Add conversion integration test, and make batchnorm explicit variable

* Support BatchNorm in Hubert pos_conv_emb as in fairseq
fix make fixup styling changes

* [run-slow] hubert

* Support BatchNorm in Hubert pos_conv_emb as in fairseq

* [run-slow] hubert

* Support BatchNorm in Hubert pos_conv_emb as in fairseq
Add conversion integration test, and make batchnorm explicit variable

* Support BatchNorm in Hubert pos_conv_emb as in fairseq
fix make fixup styling changes

* [run-slow] hubert

* [run-slow] hubert

---------

Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
Co-authored-by: Rudy Delouya <rudy.delouya@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
2024-12-10 14:18:23 +01:00
Pavel Iakubovskii
c8c8dffbe4
Update I-JEPA checkpoints path (#35120)
Update checkpoints path
2024-12-06 13:42:51 +00:00
Aymeric Roucher
9ad4c93536
Add Aria (#34157)
* Add Aria
---------

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-12-06 12:17:34 +01:00
Pablo Montalvo
a5bb528471
Fix signatures for processing kwargs (#35105)
* add conversion script

* remove pg2 refs

* fixup style

* small update

* get correct scaling

* add back missing bos

* fix missing config keys

* might revert this pos_embeddings

* fixup 9b config

* fix 9b

* fixup 9b conversion for good + add back num_hidden_layers

* add correct query scaling for 2b, 9b, 27b

* fixup 27b conversion

* Additional variant: 27b-896

* Use CPU for conversion to reduce GPU RAM requirements

* fix causal mask generation + formatting

* fix in-training causal mask generation edge case

* trigger CI

* update config

* update config

* update config

* update config

* update config

* update config

* update config

* update config

* update config

* move conversion file to main model dir

* handle multi-images + bos token

* address comments for input ids

* revert ci fixes

* [run-slow] paligemma

* fix

* [run-slow] paligemma

* skip end 2 end

* [run-slow] paligemma

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-05 18:15:48 +01:00
Yih-Dar
b0a51e5cff
Fix flaky Hub CI (test_trainer.py) (#35062)
* fix

* Update src/transformers/testing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* check

* check

* check

* check

* check

* check

* Update src/transformers/testing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Update src/transformers/testing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* check

* check

* check

* Final space

* Final adjustment

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Lucain <lucainp@gmail.com>
2024-12-05 17:02:27 +01:00
João Marcelo
50189e36a6
Add I-JEPA (#33125)
* first draft

* add IJepaEmbeddings class

* fix copy-from for IJepa model

* add weight conversion script

* update attention class names in IJepa model

* style changes

* Add push_to_hub option to convert_ijepa_checkpoint function

* add initial tests for I-JEPA

* minor style changes to conversion script

* make fixup related

* rename conversion script

* Add I-JEPA to sdpa docs

* minor fixes

* adjust conversion script

* update conversion script

* adjust sdpa docs

* [run_slow] ijepa

* [run-slow] ijepa

* [run-slow] ijepa

* [run-slow] ijepa

* [run-slow] ijepa

* [run-slow] ijepa

* formatting issues

* adjust modeling to modular code

* add IJepaModel to objects to ignore in docstring checks

* [run-slow] ijepa

* fix formatting issues

* add usage instruction snippet to docs

* change pos encoding, add checkpoint for doc

* add verify logits for all models

* [run-slow] ijepa

* update docs to include image feature extraction instructions

* remove pooling layer from IJepaModel in image classification class

* [run-slow] ijepa

* remove pooling layer from IJepaModel constructor

* update docs

* [run-slow] ijepa

* [run-slow] ijepa

* small changes

* [run-slow] ijepa

* style adjustments

* update copyright in init file

* adjust modular ijepa

* [run-slow] ijepa
2024-12-05 16:14:46 +01:00
eustlb
54aae121eb
[Whisper] Fix whisper tokenizer (#34537)
* handle single timestamp ending

* include last timestamp token

* handle single timestamp ending

* avoid floating points arithm limitations

* ensure float64 operations

* new test

* make fixup

* make copies

* handle edge case double tokens ending with different tokens

* handle single timestamp ending

* make fixup

* handle conditioning on prev segments

* fix

* Update src/transformers/models/whisper/generation_whisper.py

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* [run-slow] whisper

* don't call item() to avoid unnecessary sync

* fix

---------

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
Co-authored-by: Eustache Le Bihan <eustlb@users.noreply.huggingface.co>
2024-12-05 13:46:29 +01:00
Anton Vlasjuk
46df859975
[GPTNeoX] Flex Attention + Refactor (#34896)
* gpt neox flex attention + refactor

* some formatting

* small fix on dropout

* add assertion on flex attn test

* flaky ci :(

* add head mask support

* style

* handle dtype, replace torch where

* fixup flex with output attns

* code review and several other fixes

* Update src/transformers/modeling_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* style

* remove unnecessary comment

* remove incorrect comment

* make flex attn check more agnostic tor versions and centralized

* change peft input dtype check to value since q and k could be affected by other stuff like RoPE

* i forgor

* flaky

* code review and small fixes

* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-12-04 14:48:28 +01:00
Wang, Yi
125de41643
fix speecht5 failure issue in test_peft_gradient_checkpointing_enable… (#34454)
* fix speecht5 failure issue in test_peft_gradient_checkpointing_enable_disable

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

* [run-slow] speecht5

---------

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
Co-authored-by: Matt <rocketknight1@gmail.com>
2024-12-03 13:58:54 +00:00
Dmitry Rogozhkin
31830474bf
Fix test_eager_matches_sdpa_inference for XPU backend (#34889)
* Use torch.nn.attention.sdpa_kernel instead of deprecated torch.backends.cuda.sdp_kernel

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* Fix test_eager_matches_sdpa_inference for XPU backend

As of PyTorch 2.5 XPU backend supports only torch.nn.attention.SDPBackend.MATH
which is implemented on PyTorch level using aten operators and is device
agnostic with respect to implementation of each aten operator. Thus, we can
reuse CUDA (or CPU) MATH weights for XPU.

Fixes: #34888
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* Use torch.amp.autocast instead of deprecated torch.cuda.amp.autocast in nemotron

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

---------

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2024-12-02 16:21:04 +01:00
Tibor Reiss
89d7bf584f
🚨🚨🚨 Uniformize kwargs for TrOCR Processor (#34587)
* Make kwargs uniform for TrOCR

* Add tests

* Put back current_processor

* Remove args

* Add todo comment

* Code review - breaking change
2024-11-29 11:58:11 +00:00
Michael Goin
9d6f0ddcec
Add optimized PixtralImageProcessorFast (#34836)
* Add optimized PixtralImageProcessorFast

* make style

* Add dummy_vision_object

* Review comments

* Format

* Fix dummy

* Format

* np.ceil for math.ceil
2024-11-28 16:04:05 +01:00
Raushan Turganbay
5e8c1d713d
Offloaded cache: fix generate (#34921)
* fix cache impl

* require_torch_gpu

* fix mamba

* fix copies
2024-11-28 15:05:56 +01:00
Arthur
4c1388f48e
[FlexAttention] Update gemma2 (#34942)
* update tests

* now maybe this fixes the previous fialing tests!

* nit default

* Update src/transformers/models/gemma2/modular_gemma2.py

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

* fix-copies

---------

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
2024-11-27 11:50:48 +01:00
eustlb
4d1d0f29a4
[Whisper] Fix whisper integration tests (#34111)
* fix test_tiny_timestamp_generation

* fix test_large_timestamp_generation

* fix test_whisper_shortform_single_batch_prev_cond

* fix test_whisper_shortform_multi_batch_hard_prev_cond

* return_timestamps necessary with long form

* fix test_default_multilingual_transcription_long_form

* fix test_tiny_token_timestamp_generation_longform

* fix test_whisper_longform_multi_batch_hard

* Update tests/models/whisper/test_modeling_whisper.py

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* fix typo

* do not expect special tokens

* fix test_whisper_longform_single_batch_beam

* fix test_whisper_longform_multi_batch_hard_prev_cond

* update test_whisper_longform_multi_batch_hard_prev_cond

* update test_whisper_longform_multi_batch_hard_prev_cond

* these tests does not make sense anymore

* this test does not make sense anymore

* make fixup

* suggested nits

* add test with forced_decoder_ids

* this test does not make sense anymore

* change assert for unittest test cases

* make fixup

* test with prompt_ids and task and language

* fix unittest test case call

* fix test_tiny_generation

* fix test_tiny_en_generation

* fix test_tiny_en_batched_generation

* fix test_tiny_longform_timestamps_generation

* fix test_tiny_timestamp_generation

* fix test_large_generation

* fix test_large_batched_generation

* fix test_large_generation_multilingual

* fix test_large_timestamp_generation

* fix test_large_timestamp_generation

* fix test_tiny_token_timestamp_generation_longform

* fix test_tiny_en_batched_generation

* make fixup

* [run-slow] whisper

---------

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
2024-11-26 12:23:08 +01:00
Yih-Dar
a830df2909
Fix test_auto_backbone_timm_model_from_pretrained (#34877)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-25 17:20:41 +01:00
Shane A
9121ab8fe8
Rename OLMo November to OLMo2 (#34864)
* Rename/move OLMo Nov files to OLMo2

* Rename Olmo1124 and its variants to Olmo2
2024-11-25 16:31:22 +01:00
Jacky Lee
f4c04ba32b
Fix Qwen2 failing tests (#34819)
* fix: qwen2 model ids

* fix: line

* fix: more format

* update: reformat
2024-11-25 15:53:04 +01:00
Arthur
857d46ca0c
[Deberta/Deberta-v2] Refactor code base to support compile, export, and fix LLM (#22105)
* some modification for roadmap

* revert some changes

* yups

* weird

* make it work

* sttling

* fix-copies

* fixup

* renaming

* more fix-copies

* move stuff around

* remove torch script warnings

* ignore copies

* revert bad changes

* woops

* just styling

* nit

* revert

* style fixup

* nits configuration style

* fixup

* nits

* will this fix the tf pt issue?

* style

* ???????

* update

* eval?

* update error message

* updates

* style

* grumble grumble

* update

* style

* nit

* skip torch fx tests that were failing

* style

* skip the failing tests

* skip another test and make style
2024-11-25 10:43:16 +01:00
Raushan Turganbay
098962dac2
BLIP: fix generation after hub update (#34876)
* fix blip generation

* dont remove it yet

* Update src/transformers/models/blip_2/modeling_blip_2.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* address comments

* modular

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-11-25 10:41:55 +01:00
Raushan Turganbay
c1a8520419
Cache: init empty cache when use_cache (#34274)
* fix

* fix tests

* fix copies

* add docs

* Revert "add docs"

This reverts commit 32d35634f1.

* qwen move deltas

* mllama can potentiall fullgraph compile

* enable mllama compile and fix tests

* remove mllama fixes
2024-11-25 10:11:33 +01:00
Raushan Turganbay
28fb02fc05
VLMs: enable generation tests - last batch (#34484)
* add tests for 3 more vlms

* fix fuyu back

* skip test
2024-11-21 11:00:22 +01:00
Phillip Kuznetsov
8cadf76e1c
fix(DPT,Depth-Anything) torch.export (#34103)
* Fix torch.export issue in dpt based models

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Simplify the if statements

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Move activation definitions of zoe_depth to init()

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Add test_export for dpt and zoedepth

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* add depth anything

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Remove zoedepth non-automated zoedepth changes and zoedepth test

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* [run_slow] dpt, depth_anything, zoedepth

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

---------

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>
2024-11-20 11:31:21 +01:00
Raushan Turganbay
9470d65324
Fix low memory beam search (#34746)
* fix

* higher max positions in tests
2024-11-20 07:46:35 +01:00
Yoni Gozlan
eedc113914
Add Image Processor Fast Deformable DETR (#34353)
* add deformable detr image processor fast

* add fast processor to doc

* fix copies

* nit docstring

* Add tests gpu/cpu and fix docstrings

* fix docstring

* import changes from detr

* fix imports

* rebase and fix

* fix input data format change in detr and rtdetr fast
2024-11-19 11:18:58 -05:00
Phillip Kuznetsov
5fa4f64605
🚨🚨🚨 fix(Mask2Former): torch export 🚨🚨🚨 (#34393)
* fix(Mask2Former): torch export

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* revert level_start_index and create a level_start_index_list

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Add a comment to explain the level_start_index_list

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Address comment

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* add torch.export.export test

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* rename arg

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* remove spatial_shapes

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Use the version check from pytorch_utils

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* [run_slow] mask2former

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

---------

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>
2024-11-19 16:44:53 +01:00
Arthur
4bff54f921
Gemma capping (#34282)
* softcapping

* soft cap before the mask

* style

* ...

* super nit

* update

* fixes

* update

* small issue with modular

* fix modular imports

* update

* fixup

* simplify a hell lot

* simplify cleaning imports

* finish fixing

* update our design

* nits

* use a deprecation cycle

* updates

* Fix modular (recursive deps need to always be computed after merges!)

* push

* fix

* update

* fix modular order

* make fix-copies

* updates

* update

* ?

* don't compile for now

* ?

* fix some stuff

* donc!

* fix copies

* update

* fixup

* ?

* fix two tests

* fix?

* for now, don't use head info

* eager when output attentoin and sdpa or flash as it's the simplest behaviour (for our tests as well :))

* fix-copies

* revert sdpa check

* Apply suggestions from code review

Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>

* rebase, fix-copies and push

* add a slow integration test

* update the test

* fix left padding issue

* fix test

* remove duplicate scaling

* quality

* add a small test and make sure it works

* 2b

---------

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
2024-11-19 13:52:38 +01:00
Jiahao Li
0db91c3c8d
Support gradient checkpointing in Qwen2VL ViT (#34724)
* Support gradient checkpointing in Qwen2VL ViT

* Enable gradient checkpoint tests for Qwen2VL

* [run-slow] qwen2_vl
2024-11-19 12:30:44 +01:00
Raushan Turganbay
1646ffb4d1
VLMs: patch_size -> num_image_tokens in processing (#33424)
* use num additional tokens

* fix copies + docs

* another fix copies :)

* add docs

* move order for BC
2024-11-18 13:21:07 +01:00
Shane A
3ee24e2208
Add OLMo November 2024 (#34551)
* Add model skeletion with transformers-cli add-new-model-like

* Convert config to modular, add rms_norm_eps, delete clip_qkv

* Convert model to modular, add RMSNorm

* Add flash attention with qk norm and no qkv clipping

* Add decoder layer with RMSNorm after attention/feedforward layers

* Add base and causal model

* Add converter improvements from OLMo repo

* Update weight loading in OLMo to HF converter

* Set correct default for rms_norm_eps

* Set correct pipeline_model_mapping in test

* Run make fixup

* Fix model type

* Re-run modular conversion

* Manually set config docs to fix build errors

* Convert olmo-1124 to olmo_1124 to fix flash attention docs errors

* Start updating tests

* Update tests

* Copy upstream test_eager_matches_sdpa_inference_1_bfloat16 changes to olmo_1124

* Rename input_layernorm and post_attention_layernorm to reflect their ops better

* Use correct tokenizer

* Remove test unsupported by GPT2 tokenizer

* Create GenerationConfig outside of from_pretrained call

* Use simpler init file structure

* Add explicit __all__ to support simplified init

* Make safetensor serialization the default

* Update OLMo November 2024 docs
2024-11-18 10:43:10 +01:00
Yih-Dar
f2d5dfbab2
Remove @slow for test_eager_matches_sdpa_inference (#34558)
* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-05 16:10:42 +01:00
Yoni Gottesman
082e57e0d4
Fix #34494 assistant tokens when truncated (#34531)
* Fix assistant tokens when truncated

* fix test

* fix test

* step
2024-11-05 15:10:15 +00:00
Guang Yang
663c851239
DistilBERT is ExecuTorch compatible (#34475)
* DistillBERT is ExecuTorch compatible

* [run_slow] distilbert

* [run_slow] distilbert

---------

Co-authored-by: Guang Yang <guangyang@fb.com>
2024-11-05 13:41:48 +01:00
Raushan Turganbay
893ad04fad
Load sub-configs from composite configs (#34410)
* save/load sub-configs

* nit forgot these

* fix copies

* move test to common

* use dict for sub-configs

* add load-save-laod test

* clean up modeling check

* oops this are correct keys

* fix some tests, missed some composite configs

* this model was missed
2024-11-05 11:34:01 +01:00
Raushan Turganbay
187439c3fa
VLM: special multimodal Tokenizer (#34461)
* kinda works

* update

* add tests

* update

* use special tokens in processors

* typo

* fix copies

* fix

* fix moshi after rebase

* update

* fix tests

* update

* Update docs/source/en/main_classes/tokenizer.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update docs

* test for load time adding tokens

* fix some more tests which are now fetched better

* one more fix

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-11-04 16:37:51 +01:00
Raushan Turganbay
4cc0813e28
BLIP: enable generation tests (#34174)
* blip2 tests

* instructblips

* copies

* fix slow tests

* fix

* uncomment this

* clean up after rebase

* should be model main input

* fix overwritten tests

* oops len should be multiple of frame number

* style

* fix some tests
2024-11-01 08:54:48 +01:00
Raushan Turganbay
6beb3f1691
Blip: get/set input embeddings correctly (#34152)
* set-get embeds

* add tests

* fix tests

* remove

* return dict True

* fix tests

* why did i remove this

* enabel torchscript tests
2024-11-01 08:39:39 +01:00
NielsRogge
df8640cedb
[CLIPSeg] Make interpolate_pos_encoding default to True (#34419)
* Remove interpolate_pos_encoding

* Make fixup

* Make interpolate_pos_encoding default to True

* Reuse existing interpolation

* Add integration test
2024-10-31 22:15:04 +01:00
Yoni Gozlan
203e27059b
Add image text to text pipeline (#34170)
* Standardize image-text-to-text-models-output

add post_process_image_text_to_text to chameleon and cleanup

Fix legacy kwarg behavior and deprecation warning

add post_process_image_text_to_text to qwen2_vl and llava_onevision

Add post_process_image_text_to_text to idefics3, mllama, pixtral processor

* nit var name post_process_image_text_to_text udop

* nit fix deprecation warnings

* Add image-text-to-text pipeline

* add support for image url in chat template for pipeline

* Reformat to be fully compatible with chat templates

* Add tests chat template

* Fix imports and tests

* Add pipeline tag

* change logic handling of single prompt ans multiple images

* add pipeline mapping to models

* fix batched inference

* fix tests

* Add manual batching for preprocessing

* Fix outputs with nested images

* Add support for all common processing kwargs

* Add default padding when multiple text inputs (batch size>1)

* nit change version deprecation warning

* Add support for text only inference

* add chat_template warnings

* Add pipeline tests and add copied from post process function

* Fix batched pipeline tests

* nit

* Fix pipeline tests blip2

* remove unnecessary max_new_tokens

* revert processing kosmos2 and remove unnecessary max_new_tokens

* fix pipeline tests idefics

* Force try loading processor if pipeline supports it

* revert load_processor change

* hardcode loading only processor

* remove unnecessary try except

* skip imagetexttotext tests for kosmos2 as tiny model causes problems

* Make code clearer

* Address review comments

* remove preprocessing logic from pipeline

* fix fuyu

* add BC resize fuyu

* Move post_process_image_text_to_text to ProcessorMixin

* add guard in post_process

* fix zero shot object detection pipeline

* add support for generator input in pipeline

* nit

* change default image-text-to-text model to llava onevision

* fix owlv2 size dict

* Change legacy deprecation warning to only show when True
2024-10-31 15:48:11 -04:00
Yih-Dar
114dd812dd
make test_eager_matches_sdpa_inference less flaky (#34512)
* try

* try

* try

* try

* try

* try

* update

* update

* update

* update

* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-31 18:34:00 +01:00
Phillip Kuznetsov
b5919e12f7
fix(DPT,Depth-Anything) Address expected_slice errors inside inference tests (#34518)
* fix(DPT,Depth-Anything) Address expected_slice errors inside inference tests

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* [run_slow] dpt, depth_anything

---------

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>
2024-10-31 16:47:58 +01:00
Yih-Dar
ab98f0b0a1
avoid calling gc.collect and cuda.empty_cache (#34514)
* update

* update

* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-31 16:36:13 +01:00
Yoni Gozlan
48872fd6ae
Add Image Processor Fast RT-DETR (#34354)
* add fast image processor rtdetr

* add gpu/cpu test and fix docstring

* remove prints

* add to doc

* nit docstring

* avoid iterating over images/annotations several times

* change torch typing

* Add image processor fast documentation
2024-10-30 13:49:47 -04:00
Pablo Montalvo
241d79026f
fix pixtral processor (#34486)
* fix pixtral processor

* test out full length batches + remove undue ValueError

* fix up processing

* fix tests

* fix

* last fixup

* style

* [run-slow] pixtral

* [run-slow] pixtral

* fix config key

* skip torchscript tests

* [run-slow] pixtral

* add missing key

* [run-slow] pixtral

* fix docs

* [run-slow] pixtral

* fix wrong url for integration test

* [run-slow] pixtral

* pixtralVisionModel does not have a lm head

* [run-slow] pixtral
2024-10-30 14:17:20 +01:00
Joao Gante
8a734ea2c3
Tests: move generate tests to the right mixin and delete redundant tests (#34464)
* tmp commit

* tmp commit

* cull overwrites of deleted tests

* typo

* more specific docstring

* make fixup

* parameterize at the top?

* correction

* more deletions :D

* tmp commit

* for VLMs too

* fix _check_outputs

* test nit

* make fixup

* fix another flaky

* test_generate_from_inputs_embeds -- handle missing attention mask
2024-10-30 10:59:08 +00:00
Raushan Turganbay
913330ca9f
VLMs: fix number of image tokens (#34332)
* fix

* fix tests

* add tests

* style

* style

* fix qwen after rebase

* fix video llava
2024-10-30 10:21:37 +01:00
Guang Yang
cd277618d4
Roberta is ExecuTorch compatible (#34425)
* Roberta is ExecuTorch compatible

* [run_slow] roberta

---------

Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-30 08:36:45 +00:00
Guang Yang
f339042b0b
Albert is ExecuTorch compatible (#34476)
Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-29 16:22:13 +01:00
Guang Yang
34620e8f0a
MobileBERT is ExecuTorch compatible (#34473)
Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-29 16:14:31 +01:00
Guang Yang
5392f12e16
Bert is ExecuTorch compatible (#34424)
Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-29 14:30:02 +01:00
Yih-Dar
439334c8fb
Simplify running tests in a subprocess (#34213)
* check

* check

* check

* check

* add docstring

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-29 10:48:57 +01:00
StevenBucaille
a1835195d1
🚨🚨🚨 [SuperPoint] Fix keypoint coordinate output and add post processing (#33200)
* feat: Added int conversion and unwrapping

* test: added tests for post_process_keypoint_detection of SuperPointImageProcessor

* docs: changed docs to include post_process_keypoint_detection method and switched from opencv to matplotlib

* test: changed test to not depend on SuperPointModel forward

* test: added missing require_torch decorator

* docs: changed pyplot parameters for the keypoints to be more visible in the example

* tests: changed import torch location to make test_flax and test_tf

* Revert "tests: changed import torch location to make test_flax and test_tf"

This reverts commit 39b32a2f69.

* tests: fixed import

* chore: applied suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* tests: fixed import

* tests: fixed import (bis)

* tests: fixed import (ter)

* feat: added choice of type for target_size and changed tests accordingly

* docs: updated code snippet to reflect the addition of target size type choice in post process method

* tests: fixed imports (...)

* tests: fixed imports (...)

* style: formatting file

* docs: fixed typo from image[0] to image.size[0]

* docs: added output image and fixed some tests

* Update docs/source/en/model_doc/superpoint.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* fix: included SuperPointKeypointDescriptionOutput in TYPE_CHECKING if statement and changed tests results to reflect changes to SuperPoint from absolute keypoints coordinates to relative

* docs: changed SuperPoint's docs to print output instead of just accessing

* style: applied make style

* docs: added missing output type and precision in docstring of post_process_keypoint_detection

* perf: deleted loop to perform keypoint conversion in one statement

* fix: moved keypoint conversion at the end of model forward

* docs: changed SuperPointInterestPointDecoder to SuperPointKeypointDecoder class name and added relative (x, y) coordinates information to its method

* fix: changed type hint

* refactor: removed unnecessary brackets

* revert: SuperPointKeypointDecoder to SuperPointInterestPointDecoder

* Update docs/source/en/model_doc/superpoint.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

---------

Co-authored-by: Steven Bucaille <steven.bucaille@buawei.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2024-10-29 09:36:03 +00:00
Raushan Turganbay
808d6c50f8
Generation: fix test (#34369)
* fix test

* fix copies
2024-10-29 07:57:10 +01:00
Alexandros Benetatos
a769ed45e1
Add post_process_depth_estimation for GLPN (#34413)
* add depth postprocessing for GLPN

* remove previous temp fix for glpn tests

* Style changes for GLPN's `post_process_depth_estimation`

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* additional style fix

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-10-28 19:44:20 +01:00
Ilyas Moutawwakil
fddbd3c13c
Fix pix2struct (#34374)
* fix

* fix and test use_cache test

* style

* remove atol
2024-10-28 11:24:56 +01:00
Joao Gante
186b8dc190
Tests: upgrade test_eager_matches_sdpa_generate (#34386) 2024-10-25 11:55:07 +01:00
Yoni Gozlan
940a6bd343
Use non nested images and batched text Idefics2/3 (#34222)
* add support for non nested images and add tests

* add tests error scenario

* fix style

* added single and no image to error tests
2024-10-24 20:00:13 -04:00
Cyril Vallez
4c6e0c9252
Correct the new defaults (#34377)
* Correct the new defaults

* CIs

* add check

* Update utils.py

* Update utils.py

* Add the max_length in generate test checking shape without passing length

* style

* CIs

* fix fx CI issue
2024-10-24 18:42:03 +02:00
Michael Benayoun
1c5918d910
Fix torch.fx issue related to the new loss_kwargs keyword argument (#34380)
* Fix FX

* Unskip tests
2024-10-24 18:34:28 +02:00
Raushan Turganbay
b29c24ff1e
CI: fix failures (#34371)
fix
2024-10-24 13:44:53 +02:00
Yih-Dar
c42b3223db
skip test_pipeline_depth_estimation temporarily (#34316)
skip

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-23 17:27:51 +02:00
Zach Mueller
d9f733625c
Enable Gradient Accumulation fix across all models + trainer fully in forward() (#34283)
* Enable grad accum fix across all models + trainer fully in forward()

* handle peft case

* Account for DDP: need to run scale tests

* Use accelerator state

* Quality

* Guard

* Experiment w/ only fairseq fix

* Fairseq only

* Revert multiply_grads fix

* Mult by grad accum to fully bring back solution

* Style

* Good to go now

* Skip fx tests for now

* Bookmark

* Working now
2024-10-23 11:24:57 -04:00
Yoni Gozlan
e7c3fa7f57
Fix continue_final_message for image-text-to-text chat templates (#34236)
* fix continue_final_message for vlms

* Add one test for vlms continue_final_message chat template
2024-10-22 11:57:44 -04:00
Guang Yang
c14ccbcd64
Olmo is ExecuTorch Compatible (#34181)
Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-22 15:53:01 +02:00
Guang Yang
7a08a772cc
Qwen2.5 is ExecuTorch Compatible (#34102)
Qwen2 is ExecuTorch Compatible

Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-22 15:52:23 +02:00
Alexandros Benetatos
c31a6ff474
Add post_process_depth_estimation to image processors and support ZoeDepth's inference intricacies (#32550)
* add colorize_depth and matplotlib availability check

* add post_process_depth_estimation for zoedepth + tests

* add post_process_depth_estimation for DPT + tests

* add post_process_depth_estimation in DepthEstimationPipeline & special case for zoedepth

* run `make fixup`

* fix import related error on tests

* fix more import related errors on test

* forgot some `torch` calls in declerations

* remove `torch` call in zoedepth tests that caused error

* updated docs for depth estimation

* small fix for `colorize` input/output types

* remove `colorize_depth`, fix various names, remove matplotlib dependency

* fix formatting

* run fixup

* different images for test

* update examples in `forward` functions

* fixed broken links

* fix output types for docs

* possible format fix inside `<Tip>`

* Readability related updates

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Readability related update

* cleanup after merge

* refactor `post_process_depth_estimation` to return dict; simplify ZoeDepth's `post_process_depth_estimation`

* rewrite dict merging to support python 3.8

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2024-10-22 15:50:54 +02:00
Raushan Turganbay
73d65e637b
T5 compile compatibilty (#34089)
* this worked in normal generation, needs more tests

* fix almost all tests in t5

* nit

* longt5, umt5, mt5

* style

* udop, pix2struct

* more models

* fix some tests

* fix onnx tests

* tracing tests fixed

* compile enabled and tested for t5 models

* fix small bug in slow tests

* [run-slow] t5

* uncomment

* style

* update with new generation refactoring

* nit

* fix copies

* this is the fix, had to change t5 to fix copies

* update

* [run-slow] t5

* [run-slow] t5

* update

* add test for encoder only T5

* clean up after rebase

* fix pop2piano

* add comment

* style

* fix copies after rebase

* fix copies  missed this one
2024-10-22 08:23:53 +02:00
Raushan Turganbay
21d5025826
Attn implementation for composite models (#32238)
* first try

* codestyle

* idefics2 is happy

* [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo, paligemma

* fix-copies

* [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo

* blip-2 needs to init vision from config

* when was this removed O_o

* minor fix

* tests

* this way?

* tests

* model-agnostic code

* codestyle

* add tests for idefics

* modify general test for VLMs

* no generation test for vlm yet!

* no generation test here also

* wanr in VIT-SDPA if output attn

* add more tests

* user can pass dict as attn impl

* repo consistency

* update

* muicgen

* no prints

* forgot speech enc-dec and clip

* how many composite models we have?

* musicgen meelody is same as mudicgen

* +siglip

* fix tests + add some more

* remove idefics custom overriden code

* make idefics2 automappable

* nits

* skip tests

* doctests

* Update src/transformers/models/idefics2/configuration_idefics2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/clip/test_modeling_clip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/idefics2/test_modeling_idefics2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/idefics2/test_modeling_idefics2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/configuration_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* major update, no need for automap

* clean up

* add FA2 test

* more tests

* style

* skip tests

* why did these started failing now?

* no attributes for FA2 needed

* one tiny test

* address comment about FA2 false warning

* style

* add new models and resolve conflicts

* fix copies

* let it be this way for now, come back tomorrow to review

* some more fixes

* update

* more updates

* update

* fix copies

* style and tests

* another big update

* fix tests

* fix tests

* update

* another update

* fix tests

* fix copies

* fix tests

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-10-22 06:54:44 +02:00
Yoni Gozlan
a4122813d1
Add DetrImageProcessorFast (#34063)
* add fully functionning image_processing_detr_fast

* Create tensors on the correct device

* fix copies

* fix doc

* add tests equivalence cpu gpu

* fix doc en

* add relative imports and copied from

* Fix copies and nit
2024-10-21 09:05:05 -04:00
Raushan Turganbay
ca541bd4f4
Generation tests: don't rely on main input name (#34228)
* don't rely on main input name

* update
2024-10-21 10:00:14 +02:00
Cyril Vallez
6604764007
add Glm (#33823)
* Create modular_glm.py

* Update modular_glm.py

* Finalize architecture without all attentions

* Add all attentions modules

* Finalize modular

* Update given last version

* Last update

* Finalize model

* Finalize converter

* Update convert_glm_weights_to_hf.py

* style

* style

* Create __init__.py

* Aff all inits

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Correct the rotary embeddings

* Remove apply_residual_connection_post_layernorm (always false)

* remove use_rms_norm (always true)

* remove past_layer_norm (always true)

* Update __init__.py

* Update config and license

* start adding tests and doc

* Add doc + style

* Update test_modeling_glm.py

* Add dummies

* Apply correct modeling

* Refactor attention to follow llama

* Update __init__.py

* Update convert_glm_weights_to_hf.py

* Correct bias

* remove linear_bias and pdrop (never used)

* apply modular

* Simplify converter

* remove dummies + style

* add model_input_names

* Add pretraining_tp to config for when eager attention is used

* Update modular to remove all pretraining_tp

* Update test_modeling_glm.py

* Update the __all__

* Update __all__

* Update __init__.py

* Update test_modeling_glm.py

* add revisions

* Add the correct repos and revisions

* style

* Update __init__.py

* update exports

* remove import of modular files

* style

* Apply Llama changes + refine converter

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* style

* Use new modular converter

* add pretrainedmodel to init

* style

* Update test_modeling_glm.py

* Move config outside modular to please CI about docstrings

* Add dummies to please CI

* Update glm.md

* Update glm.md
2024-10-18 17:41:12 +02:00
Arthur
b54109c746
Fix-red-ci (#34230)
* fix copies, skip fx for llama

* styke

* re-fix copies

* last?

* style
2024-10-17 23:38:35 +02:00
Guang Yang
9470c00042
Llama3 and Llama2 are ExecuTorch compatible (#34101)
Llama3_1b and Llama2_7b are ExecuTorch compatible

Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-17 17:33:19 +02:00
Yoach Lacombe
9ba021ea75
Moshi integration (#33624)
* clean mimi commit

* some nits suggestions from Arthur

* make fixup

* first moshi WIP

* converting weights working + configuration + generation configuration

* finalize converting script - still missing tokenizer and FE and processor

* fix saving model w/o default config

* working generation

* use GenerationMixin instead of inheriting

* add delay pattern mask

* fix right order: moshi codes then user codes

* unconditional inputs + generation config

* get rid of MoshiGenerationConfig

* blank user inputs

* update convert script:fix conversion, add  tokenizer, feature extractor and bf16

* add and correct Auto classes

* update modeling code, configuration and tests

* make fixup

* fix some copies

* WIP: add integration tests

* add dummy objects

* propose better readiblity and code organisation

* update tokenization tests

* update docstrigns, eval and modeling

* add .md

* make fixup

* add MoshiForConditionalGeneration to ignore Auto

* revert mimi changes

* re

* further fix

* Update moshi.md

* correct md formating

* move prepare causal mask to class

* fix copies

* fix depth decoder causal

* fix and correct some tests

* make style and update .md

* correct config checkpoitn

* Update tests/models/moshi/test_tokenization_moshi.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/models/moshi/test_tokenization_moshi.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* make style

* Update src/transformers/models/moshi/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup

* change firm in copyrights

* udpate config with nested dict

* replace einsum

* make style

* change split to True

* add back splt=False

* remove tests in convert

* Update tests/models/moshi/test_modeling_moshi.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* add default config repo + add model to FA2 docstrings

* remove logits float

* fix some tokenization tests and ignore some others

* make style tokenization tests

* update modeling with sliding window + update modeling tests

* [run-slow] moshi

* remove prepare for generation frol CausalLM

* isort

* remove copied from

* ignore offload tests

* update causal mask and prepare 4D mask aligned with recent changes

* further test refine + add back prepare_inputs_for_generation for depth decoder

* correct conditional use of prepare mask

* update slow integration tests

* fix multi-device forward

* remove previous solution to device_map

* save_load is flaky

* fix generate multi-devices

* fix device

* move tensor to int

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Marc Sun <marc@huggingface.co>
2024-10-16 11:21:49 +02:00
Raushan Turganbay
d087165db0
IDEFICS: support inputs embeds (#34043)
* support embeds

* use cache from config

* style...

* fix tests after rebase
2024-10-16 09:25:26 +02:00
laurentd-lunit
0f49deacbf
[feat] LlavaNext add feature size check to avoid CUDA Runtime Error (#33608)
* [feat] add feature size check to avoid CUDA Runtime Error

* [minor] add error handling to all llava models

* [minor] avoid nested if else

* [minor] add error message to Qwen2-vl and chameleon

* [fix] token dimension for check

* [minor] add feature dim check for videos too

* [fix] dimension check

* [fix] test reference values

---------

Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
2024-10-15 16:19:18 +02:00
Prakarsh Kaushik
293e6271c6
Add sdpa for Vivit (#33757)
* chore:add sdpa to vivit

* fix:failing slow test_inference_interpolate_pos_encoding(failing on main branch too)

* chore:fix nits

* ci:fix repo consistency failure

* chore:add info and benchmark to model doc

* [run_slow] vivit

* chore:revert interpolation test fix for new issue

* [run_slow] vivit

* [run_slow] vivit

* [run_slow] vivit

* chore:add fallback for output_attentions being True

* [run_slow] vivit

* style:make fixup

* [run_slow] vivit
2024-10-15 11:27:54 +02:00
Raushan Turganbay
23874f5948
Idefics: enable generation tests (#34062)
* add idefics

* conflicts after merging main

* enable tests but need to fix some

* fix tests

* no print

* fix/skip some slow tests

* continue not skip

* rebasing broken smth, this is the fix
2024-10-15 11:17:14 +02:00
Anton Vlasjuk
7434c0ed21
Mistral-related models for QnA (#34045)
* mistral qna start

* mixtral qna

* oops

* qwen2 qna

* qwen2moe qna

* add missing input embed methods

* add copied to all methods, can't directly from llama due to the prefix

* make top level copied from
2024-10-14 08:53:32 +02:00
Yih-Dar
7b06473b8f
avoid many failures for ImageGPT (#34071)
* skip

* [run-slow] imagegpt

* skip

* [run-slow] imagegpt

* [run-slow] imagegpt,video_llava

* skip

* [run-slow] imagegpt,video_llava

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-11 15:24:01 +02:00
Yoach Lacombe
9dca0c9116
Fix DAC slow tests (#34088)
* Fix DAC slow tests and fix decode

* [run-slow] dac
2024-10-11 14:43:03 +02:00
Joao Gante
e878eaa9fc
Tests: upcast logits to float() (#34042)
upcast
2024-10-11 11:51:49 +01:00
Guang Yang
7d97cca8dd
Generate using exported model and enable gemma2-2b in ExecuTorch (#33707)
* Generate using exported model and enable gemma2-2b in ExecuTorch

* [run_slow] gemma, gemma2

* truncate expected output message

* Bump required torch version to support gemma2 export

* [run_slow] gemma, gemma2

---------

Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-11 10:16:31 +02:00
Pavel Iakubovskii
8363fd8346
Update Blip2 is_pipeline_test_to_skip method signature (#34067)
Update method signature
2024-10-10 16:32:08 +01:00
Mohamed Abu El-Nasr
4a3f1a686f
check if eigenvalues of covariance matrix are complex. (#34037)
check if eigenvalues of covariance complex for psd checking
2024-10-10 14:44:05 +02:00
Raushan Turganbay
adea67541a
Phi3: fix attn for sliding window (#33586)
* fix phi3 attn fir sliding window

* fix tests

* address most comment

* style

* update after rebase

* add more models

* fix tests
2024-10-10 11:50:39 +02:00
Avishai Elmakies
a265600c60
add sdpa to OPT (#33298)
* add sdpa to OPT

* chore: remove redundant whitespace in OPTDecoder class

* fixup

* bug fix

* add sdpa and attention generate test

* fixup

* Refactor OPTAttention forward method for improved readability and maintainability

* undo refactor for _shape and key,val states

* add OPT to doc, fixup didn't find it for some reason

* change order

* change default attn_implemntation in testing to eager

* [run-slow] opt

* change test_eager_matches_sdpa_generate to the one llama

* Update default attention implementation in testing common

* [run-slow] opt

* remove uneeded print

* [run-slow] opt

* refactor model testers to have attn_implementation="eager"

* [run-slow] opt

* convert test_eager_matches_sdpa_generate to opt-350M

* bug fix when creating mask for opt

* [run-slow] opt

* if layer head mask default to eager

* if head mask is not none fall to eager

* [run-slow] opt

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Clean up Unpack imports (#33631)

clean up Unpack imports

* Fix DPT /Dinov2 sdpa regression on main (#33660)

* fallback to eager if output attentions.

* fix copies

* handle dependency errors in check_imports (#33622)

* handle dependency errors in check_imports

* change log level to warning

* add back self.max_position_embeddings = config.max_position_embeddings (#33550)

* add back self.max_position_embeddings = config.max_position_embeddings

* fix-copies

* Fix Llava conversion for LlavaQwen2ForCausalLM with Clip vision tower (#33613)

fix llavaqwen2 model conversion

* Uniformize kwargs for Udop processor and update docs (#33628)

* Add optional kwargs and uniformize udop

* cleanup Unpack

* nit Udop

* Generation: deprecate `PreTrainedModel` inheriting from `GenerationMixin`  (#33203)

* Enable BNB multi-backend support (#31098)

* enable cpu bnb path

* fix style

* fix code style

* fix 4 bit path

* Update src/transformers/utils/import_utils.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* add multi backend refactor tests

* fix style

* tweak 4bit quantizer + fix corresponding tests

* tweak 8bit quantizer + *try* fixing corresponding tests

* fix dequant bnb 8bit

* account for Intel CPU in variability of expected outputs

* enable cpu and xpu device map

* further tweaks to account for Intel CPU

* fix autocast to work with both cpu + cuda

* fix comments

* fix comments

* switch to testing_utils.torch_device

* allow for xpu in multi-gpu tests

* fix tests 4bit for CPU NF4

* fix bug with is_torch_xpu_available needing to be called as func

* avoid issue where test reports attr err due to other failure

* fix formatting

* fix typo from resolving of merge conflict

* polish based on last PR review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* fix CI

* Update src/transformers/integrations/integration_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/integrations/integration_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix error log

* fix error msg

* add \n in error log

* make quality

* rm bnb cuda restriction in doc

* cpu model don't need dispatch

* fix doc

* fix style

* check cuda avaliable in testing

* fix tests

* Update docs/source/en/model_doc/chameleon.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update docs/source/en/model_doc/llava_next.md

Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update tests/quantization/bnb/test_4bit.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update tests/quantization/bnb/test_4bit.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* fix doc

* fix check multibackends

* fix import sort

* remove check torch in bnb

* docs: update bitsandbytes references with multi-backend info

* docs: fix small mistakes in bnb paragraph

* run formatting

* reveret bnb check

* move bnb multi-backend check to import_utils

* Update src/transformers/utils/import_utils.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* fix bnb check

* minor fix for bnb

* check lib first

* fix code style

* Revert "run formatting"

This reverts commit ac108c6d6b.

* fix format

* give warning when bnb version is low and no cuda found]

* fix device assignment check to be multi-device capable

* address akx feedback on get_avlbl_dev fn

* revert partially, as we don't want the function that public, as docs would be too much (enforced)

---------

Co-authored-by: Aarni Koskela <akx@iki.fi>
Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Fix error string after refactoring into get_chat_template (#33652)

* Fix error string after refactoring into get_chat_template

* Take suggestion from CR

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* uniformize git processor (#33668)

* uniformize git processor

* update doctring

* Modular `transformers`: modularity and inheritance for new model additions (#33248)

* update exampel

* update

* push the converted diff files for testing and ci

* correct one example

* fix class attributes and docstring

* nits

* oups

* fixed config!

* update

* nitd

* class attributes are not matched against the other, this is missing

* fixed overwriting self.xxx now onto the attributes I think

* partial fix, now order with docstring

* fix docstring order?

* more fixes

* update

* fix missing docstrings!

* examples don't all work yet

* fixup

* nit

* updated

* hick

* update

* delete

* update

* update

* update

* fix

* all default

* no local import

* fix more diff

* some fix related to "safe imports"

* push fixed

* add helper!

* style

* add a check

* all by default

* add the

* update

* FINALLY!

* nit

* fix config dependencies

* man that is it

* fix fix

* update diffs

* fix the last issue

* re-default to all

* alll the fixes

* nice

* fix properties vs setter

* fixup

* updates

* update dependencies

* make sure to install what needs to be installed

* fixup

* quick fix for now

* fix!

* fixup

* update

* update

* updates

* whitespaces

* nit

* fix

* simplify everything, and make it file agnostic (should work for image processors)

* style

* finish fixing all import issues

* fixup

* empty modeling should not be written!

* Add logic to find who depends on what

* update

* cleanup

* update

* update gemma to support positions

* some small nits

* this is the correct docstring for gemma2

* fix merging of docstrings

* update

* fixup

* update

* take doc into account

* styling

* update

* fix hidden activation

* more fixes

* final fixes!

* fixup

* fixup instruct  blip video

* update

* fix bugs

* align gemma2 with the rest as well

* updats

* revert

* update

* more reversiom

* grind

* more

* arf

* update

* order will matter

* finish del stuff

* update

* rename to modular

* fixup

* nits

* update makefile

* fixup

* update order of the checks!

* fix

* fix docstring that has a call inside

* fiix conversion check

* style

* add some initial documentation

* update

* update doc

* some fixup

* updates

* yups

* Mostly todo gimme a minut

* update

* fixup

* revert some stuff

* Review docs for the modular transformers (#33472)

Docs

* good update

* fixup

* mmm current updates lead to this code

* okay, this fixes it

* cool

* fixes

* update

* nit

* updates

* nits

* fix doc

* update

* revert bad changes

* update

* updates

* proper update

* update

* update?

* up

* update

* cool

* nits

* nits

* bon bon

* fix

* ?

* minimise changes

* update

* update

* update

* updates?

* fixed gemma2

* kind of a hack

* nits

* update

* remove `diffs` in favor of `modular`

* fix make fix copies

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Fix CIs post merging modular transformers (#33681)

update

* Fixed docstring for cohere model regarding unavailability of prune_he… (#33253)

* Fixed docstring for cohere model regarding unavailability of prune_head() methods

The docstring mentions that cohere model supports prune_heads() methods. I have fixed the docstring by explicitly mentioning that it doesn't support that functionality.

* Update src/transformers/models/cohere/modeling_cohere.py

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Generation tests: update imagegpt input name, remove unused functions (#33663)

* Improve Error Messaging for Flash Attention 2 on CPU (#33655)

Update flash-attn error message on CPU

Rebased to latest branch

* Gemma2: fix config initialization (`cache_implementation`) (#33684)

* Fix ByteLevel alphabet missing when Sequence pretokenizer is used (#33556)

* Fix ByteLevel alphabet missing when Sequence pretokenizer is used

* Fixed formatting with `ruff`.

* Uniformize kwargs for image-text-to-text processors (#32544)

* uniformize FUYU processor kwargs

* Uniformize instructblip processor kwargs

* Fix processor kwargs and tests Fuyu, InstructBlip, Kosmos2

* Uniformize llava_next processor

* Fix save_load test for processor with chat_template only as extra init args

* Fix import Unpack

* Fix Fuyu Processor import

* Fix FuyuProcessor import

* Fix FuyuProcessor

* Add defaults for specific kwargs kosmos2

* Fix Udop to return BatchFeature instead of BatchEncoding and uniformize kwargs

* Add tests processor Udop

* remove Copied from in processing Udop as change of input orders caused by BatchEncoding -> BatchFeature

* Fix overwrite tests kwargs processors

* Add warnings and BC for changes in processor inputs order, change docs, add BC for text_pair as arg for Udop

* Fix processing test fuyu

* remove unnecessary pad_token check in instructblip ProcessorTest

* Fix BC tests and cleanup

* FIx imports fuyu

* Uniformize Pix2Struct

* Fix wrong name for FuyuProcessorKwargs

* Fix slow tests reversed inputs align fuyu llava-next, change udop warning

* Fix wrong logging import udop

* Add check images text input order

* Fix copies

* change text pair handling when positional arg

* rebase on main, fix imports in test_processing_common

* remove optional args and udop uniformization from this PR

* fix failing tests

* remove unnecessary test, fix processing utils and test processing common

* cleanup Unpack

* cleanup

* fix conflict grounding dino

* 🚨🚨 Setting default behavior of assisted decoding (#33657)

* tests: fix pytorch tensor placement errors (#33485)

This commit fixes the following errors:
* Fix "expected all tensors to be on the same device" error
* Fix "can't convert device type tensor to numpy"

According to pytorch documentation torch.Tensor.numpy(force=False)
performs conversion only if tensor is on CPU (plus few other restrictions)
which is not the case. For our case we need force=True since we just
need a data and don't care about tensors coherency.

Fixes: #33517
See: https://pytorch.org/docs/2.4/generated/torch.Tensor.numpy.html

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* bump tokenizers, fix added tokens fast (#32535)

* update based on tokenizers release

* update

* nits

* update

* revert re addition

* don't break that yet

* fmt

* revert unwanted

* update tokenizers version

* update dep table

* update

* update in conversion script as well

* some fix

* revert

* fully revert

* fix training

* remove set trace

* fixup

* update

* update

* [Pixtral] Improve docs, rename model (#33491)

* Improve docs, rename model

* Fix style

* Update repo id

* fix code quality after merge

* HFQuantizer implementation for compressed-tensors library (#31704)

* Add compressed-tensors HFQuantizer implementation

* flag serializable as False

* run

* revive lines deleted by ruff

* fixes to load+save from sparseml, edit config to quantization_config, and load back

* address satrat comment

* compressed_tensors to compressed-tensors and revert back is_serializable

* rename quant_method from sparseml to compressed-tensors

* tests

* edit tests

* clean up tests

* make style

* cleanup

* cleanup

* add test skip for when compressed tensors is not installed

* remove pydantic import + style

* delay torch import in test

* initial docs

* update main init for compressed tensors config

* make fix-copies

* docstring

* remove fill_docstring

* Apply suggestions from code review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* review comments

* review comments

* comments - suppress warnings on state dict load, tests, fixes

* bug-fix - remove unnecessary call to apply quant lifecycle

* run_compressed compatability

* revert changes not needed for compression

* no longer need unexpected keys fn

* unexpected keys not needed either

* Apply suggestions from code review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* add to_diff_dict

* update docs and expand testing

* Update _toctree.yml with compressed-tensors

* Update src/transformers/utils/quantization_config.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update doc

* add note about saving a loaded model

---------

Co-authored-by: George Ohashi <george@neuralmagic.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Sara Adkins <sara@neuralmagic.com>
Co-authored-by: Sara Adkins <sara.adkins65@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Dipika Sikka <ds3822@columbia.edu>
Co-authored-by: Dipika <dipikasikka1@gmail.com>

* update model card for opt

* add batch size to inference table

* [slow-run] opt

* [run-slow] opt

---------

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Co-authored-by: Avishai Elmakies <avishai.elma@cs.huji.ac.il>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: chengchengpei <5881383+chengchengpei@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Aarni Koskela <akx@iki.fi>
Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Tibor Reiss <75096465+tibor-reiss@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>
Co-authored-by: Muhammad Naufil <m.naufil1@gmail.com>
Co-authored-by: sizhky <yyeshr@gmail.com>
Co-authored-by: Umar Butler <umar@umar.au>
Co-authored-by: Jonathan Mamou <jonathan.mamou@intel.com>
Co-authored-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>
Co-authored-by: George Ohashi <george@neuralmagic.com>
Co-authored-by: Sara Adkins <sara@neuralmagic.com>
Co-authored-by: Sara Adkins <sara.adkins65@gmail.com>
Co-authored-by: Dipika Sikka <ds3822@columbia.edu>
Co-authored-by: Dipika <dipikasikka1@gmail.com>
2024-10-10 11:49:34 +02:00
Pavel Iakubovskii
48461c0fe2
Make pipeline able to load processor (#32514)
* Refactor get_test_pipeline

* Fixup

* Fixing tests

* Add processor loading in tests

* Restructure processors loading

* Add processor to the pipeline

* Move model loading on tom of the test

* Update `get_test_pipeline`

* Fixup

* Add class-based flags for loading processors

* Change `is_pipeline_test_to_skip` signature

* Skip t5 failing test for slow tokenizer

* Fixup

* Fix copies for T5

* Fix typo

* Add try/except for tokenizer loading (kosmos-2 case)

* Fixup

* Llama not fails for long generation

* Revert processor pass in text-generation test

* Fix docs

* Switch back to json file for image processors and feature extractors

* Add processor type check

* Remove except for tokenizers

* Fix docstring

* Fix empty lists for tests

* Fixup

* Fix load check

* Ensure we have non-empty test cases

* Update src/transformers/pipelines/__init__.py

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Update src/transformers/pipelines/base.py

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Rework comment

* Better docs, add note about pipeline components

* Change warning to error raise

* Fixup

* Refine pipeline docs

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-10-09 16:46:11 +01:00
Raushan Turganbay
5ee52ae0bc
Mllama: fix tests (#34000)
* fix tests

* don't need this

* style
2024-10-09 14:02:56 +02:00
Joao Gante
295a90cb40
Generate: remove most decoder-only LLMs prepare_inputs_for_generation (#33870) 2024-10-09 12:15:48 +01:00
Mohamed Abu El-Nasr
cdee5285ca
Fix Failed tests with mobile bert resize tokens embedding (#33950)
* Fix Failed tests with mobile bert

* Cast to the correct dtype

* Code fixup

* Fix padding_idx larger that embedding_size

* Reduce covariance more. use 1e-7 instead of 1e-5

* Comment fix

* Reduce covariance more. use 1e-9 instead of 1e-7

* Copy new config

* all but MRA fixed

* fix mra

* very flaky

* skip instead

* make fixup

---------

Co-authored-by: Joao Gante <joao@huggingface.co>
2024-10-09 11:23:50 +01:00
Yoni Gozlan
e2001c3413
Add auto model for image-text-to-text (#32472)
* Add Auto model for image-text-to-text

* Remove donut from processing auto, add chameleon ti image text to text models

* add qwen2_vl and llava_onevision

* add pixtral to auto model for image-text-to-text

* add mllama and idefics3

* remove models in IGNORE_NON_AUTO_CONFIGURED

* add AutoModelForImageTextToText to tests and doc
2024-10-08 14:26:43 +02:00
Arthur
736c7cde51
[pytes collection] Fix flax test collection (#34004)
bit weird but to filter I had to use this
2024-10-07 18:11:13 +02:00
Arthur
9b4b0c07db
[Red CIs] Fix hub failures (#34001)
maybe setup should work?
2024-10-07 10:56:24 +02:00
TomLim
1bd604d11c
[WIP] Add Tokenizer for MyT5 Model (#31286)
* Initial commit for MyT5 model

* custom implementation of MyT5 tokenizer, unused files deleted

* unittest for myt5 tokenizer

* upadate of import structure and style

* removed remmanents of MyT5Config

* fixed docstrings

* Updates after review: filled documentaion file, new docstrings and tests added

* Fixed code style issues

* fixed copied from to refer to function

* updated loading myt5 tokenizer in tests, added sample byte map file to fixtures

* changes after review

* removed redundant copied from

* removed redundant copied from

* optimalization and loading model from hf

* [run_slow] myt5

* [run-slow] myt5

* Updated en documentation for myt5

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-10-06 10:33:16 +02:00
Yehoshua Cohen
56be9f1925
add test for Jamba with new model jamba-tiny-dev (#33863)
* add test for jamba with new model

* ruff fix

---------

Co-authored-by: Yehoshua Cohen <yehoshuaco@ai21.com>
2024-10-05 16:03:12 +02:00
Raushan Turganbay
612065efeb
Paligemma: fix static cache test (#33941)
* fix

* not flaky anymore + style
2024-10-05 09:47:37 +02:00
Joao Gante
38f9f10dd9
Cache: revert DynamicCache init for BC (#33861)
* tmp commit

* tmp commit

* make fixup

* missing removal

* fix condition

* fix end-to-end compilation

* if -> elif

* BC

* BC

* use @deprecate_kwarg("num_hidden_layers", version="4.47.0")

* wups the import

* 🥴

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2024-10-04 22:47:08 +02:00
Arthur
f92d354823
fix red check-copies (#33964) 2024-10-04 22:45:37 +02:00