Commit Graph

1416 Commits

Author SHA1 Message Date
Adibvafa Fallahpour
c269c5c74d
Fix Mamba slow path bug with dtype mismatch. (#32691)
* Fix Mamba slow path bug with dtype mismatch.

* Update test_modeling_mamba.py

* Improve style.

* Fix issue with cache position of dtype mismatch test.

* Change test for slow path.

* Revert changes.

* Switch to buggy code and add test to catch it.

* Fix the dtype mismatch bug and add test code to verify it.

* Fix minor bug with test.

* Fix incorrect dtype of model output.

* Fix incorrect dtype of cache.

* Fix incorrect dtype of ssm cache.

* Fix incorrect dtype of conv state.

* Remove assertion for ssm state.

* Add assertion for conv state dtype.

* Fix all issues with dtype mismatch test.
2024-10-01 09:28:40 +02:00
Joshua Lochner
18c5b216f1
Fix ViT-MAE decoder interpolate (#33330)
* Fix ViT-MAE decoder interpolate

* Add unit test for `interpolate_pos_encoding` w/ custom sizes

* [run_slow] vit_mae
2024-09-30 18:47:13 +02:00
Raushan Turganbay
3e039d3827
Paligemma support for multi-image (#33447)
* upadte

* Update src/transformers/models/paligemma/processing_paligemma.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* update docs

* better example in tests

* support image tokens

* read token

* Update tests/models/paligemma/test_processing_paligemma.py

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>

* nit: naming

* Update docs/source/en/model_doc/paligemma.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* conflicts after rebasing

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
2024-09-27 11:23:14 +02:00
Ita Zaporozhets
6730485b02
clean_up_tokenization_spaces=False if unset (#31938)
* clean_up_tokenization_spaces=False if unset

* deprecate warning

* updating param for old models

* update models

* make fix-copies

* fix-copies and update bert models

* warning msg

* update prophet and clvp

* updating test since space before is arbitrarily removed

* remove warning for 4.45
2024-09-26 19:38:20 +02:00
Arthur
46841d3eb2
[MllamaProcessor] Update errors and API with multiple image (#33715)
* update error

* update and add a test

* update

* update
2024-09-26 16:33:25 +02:00
Franz Louis Cesista
0a21381ba3
Uniformize kwargs for chameleon processor (#32181)
* uniformize kwargs of Chameleon

* fix linter nit

* rm stride default

* add tests for chameleon processor

* fix tests

* add comment on get_component

* rm Chameleon's slow tokenizer

* add check order images text + nit

* update docs and tests

* Fix LlamaTokenizer tests

* fix gated repo access

* fix wrong import

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
2024-09-26 10:18:07 -04:00
Andrés Marafioti
f2c388e3f9
Add Idefics 3! (#32473)
* Add Idefics 3!

* fixes to make both pipelines identical

* fix for quantized models

* First pass at the review

* remove vocab size from the main config (it's still in the text_config)

* hot fix for merve

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* re-add model_type for text_config

* remove support for old_cache

* remove hidden_size from main config

* rename idefics3 HF repo

* few changes suggested in the PR

* fix to input_data_format computation

* remove overwrite of _autoset_attn_implementation following @zucchini-nlp suggestion

* improve example

* few improvements from amy's review

* big change to enable processing input images as numpy arrays

* Changes to the code to uniformize processor kwargs

* image processing tests

* image processing tests fixes and some bugs they discovered

* addressed review comments from Yoni

* fix modeling tests

* remove special tokens that are not special

* fixes tests

* skip failing tests - they also fail for idefics2

* added paper and readded the tests with multi gpu, who knows

* Update docs/source/en/model_doc/idefics3.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* review amy until image_processing_idefics3

* last comments from Amy

* review amy

* Update src/transformers/models/idefics3/image_processing_idefics3.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/idefics3/modeling_idefics3.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/idefics3.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* doc improvement - amy review

* fix runtime error during fine-tuning

* amy's review

* Update src/transformers/models/idefics3/image_processing_idefics3.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/idefics3/image_processing_idefics3.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/idefics3/modeling_idefics3.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* ruff

* amy's comment on the order

* ruff ruff

* fix copies

* square images when they are not splitted

* ruff :(

* Update src/transformers/models/idefics3/image_processing_idefics3.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/idefics3/test_processing_idefics3.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix small bug introduced in refactor

* amy's image processing changes

* fixes peft tests and ruff

* modify to_pil_image from transformers. and review from emanuele.

* add modified to_pil_image

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-25 21:28:49 +02:00
Manuel
a55adee890
adding positional encoder changes and tests (#32600)
* adding positional encoder changes and tests

* adding ruff suggestions

* changes added by python utils/check_copies.py --fix_and_overwrite

* removing pos_encoding added by script

* adding interpolation to clipseg

* formatting

* adding further testing to altclip and better documentation to kosmos2

* skipping test_inputs_embeds_matches_input_ids_with_generate in git model

* fixing clipseg comment suggestions

* [run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip

* fixing bridgetower test

* fixing altclip tensor output POS test

* adding ruff formatting

* fixing several tests

* formatting with ruff

* adding positional encoder changes and tests

* adding ruff suggestions

* changes added by python utils/check_copies.py --fix_and_overwrite

* removing pos_encoding added by script

* adding interpolation to clipseg

* formatting

* adding further testing to altclip and better documentation to kosmos2

* skipping test_inputs_embeds_matches_input_ids_with_generate in git model

* fixing clipseg comment suggestions

* fixing bridgetower test

* fixing altclip tensor output POS test

* adding ruff formatting

* fixing several tests

* formatting with ruff

* adding right pretrained model

* [run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip

* fixing test_inference_image_segmentation

* [run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip

* fixing test_inference_interpolate_pos_encoding for the git model as there is no vision_model_output

* [run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip

* adding ruff formatting

* [run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip

* adding new interpolate_pos_encoding function

* [run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip

* fixing interpolate_POS funciton

* adapting output tensor in teests

* [run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip

* modifying output tensor

* [run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip

* adding the correct tensor

* [run_slow]  clipseg

* fixing spaces

* [run_slow]  clipseg

* [run_slow]  clipseg

---------

Co-authored-by: Manuel Sanchez Hernandez <manuel.sanchez.hernandez@schibsted.com>
2024-09-25 19:05:01 +01:00
Arthur
19d58d31f1
Add MLLama (#33703)
* current changes

* nit

* Add cross_attenttion_mask to processor

* multi-image fixed

* Add cross_attenttion_mask to processor

* cross attn works in all cases

* WIP refactoring function for image processor

* WIP refactoring image processor functions

* Refactor preprocess to use global loops instead of list nested list comps

* Docstrings

* Add channels unification

* fix dtype issues

* Update docsrings and format

* Consistent max_image_tiles

* current script

* updates

* Add convert to rgb

* Add image processor tests

* updates!

* update

* god damn it I am dumb sometimes

* Precompute aspect ratios

* now this works, full match

* fix 😉

* nits

* style

* fix model and conversion

* nit

* nit

* kinda works

* hack for sdpa non-contiguous bias

* nits here and there

* latest c hanges

* merge?

* run forward

* Add aspect_ratio_mask

* vision attention mask

* update script and config variable names

* nit

* nits

* be able to load

* style

* nits

* there

* nits

* make forward run

* small update

* enable generation multi-turn

* nit

* nit

* Clean up a bit for errors and typos

* A bit more constant fixes

* 90B keys and shapes match

* Fix for 11B model

* Fixup, remove debug part

* Docs

* Make max_aspect_ratio_id to be minimal

* Update image processing code to match new implementation

* Adjust conversion for final checkpoint state

* Change dim in repeat_interleave (accordig to meta code)

* tmp fix for num_tiles

* Fix for conversion (gate<->up, q/k_proj rope permute)

* nits

* codestyle

* Vision encoder fixes

* pass cross attn mask further

* Refactor aspect ratio mask

* Disable text-only generation

* Fix cross attention layers order, remove q/k norm rotation for cross atention layers

* Refactor gated position embeddings

* fix bugs but needs test with new weights

* rope scaling should be llama3

* Fix rope scaling name

* Remove debug for linear layer

* fix copies

* Make mask prepare private func

* Remove linear patch embed

* Make precomputed embeddings as nn.Embedding module

* MllamaPrecomputedAspectRatioEmbedding with config init

* Remove unused self.output_dim

* nit, intermediate layers

* Rename ln and pos_embed

* vision_chunk_size -> image_size

* return_intermediate -> intermediate_layers_indices

* vision_input_dim -> hidden_size

* Fix copied from statements

* fix most tests

* Fix more copied from

* layer_id->layer_idx

* Comment

* Fix tests for processor

* Copied from for _prepare_4d_causal_attention_mask_with_cache_position

* Style fix

* Add MllamaForCausalLM

* WIP fixing tests

* Remove duplicated layers

* Remove dummy file

* Fix style

* Fix consistency

* Fix some TODOs

* fix language_model instantiation, add docstring

* Move docstring, remove todos for precomputed embeds (we cannot init them properly)

* Add initial docstrings

* Fix

* fix some tests

* lets skip these

* nits, remove print, style

* Add one more copied from

* Improve test message

* Make validate func private

* Fix dummy objects

* Refactor `data_format` a bit + add comment

* typos/nits

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>

* fix dummy objects and imports

* Add chat template config json

* remove num_kv_heads from vision attention

* fix

* move some commits and add more tests

* fix test

* Remove `update_key_name` from modeling utils

* remove num-kv-heads again

* some prelimiary docs

* Update chat template + tests

* nit, conversion script max_num_tiles from params

* Fix warning for text-only generation

* Update conversion script for instruct models

* Update chat template in converstion + test

* add tests for CausalLM model

* model_max_length, avoid null chat_template

* Refactor conversion script

* Fix forward

* Fix integration tests

* Refactor vision config + docs

* Fix default

* Refactor text config

* Doc fixes

* Remove unused args, fix docs example

* Squashed commit of the following:

commit b51ce5a2efffbecdefbf6fc92ee87372ec9d8830
Author: qubvel <qubvel@gmail.com>
Date:   Wed Sep 18 13:39:15 2024 +0000

    Move model + add output hidden states and output attentions

* Fix num_channels

* Add mllama text and mllama vision models

* Fixing repo consistency

* Style fix

* Fixing repo consistency

* Fixing unused config params

* Fix failed tests after refactoring

* hidden_activation -> hidden_act  for text mlp

* Remove from_pretrained from sub-configs

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/mllama/convert_mllama_weights_to_hf.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Reuse lambda in conversion script

* Remove run.py

* Update docs/source/en/model_doc/mllama.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/mllama/processing_mllama.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Remove unused LlamaTokenizerFast

* Fix logging

* Refactor gating

* Remove cycle for collecting intermediate states

* Refactor text-only check, add integration test for text-only

* Revert from pretrained to configs

* Fix example

* Add auto `bos_token` adding in processor

* Fix tips

* Update src/transformers/models/auto/tokenization_auto.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Enable supports_gradient_checkpointing model flag

* add eager/sdpa options

* don't skip attn tests and bring back GC skips (did i really remove those?)

* Fix signature, but get error with None gradient

* Fix output attention tests

* Disable GC back

* Change no split modules

* Fix dropout

* Style

* Add Mllama to sdpa list

* Add post init for vision model

* Refine config for MllamaForCausalLMModelTest and skipped tests for CausalLM model

* if skipped, say it, don't pass

* Clean vision tester config

* Doc for args

* Update tests/models/mllama/test_modeling_mllama.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add cross_attention_mask to test

* typehint

* Remove todo

* Enable gradient checkpointing

* Docstring

* Style

* Fixing and skipping some tests for new cache

* Mark flaky test

* Skip `test_sdpa_can_compile_dynamic` test

* Fixing some offload tests

* Add direct GenerationMixin inheritance

* Remove unused code

* Add initializer_range to vision config

* update the test to make sure we show if split

* fix gc?

* Fix repo consistency

* Undo modeling utils debug changes

* Fix link

* mllama -> Mllama

* [mllama] -> [Mllama]

* Enable compile test for CausalLM model (text-only)

* Fix TextModel prefix

* Update doc

* Docs for forward, type hints, and vision model prefix

* make sure to reset

* fix init

* small script refactor and styling

* nit

* updates!

* some nits

* Interpolate embeddings for 560 size and update integration tests

* nit

* does not suppor static cache!

* update

* fix

* nit2

* this?

* Fix conversion

* Style

* 4x memory improvement with image cache AFAIK

* Token decorator for tests

* Skip failing tests

* update processor errors

* fix split issues

* style

* weird

* style

* fix failing tests

* update

* nit fixing the whisper tests

* fix path

* update

---------

Co-authored-by: raushan <raushan@huggingface.co>
Co-authored-by: pavel <ubuntu@ip-10-90-0-11.ec2.internal>
Co-authored-by: qubvel <qubvel@gmail.com>
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2024-09-25 19:56:25 +02:00
Yoni Gozlan
94f18cf23c
Add OmDet-Turbo (#31843)
* Add template with add-new-model-like

* Add rough OmDetTurboEncoder and OmDetTurboDecoder

* Add working OmDetTurbo convert to hf

* Change OmDetTurbo encoder to RT-DETR encoder

* Add swin timm backbone as default, add always partition fix for swin timm

* Add labels and tasks caching

* Fix make fix-copies

* Format omdet_turbo

* fix Tokenizer tests

* Fix style and quality

* Reformat omdet_turbo

* Fix quality, style, copies

* Standardize processor kwargs

* Fix style

* Add output_hidden_states and ouput_attentions

* Add personalize multi-head attention, improve docstrings

* Add integrated test and fix copy, style, quality

* Fix unprotected import

* Cleanup comments and fix unprotected imports

* Add fix different prompts in batch (key_padding_mask)

* Add key_padding_mask to custom multi-head attention module

* Replace attention_mask by key_padding_mask

* Remove OmDetTurboModel and refactor

* Refactor processing of classes and abstract use of timm backbone

* Add testing, fix output attentions and hidden states, add cache for anchors generation

* Fix copies, style, quality

* Add documentation, conver key_padding_mask to attention_mask

* revert changes to backbone_utils

* Fic docstrings rst

* Fix unused argument in config

* Fix image link documentation

* Reorder config and cleanup

* Add tokenizer_init_kwargs in merge_kwargs of the processor

* Change AutoTokenizer to CLIPTokenizer in convert

* Fix init_weights

* Add ProcessorMixin tests, Fix convert while waiting on uniform kwargs

* change processor kwargs and make task input optional

* Fix omdet docs

* Remove unnecessary tests for processor kwargs

* Replace nested BatchEncoding output of the processor by a flattened BatchFeature

* Make modifications from Pavel review

* Add changes Amy review

* Remove unused param

* Remove normalize_before param, Modify processor call docstring

* Remove redundant decoder class, add gradient checkpointing for decoder

* Remove commented out code

* Fix inference in fp16 and add fp16 integrated test

* update omdet md doc

* Add OmdetTurboModel

* fix caching and nit

* add OmDetTurboModel to tests

* nit change repeated key test

* Improve inference speed in eager mode

* fix copies

* Fix nit

* remove OmdetTurboModel

* [run-slow] omdet_turbo

* [run-slow] omdet_turbo

* skip dataparallel test

* [run-slow] omdet_turbo

* update weights to new path

* remove unnecessary config in class

---------

Co-authored-by: Ubuntu <ubuntu@ip-172-31-91-248.ec2.internal>
2024-09-25 13:26:28 -04:00
NielsRogge
06e27e3dc0
[Pixtral] Improve docs, rename model (#33491)
* Improve docs, rename model

* Fix style

* Update repo id
2024-09-25 13:53:12 +02:00
Dmitry Rogozhkin
5e2916bc14
tests: fix pytorch tensor placement errors (#33485)
This commit fixes the following errors:
* Fix "expected all tensors to be on the same device" error
* Fix "can't convert device type tensor to numpy"

According to pytorch documentation torch.Tensor.numpy(force=False)
performs conversion only if tensor is on CPU (plus few other restrictions)
which is not the case. For our case we need force=True since we just
need a data and don't care about tensors coherency.

Fixes: #33517
See: https://pytorch.org/docs/2.4/generated/torch.Tensor.numpy.html

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2024-09-25 12:21:53 +01:00
Yoni Gozlan
5f0c181f4e
Uniformize kwargs for image-text-to-text processors (#32544)
* uniformize FUYU processor kwargs

* Uniformize instructblip processor kwargs

* Fix processor kwargs and tests Fuyu, InstructBlip, Kosmos2

* Uniformize llava_next processor

* Fix save_load test for processor with chat_template only as extra init args

* Fix import Unpack

* Fix Fuyu Processor import

* Fix FuyuProcessor import

* Fix FuyuProcessor

* Add defaults for specific kwargs kosmos2

* Fix Udop to return BatchFeature instead of BatchEncoding and uniformize kwargs

* Add tests processor Udop

* remove Copied from in processing Udop as change of input orders caused by BatchEncoding -> BatchFeature

* Fix overwrite tests kwargs processors

* Add warnings and BC for changes in processor inputs order, change docs, add BC for text_pair as arg for Udop

* Fix processing test fuyu

* remove unnecessary pad_token check in instructblip ProcessorTest

* Fix BC tests and cleanup

* FIx imports fuyu

* Uniformize Pix2Struct

* Fix wrong name for FuyuProcessorKwargs

* Fix slow tests reversed inputs align fuyu llava-next, change udop warning

* Fix wrong logging import udop

* Add check images text input order

* Fix copies

* change text pair handling when positional arg

* rebase on main, fix imports in test_processing_common

* remove optional args and udop uniformization from this PR

* fix failing tests

* remove unnecessary test, fix processing utils and test processing common

* cleanup Unpack

* cleanup

* fix conflict grounding dino
2024-09-24 21:28:19 -04:00
Joao Gante
a7734238ff
Generation tests: update imagegpt input name, remove unused functions (#33663) 2024-09-24 16:40:48 +01:00
Joao Gante
e15687fffe
Generation: deprecate PreTrainedModel inheriting from GenerationMixin (#33203) 2024-09-23 18:28:36 +01:00
Yoni Gozlan
1456120929
Uniformize kwargs for Udop processor and update docs (#33628)
* Add optional kwargs and uniformize udop

* cleanup Unpack

* nit Udop
2024-09-23 12:47:32 -04:00
Avishai Elmakies
78b2929c05
Sdpa dino v2 (#33403)
* add sdpa to dinov2

* fixup

* add dinov2 to sdpa doc

* update doc order

* [run-slow] dinov2

* common to eager

* [run-slow] dinov2

* update attn implementation in common

* update test_modeling_dinov2 to have mask_ration, num_masks and mask_length similar to vit

* [run-slow] dinov2

---------

Co-authored-by: Avishai Elmakies <avishai.elma@cs.huji.ac.il>
2024-09-21 01:58:00 +01:00
Mayank Mishra
e472e077c2
Granitemoe (#33207)
* first commit

* drop tokenizer

* drop tokenizer

* drop tokenizer

* drop convert

* granite

* drop tokenization test

* mup

* fix

* reformat

* reformat

* reformat

* fix docs

* stop checking for checkpoint

* update support

* attention multiplier

* update model

* tiny drop

* saibo drop

* skip test

* fix test

* fix test

* drop

* drop useless imports

* update docs

* drop flash function

* copied from

* drop pretraining tp

* drop pretraining tp

* drop pretraining tp

* drop unused import

* drop code path

* change name

* softmax scale

* head dim

* drop legacy cache

* rename params

* cleanup

* fix copies

* comments

* add back legacy cache

* multipliers

* multipliers

* multipliers

* text fix

* fix copies

* merge

* multipliers

* attention multiplier

* drop unused imports

* add granitemoe

* add decoration

* remove moe from sequenceclassification

* fix test

* fix

* fix

* fix

* move rope?

* merge

* drop bias

* drop bias

* Update src/transformers/models/granite/configuration_granite.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

* Update src/transformers/models/granite/modeling_granite.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

* fix

* fix

* fix

* drop

* drop

* fix

* fix

* cleanup

* cleanup

* fix

* fix granite tests

* fp32 test

* fix

* drop jitter

* fix

* rename

* rename

* fix config

* add gen test

---------

Co-authored-by: Yikang Shen <yikang.shn@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-09-21 01:43:50 +02:00
Yoni Gozlan
c0c6815dc9
Add support for args to ProcessorMixin for backward compatibility (#33479)
* add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin

* change size and crop_size in processor kwargs tests to do_rescale and rescale_factor

* remove unnecessary llava processor kwargs test overwrite

* nit

* change data_arg_name to input_name

* Remove unnecessary test override

* Remove unnecessary tests Paligemma

* Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring
2024-09-20 11:40:59 -04:00
Joao Gante
2fdb5e74cc
VLM generate: tests can't generate image/video tokens (#33623) 2024-09-20 15:43:27 +01:00
amyeroberts
f9b4409726
Remove unnecessary CPM model tests (#33621)
Remove model tests
2024-09-20 14:20:57 +01:00
Lake Lee
ec1424c6a3
Update modeling_mamba2.py, fix pad size (#32599)
* Update modeling_mamba2.py

Fix pad_size calculation to ensure it's less than self.chunk_size

* [run_slow] mamba2

* [run-slow] mamba2

* [run-slow] Add @require_read_token decorator to failing tests for token propagation

* [run_slow] mamba2
2024-09-20 11:40:57 +01:00
Fanli Lin
8bd1f2f338
[tests] make more tests device-agnostic (#33580)
* enable

* fix

* add xpu skip

* add marker

* skip for xpu

* add more

* enable on accelerator

* add more cases

* add more tests

* add more
2024-09-20 10:16:43 +01:00
Fanli Lin
4d8908df27
[tests] enable GemmaIntegrationTest on XPU (#33555)
enable GemmaIntegrationTest
2024-09-19 19:39:19 +01:00
Fanli Lin
b87755aa6d
[tests] skip tests for xpu (#33553)
* enable

* fix

* add xpu skip

* add marker

* skip for xpu

* add more

* add one more
2024-09-19 19:28:04 +01:00
Yoni Gozlan
f111d5b783
Uniformize kwargs for Paligemma processor and update docs (#33571)
* Uniformize paligemma processor

* nit
2024-09-19 14:14:06 -04:00
Joao Gante
52920b5dd5
Cache: don't throw warnings on gemma2 when instantiating a new cache (#33595) 2024-09-19 17:42:47 +01:00
Anton Vlasjuk
b50ff5993a
[Mamba2] Move dt calculations to kernel (#33520)
* use kernel for dt calculations

* add small test

* [run-slow] mamba2
2024-09-19 17:41:17 +01:00
Pablo Montalvo
413008c580
add uniform processors for altclip + chinese_clip (#31198)
* add initial design for uniform processors + align model

* add uniform processors for altclip + chinese_clip

* fix mutable default 👀

* add configuration test

* handle structured kwargs w defaults + add test

* protect torch-specific test

* fix style

* fix

* rebase

* update processor to generic kwargs + test

* fix style

* add sensible kwargs merge

* update test

* fix assertEqual

* move kwargs merging to processing common

* rework kwargs for type hinting

* just get Unpack from extensions

* run-slow[align]

* handle kwargs passed as nested dict

* add from_pretrained test for nested kwargs handling

* [run-slow]align

* update documentation + imports

* update audio inputs

* protect audio types, silly

* try removing imports

* make things simpler

* simplerer

* move out kwargs test to common mixin

* [run-slow]align

* skip tests for old processors

* [run-slow]align, clip

* !$#@!! protect imports, darn it

* [run-slow]align, clip

* [run-slow]align, clip

* update common processor testing

* add altclip

* add chinese_clip

* add pad_size

* [run-slow]align, clip, chinese_clip, altclip

* remove duplicated tests

* fix

* update doc

* improve documentation for default values

* add model_max_length testing

This parameter depends on tokenizers received.

* Raise if kwargs are specified in two places

* fix

* match defaults

* force padding

* fix tokenizer test

* clean defaults

* move tests to common

* remove try/catch block

* deprecate kwarg

* format

* add copyright + remove unused method

* [run-slow]altclip, chinese_clip

* clean imports

* fix version

* clean up deprecation

* fix style

* add corner case test on kwarg overlap

* resume processing - add Unpack as importable

* add tmpdirname

* fix altclip

* fix up

* add back crop_size to specific tests

* generalize tests to possible video_processor

* add back crop_size arg

* fixup overlapping kwargs test for qformer_tokenizer

* remove copied from

* fixup chinese_clip tests values

* fixup tests - qformer tokenizers

* [run-slow] altclip, chinese_clip

* remove prepare_image_inputs
2024-09-19 17:21:54 +02:00
Pablo Montalvo
4f0246e535
fix tests with main revision and read token (#33560)
* fix tests with main revision and read token

* [run-slow]mamba2

* test previously skipped tests

* [run-slow]mamba2

* skip some tests

* [run-slow]mamba2

* finalize tests

* [run-slow]mamba2
2024-09-19 17:10:22 +02:00
Joao Gante
f3b3810fe6
rag: fix CI (#33578) 2024-09-19 11:55:26 +01:00
Raushan Turganbay
d7975a5874
VLMs: enable generation tests (#33533)
* add tests

* fix whisper

* update

* nit

* add qwen2-vl

* more updates!

* better this way

* fix this one

* fix more tests

* fix final tests, hope so

* fix led

* Update tests/generation/test_utils.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* pr comments

* not pass pixels and extra for low-mem tests, very flaky because of visio tower

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2024-09-19 12:04:24 +02:00
Raushan Turganbay
e40bb4845e
Load and save video-processor from separate folder (#33562)
* load and save from video-processor folder

* Update src/transformers/models/llava_onevision/processing_llava_onevision.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-19 09:56:52 +02:00
Yoach Lacombe
5af7d41e49
Codec integration (#33565)
* clean mimi commit

* some nits suggestions from Arthur

* make fixup

* rename repo id + change readme

* Update docs/source/en/model_doc/mimi.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add flaky flag to batching equivalence due to audio_codes failing sometimes

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-18 19:23:44 +02:00
Raushan Turganbay
db72894b48
Chat template: save and load correctly for processors (#33462)
* fix

* add tests

* fix tests

* Update tests/models/llava/test_processor_llava.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix

* fix tests

* update tests

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-18 13:00:44 +02:00
Wang, Yi
454a0f2efd
fix patch_attention_mask incorrect setting which leads to the differe… (#33499)
* fix patch_attention_mask incorrect setting which leads to the difference in the generated text if batch > 1

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

* fix format

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

* [run_slow] idefics2

---------

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
2024-09-17 22:24:42 +01:00
Yoni Gozlan
d8500cd229
Uniformize kwargs for Pixtral processor (#33521)
* add uniformized pixtral and kwargs

* update doc

* fix _validate_images_text_input_order

* nit
2024-09-17 14:44:27 -04:00
Nikita Krasnytskyi
c29a8694b0
Fix missing sequences_scores in the Whisper beam search output (#32970)
* added sequences_scores to the output

* added beam_indices to output

* added test to check for beam_indices, sequences_scores and their shape

* removed redundant whitespaces

* make fixup
2024-09-17 19:36:11 +01:00
ErezSC42
46c27577b3
fix to jamba config, asserting attention and expert offset (#33316)
* fix to jamba config, asserting attention and expert offset

* fix foramtting

* fix foramtting

* fix foramtting

* changed to error raise instead of assertion, added unittests

* fix

* changed t_ to property_

* changed t_ to property_

* quickfix

* ran code styler
2024-09-17 19:29:27 +01:00
Wang, Yi
74026b473e
idefics2 enable_input_require_grads not aligned with disable_input_re… (#33194)
* idefics2 enable_input_require_grads not aligned with disable_input_require_grads
make peft+idefics2 checkpoints disable fail

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

* split test case

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

* fix ci failure

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

* refine test

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

---------

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
2024-09-17 10:39:34 +01:00
Insu Jang
bcf8946f0a
Fix number of patch check for different vision feature select strategy (#32494)
* Fix number of patch check for different vision feature select strategy

* add test

---------

Co-authored-by: raushan <raushan@huggingface.co>
2024-09-17 09:33:07 +02:00
Yoach Lacombe
18e1a9c719
Fix parametrization-based weight norm (#33275)
* refactor weight_norm + propose uniformed solution to reconcile meta load_state_dict with classic loading

* make style

* fix sew

* fix sew and sew_d tests
2024-09-17 08:05:21 +02:00
Yoach Lacombe
98adf24883
[Whisper test] Fix some failing tests (#33450)
* Fix failing tensor placement in Whisper

* fix long form generation tests

* more return_timestamps=True

* make fixup

* [run_slow] whisper

* [run_slow] whisper
2024-09-16 19:05:17 +02:00
Yoni Gozlan
2f62146f0e
Uniformize kwargs for LLaVa processor and update docs (#32858)
* Uniformize kwargs for LlaVa and update docs

* Change order of processor inputs in docstring

* Improve BC support for reversed images and text inputs

* cleanup llava processor call docstring

* Add encoded inputs as valid text inputs in reverse input check, add deprecation version in warning

* Put function check reversed images text outside base processor class

* Refactor _validate_images_text_input_order

* Add ProcessingUtilTester

* fix processing and test_processing
2024-09-16 11:26:26 -04:00
Arthur
8bd2b1e8c2
Add support for Pixtral (#33449)
* initial commit

* gloups

* updates

* work

* weights match

* nits

* nits

* updates to support the tokenizer :)

* updates

* Pixtral processor (#33454)

* rough outline

* Add in image break and end tokens

* Fix

* Udo some formatting changes

* Set patch_size default

* Fix

* Fix token expansion

* nit in conversion script

* Fix image token list creation

* done

* add expected results

* Process list of list of images (#33465)

* updates

* working image and processor

* this is the expected format

* some fixes

* push current updated

* working mult images!

* add a small integration test

* Uodate configuration docstring

* Formatting

* Config docstring fix

* simplify model test

* fixup modeling and etests

* Return BatchMixFeature in image processor

* fix some copies

* update

* nits

* Update model docstring

* Apply suggestions from code review

* Fix up

* updates

* revert modeling changes

* update

* update

* fix load safe

* addd liscence

* update

* use pixel_values as required by the model

* skip some tests and refactor

* Add pixtral image processing tests (#33476)

* Image processing tests

* Add processing tests

* woops

* defaults reflect pixtral image processor

* fixup post merge

* images -> pixel values

* oups sorry Mr docbuilder

* isort

* fix

* fix processor tests

* small fixes

* nit

* update

* last nits

* oups this was really breaking!

* nits

* is composition needs to be true

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-14 12:28:39 +02:00
Amit Garg
dfd31158ee
[Phi-3] Bug on stale kv cache (#33129)
* fix long seq bug

* fixed format

* fixed fn copy inconsistency

* fix long seq bug

* fixed format

* fixed fn copy inconsistency

* Addressed comments

* added a unit test

* fixed cache position

* Added a warning msg to the forward fn

* fixed test case
2024-09-13 14:07:19 +02:00
Raushan Turganbay
4b0418df11
Enable padding_side as call time kwargs (#33385)
* fix

* add padding-side kwarg

* add padding side in all models & fix tests

* fix copies

* fix tests
2024-09-13 11:58:38 +01:00
Raushan Turganbay
9c4639b622
Return image hidden states (#33426)
* fix

* return image hidden states

* fix copies

* fix test
2024-09-13 10:20:03 +02:00
benniekiss
5c6257d1fc
[whisper] Clarify error message when setting max_new_tokens (#33324)
* clarify error message when setting max_new_tokens

* sync error message in test_generate_with_prompt_ids_max_length

* there is no self
2024-09-12 18:48:36 +02:00
Raushan Turganbay
2f611d30d9
Qwen2-VL: clean-up and add more tests (#33354)
* clean-up on qwen2-vl and add generation tests

* add video tests

* Update tests/models/qwen2_vl/test_processing_qwen2_vl.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix and add better tests

* Update src/transformers/models/qwen2_vl/image_processing_qwen2_vl.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* update docs and address comments

* Update docs/source/en/model_doc/qwen2_vl.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_vl.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* update

* remove size at all

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-12 18:24:04 +02:00