Commit Graph

37 Commits

Author SHA1 Message Date
Cyril Vallez
07aab1af1e
Remove dead protected imports (#38980)
* remove them

* more
2025-06-23 13:44:50 +02:00
Yih-Dar
c87058beb8
Fix llava_onevision tests (#38791)
* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-06-12 15:06:49 +02:00
youngrok cha
101b3fa4ea
fix multi-image case for llava-onevision (#38084)
* _get_padding_size module

* do not patchify images when processing multi image

* modify llava onevision image processor fast

* tensor to list of tensors

* backward compat

* reuse pad_to_square in llave & some clarification

* add to doc

* fix: consider no image cases (text only or video)

* add integration test

* style & repo_consistency
2025-05-21 11:50:46 +02:00
youngrok cha
a5cc7a67d7
[bug] fix llava processor to calculate unpadding size correctly (#37988)
* fix llava processor to calculate unpad size correctly

* repo consistency

* Revert "repo consistency" & "setUp in llava family"

This reverts commit 26a50af8db.

* add edge case test for padding & unpadding

* compute unpadding size from original size

* make test config explicit

* Revert "compute unpadding size from original size"

This reverts commit 752cd27ad9.

* Revert "add edge case test for padding & unpadding"

This reverts commit ccbd094d69.

* revert unpad logic

* remove irrelevant tests

* model test

* remove processor from model test

---------

Co-authored-by: jaycha <jaycha@ncsoft.com>
2025-05-13 13:49:09 +00:00
Raushan Turganbay
a31fa218ad
🔴 Video processors as a separate class (#35206)
* initial design

* update all video processors

* add tests

* need to add qwen2-vl (not tested yet)

* add qwen2-vl in auto map

* fix copies

* isort

* resolve confilicts kinda

* nit:

* qwen2-vl is happy now

* qwen2-5 happy

* other models are happy

* fix copies

* fix tests

* add docs

* CI green now?

* add more tests

* even more changes + tests

* doc builder fail

* nit

* Update src/transformers/models/auto/processing_auto.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* small update

* imports correctly

* dump, otherwise this is getting unmanagebale T-T

* dump

* update

* another update

* update

* tests

* move

* modular

* docs

* test

* another update

* init

* remove flakiness in tests

* fixup

* clean up and remove commented lines

* docs

* skip this one!

* last fix after rebasing

* run fixup

* delete slow files

* remove unnecessary tests + clean up a bit

* small fixes

* fix tests

* more updates

* docs

* fix tests

* update

* style

* fix qwen2-5-vl

* fixup

* fixup

* unflatten batch when preparing

* dump, come back soon

* add docs and fix some tests

* how to guard this with new dummies?

* chat templates in qwen

* address some comments

* remove `Fast` suffix

* fixup

* oops should be imported from transforms

* typo in requires dummies

* new model added with video support

* fixup once more

* last fixup I hope

* revert image processor name + comments

* oh, this is why fetch test is failing

* fix tests

* fix more tests

* fixup

* add new models: internvl, smolvlm

* update docs

* imprt once

* fix failing tests

* do we need to guard it here again, why?

* new model was added, update it

* remove testcase from tester

* fix tests

* make style

* not related CI fail, lets' just fix here

* mark flaky for now, filas 15 out of 100

* style

* maybe we can do this way?

* don't download images in setup class

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-05-12 11:55:51 +02:00
Raushan Turganbay
17742bd9c8
🔴 [VLM] Add base model without head (#37033)
* i guessreverted all CdGen classes

* style

* llava onevision

* fix copies

* fix some tests

* some more tests

* dump

* skip these

* nevermind, i am dumb

* revert fix not needed

* fixup

* fixup

* another fixup

* more fixup to make ci finally happy

* fixup after rebasing

* fix qwen tests

* add internVL + typos here and there

* image token index -> id

* style

* fix init weights

* revert blip-2 not supported

* address comments

* fix copies

* revert blip2 test file as well

* as discussed internally, revert back CdGen models

* fix some tests

* fix more tests for compile

* CI red

* fix copies

* enumerate explicitly allowed models

* address comments

* fix tests

* fixup

* style again

* add tests for new model class

* another fixup ( x _ x )

* [fixup] unused attributes can be removed post-deprecation
2025-05-07 17:47:51 +02:00
youngrok cha
acded47fe7
[llava] one pixel is missing from padding when length is odd (#37819)
* [fix] one pixel should be added when length is odd

* [fix] add vision_aspect_ratio args & typo

* [fix] style

* [fix] do not fix fast file directly

* [fix] convert using modular

* remove duplicate codes

* match unpad logic with pad logic

* test odd-sized images for llava & aria

* test unpad odd-sized padding for llava family

* fix style

* add kwarg to onvision modular

* move vision_aspect_ratio from image_processor to processor
(llava_onevision)
2025-05-06 13:11:26 +02:00
Raushan Turganbay
1cfcbfcab8
[VLMs] fix flash-attention tests (#37603)
* fix one test

* fa2 ln test

* remove keys from config recursively

* fix

* fixup
2025-04-24 11:48:11 +02:00
Joao Gante
85665a4263
[tests] Stricter generate + compilation test -- no recompilations allowed (#37629)
* tmp commit

* stricter compilation test

* trigger tests

* rm todo
2025-04-22 11:12:18 +01:00
Raushan Turganbay
32eca7197a
[vlm] adjust max length for special tokens (#37342)
* update

* apply suggestion

* fix tests for main branch

* remove unused logger

* add special tokens in tests

* nit

* fix more tests

* fix test

* pg also
2025-04-16 20:49:20 +02:00
Raushan Turganbay
a563999a02
[processor] clean up mulitmodal tests (#37362)
* clkea up mulitmodal processor tests

* fixup

* fix tests

* fix one last test

* forgot
2025-04-11 13:32:19 +02:00
Raushan Turganbay
1ae8d54b04
[chat-template] Unify tests and clean up 🧼 (#37275)
* fix tests and some clean up

* make one general test for each modality

* remove redundant merging of kwargs

* edge cases

* dont enforce slow when reloading

* fix gemma3 tests

* has to adapt llama 4 after rebase

* remove also from overriden tests

* should be green now
2025-04-10 14:42:32 +02:00
Matt
4d0de5f73a
🚨 🚨 Setup -> setupclass conversion (#37282)
* More limited setup -> setupclass conversion

* make fixup

* Trigger tests

* Fixup UDOP

* Missed a spot

* tearDown -> tearDownClass where appropriate

* Couple more class fixes

* Fixups for UDOP and VisionTextDualEncoder

* Ignore errors when removing the tmpdir, in case it already got cleaned up somewhere

* CLIP fixes

* More correct classmethods

* Wav2Vec2Bert fixes

* More methods become static

* More class methods

* More class methods

* Revert changes for integration tests / modeling files

* Use a different tempdir for tests that actually write to it

* Remove addClassCleanup and just use teardownclass

* Remove changes in modeling files

* Cleanup get_processor_dict() for got_ocr2

* Fix regression on Wav2Vec2BERT test that was masked by this before

* Rework tests that modify the tmpdir

* make fix-copies

* revert clvp modeling test changes

* Fix CLIP processor test

* make fix-copies
2025-04-08 17:15:37 +01:00
cyyever
1e6b546ea6
Use Python 3.9 syntax in tests (#37343)
Signed-off-by: cyy <cyyever@outlook.com>
2025-04-08 14:12:08 +02:00
Matt
8ebc435267
Fix llava_onevision tests (#37280)
* Fix llava_onevision tests

* Trigger tests
2025-04-04 15:03:38 +01:00
Matt
2d46a08b63
Purge unused ModelTester code (#37085)
* Purge correctly this time

* Remove more methods from recent PRs

* make fixup
2025-04-03 17:48:35 +01:00
co63oc
996f512d52
Fix typos in tests (#36547)
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
2025-03-05 15:04:06 -08:00
Raushan Turganbay
0c78ef6cd3
🔴 VLM: compile compatibility (#35724)
* llavas

* add mroe models

* fix `compile_forward` test for all models

* fix copies

* make style

* also doesn't support cache class

* fix some tests

* not copied from

* ci green?

* fix tests

* fix copies

* fix tests

* check with `numel` and remove `item`

* fix copies

* fix copies

* Update src/transformers/models/cohere2/modeling_cohere2.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* opt remove cross attn

* gemma2

* fixup

* fixup

* fix newly added test

* maybe fixed?

* green please?

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-02-14 15:23:49 +01:00
Joao Gante
62c7ea0201
CI: avoid human error, automatically infer generative models (#33212)
* tmp commit

* move tests to the right class

* remove ALL all_generative_model_classes = ...

* skip tf roberta

* skip InstructBlipForConditionalGenerationDecoderOnlyTest

* videollava

* reduce diff

* reduce diff

* remove  on vlms

* fix a few more

* manual rebase bits

* more manual rebase

* remove all manual generative model class test entries

* fix up to ernie

* a few more removals

* handle remaining cases

* recurrent gemma

* it's better here

* make fixup

* tf idefics is broken

* tf bert + generate is broken

* don't touch tf :()

* don't touch tf :(

* make fixup

* better comments for test skips

* revert tf changes

* remove empty line removal

* one more

* missing one
2025-02-13 16:27:11 +01:00
Raushan Turganbay
eebd2c972c
Chat template: update for processor (#35953)
* update

* we need batched nested input to always process correctly

* update a bit

* fix copies
2025-02-10 09:52:19 +01:00
Yoni Gozlan
fa56dcc2ab
Refactoring of ImageProcessorFast (#35069)
* add init and base image processing functions

* add add_fast_image_processor to transformers-cli

* add working fast image processor clip

* add fast image processor to doc, working tests

* remove "to be implemented" SigLip

* fix unprotected import

* fix unprotected vision import

* update ViTImageProcessorFast

* increase threshold slow fast ewuivalence

* add fast img blip

* add fast class in tests with cli

* improve cli

* add fast image processor convnext

* add LlavaPatchingMixin and fast image processor for llava_next and llava_onevision

* add device kwarg to ImagesKwargs for fast processing on cuda

* cleanup

* fix unprotected import

* group images by sizes and add batch processing

* Add batch equivalence tests, skip when center_crop is used

* cleanup

* update init and cli

* fix-copies

* refactor convnext, cleanup base

* fix

* remove patching mixins, add piped torchvision transforms for ViT

* fix unbatched processing

* fix f strings

* protect imports

* change llava onevision to class transforms (test)

* fix convnext

* improve formatting (following Pavel review)

* fix handling device arg

* improve cli

* fix

* fix inits

* Add distinction between preprocess and _preprocess, and support for arbitrary kwargs through valid_extra_kwargs

* uniformize qwen2_vl fast

* fix docstrings

* add add fast image processor llava

* remove min_pixels max_pixels from accepted size

* nit

* nit

* refactor fast image processors docstrings

* cleanup and remove fast class transforms

* update add fast image processor transformers cli

* cleanup docstring

* uniformize pixtral fast and  make _process_image explicit

* fix prepare image structure llava next/onevision

* Use typed kwargs instead of explicit args

* nit fix import Unpack

* clearly separate pops and gets in base preprocess. Use explicit typed kwargs

* make qwen2_vl preprocess arguments hashable
2025-02-04 17:52:31 -05:00
Arthur
b912f5ee43
use torch.testing.assertclose instead to get more details about error in cis (#35659)
* use torch.testing.assertclose instead to get more details about error in cis

* fix

* style

* test_all

* revert for I bert

* fixes and updates

* more image processing fixes

* more image processors

* fix mamba and co

* style

* less strick

* ok I won't be strict

* skip and be done

* up
2025-01-24 16:55:28 +01:00
Alex Brooks
71cc8161b2
Granite Vision Support (#35579)
* Add multimodal granite support

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

Support multiple image feature layres

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Remove failing validation for visual encoders with no cls

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Update llava based models / configs to support list of feature layers

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Add tests for multiple feature layers

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Use conditional instead of except for misaligned feature shapes

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* crop cls from each hidden state

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Fix formatting

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Support single vision feature int in vipllava

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Fix typo in vision feature selection strategy validation

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Add tentative integration test for granite vision models

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Add granite vision docs

Replace multimodal granite refs with granite vision

Add granite vision / llava next alias

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Use image url in granitevision example

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

---------

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
2025-01-23 17:15:52 +01:00
Raushan Turganbay
8571bb145a
Fix CI for VLMs (#35690)
* fix some easy test

* more tests

* remove logit check here also

* add require_torch_large_gpu in Emu3
2025-01-20 11:15:39 +01:00
Raushan Turganbay
e0646f3dce
Chat template: return vectorized output in processors (#34275)
* update chat template

* style

* fix tests

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* typehints + docs

* fix tests

* remove unnecessary warnings

* forgot code style :(

* allow users to pass backend and num frames

* Update docs/source/en/chat_templating.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/processing_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* typo fix

* style

* address comments

* align with "pipeline" template

* update docs

* update docs

* unpack for all kwargs?

* wrong conflict resolution while rebasing

* tmp

* update docs

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-10 11:05:29 +01:00
Yih-Dar
05de764e9c
Aurevoir PyTorch 1 (#35358)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-20 14:36:31 +01:00
Fanli Lin
bdd4201fdb
[tests] fix "Tester object has no attribute '_testMethodName'" (#34910)
* add more cases

* fix method not found in unittest

Signed-off-by: Lin, Fanli <fanli.lin@intel.com>

* fix more cases

* add more models

* add all

* no unittest.case

* remove for oneformer

* fix style

---------

Signed-off-by: Lin, Fanli <fanli.lin@intel.com>
2024-12-13 14:33:45 +01:00
Raushan Turganbay
893ad04fad
Load sub-configs from composite configs (#34410)
* save/load sub-configs

* nit forgot these

* fix copies

* move test to common

* use dict for sub-configs

* add load-save-laod test

* clean up modeling check

* oops this are correct keys

* fix some tests, missed some composite configs

* this model was missed
2024-11-05 11:34:01 +01:00
Yoni Gozlan
203e27059b
Add image text to text pipeline (#34170)
* Standardize image-text-to-text-models-output

add post_process_image_text_to_text to chameleon and cleanup

Fix legacy kwarg behavior and deprecation warning

add post_process_image_text_to_text to qwen2_vl and llava_onevision

Add post_process_image_text_to_text to idefics3, mllama, pixtral processor

* nit var name post_process_image_text_to_text udop

* nit fix deprecation warnings

* Add image-text-to-text pipeline

* add support for image url in chat template for pipeline

* Reformat to be fully compatible with chat templates

* Add tests chat template

* Fix imports and tests

* Add pipeline tag

* change logic handling of single prompt ans multiple images

* add pipeline mapping to models

* fix batched inference

* fix tests

* Add manual batching for preprocessing

* Fix outputs with nested images

* Add support for all common processing kwargs

* Add default padding when multiple text inputs (batch size>1)

* nit change version deprecation warning

* Add support for text only inference

* add chat_template warnings

* Add pipeline tests and add copied from post process function

* Fix batched pipeline tests

* nit

* Fix pipeline tests blip2

* remove unnecessary max_new_tokens

* revert processing kosmos2 and remove unnecessary max_new_tokens

* fix pipeline tests idefics

* Force try loading processor if pipeline supports it

* revert load_processor change

* hardcode loading only processor

* remove unnecessary try except

* skip imagetexttotext tests for kosmos2 as tiny model causes problems

* Make code clearer

* Address review comments

* remove preprocessing logic from pipeline

* fix fuyu

* add BC resize fuyu

* Move post_process_image_text_to_text to ProcessorMixin

* add guard in post_process

* fix zero shot object detection pipeline

* add support for generator input in pipeline

* nit

* change default image-text-to-text model to llava onevision

* fix owlv2 size dict

* Change legacy deprecation warning to only show when True
2024-10-31 15:48:11 -04:00
Yih-Dar
ab98f0b0a1
avoid calling gc.collect and cuda.empty_cache (#34514)
* update

* update

* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-31 16:36:13 +01:00
Raushan Turganbay
21d5025826
Attn implementation for composite models (#32238)
* first try

* codestyle

* idefics2 is happy

* [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo, paligemma

* fix-copies

* [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo

* blip-2 needs to init vision from config

* when was this removed O_o

* minor fix

* tests

* this way?

* tests

* model-agnostic code

* codestyle

* add tests for idefics

* modify general test for VLMs

* no generation test for vlm yet!

* no generation test here also

* wanr in VIT-SDPA if output attn

* add more tests

* user can pass dict as attn impl

* repo consistency

* update

* muicgen

* no prints

* forgot speech enc-dec and clip

* how many composite models we have?

* musicgen meelody is same as mudicgen

* +siglip

* fix tests + add some more

* remove idefics custom overriden code

* make idefics2 automappable

* nits

* skip tests

* doctests

* Update src/transformers/models/idefics2/configuration_idefics2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/clip/test_modeling_clip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/idefics2/test_modeling_idefics2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/idefics2/test_modeling_idefics2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/configuration_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* major update, no need for automap

* clean up

* add FA2 test

* more tests

* style

* skip tests

* why did these started failing now?

* no attributes for FA2 needed

* one tiny test

* address comment about FA2 false warning

* style

* add new models and resolve conflicts

* fix copies

* let it be this way for now, come back tomorrow to review

* some more fixes

* update

* more updates

* update

* fix copies

* style and tests

* another big update

* fix tests

* fix tests

* update

* another update

* fix tests

* fix copies

* fix tests

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-10-22 06:54:44 +02:00
Yoni Gozlan
62e8c759c3
rename all test_processing_*.py to test_processor_*.py (#33878)
* rename all test_processing_*.py to test_processor_*.py ans fix duplicate test processor paligemma

* fix copies

* fix broken tests

* fix-copies

* fix test processor bridgetower
2024-10-02 16:43:43 +02:00
Yoni Gozlan
c0c6815dc9
Add support for args to ProcessorMixin for backward compatibility (#33479)
* add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin

* change size and crop_size in processor kwargs tests to do_rescale and rescale_factor

* remove unnecessary llava processor kwargs test overwrite

* nit

* change data_arg_name to input_name

* Remove unnecessary test override

* Remove unnecessary tests Paligemma

* Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring
2024-09-20 11:40:59 -04:00
Raushan Turganbay
d7975a5874
VLMs: enable generation tests (#33533)
* add tests

* fix whisper

* update

* nit

* add qwen2-vl

* more updates!

* better this way

* fix this one

* fix more tests

* fix final tests, hope so

* fix led

* Update tests/generation/test_utils.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* pr comments

* not pass pixels and extra for low-mem tests, very flaky because of visio tower

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2024-09-19 12:04:24 +02:00
Raushan Turganbay
e40bb4845e
Load and save video-processor from separate folder (#33562)
* load and save from video-processor folder

* Update src/transformers/models/llava_onevision/processing_llava_onevision.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-19 09:56:52 +02:00
Raushan Turganbay
db72894b48
Chat template: save and load correctly for processors (#33462)
* fix

* add tests

* fix tests

* Update tests/models/llava/test_processor_llava.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix

* fix tests

* update tests

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-18 13:00:44 +02:00
Raushan Turganbay
43df47d8e7
Llava Onevision: add model (#32673)
* working version

* fix copies

* update

* tests

* update docs

* codestyle

* add more tests

* add returns for docs

* clean up

* Update src/transformers/models/llava_onevision/processing_llava_onevision.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* updates

* codestyle

* style

* shouldn't be reversed

* [run-slow] llava_onevision

* [run-slow] llava_onevision

* add pooling in videos

* [run-slow] llava_onevision

* num-logits-to-keep

* [run-slow] llava_onevision

* [run-slow] llava_onevision

* Update tests/test_modeling_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* video matched orig impl

* fix tests

* chat template was modified

* Update docs/source/en/model_doc/llava_onevision.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add morer info in the doc page

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-05 14:43:20 +05:00