Raushan Turganbay
a29eabd0eb
Expand inputs in processors for VLMs ( #30962 )
...
* let it be
* draft
* should not have changed
* add warnings
* fix & add tests
* fix tests
* ipnuts embeds cannot be passed with pixels
* more updates
* paligemma ready!
* minor typos
* update blip-2
* fix tests & raise error
* docstring
* add blip2 test
* tmp
* add image seq length to config
* update docstring
* delete
* fix tests
* fix blip
* fix paligemma
* out-of-place scatter
* add llava-next-video
* Update src/transformers/models/blip_2/modeling_blip_2.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* remove tmp
* codestyle
* nits
* more nits
* remove overriding in tests
* comprehension when merging video
* fix-copies
* revert changes for embeds test
* fix tests after making comprehension
* Update src/transformers/models/blip_2/processing_blip_2.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* Update src/transformers/models/blip_2/processing_blip_2.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* more updates
* fix tests
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
2024-08-13 10:14:39 +05:00
Raushan Turganbay
e71f2863d7
Add LLaVa NeXT Video ( #31252 )
...
* squash into single commit
* run diff once more
* docstring
* tests
* minor chnages and ready to go
* Update src/transformers/models/llava_next_video/processing_llava_next_video.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/vipllava/test_modeling_vipllava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* [run-slow] llava-next-video
* [run-slow] llava-next-video
* [run-slow] llava_next_video
* fix two tests
* fix slow tests
* remove logit checks due to numeric errors
* run test once more
* [run-slow] llava_next_video
* final try to pass the test
* [run-slow] llava_next_video
* [run-slow] llava_next_video
* [run-slow] llava_next_video
* style
* fix
* style
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-06-26 21:52:28 +05:00
Raushan Turganbay
d64e4da713
Video-LLaVa: handle any number of frames ( #31221 )
...
video-llava can handle more frames
2024-06-04 14:20:03 +05:00
Arthur
673440d073
update ruff version ( #30932 )
...
* update ruff version
* fix research projects
* Empty
* Fix errors
---------
Co-authored-by: Lysandre <lysandre@huggingface.co>
2024-05-22 06:40:15 +02:00
Raushan Turganbay
bd9f4d7951
Add Video Llava ( #29733 )
...
* add model draft
* update docstring
* add tests
* support image and video as input
* update for better handling of mixed input and clean-up a bit
* bug when mixed inputs & add tests
* Update README.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Merge remote-tracking branch 'upstream/main' into video_llava
* link to abstract of paper in README
* fix test
* fix-copies
* make tests happy
* skip docstest for now
* do not run doctest for now
* Update src/transformers/models/video_llava/processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/video_llava/test_modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* address review comments
* failing tests
* Fix vocab_size in common tests for VLMs
* codestyle
* Update src/transformers/models/video_llava/configuration_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/configuration_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/video_llava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/video_llava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/video_llava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/video_llava/test_modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/video_llava/test_modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/video_llava/test_modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* PR suggestions
* fix-copies
* Update src/transformers/models/video_llava/configuration_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/configuration_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add full example in docs
* clean-up with new model-id
* [run-slow] video_llava
* update docstring
* [run-slow] video_llava
* remove all achive maps
* fix some tests
* test was supposed to be skipped for llava :)
---------
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-15 16:42:29 +05:00