laurentd-lunit
0f49deacbf
[feat] LlavaNext add feature size check to avoid CUDA Runtime Error ( #33608 )
...
* [feat] add feature size check to avoid CUDA Runtime Error
* [minor] add error handling to all llava models
* [minor] avoid nested if else
* [minor] add error message to Qwen2-vl and chameleon
* [fix] token dimension for check
* [minor] add feature dim check for videos too
* [fix] dimension check
* [fix] test reference values
---------
Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
2024-10-15 16:19:18 +02:00
Raushan Turganbay
d7975a5874
VLMs: enable generation tests ( #33533 )
...
* add tests
* fix whisper
* update
* nit
* add qwen2-vl
* more updates!
* better this way
* fix this one
* fix more tests
* fix final tests, hope so
* fix led
* Update tests/generation/test_utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* pr comments
* not pass pixels and extra for low-mem tests, very flaky because of visio tower
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2024-09-19 12:04:24 +02:00
Raushan Turganbay
7d2d6ce9cb
VLM: fixes after refactor ( #32907 )
...
* leave only half of the changes
* fix tests
* [run-slow] llava, llava_next, llava_next_video, vipllava, video_llava
* fix tests, first try
* [run-slow] llava, llava_next, llava_next_video, vipllava, video_llava
* fix, second try
* [run-slow] llava, llava_next, llava_next_video, vipllava, video_llava
* fix
* [run-slow] llava, llava_next, llava_next_video, vipllava, video_llava
2024-09-10 12:02:37 +02:00
Raushan Turganbay
a29eabd0eb
Expand inputs in processors for VLMs ( #30962 )
...
* let it be
* draft
* should not have changed
* add warnings
* fix & add tests
* fix tests
* ipnuts embeds cannot be passed with pixels
* more updates
* paligemma ready!
* minor typos
* update blip-2
* fix tests & raise error
* docstring
* add blip2 test
* tmp
* add image seq length to config
* update docstring
* delete
* fix tests
* fix blip
* fix paligemma
* out-of-place scatter
* add llava-next-video
* Update src/transformers/models/blip_2/modeling_blip_2.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* remove tmp
* codestyle
* nits
* more nits
* remove overriding in tests
* comprehension when merging video
* fix-copies
* revert changes for embeds test
* fix tests after making comprehension
* Update src/transformers/models/blip_2/processing_blip_2.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* Update src/transformers/models/blip_2/processing_blip_2.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* more updates
* fix tests
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
2024-08-13 10:14:39 +05:00
Raushan Turganbay
e71f2863d7
Add LLaVa NeXT Video ( #31252 )
...
* squash into single commit
* run diff once more
* docstring
* tests
* minor chnages and ready to go
* Update src/transformers/models/llava_next_video/processing_llava_next_video.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/vipllava/test_modeling_vipllava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* [run-slow] llava-next-video
* [run-slow] llava-next-video
* [run-slow] llava_next_video
* fix two tests
* fix slow tests
* remove logit checks due to numeric errors
* run test once more
* [run-slow] llava_next_video
* final try to pass the test
* [run-slow] llava_next_video
* [run-slow] llava_next_video
* [run-slow] llava_next_video
* style
* fix
* style
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-06-26 21:52:28 +05:00
Arthur
673440d073
update ruff version ( #30932 )
...
* update ruff version
* fix research projects
* Empty
* Fix errors
---------
Co-authored-by: Lysandre <lysandre@huggingface.co>
2024-05-22 06:40:15 +02:00
Raushan Turganbay
9d31b32e9d
Use text config's vocab size in testing models ( #30568 )
...
use text config's vocab size
2024-05-01 12:32:45 +05:00
NielsRogge
d91fd7f92c
Add LLaVa-1.6, bis ( #29586 )
...
* First draft
* Fix tests, add docs
* Improve docstrings
* Fix test
* Address comments
* Address comments
* Remove vocab_size attribute
* Remove batch_size
* Address comment
* Add image processor tests
* Support fx
* Update docstring
* Add support for 34b
* Convert 34b model
* Add integration tests
* Update checkpoints
* Convert vicuna-13b, remove doc tests
* Remove script
* Remove file
* Address comments
* Improve docstrings
* Deprecate vocab_size
* Remove aspect_ratio_setting
* Address comments
* Update READMEs
* Add tips about chat templates
* Fix tests
* Deprecate vocab_size safely
* Update tests
---------
Co-authored-by: Amy Roberts <22614925+amyeroberts@users.noreply.github.com>
2024-03-20 15:51:12 +00:00
Victor SANH
0f2f0c634f
Fix _merge_input_ids_with_image_features
for llava model ( #28333 )
...
* fix `_merge_input_ids_with_image_features` for llava model
* Update src/transformers/models/llava/modeling_llava.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* adress comments
* style and tests
* ooops
* test the backward too
* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update tests/models/vipllava/test_modeling_vipllava.py
* style and quality
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-01-10 08:33:33 +01:00
Younes Belkada
c7f076a00e
Adds VIP-llava to transformers ( #27932 )
...
* v1
* add-new-model-like
* revert
* fix forward and conversion script
* revert
* fix copies
* fixup
* fix
* Update docs/source/en/index.md
* Apply suggestions from code review
* push
* fix
* fixes here and there
* up
* fixup and fix tests
* Apply suggestions from code review
* add docs
* fixup
* fixes
* docstring
* add docstring
* fixup
* docstring
* fixup
* nit
* docs
* more copies
* fix copies
* nit
* update test
2023-12-13 10:42:24 +01:00