Raushan Turganbay
|
4d5822e65d
|
[smolvlm] fix video inference (#39147)
* fix smolvlm
* better do as before, set sampling params in overwritten `apply_chat_template`
* style
* update with `setdefault`
|
2025-07-02 12:05:10 +02:00 |
|
Raushan Turganbay
|
f8b88866f5
|
[VLMs] support passing embeds along with pixels (#38467)
* VLMs can work with embeds now
* update more models
* fix tests
* fix copies
* fixup
* fix
* style
* unskip tests
* fix copies
* fix tests
* style
* omni modality models
* qwen models had extra indentation
* fix some other tests
* fix copies
* fix test last time
* unrelated changes revert
* we can't rely only on embeds
* delete file
* de-flake mistral3
* fix qwen models
* fix style
* fix tests
* fix copies
* deflake the test
* modular reverted by fixes, fix again
* flaky test, overwritten
* fix copies
* style
|
2025-07-01 11:33:20 +00:00 |
|
Raushan Turganbay
|
1cfcbfcab8
|
[VLMs] fix flash-attention tests (#37603)
* fix one test
* fa2 ln test
* remove keys from config recursively
* fix
* fixup
|
2025-04-24 11:48:11 +02:00 |
|
cyyever
|
1e6b546ea6
|
Use Python 3.9 syntax in tests (#37343)
Signed-off-by: cyy <cyyever@outlook.com>
|
2025-04-08 14:12:08 +02:00 |
|
Yih-Dar
|
d13c390d01
|
Mark 2 tests as flaky for now (#37038)
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
|
2025-03-27 10:59:47 +01:00 |
|
Joao Gante
|
fc8764c9a6
|
[Generation, Gemma 3] When passing a custom generation_config , overwrite default values with the model's base generation_config (#36684)
|
2025-03-15 12:40:09 +00:00 |
|
co63oc
|
996f512d52
|
Fix typos in tests (#36547)
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
|
2025-03-05 15:04:06 -08:00 |
|
Orr Zohar
|
4397dfcb71
|
SmolVLM2 (#36126)
* smolvlm init
* updates
* fixing bugs
* minimal run, no checks
* minimal run, no checks
* passing first check + adding url support
* updating video dataloading logic
* fixing image logic
* trying modular, but fails
* modular is working, changing processor to match PR comments and general transformers logic
* fixing kwargs
* offloading video loading logic to image_util
* fixing circleci code formatting errors
* fixing circleci code formatting errors
* fixing circleci code formatting errors
* fixing circleci code formatting errors
* fixing circleci code formatting errors
* fixing circleci code formatting errors
* fixing circleci code formatting errors
* fixing circleci code formatting errors
* fixing circleci code formatting errors
* fixing circleci code formatting errors
* fixing circleci code formatting errors
* fixing circleci code formatting errors
* fixing circleci code formatting errors
* fixing circleci code formatting errors
* update
* add idefics3-based tests
* add keyword to all
* add PreTrainedModel
* updateing video loading logic
* working inference
* updates for PR comments
* updates for PR comments
* moving SmolVLMPretrainedModel higher to fix import error
* CI test pass
* CI test pass
* removing lambda
* CI test pass
* CI test pass
* CI test pass
* CI test pass
* CI test pass
* CI test pass
* processor tests
* add example in docs
* typo
* fix copies
* skip compile tests - sdpa for VisionTransformer
* fix init
* raise import error for num2words
* update doc for FA2
* more doc fix
* CI
* updates for PR comments
* Update docs/source/en/model_doc/smolvlm.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update docs/source/en/model_doc/smolvlm.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update docs/source/en/model_doc/smolvlm.md
Co-authored-by: Joshua Lochner <admin@xenova.com>
* Update docs/source/en/model_doc/smolvlm.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update docs/source/en/model_doc/smolvlm.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* fixing processor -- tokenizer not defined properly, (gpt2 tokenizer), and does not have the attributes of fake image token, etc
* adding smolvlm to VQA models
* removing vqa auto class
* Update src/transformers/models/smolvlm/processing_smolvlm.py
Co-authored-by: Joshua Lochner <admin@xenova.com>
* removing smolvlmvisiontransformer from index.md
* my bad, video processing had typos
* fixing docs
* renaming params in SmolVLMModel.inputs_merger
* removing un-needed dtype/device in model forward
* ruff for CI
* update docs
* Update docs/source/en/model_doc/smolvlm.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* return cache position
* return cache position
* return cache also in modular
* needed to run modular again
* fix training tests
* push vectorized inputs merger
* format
* format
* reduce number of mappings
* addressing PR comments
* happy CI, happy me :)
* skip non-nested images
* adjust integration test for smaller GPUs
* format
* fix kwargs in chat template apply
* skip this for now
---------
Co-authored-by: raushan <raushan@huggingface.co>
Co-authored-by: Pablo <pablo.montalvo.leroux@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Joshua Lochner <admin@xenova.com>
|
2025-02-20 15:00:26 +01:00 |
|