transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-05 13:50:13 +06:00

Author	SHA1	Message	Date
Yih-Dar	e39172ecab	Fix `llava_next` tests (#38813 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-13 15:19:41 +02:00
youngrok cha	a5cc7a67d7	[bug] fix llava processor to calculate unpadding size correctly (#37988 ) * fix llava processor to calculate unpad size correctly * repo consistency * Revert "repo consistency" & "setUp in llava family" This reverts commit `26a50af8db`. * add edge case test for padding & unpadding * compute unpadding size from original size * make test config explicit * Revert "compute unpadding size from original size" This reverts commit `752cd27ad9`. * Revert "add edge case test for padding & unpadding" This reverts commit `ccbd094d69`. * revert unpad logic * remove irrelevant tests * model test * remove processor from model test --------- Co-authored-by: jaycha <jaycha@ncsoft.com>	2025-05-13 13:49:09 +00:00
Raushan Turganbay	17742bd9c8	🔴 [VLM] Add base model without head (#37033 ) * i guessreverted all CdGen classes * style * llava onevision * fix copies * fix some tests * some more tests * dump * skip these * nevermind, i am dumb * revert fix not needed * fixup * fixup * another fixup * more fixup to make ci finally happy * fixup after rebasing * fix qwen tests * add internVL + typos here and there * image token index -> id * style * fix init weights * revert blip-2 not supported * address comments * fix copies * revert blip2 test file as well * as discussed internally, revert back CdGen models * fix some tests * fix more tests for compile * CI red * fix copies * enumerate explicitly allowed models * address comments * fix tests * fixup * style again * add tests for new model class * another fixup ( x _ x ) * [fixup] unused attributes can be removed post-deprecation	2025-05-07 17:47:51 +02:00
youngrok cha	acded47fe7	[llava] one pixel is missing from padding when length is odd (#37819 ) * [fix] one pixel should be added when length is odd * [fix] add vision_aspect_ratio args & typo * [fix] style * [fix] do not fix fast file directly * [fix] convert using modular * remove duplicate codes * match unpad logic with pad logic * test odd-sized images for llava & aria * test unpad odd-sized padding for llava family * fix style * add kwarg to onvision modular * move vision_aspect_ratio from image_processor to processor (llava_onevision)	2025-05-06 13:11:26 +02:00
Cyril Vallez	0cfbf9c95b	Force torch>=2.6 with torch.load to avoid vulnerability issue (#37785 ) * fix all main files * fix test files * oups forgot modular * add link * update message	2025-04-25 16:57:09 +02:00
Raushan Turganbay	1cfcbfcab8	[VLMs] fix flash-attention tests (#37603 ) * fix one test * fa2 ln test * remove keys from config recursively * fix * fixup	2025-04-24 11:48:11 +02:00
Joao Gante	85665a4263	[tests] Stricter generate + compilation test -- no recompilations allowed (#37629 ) * tmp commit * stricter compilation test * trigger tests * rm todo	2025-04-22 11:12:18 +01:00
Raushan Turganbay	32eca7197a	[vlm] adjust max length for special tokens (#37342 ) * update * apply suggestion * fix tests for main branch * remove unused logger * add special tokens in tests * nit * fix more tests * fix test * pg also	2025-04-16 20:49:20 +02:00
Raushan Turganbay	a563999a02	[processor] clean up mulitmodal tests (#37362 ) * clkea up mulitmodal processor tests * fixup * fix tests * fix one last test * forgot	2025-04-11 13:32:19 +02:00
Raushan Turganbay	1ae8d54b04	[chat-template] Unify tests and clean up 🧼 (#37275 ) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now	2025-04-10 14:42:32 +02:00
Matt	4d0de5f73a	🚨 🚨 Setup -> setupclass conversion (#37282 ) * More limited setup -> setupclass conversion * make fixup * Trigger tests * Fixup UDOP * Missed a spot * tearDown -> tearDownClass where appropriate * Couple more class fixes * Fixups for UDOP and VisionTextDualEncoder * Ignore errors when removing the tmpdir, in case it already got cleaned up somewhere * CLIP fixes * More correct classmethods * Wav2Vec2Bert fixes * More methods become static * More class methods * More class methods * Revert changes for integration tests / modeling files * Use a different tempdir for tests that actually write to it * Remove addClassCleanup and just use teardownclass * Remove changes in modeling files * Cleanup get_processor_dict() for got_ocr2 * Fix regression on Wav2Vec2BERT test that was masked by this before * Rework tests that modify the tmpdir * make fix-copies * revert clvp modeling test changes * Fix CLIP processor test * make fix-copies	2025-04-08 17:15:37 +01:00
cyyever	1e6b546ea6	Use Python 3.9 syntax in tests (#37343 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-04-08 14:12:08 +02:00
Matt	2d46a08b63	Purge unused ModelTester code (#37085 ) * Purge correctly this time * Remove more methods from recent PRs * make fixup	2025-04-03 17:48:35 +01:00
cyyever	41a0e58e5b	Set weights_only in torch.load (#36991 )	2025-03-27 14:55:50 +00:00
Afanti	26c83490d2	chore: fix typos in the tests directory (#36813 ) * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * fix: format codes * chore: fix copy mismatch issue * fix: format codes * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: restore previous words * chore: revert unexpected changes	2025-03-21 10:20:05 +01:00
co63oc	996f512d52	Fix typos in tests (#36547 ) Signed-off-by: co63oc <co63oc@users.noreply.github.com>	2025-03-05 15:04:06 -08:00
Raushan Turganbay	0c78ef6cd3	🔴 VLM: compile compatibility (#35724 ) * llavas * add mroe models * fix `compile_forward` test for all models * fix copies * make style * also doesn't support cache class * fix some tests * not copied from * ci green? * fix tests * fix copies * fix tests * check with `numel` and remove `item` * fix copies * fix copies * Update src/transformers/models/cohere2/modeling_cohere2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * opt remove cross attn * gemma2 * fixup * fixup * fix newly added test * maybe fixed? * green please? --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-02-14 15:23:49 +01:00
Joao Gante	62c7ea0201	CI: avoid human error, automatically infer generative models (#33212 ) * tmp commit * move tests to the right class * remove ALL all_generative_model_classes = ... * skip tf roberta * skip InstructBlipForConditionalGenerationDecoderOnlyTest * videollava * reduce diff * reduce diff * remove on vlms * fix a few more * manual rebase bits * more manual rebase * remove all manual generative model class test entries * fix up to ernie * a few more removals * handle remaining cases * recurrent gemma * it's better here * make fixup * tf idefics is broken * tf bert + generate is broken * don't touch tf :() * don't touch tf :( * make fixup * better comments for test skips * revert tf changes * remove empty line removal * one more * missing one	2025-02-13 16:27:11 +01:00
Raushan Turganbay	eebd2c972c	Chat template: update for processor (#35953 ) * update * we need batched nested input to always process correctly * update a bit * fix copies	2025-02-10 09:52:19 +01:00
Yoni Gozlan	fa56dcc2ab	Refactoring of ImageProcessorFast (#35069 ) * add init and base image processing functions * add add_fast_image_processor to transformers-cli * add working fast image processor clip * add fast image processor to doc, working tests * remove "to be implemented" SigLip * fix unprotected import * fix unprotected vision import * update ViTImageProcessorFast * increase threshold slow fast ewuivalence * add fast img blip * add fast class in tests with cli * improve cli * add fast image processor convnext * add LlavaPatchingMixin and fast image processor for llava_next and llava_onevision * add device kwarg to ImagesKwargs for fast processing on cuda * cleanup * fix unprotected import * group images by sizes and add batch processing * Add batch equivalence tests, skip when center_crop is used * cleanup * update init and cli * fix-copies * refactor convnext, cleanup base * fix * remove patching mixins, add piped torchvision transforms for ViT * fix unbatched processing * fix f strings * protect imports * change llava onevision to class transforms (test) * fix convnext * improve formatting (following Pavel review) * fix handling device arg * improve cli * fix * fix inits * Add distinction between preprocess and _preprocess, and support for arbitrary kwargs through valid_extra_kwargs * uniformize qwen2_vl fast * fix docstrings * add add fast image processor llava * remove min_pixels max_pixels from accepted size * nit * nit * refactor fast image processors docstrings * cleanup and remove fast class transforms * update add fast image processor transformers cli * cleanup docstring * uniformize pixtral fast and make _process_image explicit * fix prepare image structure llava next/onevision * Use typed kwargs instead of explicit args * nit fix import Unpack * clearly separate pops and gets in base preprocess. Use explicit typed kwargs * make qwen2_vl preprocess arguments hashable	2025-02-04 17:52:31 -05:00
Alex Brooks	e284c7e954	Update Granite Vision Model Path / Tests (#35998 ) * Update granite vision model path Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Enable granite vision test Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> --------- Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>	2025-02-03 20:06:03 +01:00
Arthur	b912f5ee43	use torch.testing.assertclose instead to get more details about error in cis (#35659 ) * use torch.testing.assertclose instead to get more details about error in cis * fix * style * test_all * revert for I bert * fixes and updates * more image processing fixes * more image processors * fix mamba and co * style * less strick * ok I won't be strict * skip and be done * up	2025-01-24 16:55:28 +01:00
Alex Brooks	71cc8161b2	Granite Vision Support (#35579 ) * Add multimodal granite support Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Support multiple image feature layres Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Remove failing validation for visual encoders with no cls Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Update llava based models / configs to support list of feature layers Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Add tests for multiple feature layers Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Use conditional instead of except for misaligned feature shapes Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * crop cls from each hidden state Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Fix formatting Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Support single vision feature int in vipllava Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Fix typo in vision feature selection strategy validation Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Add tentative integration test for granite vision models Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Add granite vision docs Replace multimodal granite refs with granite vision Add granite vision / llava next alias Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Use image url in granitevision example Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> --------- Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>	2025-01-23 17:15:52 +01:00
Raushan Turganbay	8571bb145a	Fix CI for VLMs (#35690 ) * fix some easy test * more tests * remove logit check here also * add require_torch_large_gpu in Emu3	2025-01-20 11:15:39 +01:00
Raushan Turganbay	d1681ec2b6	VLMs: major clean up 🧼 (#34502 ) only lllava models are modified	2025-01-08 10:35:23 +01:00
Yih-Dar	05de764e9c	Aurevoir PyTorch 1 (#35358 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-20 14:36:31 +01:00
Fanli Lin	bdd4201fdb	[tests] fix "Tester object has no attribute '_testMethodName'" (#34910 ) * add more cases * fix method not found in unittest Signed-off-by: Lin, Fanli <fanli.lin@intel.com> * fix more cases * add more models * add all * no unittest.case * remove for oneformer * fix style --------- Signed-off-by: Lin, Fanli <fanli.lin@intel.com>	2024-12-13 14:33:45 +01:00
Raushan Turganbay	1646ffb4d1	VLMs: `patch_size` -> `num_image_tokens` in processing (#33424 ) * use num additional tokens * fix copies + docs * another fix copies :) * add docs * move order for BC	2024-11-18 13:21:07 +01:00
Yih-Dar	f2d5dfbab2	Remove `@slow` for `test_eager_matches_sdpa_inference` (#34558 ) * update * update * update * update * update * update * update * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-05 16:10:42 +01:00
Raushan Turganbay	893ad04fad	Load sub-configs from composite configs (#34410 ) * save/load sub-configs * nit forgot these * fix copies * move test to common * use dict for sub-configs * add load-save-laod test * clean up modeling check * oops this are correct keys * fix some tests, missed some composite configs * this model was missed	2024-11-05 11:34:01 +01:00
Yoni Gozlan	203e27059b	Add image text to text pipeline (#34170 ) * Standardize image-text-to-text-models-output add post_process_image_text_to_text to chameleon and cleanup Fix legacy kwarg behavior and deprecation warning add post_process_image_text_to_text to qwen2_vl and llava_onevision Add post_process_image_text_to_text to idefics3, mllama, pixtral processor * nit var name post_process_image_text_to_text udop * nit fix deprecation warnings * Add image-text-to-text pipeline * add support for image url in chat template for pipeline * Reformat to be fully compatible with chat templates * Add tests chat template * Fix imports and tests * Add pipeline tag * change logic handling of single prompt ans multiple images * add pipeline mapping to models * fix batched inference * fix tests * Add manual batching for preprocessing * Fix outputs with nested images * Add support for all common processing kwargs * Add default padding when multiple text inputs (batch size>1) * nit change version deprecation warning * Add support for text only inference * add chat_template warnings * Add pipeline tests and add copied from post process function * Fix batched pipeline tests * nit * Fix pipeline tests blip2 * remove unnecessary max_new_tokens * revert processing kosmos2 and remove unnecessary max_new_tokens * fix pipeline tests idefics * Force try loading processor if pipeline supports it * revert load_processor change * hardcode loading only processor * remove unnecessary try except * skip imagetexttotext tests for kosmos2 as tiny model causes problems * Make code clearer * Address review comments * remove preprocessing logic from pipeline * fix fuyu * add BC resize fuyu * Move post_process_image_text_to_text to ProcessorMixin * add guard in post_process * fix zero shot object detection pipeline * add support for generator input in pipeline * nit * change default image-text-to-text model to llava onevision * fix owlv2 size dict * Change legacy deprecation warning to only show when True	2024-10-31 15:48:11 -04:00
Yih-Dar	ab98f0b0a1	avoid calling `gc.collect` and `cuda.empty_cache` (#34514 ) * update * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-31 16:36:13 +01:00
Raushan Turganbay	913330ca9f	VLMs: fix number of image tokens (#34332 ) * fix * fix tests * add tests * style * style * fix qwen after rebase * fix video llava	2024-10-30 10:21:37 +01:00
Raushan Turganbay	21d5025826	Attn implementation for composite models (#32238 ) * first try * codestyle * idefics2 is happy * [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo, paligemma * fix-copies * [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo * blip-2 needs to init vision from config * when was this removed O_o * minor fix * tests * this way? * tests * model-agnostic code * codestyle * add tests for idefics * modify general test for VLMs * no generation test for vlm yet! * no generation test here also * wanr in VIT-SDPA if output attn * add more tests * user can pass dict as attn impl * repo consistency * update * muicgen * no prints * forgot speech enc-dec and clip * how many composite models we have? * musicgen meelody is same as mudicgen * +siglip * fix tests + add some more * remove idefics custom overriden code * make idefics2 automappable * nits * skip tests * doctests * Update src/transformers/models/idefics2/configuration_idefics2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/clip/test_modeling_clip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/idefics2/test_modeling_idefics2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/idefics2/test_modeling_idefics2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/configuration_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * major update, no need for automap * clean up * add FA2 test * more tests * style * skip tests * why did these started failing now? * no attributes for FA2 needed * one tiny test * address comment about FA2 false warning * style * add new models and resolve conflicts * fix copies * let it be this way for now, come back tomorrow to review * some more fixes * update * more updates * update * fix copies * style and tests * another big update * fix tests * fix tests * update * another update * fix tests * fix copies * fix tests --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-10-22 06:54:44 +02:00
Yoni Gozlan	5f0c181f4e	Uniformize kwargs for image-text-to-text processors (#32544 ) * uniformize FUYU processor kwargs * Uniformize instructblip processor kwargs * Fix processor kwargs and tests Fuyu, InstructBlip, Kosmos2 * Uniformize llava_next processor * Fix save_load test for processor with chat_template only as extra init args * Fix import Unpack * Fix Fuyu Processor import * Fix FuyuProcessor import * Fix FuyuProcessor * Add defaults for specific kwargs kosmos2 * Fix Udop to return BatchFeature instead of BatchEncoding and uniformize kwargs * Add tests processor Udop * remove Copied from in processing Udop as change of input orders caused by BatchEncoding -> BatchFeature * Fix overwrite tests kwargs processors * Add warnings and BC for changes in processor inputs order, change docs, add BC for text_pair as arg for Udop * Fix processing test fuyu * remove unnecessary pad_token check in instructblip ProcessorTest * Fix BC tests and cleanup * FIx imports fuyu * Uniformize Pix2Struct * Fix wrong name for FuyuProcessorKwargs * Fix slow tests reversed inputs align fuyu llava-next, change udop warning * Fix wrong logging import udop * Add check images text input order * Fix copies * change text pair handling when positional arg * rebase on main, fix imports in test_processing_common * remove optional args and udop uniformization from this PR * fix failing tests * remove unnecessary test, fix processing utils and test processing common * cleanup Unpack * cleanup * fix conflict grounding dino	2024-09-24 21:28:19 -04:00
Raushan Turganbay	d7975a5874	VLMs: enable generation tests (#33533 ) * add tests * fix whisper * update * nit * add qwen2-vl * more updates! * better this way * fix this one * fix more tests * fix final tests, hope so * fix led * Update tests/generation/test_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * pr comments * not pass pixels and extra for low-mem tests, very flaky because of visio tower --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2024-09-19 12:04:24 +02:00
Raushan Turganbay	db72894b48	Chat template: save and load correctly for processors (#33462 ) * fix * add tests * fix tests * Update tests/models/llava/test_processor_llava.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix * fix tests * update tests --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-09-18 13:00:44 +02:00
Insu Jang	bcf8946f0a	Fix number of patch check for different vision feature select strategy (#32494 ) * Fix number of patch check for different vision feature select strategy * add test --------- Co-authored-by: raushan <raushan@huggingface.co>	2024-09-17 09:33:07 +02:00
Raushan Turganbay	7d2d6ce9cb	VLM: fixes after refactor (#32907 ) * leave only half of the changes * fix tests * [run-slow] llava, llava_next, llava_next_video, vipllava, video_llava * fix tests, first try * [run-slow] llava, llava_next, llava_next_video, vipllava, video_llava * fix, second try * [run-slow] llava, llava_next, llava_next_video, vipllava, video_llava * fix * [run-slow] llava, llava_next, llava_next_video, vipllava, video_llava	2024-09-10 12:02:37 +02:00
laurentd-lunit	d703477265	[fix] LlavaNextProcessor '_get_unpadded_features' method (#33263 ) * [fix] LlavaNextProcessor '_get_unpadded_features' method * [tests] add test_image_token_filling * [chore] style + comment * [minor] improve readability * [chore] run make fix-copies	2024-09-04 17:41:51 +05:00
jp	e840127370	reopen: llava-next fails to consider padding_side during Training (#32679 ) restore #32386	2024-08-15 11:44:19 +01:00
Raushan Turganbay	a29eabd0eb	Expand inputs in processors for VLMs (#30962 ) * let it be * draft * should not have changed * add warnings * fix & add tests * fix tests * ipnuts embeds cannot be passed with pixels * more updates * paligemma ready! * minor typos * update blip-2 * fix tests & raise error * docstring * add blip2 test * tmp * add image seq length to config * update docstring * delete * fix tests * fix blip * fix paligemma * out-of-place scatter * add llava-next-video * Update src/transformers/models/blip_2/modeling_blip_2.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * remove tmp * codestyle * nits * more nits * remove overriding in tests * comprehension when merging video * fix-copies * revert changes for embeds test * fix tests after making comprehension * Update src/transformers/models/blip_2/processing_blip_2.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * Update src/transformers/models/blip_2/processing_blip_2.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * more updates * fix tests --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2024-08-13 10:14:39 +05:00
Sai-Suraj-27	85a1269e19	fix: Replaced deprecated `unittest method` with the correct one (#32198 ) Replaced deprecated unittest method with the correct one.	2024-07-24 18:00:21 +01:00
Raushan Turganbay	3aefb4ec7f	LLaVaNeXT: pad on right if training (#32134 ) * pad on right if training * docs * add tests	2024-07-23 10:23:55 +05:00
Raushan Turganbay	b873234cb6	Llava: add default chat templates (#31691 ) * add default chat templates * Update src/transformers/models/llava/processing_llava.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/llava_next/processing_llava_next.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * more clear docstring and docs * Update docs/source/en/model_doc/llava.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/vipllava.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add tests * remove default templates (see #31733) * load chat template from another file * Update docs/source/en/model_doc/llava_next.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * revert some changes in docs * forgot vipllava * chat template file is not temporary hack * warn if loading from processor * not that file * similarly modify `save_pretrained` * Update tests/models/llava_next/test_processor_llava_next.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/vipllava/test_processor_vipllava.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/vipllava.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/processing_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/processing_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/vipllava.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/llava.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/llava.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/processing_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2024-07-19 10:08:56 +05:00
amyeroberts	1de7dc7403	Skip tests properly (#31308 ) * Skip tests properly * [test_all] * Add 'reason' as kwarg for skipTest * [test_all] Fix up * [test_all]	2024-06-26 21:59:08 +01:00
Raushan Turganbay	e71f2863d7	Add LLaVa NeXT Video (#31252 ) * squash into single commit * run diff once more * docstring * tests * minor chnages and ready to go * Update src/transformers/models/llava_next_video/processing_llava_next_video.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/vipllava/test_modeling_vipllava.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * [run-slow] llava-next-video * [run-slow] llava-next-video * [run-slow] llava_next_video * fix two tests * fix slow tests * remove logit checks due to numeric errors * run test once more * [run-slow] llava_next_video * final try to pass the test * [run-slow] llava_next_video * [run-slow] llava_next_video * [run-slow] llava_next_video * style * fix * style --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-06-26 21:52:28 +05:00
amyeroberts	02c525d226	Rename misnamed image processor test files (#31430 )	2024-06-17 10:21:28 +01:00
amyeroberts	f53fe35b29	Fast image processor (#28847 ) * Draft fast image processors * Draft working fast version * py3.8 compatible cache * Enable loading fast image processors through auto * Tidy up; rescale behaviour based on input type * Enable tests for fast image processors * Smarter rescaling * Don't default to Fast * Safer imports * Add necessary Pillow requirement * Woops * Add AutoImageProcessor test * Fix up * Fix test for imagegpt * Fix test * Review comments * Add warning for TF and JAX input types * Rearrange * Return transforms * NumpyToTensor transformation * Rebase - include changes from upstream in ImageProcessingMixin * Safe typing * Fix up * convert mean/std to tesnor to rescale * Don't store transforms in state * Fix up * Update src/transformers/image_processing_utils_fast.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Warn if fast image processor available * Update src/transformers/models/vit/image_processing_vit_fast.py * Transpose incoming numpy images to be in CHW format * Update mapping names based on packages, auto set fast to None * Fix up * Fix * Add AutoImageProcessor.from_pretrained(checkpoint, use_fast=True) test * Update src/transformers/models/vit/image_processing_vit_fast.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Add equivalence and speed tests * Fix up --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2024-06-11 15:47:38 +01:00
Raushan Turganbay	5d0bf59b4d	LLaVa-Next: Update docs with batched inference (#30857 ) * update docs with batch ex * Update docs/source/en/model_doc/llava_next.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * accept nested list of img --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2024-05-20 13:45:56 +05:00

1 2

56 Commits