transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-03 21:00:08 +06:00

Author	SHA1	Message	Date
Raushan Turganbay	f8b88866f5	[VLMs] support passing embeds along with pixels (#38467 ) * VLMs can work with embeds now * update more models * fix tests * fix copies * fixup * fix * style * unskip tests * fix copies * fix tests * style * omni modality models * qwen models had extra indentation * fix some other tests * fix copies * fix test last time * unrelated changes revert * we can't rely only on embeds * delete file * de-flake mistral3 * fix qwen models * fix style * fix tests * fix copies * deflake the test * modular reverted by fixes, fix again * flaky test, overwritten * fix copies * style	2025-07-01 11:33:20 +00:00
Yao Matrix	2100ee6545	fix UT failures on XPU w/ stock PyTorch 2.7 & 2.8 (#39116 ) * fix UT failures on XPU w/ stock PyTorch 2.7 & 2.8 Signed-off-by: YAO Matrix <matrix.yao@intel.com> * zamba2 Signed-off-by: YAO Matrix <matrix.yao@intel.com> * xx Signed-off-by: YAO Matrix <matrix.yao@intel.com> * internvl Signed-off-by: YAO Matrix <matrix.yao@intel.com> * tp cases Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-06-30 11:49:03 +02:00
Rémi Ouazan	25c44d4b68	Internvl fix (#38946 ) * Image processor compile fix (#38540) * Added a compile-friendly versiom of resize to BaseImgProcessorFast * Changed qwen2 processor to use its parent class .resize * Style * underlined issue only happens on AMD w/ comment and bool check * Fixed some utils functions * Fixed the same issue for bridgetower * Fixed the same issue for llava_next * Repo consistency for llava onevision * Update src/transformers/image_processing_utils_fast.py Co-authored-by: Mohit Sharma <mohit21sharma.ms@gmail.com> --------- Co-authored-by: Mohit Sharma <mohit21sharma.ms@gmail.com> * Added an Expectation to an internvl test * Made qwen2_vl use the resize method of its parent clas * Changed to torch.where --------- Co-authored-by: Mohit Sharma <mohit21sharma.ms@gmail.com>	2025-06-26 13:44:59 +02:00
Rémi Ouazan	9ff246db00	Expectation fixes and added AMD expectations (#38729 )	2025-06-13 16:14:58 +02:00
Yih-Dar	04cdf83244	Update some tests for torch 2.7.1 (#38701 ) * fix 1 * fix 2 * fix 3 * fix 4 * fp16 * break * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-10 11:46:52 +02:00
Yih-Dar	ebeec13609	Fix `InternVL` integration test (#38612 ) Some checks failed Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Has been cancelled Details Build documentation / build (push) Has been cancelled Details Slow tests on important models (on Push - A10) / Get all modified files (push) Has been cancelled Details Self-hosted runner (push-caller) / Check if setup was changed (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Transformers metadata / build_and_package (push) Has been cancelled Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Has been cancelled Details Self-hosted runner (push-caller) / build-docker-containers (push) Has been cancelled Details Self-hosted runner (push-caller) / Trigger Push CI (push) Has been cancelled Details * fix * fix * fix OOM --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-07 08:30:47 +02:00
Raushan Turganbay	17742bd9c8	🔴 [VLM] Add base model without head (#37033 ) * i guessreverted all CdGen classes * style * llava onevision * fix copies * fix some tests * some more tests * dump * skip these * nevermind, i am dumb * revert fix not needed * fixup * fixup * another fixup * more fixup to make ci finally happy * fixup after rebasing * fix qwen tests * add internVL + typos here and there * image token index -> id * style * fix init weights * revert blip-2 not supported * address comments * fix copies * revert blip2 test file as well * as discussed internally, revert back CdGen models * fix some tests * fix more tests for compile * CI red * fix copies * enumerate explicitly allowed models * address comments * fix tests * fixup * style again * add tests for new model class * another fixup ( x _ x ) * [fixup] unused attributes can be removed post-deprecation	2025-05-07 17:47:51 +02:00
Yao Matrix	34f26e2c3e	enable internvl UTs on XPU (#37779 ) * enable internvl UTs on XPU Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style per comments Signed-off-by: Yao Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Signed-off-by: Yao Matrix <matrix.yao@intel.com>	2025-04-30 10:29:40 +02:00
Raushan Turganbay	1e9087368c	[internvl] fix chat template (#37656 ) * fix chat template * update * update conversion * rename `fake_image_token` in tests	2025-04-23 16:56:36 +02:00
Yoni Gozlan	a245011252	Add InternVL (2.5 MPO) (#35968 ) * initial commit * add convert internvl * add first end-to-end working internvl * nit prompt and image proc * add working chat template * add conversion llama-based models * add tests * pass all tests * fix isort * fix modular after main merge * add video processing for internvl * add support for interlaced images and videos * Remove processing and config from modular, add more tests * add llama model tests * Modify processor for compatibility with refactored got ocr image processor * add comments in processor * Add docs and nits * change video processing to use custom sample_indices_fn * rebase and fix tests * add processor tests * Add changes Raushan review * Use the new attention interface for the vision model * nits * add support for custom video_load_backend * remove mention to InternVLTokenizer * refactor vision model to simplify logic * refactor processor for better readibility * fix copies * fix require av processor test * refactor internVL vision * Update processor and fix processing tests * fix docstring * update convert_weights for internvl3 * change image processor to fast by default * remove do_center_crop=True in convert_weights * force use_cache to True * push_to_hub before reloading * fix internVLVision for larger models * update convert weight for qk norm * fix convert_weights * fix eos_token_id in convert * update docs and integration tests * make modifs after review * fix wrong k_norm and reduce modular * change image_token_index to image_token_id * change checkpoint to OpenGVLab org * last nits * explicitely del self.num_key_value_groups * add extra special tokens	2025-04-18 18:57:33 +02:00

10 Commits