transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-08 23:30:08 +06:00

Author	SHA1	Message	Date
youngrok cha	a5cc7a67d7	[bug] fix llava processor to calculate unpadding size correctly (#37988 ) * fix llava processor to calculate unpad size correctly * repo consistency * Revert "repo consistency" & "setUp in llava family" This reverts commit `26a50af8db`. * add edge case test for padding & unpadding * compute unpadding size from original size * make test config explicit * Revert "compute unpadding size from original size" This reverts commit `752cd27ad9`. * Revert "add edge case test for padding & unpadding" This reverts commit `ccbd094d69`. * revert unpad logic * remove irrelevant tests * model test * remove processor from model test --------- Co-authored-by: jaycha <jaycha@ncsoft.com>	2025-05-13 13:49:09 +00:00
Chris	67b3d45eb6	Fix `past_key_values` type hint in model output types (#37953 ) * F: Fix type hint. * F: Use Cache type. * F: Sort import. * U: Format. * U: Address reviews.	2025-05-13 13:36:49 +00:00
Eva Koroleva	07feaad8fb	Fix bug in prefill_chunk_size that ignores disable_compile flag (#38067 ) Fix bug in prefill_chunk_size implementation that ignores disable_compile flag	2025-05-13 13:23:23 +00:00
Raushan Turganbay	e40f301f1f	[smolvlm] skip the test (#38099 ) skip the test	2025-05-13 12:50:43 +00:00
ivarflakstad	e27d230ddd	Disable report callbacks for certain training tests (#38088 ) * Disable report callbacks for certain training tests * Disable report callbacks for test_auto_batch_size_finder	2025-05-13 14:49:55 +02:00
Bongseok Lee	ab65ba47ad	fix: Propagate `lr_scheduler_kwargs` options to create LR Scheduler when LayerWiseDummyOptimizer is used (#34559 ) fix: fix get_scheduler	2025-05-13 13:56:45 +02:00
Fanli Lin	8fb60bf6be	add timeout for downloading the `librispeech_asr` dataset (#38073 ) * add timeout * change 10 to 60	2025-05-13 11:50:12 +01:00
Yih-Dar	3ad35d0bca	update `require_read_token` (#38093 ) * update require_read_token * new repo * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-13 12:07:07 +02:00
Yoni Gozlan	e3b70b0d1c	Refactor image processor phi4 (#36976 ) * refactor image processor phi4 * nits fast image proc * add image tests phi4 * Fix image processing tests * update integration tests * remove revision and add comment in integration tests	2025-05-12 15:13:40 -04:00
Yih-Dar	4143f94d51	uninstall `kernels` from docker images (#38083 ) uninstall kernels Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-12 18:03:47 +02:00
Shiyu	a63cb7578e	update seed_worker to set seed based on worker_id and rank (#37980 ) * update seed_worker to set seed based on worker_id and rank * test case * set output_dir as remove tmp dir	2025-05-12 15:59:16 +00:00
efsotr	e387821a96	Fix tot update in trainer (#37923 ) * fix total updates in epoch * add test; fix max_steps * replace with multi-gpu decorator	2025-05-12 17:45:24 +02:00
Weipeng Jiang	f0e975c6cf	fix the inconsist docstring in apply_chat_template (#38069 ) The commit (`5cf11e5ab9`) fixed the type hints for the parameter `tools` in apply_chat_template, but the docstring was not changed.	2025-05-12 16:32:01 +01:00
Junlin Zhou	31791b16a1	chore(qwen2): display warning log only when sliding window attention … (#36316 ) * chore(qwen2): display warning log only when sliding window attention is enabled * Align modeling_qwen2.py and modular_qwen2.py --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-05-12 16:31:44 +01:00
ivarflakstad	8ea72d12a2	Fix mt5 test on AMD devices (#38081 )	2025-05-12 16:59:00 +02:00
谭九鼎	5c85018072	docs: fix md style (#38057 )	2025-05-12 15:56:31 +01:00
ivarflakstad	7eaa90b87b	Add AMD expectation to test_gpt2_sample (#38079 )	2025-05-12 16:51:21 +02:00
Pavel Iakubovskii	4220039b29	Fix OneFormer integration test (#38016 ) * Fix integration tests * format	2025-05-12 16:02:41 +02:00
Joao Gante	8efe3a9d77	[`chat`] generate parameterization powered by `GenerationConfig` and UX-related changes (#38047 ) * accept arbitrary kwargs * move user commands to a separate fn * work with generation config files * rm cmmt * docs * base generate flag doc section * nits * nits * nits * no <br> * better basic args description	2025-05-12 14:04:41 +01:00
Raushan Turganbay	a5c6172c81	[VLM] fix loading issues (#38051 ) * fix qwen2-vl loading * fix a few nore models * delete print * fix copies	2025-05-12 10:14:04 +00:00
Raushan Turganbay	a31fa218ad	🔴 Video processors as a separate class (#35206 ) * initial design * update all video processors * add tests * need to add qwen2-vl (not tested yet) * add qwen2-vl in auto map * fix copies * isort * resolve confilicts kinda * nit: * qwen2-vl is happy now * qwen2-5 happy * other models are happy * fix copies * fix tests * add docs * CI green now? * add more tests * even more changes + tests * doc builder fail * nit * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * small update * imports correctly * dump, otherwise this is getting unmanagebale T-T * dump * update * another update * update * tests * move * modular * docs * test * another update * init * remove flakiness in tests * fixup * clean up and remove commented lines * docs * skip this one! * last fix after rebasing * run fixup * delete slow files * remove unnecessary tests + clean up a bit * small fixes * fix tests * more updates * docs * fix tests * update * style * fix qwen2-5-vl * fixup * fixup * unflatten batch when preparing * dump, come back soon * add docs and fix some tests * how to guard this with new dummies? * chat templates in qwen * address some comments * remove `Fast` suffix * fixup * oops should be imported from transforms * typo in requires dummies * new model added with video support * fixup once more * last fixup I hope * revert image processor name + comments * oh, this is why fetch test is failing * fix tests * fix more tests * fixup * add new models: internvl, smolvlm * update docs * imprt once * fix failing tests * do we need to guard it here again, why? * new model was added, update it * remove testcase from tester * fix tests * make style * not related CI fail, lets' just fix here * mark flaky for now, filas 15 out of 100 * style * maybe we can do this way? * don't download images in setup class --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-05-12 11:55:51 +02:00
Arjuna Sky Kok	716819b830	fix(conversion): Fix size mismatch error during TF->PT model loading (#38014 )	2025-05-10 11:11:07 +00:00
Yao Matrix	8f08318769	enable generation fsdp/utils cases on XPU (#38009 ) * enable generation fsdp/utils test cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> * xx Signed-off-by: Yao Matrix <matrix.yao@intel.com> * use backend_xx APIs Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com>	2025-05-09 20:52:41 +00:00
Pavel Iakubovskii	87e971e14d	Fix linalg.norm for CovnNextV2 (#38015 ) Fix norm	2025-05-09 17:44:28 +01:00
Cyril Vallez	aaed2f5577	Fix cache update! (#38046 ) * fix slicing * better fix	2025-05-09 17:54:48 +02:00
Mikhail Moskovchenko	7f1a97bae3	Fix reduce-labels in BEIT Fast Image Processor (#38042 ) * Fixed reduce-labels * Little doc fix * Change docstring	2025-05-09 11:51:46 -04:00
Yih-Dar	9f9020fed3	Re-Enable `Trigger CircleCI via GitHub Actions when "ready for review" (#37885)` (#38041 ) * check actions * trigger CI * check actions * finally --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-09 16:57:54 +02:00
Lysandre Debut	23d79cea75	Support for version spec in requires & arbitrary mismatching depths across folders (#37854 ) * Support for version spec in requires & arbitrary mismatching depths * Quality * Testing	2025-05-09 15:26:27 +02:00
François REMY	774dc274ac	Do not erase a cache_position passed explicitly to generate(), if there is one (#37986 ) Do not erase a cache_position initialization passed explicitly to generate(), if there is one. But: Let initialization replace cache_position if it's set to None. I assume that if the value is explicitly passed but None, we should initialize anyway.	2025-05-09 10:56:21 +00:00
Yih-Dar	0010b41524	Disable `Trigger CircleCI via GitHub Actions when` ready for review` (#38038 ) disable Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-09 12:27:53 +02:00
Yih-Dar	d498528800	Trigger CircleCI via GitHub Actions when `ready for review` (#37885 ) * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-09 11:45:03 +02:00
Yih-Dar	66e696ee15	[Temporary] Log some information in some pytest/pluggy internal places (#37996 ) log pytest info Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-09 11:06:37 +02:00
Yao Matrix	a72cb31434	enable utils test cases on XPU (#38005 ) * enable utils test cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> * Update tests/utils/test_skip_decorators.py Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> * fix comment Signed-off-by: Yao Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>	2025-05-09 08:45:01 +02:00
Yao Matrix	1dfad4beb2	make mistral3 pass on xpu (#37882 ) * enabled mistral3 test cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * calibrate A100 expectation Signed-off-by: YAO Matrix <matrix.yao@intel.com> * update * update * update * update * update * update --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com> Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-09 06:41:11 +00:00
Wing Lian	121f7037c7	fix document masking for chunked attention (#37429 ) * fix document masking for chunked attention * remove accidental debugging sum	2025-05-09 08:22:00 +02:00
Arthur	5f5ccfdc54	[`AutoDocstring`] Based on inspect parsing of the signature (#33771 ) * delete common docstring * nit * updates * push * fixup * move stuff around fixup * no need for dataclas * damn nice modular * add auto class docstring * style * modular update * import autodocstring * fixup * maybe add original doc! * more cleanup * remove class do cas well * update * nits * more celanup * fix * wups * small check * updatez * some fixes * fix doc * update * nits * try? * nit * some updates * a little bit better * where ever we did not have help we are not really adding it! * revert llama config * small fixes and small tests * test * fixup * more fix-copies * updates * updates * fix doc building * style * small fixes * nits * fix-copies * fix merge issues faster * fix merge conf * nits jamba * ? * working autodoc for model class and forward except returns and example * support return section and unpack kwargs description * nits and cleanup * fix-copies * fix-copies * nits * Add support for llava-like models * fixup * add class args subset support * add examples inferred from automodel/pipelines * update ruff * autodocstring for Aria, Albert + fixups * Fix empty return blocks * fix copies * fix copies * add autodoc for all fast image processors + align, altclip * fix copies * add auto_doc for audio_spectrogram, auto_former, bark, bamba * Drastically improve speed + add bart beit bert * add autodoc to all bert-like models * Fix broken doc * fix copies * fix auto_docstring after merge * add autodoc to models * add models * add models * add models and improve support for optional, and custom shape in args docstring * update fast image processors * refactor auto_method_docstring in args_doc * add models and fix docstring parsing * add models * add models * remove debugging * add models * add fix_auto_docstrings and improve args_docs * add support for additional_info in args docstring * refactor (almost) all models * fix check docstring * fix -copies * fill in all missing docstrings * fix copies * fix qwen3 moe docstring * add documentation * add back labels * update docs and fix can_return_tuple in modular files * fix LongformerForMaskedLM docstring * add auto_docstring to _toctree * remove auto_docstring tests temporarily * fix copyrights new files * fix can_return_tuple granite hybrid * fix fast beit * Fix empty config doc * add support for COMMON_CUSTOM_ARGS in check_docstrings and add missing models * fix code block not closed flava * fix can_return_tuple sam hq * Fix Flaubert dataclass --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-05-08 17:46:07 -04:00
jiqing-feng	d231f5a7d4	update bnb tests (#38011 ) Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-05-08 20:35:24 +00:00
Yao Matrix	b3db4ddb22	enable mamba2 integration cases on xpu (#38006 ) * enable mamba2 integration cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com>	2025-05-08 19:48:09 +00:00
Fanli Lin	c7c2f08994	make `test_speculative_decoding_non_distil` device-agnostic (#38010 ) * make device-agnostic * use condition --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-05-08 19:19:47 +00:00
Raushan Turganbay	d23aae2b8c	[VLMs] support attention backends (#37576 ) * update models * why rename * return attn weights when sdpa * fixes * fix attn implementation composite * fix moshi * add message * add typings * use explicitly all flags for each attn type * fix some tests * import what is needed * kosmos on main has ew attention already, yay * new models in main, run fixup * won't fix kosmos yet * fix-copies * clean up after rebasing * fix tests * style * dont cast attns to fp32 * did we update ruff? oke, let's just do what it asks * fix pixtral after rebase	2025-05-08 18:18:54 +02:00
Tomek	e296c63cd4	Fix wording in `torchscript.md` (#38004 ) Fix wording in torchscript.md	2025-05-08 16:47:45 +01:00
Yufeng Xu	1c65aef923	Fix incorrect installation instructions (for issue #37476 ) (#37640 ) * debugging issue 36758 * debugging issue 36758 * debugging issue 36758 * updated attn_mask type specification in _flash_attention_forward * removed pdb * added a blank line * removed indentation * update constants * remove unnecessary files * created installation script, modified README * modified requirements and install.sh * undo irrelevant changes * removed blank line * fixing installation guide * modified README, python requirements, and install script * removed tests_otuput * modified README * discarded installation script and python<3.13 requirement	2025-05-08 16:32:58 +01:00
Yih-Dar	f2909e024c	Skip `test_push_to_hub_with_saves_each_epoch` for now (#38022 ) * update * trigger CI --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-08 16:26:24 +02:00
Joao Gante	f2b59c6173	[caches] Raise exception on offloaded static caches + multi device (#37974 ) * skip tests on >1 gpu * add todo	2025-05-08 14:37:36 +01:00
Joao Gante	4279057d70	[CI] remove duplicated message on GH comment to run slow tests (#37970 ) duplicated msg	2025-05-08 14:35:54 +01:00
Yih-Dar	3390534f36	Print commit SHA on slack message for new model notification. (#38019 ) add commit info Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-08 15:26:19 +02:00
Pavel Iakubovskii	9f8fffed3c	Fix `Optional` typing (#38018 ) * Fix * trigger	2025-05-08 14:51:45 +02:00
Yuanyuan Chen	06c16de3d3	Enable RUF013 to enforce optional typing (#37266 ) * Enable RUF013 for Optional typing Signed-off-by: cyy <cyyever@outlook.com> * Add Optional to types * Format code Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-05-08 12:39:56 +02:00
Aurélien Lac	f6664ee713	Add ALL_ATTENTION_FUNCTIONS compatibility for Pixtral model (#37960 ) * Add ALL_ATTENTION_FUNCTIONS compatibility for Pixtral model * Fix invalid operand type * Allow image_sizes to be optional in forward pass to fit tests Disallow using sdpa and output_attentions * Disallow using sdpa with output_attentions * Delete useless comments, use eager attention from smolvlm, use pattern from mistral * add _supports_attention_backend * use kwargs instead of position_ids --------- Co-authored-by: aurelien.lac <aurelien.lac@lighton.ai>	2025-05-08 12:13:13 +02:00
Sebastiaan Vermeulen	015b6dfbf8	Fix `pad` image transform for batched inputs (#37544 ) * fix * add batch dimension to expected output	2025-05-08 10:51:15 +01:00

... 8 9 10 11 12 ...

19383 Commits