transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-03 12:50:06 +06:00

Author	SHA1	Message	Date
Fanli Lin	8fb60bf6be	add timeout for downloading the `librispeech_asr` dataset (#38073 ) * add timeout * change 10 to 60	2025-05-13 11:50:12 +01:00
Yih-Dar	3ad35d0bca	update `require_read_token` (#38093 ) * update require_read_token * new repo * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-13 12:07:07 +02:00
Yoni Gozlan	e3b70b0d1c	Refactor image processor phi4 (#36976 ) * refactor image processor phi4 * nits fast image proc * add image tests phi4 * Fix image processing tests * update integration tests * remove revision and add comment in integration tests	2025-05-12 15:13:40 -04:00
Yih-Dar	4143f94d51	uninstall `kernels` from docker images (#38083 ) uninstall kernels Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-12 18:03:47 +02:00
Shiyu	a63cb7578e	update seed_worker to set seed based on worker_id and rank (#37980 ) * update seed_worker to set seed based on worker_id and rank * test case * set output_dir as remove tmp dir	2025-05-12 15:59:16 +00:00
efsotr	e387821a96	Fix tot update in trainer (#37923 ) * fix total updates in epoch * add test; fix max_steps * replace with multi-gpu decorator	2025-05-12 17:45:24 +02:00
Weipeng Jiang	f0e975c6cf	fix the inconsist docstring in apply_chat_template (#38069 ) The commit (`5cf11e5ab9`) fixed the type hints for the parameter `tools` in apply_chat_template, but the docstring was not changed.	2025-05-12 16:32:01 +01:00
Junlin Zhou	31791b16a1	chore(qwen2): display warning log only when sliding window attention … (#36316 ) * chore(qwen2): display warning log only when sliding window attention is enabled * Align modeling_qwen2.py and modular_qwen2.py --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-05-12 16:31:44 +01:00
ivarflakstad	8ea72d12a2	Fix mt5 test on AMD devices (#38081 )	2025-05-12 16:59:00 +02:00
谭九鼎	5c85018072	docs: fix md style (#38057 )	2025-05-12 15:56:31 +01:00
ivarflakstad	7eaa90b87b	Add AMD expectation to test_gpt2_sample (#38079 )	2025-05-12 16:51:21 +02:00
Pavel Iakubovskii	4220039b29	Fix OneFormer integration test (#38016 ) * Fix integration tests * format	2025-05-12 16:02:41 +02:00
Joao Gante	8efe3a9d77	[`chat`] generate parameterization powered by `GenerationConfig` and UX-related changes (#38047 ) * accept arbitrary kwargs * move user commands to a separate fn * work with generation config files * rm cmmt * docs * base generate flag doc section * nits * nits * nits * no <br> * better basic args description	2025-05-12 14:04:41 +01:00
Raushan Turganbay	a5c6172c81	[VLM] fix loading issues (#38051 ) * fix qwen2-vl loading * fix a few nore models * delete print * fix copies	2025-05-12 10:14:04 +00:00
Raushan Turganbay	a31fa218ad	🔴 Video processors as a separate class (#35206 ) * initial design * update all video processors * add tests * need to add qwen2-vl (not tested yet) * add qwen2-vl in auto map * fix copies * isort * resolve confilicts kinda * nit: * qwen2-vl is happy now * qwen2-5 happy * other models are happy * fix copies * fix tests * add docs * CI green now? * add more tests * even more changes + tests * doc builder fail * nit * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * small update * imports correctly * dump, otherwise this is getting unmanagebale T-T * dump * update * another update * update * tests * move * modular * docs * test * another update * init * remove flakiness in tests * fixup * clean up and remove commented lines * docs * skip this one! * last fix after rebasing * run fixup * delete slow files * remove unnecessary tests + clean up a bit * small fixes * fix tests * more updates * docs * fix tests * update * style * fix qwen2-5-vl * fixup * fixup * unflatten batch when preparing * dump, come back soon * add docs and fix some tests * how to guard this with new dummies? * chat templates in qwen * address some comments * remove `Fast` suffix * fixup * oops should be imported from transforms * typo in requires dummies * new model added with video support * fixup once more * last fixup I hope * revert image processor name + comments * oh, this is why fetch test is failing * fix tests * fix more tests * fixup * add new models: internvl, smolvlm * update docs * imprt once * fix failing tests * do we need to guard it here again, why? * new model was added, update it * remove testcase from tester * fix tests * make style * not related CI fail, lets' just fix here * mark flaky for now, filas 15 out of 100 * style * maybe we can do this way? * don't download images in setup class --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-05-12 11:55:51 +02:00
Arjuna Sky Kok	716819b830	fix(conversion): Fix size mismatch error during TF->PT model loading (#38014 )	2025-05-10 11:11:07 +00:00
Yao Matrix	8f08318769	enable generation fsdp/utils cases on XPU (#38009 ) * enable generation fsdp/utils test cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> * xx Signed-off-by: Yao Matrix <matrix.yao@intel.com> * use backend_xx APIs Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com>	2025-05-09 20:52:41 +00:00
Pavel Iakubovskii	87e971e14d	Fix linalg.norm for CovnNextV2 (#38015 ) Fix norm	2025-05-09 17:44:28 +01:00
Cyril Vallez	aaed2f5577	Fix cache update! (#38046 ) * fix slicing * better fix	2025-05-09 17:54:48 +02:00
Mikhail Moskovchenko	7f1a97bae3	Fix reduce-labels in BEIT Fast Image Processor (#38042 ) * Fixed reduce-labels * Little doc fix * Change docstring	2025-05-09 11:51:46 -04:00
Yih-Dar	9f9020fed3	Re-Enable `Trigger CircleCI via GitHub Actions when "ready for review" (#37885)` (#38041 ) * check actions * trigger CI * check actions * finally --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-09 16:57:54 +02:00
Lysandre Debut	23d79cea75	Support for version spec in requires & arbitrary mismatching depths across folders (#37854 ) * Support for version spec in requires & arbitrary mismatching depths * Quality * Testing	2025-05-09 15:26:27 +02:00
François REMY	774dc274ac	Do not erase a cache_position passed explicitly to generate(), if there is one (#37986 ) Do not erase a cache_position initialization passed explicitly to generate(), if there is one. But: Let initialization replace cache_position if it's set to None. I assume that if the value is explicitly passed but None, we should initialize anyway.	2025-05-09 10:56:21 +00:00
Yih-Dar	0010b41524	Disable `Trigger CircleCI via GitHub Actions when` ready for review` (#38038 ) disable Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-09 12:27:53 +02:00
Yih-Dar	d498528800	Trigger CircleCI via GitHub Actions when `ready for review` (#37885 ) * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-09 11:45:03 +02:00
Yih-Dar	66e696ee15	[Temporary] Log some information in some pytest/pluggy internal places (#37996 ) log pytest info Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-09 11:06:37 +02:00
Yao Matrix	a72cb31434	enable utils test cases on XPU (#38005 ) * enable utils test cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> * Update tests/utils/test_skip_decorators.py Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> * fix comment Signed-off-by: Yao Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>	2025-05-09 08:45:01 +02:00
Yao Matrix	1dfad4beb2	make mistral3 pass on xpu (#37882 ) * enabled mistral3 test cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * calibrate A100 expectation Signed-off-by: YAO Matrix <matrix.yao@intel.com> * update * update * update * update * update * update --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com> Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-09 06:41:11 +00:00
Wing Lian	121f7037c7	fix document masking for chunked attention (#37429 ) * fix document masking for chunked attention * remove accidental debugging sum	2025-05-09 08:22:00 +02:00
Arthur	5f5ccfdc54	[`AutoDocstring`] Based on inspect parsing of the signature (#33771 ) * delete common docstring * nit * updates * push * fixup * move stuff around fixup * no need for dataclas * damn nice modular * add auto class docstring * style * modular update * import autodocstring * fixup * maybe add original doc! * more cleanup * remove class do cas well * update * nits * more celanup * fix * wups * small check * updatez * some fixes * fix doc * update * nits * try? * nit * some updates * a little bit better * where ever we did not have help we are not really adding it! * revert llama config * small fixes and small tests * test * fixup * more fix-copies * updates * updates * fix doc building * style * small fixes * nits * fix-copies * fix merge issues faster * fix merge conf * nits jamba * ? * working autodoc for model class and forward except returns and example * support return section and unpack kwargs description * nits and cleanup * fix-copies * fix-copies * nits * Add support for llava-like models * fixup * add class args subset support * add examples inferred from automodel/pipelines * update ruff * autodocstring for Aria, Albert + fixups * Fix empty return blocks * fix copies * fix copies * add autodoc for all fast image processors + align, altclip * fix copies * add auto_doc for audio_spectrogram, auto_former, bark, bamba * Drastically improve speed + add bart beit bert * add autodoc to all bert-like models * Fix broken doc * fix copies * fix auto_docstring after merge * add autodoc to models * add models * add models * add models and improve support for optional, and custom shape in args docstring * update fast image processors * refactor auto_method_docstring in args_doc * add models and fix docstring parsing * add models * add models * remove debugging * add models * add fix_auto_docstrings and improve args_docs * add support for additional_info in args docstring * refactor (almost) all models * fix check docstring * fix -copies * fill in all missing docstrings * fix copies * fix qwen3 moe docstring * add documentation * add back labels * update docs and fix can_return_tuple in modular files * fix LongformerForMaskedLM docstring * add auto_docstring to _toctree * remove auto_docstring tests temporarily * fix copyrights new files * fix can_return_tuple granite hybrid * fix fast beit * Fix empty config doc * add support for COMMON_CUSTOM_ARGS in check_docstrings and add missing models * fix code block not closed flava * fix can_return_tuple sam hq * Fix Flaubert dataclass --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-05-08 17:46:07 -04:00
jiqing-feng	d231f5a7d4	update bnb tests (#38011 ) Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-05-08 20:35:24 +00:00
Yao Matrix	b3db4ddb22	enable mamba2 integration cases on xpu (#38006 ) * enable mamba2 integration cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com>	2025-05-08 19:48:09 +00:00
Fanli Lin	c7c2f08994	make `test_speculative_decoding_non_distil` device-agnostic (#38010 ) * make device-agnostic * use condition --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-05-08 19:19:47 +00:00
Raushan Turganbay	d23aae2b8c	[VLMs] support attention backends (#37576 ) * update models * why rename * return attn weights when sdpa * fixes * fix attn implementation composite * fix moshi * add message * add typings * use explicitly all flags for each attn type * fix some tests * import what is needed * kosmos on main has ew attention already, yay * new models in main, run fixup * won't fix kosmos yet * fix-copies * clean up after rebasing * fix tests * style * dont cast attns to fp32 * did we update ruff? oke, let's just do what it asks * fix pixtral after rebase	2025-05-08 18:18:54 +02:00
Tomek	e296c63cd4	Fix wording in `torchscript.md` (#38004 ) Fix wording in torchscript.md	2025-05-08 16:47:45 +01:00
Yufeng Xu	1c65aef923	Fix incorrect installation instructions (for issue #37476 ) (#37640 ) * debugging issue 36758 * debugging issue 36758 * debugging issue 36758 * updated attn_mask type specification in _flash_attention_forward * removed pdb * added a blank line * removed indentation * update constants * remove unnecessary files * created installation script, modified README * modified requirements and install.sh * undo irrelevant changes * removed blank line * fixing installation guide * modified README, python requirements, and install script * removed tests_otuput * modified README * discarded installation script and python<3.13 requirement	2025-05-08 16:32:58 +01:00
Yih-Dar	f2909e024c	Skip `test_push_to_hub_with_saves_each_epoch` for now (#38022 ) * update * trigger CI --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-08 16:26:24 +02:00
Joao Gante	f2b59c6173	[caches] Raise exception on offloaded static caches + multi device (#37974 ) * skip tests on >1 gpu * add todo	2025-05-08 14:37:36 +01:00
Joao Gante	4279057d70	[CI] remove duplicated message on GH comment to run slow tests (#37970 ) duplicated msg	2025-05-08 14:35:54 +01:00
Yih-Dar	3390534f36	Print commit SHA on slack message for new model notification. (#38019 ) add commit info Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-08 15:26:19 +02:00
Pavel Iakubovskii	9f8fffed3c	Fix `Optional` typing (#38018 ) * Fix * trigger	2025-05-08 14:51:45 +02:00
Yuanyuan Chen	06c16de3d3	Enable RUF013 to enforce optional typing (#37266 ) * Enable RUF013 for Optional typing Signed-off-by: cyy <cyyever@outlook.com> * Add Optional to types * Format code Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-05-08 12:39:56 +02:00
Aurélien Lac	f6664ee713	Add ALL_ATTENTION_FUNCTIONS compatibility for Pixtral model (#37960 ) * Add ALL_ATTENTION_FUNCTIONS compatibility for Pixtral model * Fix invalid operand type * Allow image_sizes to be optional in forward pass to fit tests Disallow using sdpa and output_attentions * Disallow using sdpa with output_attentions * Delete useless comments, use eager attention from smolvlm, use pattern from mistral * add _supports_attention_backend * use kwargs instead of position_ids --------- Co-authored-by: aurelien.lac <aurelien.lac@lighton.ai>	2025-05-08 12:13:13 +02:00
Sebastiaan Vermeulen	015b6dfbf8	Fix `pad` image transform for batched inputs (#37544 ) * fix * add batch dimension to expected output	2025-05-08 10:51:15 +01:00
Eon Kim	5c47d08b0d	Add Swin2SR ImageProcessorFast (#37169 ) * Add fast image processor support for Swin2SR * Add Swin2SR tests of fast image processing * Update docs and remove unnecessary test func * Fix docstring formatting * Skip fast vs slow processing test --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-05-07 12:20:16 -04:00
Raushan Turganbay	17742bd9c8	🔴 [VLM] Add base model without head (#37033 ) * i guessreverted all CdGen classes * style * llava onevision * fix copies * fix some tests * some more tests * dump * skip these * nevermind, i am dumb * revert fix not needed * fixup * fixup * another fixup * more fixup to make ci finally happy * fixup after rebasing * fix qwen tests * add internVL + typos here and there * image token index -> id * style * fix init weights * revert blip-2 not supported * address comments * fix copies * revert blip2 test file as well * as discussed internally, revert back CdGen models * fix some tests * fix more tests for compile * CI red * fix copies * enumerate explicitly allowed models * address comments * fix tests * fixup * style again * add tests for new model class * another fixup ( x _ x ) * [fixup] unused attributes can be removed post-deprecation	2025-05-07 17:47:51 +02:00
eustlb	3fa8d9c20e	[CSM] tiny fix on generation (#38001 ) nit	2025-05-07 11:45:23 -04:00
eustlb	798f948e88	Add CSM model (#36719 ) * draft structure * depth decoder with forward pre hook * full model forward draft * draft update * depth decoder update * ConversationalSpeechModelForCausalLM udpates * add generate * max length criteria small fix * udpate * updates * generation update * update in loss compute * conversion script * update for correct input embeddings * handle interleaved rope * update * update * update * support compile * update training * add doc * update doc * correct inits * ConversationalSpeechModel -> Csm * conf update * name update * tests CsmForCausalLMTest * convert use cached_file * conf + modeling updates * generate utils handle third dim shape * integration test * modeling + conf updates * common test handle more than 2 dims * add nested audio list utils * processing handle nested audio list * csm processing draft * mimi util * init updates * modular update * convert modular * processing update * csm tests update * generate tests handle third dim * generate utils handle third dim * propagate _get_initial_cache_position update * tied_weight_keys update + convert correctly * fix inputs_embeds * revert audio nested list * batch inference update + return audio * audio_utils update * processor update * some more integration tests * remove old test * porcessing output labels * improve * fix * update rope values with equivalent ones * conversion update * udpate tests * handle depth decoder generation config * remove default eos_token_id * make style * revert modeling_mimi * add default generation_config * remove sdpa since handled by default * make * fix conflict * fix conflicts * correct naming * correct imports * make * causal -> conditional naming * causal -> conditional naming * auto update * make * make * add doc * test update * fix weight init * audio tokens offsets as buffer * 4d mask in conditional class * make * doc update * fix causal mask * fix causal mask * doc update * doc update * add processor doc * update doc * fix 4d causal mask * update make_list_of_audio * do not default to mutable * remove duplicates * remove useless reset_parameters * use GradientCheckpointingLayer * use can_return_tuple * formatting * prepend placeholder in _sample * torch compile fix * some more fixies * convert modular * fix * default max_length in convert * handle depth decoder generation config correctly * clearer formulation * handle output_loading_info * handle softmax warning * add doc * propagate _get_initial_cache_position changes * generation in its own module * add processor tests * fix compile witu cuda graphs * fix compile with cuda graphs * add csm.md * include CSM loss * doc nit * doc nit * doc nit * Update docs/source/en/model_doc/csm.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add save_audio to processor * Update src/transformers/models/csm/modular_csm.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * doc update * simplify audio_codes_mask computation * doc update * simplify loss computation * fix static cache test * fix * remove comment * simplify encoded length computation * use hf-internal-testing * doc update * cast to float before numpy * nit * mem efficient codebook head * nit * cat input values with cutoffs --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-05-07 10:20:13 -04:00
Fiona Waters	c8607a17cb	Add a check to import_utils.py to allow for use of faiss_gpu installation (#37997 ) Adding check to import_utils.py for faiss_gpu	2025-05-07 14:27:41 +01:00
kaixuanliu	fb1e3a4daa	remove duplicate code (#37991 ) Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>	2025-05-07 13:46:45 +01:00

... 5 6 7 8 9 ...

19227 Commits