transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-03 12:50:06 +06:00

Author	SHA1	Message	Date
Codys12	1e921a3a9c	Add optional RMSNorm support to BitNet quantization (config + layers) (#38087 ) * enable optional RMS in BitLinear * Fix naming * Import RMS from Llama using config.* * make fix-copies * ran CI loop * remove default BitNetQuantConfig values * Fix BitNetQuantConfig to be Optional * Fix config docstrings to match Optoinal * Edit docstrings to match standards --------- Co-authored-by: steinmetzc <codysteinmetz7@gmail.com> Co-authored-by: codys12 <steinmetzc@dh-mgmt4.hpc.msoe.edu> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-05-16 12:38:06 +02:00
BakerBunker	57a79f51b2	Fix Qwen2.5 Omni `SinusoidsPositionEmbedding` precision (#38151 ) * Fix Qwen2.5 Omni `SinusoidsPositionEmbedding` precision fixes https://github.com/QwenLM/Qwen2.5-Omni/issues/271 * Update modular_qwen2_5_omni.py	2025-05-16 12:24:50 +02:00
Jerry Zhang	44fa04ae8d	Include output embedding as well with `include_embedding` flag (#37935 ) * Include output embedding as well with `include_embedding` flag Summary: att Test Plan: python tests/quantization/torchao_integration/test_torchao.py -k test_include_embedding Reviewers: Subscribers: Tasks: Tags: * format * rename include_embedding to include_input_output_embeddings --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-05-16 12:06:11 +02:00
Yao Matrix	34c1e29cdd	enable autoround cases on XPU (#38167 ) * enable autoround cases on XPU Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com>	2025-05-16 09:08:35 +00:00
Pavel Gein	0f77ca72ca	[FIX] Save speed metrics to logs (#38136 ) Previously, we calculated speed metrics and did not do anything with the result.	2025-05-15 16:58:50 +02:00
Simon Levine	27ef46e846	Omit creation of positional IDs within ESM if applicable (#38089 ) * omit pos emb creation * rft --------- Co-authored-by: sgottreich <sgottreich@absci.com>	2025-05-15 14:09:21 +00:00
Wing Lian	fe9426f12d	disable deepspeed when setting up fake trainer (#38101 ) * disable deepspeed when setting up fake trainer * Apply style fixes --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-05-15 15:34:04 +02:00
Yao Matrix	7caa57e85e	enable trainer test cases on xpu (#38138 ) * enable trainer test cases on xpu Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com>	2025-05-15 12:17:44 +00:00
Aurélien Lac	b11b28cc4e	Hotfix: Flash Attention 2 support in Pixtral (#38146 ) setting attention_mask to None when flash_attention_2 is selected Co-authored-by: aurelien.lac <aurelien.lac@lighton.ai>	2025-05-15 11:45:35 +02:00
Joao Gante	0e0e5c1044	[generate] Run custom generation code from the Hub (#36405 ) * mvp * remove trust_remote_code * generate_from_hub * handle requirements; docs * english * doc PR suggestions * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * changed remote code path to generate/generate.py * model repo has custom generate -> override base generate * check for proper inheritance * some doc updates (missing: tag-related docs) * update docs to model repo * nit * nit * nits * Update src/transformers/dynamic_module_utils.py * Apply suggestions from code review * Update docs/source/en/generation_strategies.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * trust remote code is required * use new import utils for requirements version parsing * use org examples * add tests * Apply suggestions from code review Co-authored-by: Manuel de Prada Corral <6536835+manueldeprada@users.noreply.github.com> * ascii file structure; tag instructions on readme.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Manuel de Prada Corral <6536835+manueldeprada@users.noreply.github.com>	2025-05-15 10:35:54 +01:00
Raushan Turganbay	955e61b0da	Remove head mask in generative models (#35786 ) * just squash into one commit * delete print	2025-05-15 10:44:19 +02:00
Yao Matrix	0173a99e73	enable csm integration cases on xpu, all passed (#38140 ) * enable csm test cases on XPU, all passed Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com>	2025-05-15 09:46:29 +02:00
Huang, Guangtai	e5a48785d9	[Qwen3] Qwen3 MoE add tp plan for expert mlps (#38135 ) fix tp plan	2025-05-15 09:12:39 +02:00
Olivier Schipper	4005e30c80	Fix incorrect attention mask truncate in WhisperFlashAttention2 (#36477 ) * Fix incorrect attention mask truncate in whisper flash attention * also fix incorrect attention mask truncate in qwen2 audio * Nit attention mask truncate modeling_qwen2_audio.py * Nit attention mask truncate modeling_whisper.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>	2025-05-14 20:08:31 +00:00
Sangbum Daniel Choi	aa27fa75cd	enable d_fine finetuning properly (#37962 ) add pre_output in the front Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-05-14 16:53:04 +01:00
Manuel de Prada Corral	e021bf6bf8	Add `manueldeprada` to `run_slow` whitelist (#38126 ) Add manueldeprada to run_slow allowed users	2025-05-14 15:16:58 +02:00
Arjuna Sky Kok	ef27b2bc22	[docs] add uv installation instructions for source builds (#37968 )	2025-05-14 13:09:41 +00:00
guspuffygit	4a2decd192	Update trainer.md (#38113 ) Fix typo in torch.compile method parameters	2025-05-14 12:40:00 +00:00
Kirire	935bbbc711	Add config validation and style tweaks (#37589 ) * Add config validation and style tweaks * Fix style issues * Fix style issues * style * Small fixes for copy/paste errors --------- Co-authored-by: Cyrile <cyrile.delestre@arkea.com>	2025-05-14 12:22:10 +00:00
ivarflakstad	1b00966395	Fix auto batch size finder test (#38125 ) Ensure --auto_find_batch_size is the last test arg so indexing is correct	2025-05-14 12:12:04 +00:00
Ritwick Chaudhry	fe918d13b9	Fix temporal padding in Qwen2VLImageProcessor when the number of frames is not divisible by temporal_patch_size (#38076 ) Qwen2VL: Fix temporal padding in Qwen2VLImageProcessor when frames are not divisible by temporal_patch_size	2025-05-14 12:28:21 +02:00
Raushan Turganbay	aaf224d570	[video processor] fix tests (#38104 ) * fix tests * delete * fix one more test * fix qwen + some tests are failing irrespective of `VideoProcessor` * delete file	2025-05-14 10:24:07 +00:00
Yao Matrix	9b5ce556aa	enable finegrained_fp8 and granite_speech cases on XPU (#38036 ) * enable finegrained_fp8 cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> * change back to auto Signed-off-by: Yao Matrix <matrix.yao@intel.com> * rename per comments Signed-off-by: Matrix Yao <matrix.yao@intel.com> --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com> Signed-off-by: Matrix Yao <matrix.yao@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-05-14 08:58:40 +00:00
bilibili12433014	b311a3f506	Fix description and formatting errors in code docs (#38074 ) * Update stopping_criteria.py Fix description and formatting errors. * Update stopping_criteria.py Align formatting with existing files for consistency.	2025-05-13 17:17:15 +00:00
Marc Sun	b499a14b17	Add style bot (#38102 ) add style bot	2025-05-13 19:07:17 +02:00
eustlb	e0f225cb10	[CSM] update test for t4 runners (#38110 ) update test for t4 runners	2025-05-13 11:59:26 -04:00
Jinyong Lee	342961f669	Add Fast Image Processor for vilt (#37304 ) * init vilt image processor fast * Refactor image processor tests to use loop for all processors * Add ViltImageProcessorFast with PyTorch-based optimized image processing * Change made automatically by make fixup command * Change made automatically by make fix-copies command * Fix type hints in ViltImageProcessorFast for Python compatibility * Define constants for image resizing based on COCO dataset aspect ratio * Add missing property initializations to ViltImageProcessorFast * Extract resize logic into dedicated method in ViltImageProcessorFast * Extract padding logic into dedicated method * Implement shape-based image grouping for optimized processing in Vilt * Update test suite to verify ViltImageProcessorFast attributes * Move variable declarations to _preprocess method parameters * Remove unused parameters * Rename _resize method to resize to override existing function * Remove whitespace * Remove unnecessary type check and conversion for stacked_images * Remove redundant loop and apply padding directly to stacked images * Refactor pad function to return images and mask as tuple instead of dict * Add tests comparing padding masks in slow and fast implementations * Update ViltImageProcessor tests to ensure compatibility between slow and fast implementations * Replace add_start_docstrings with auto_docstring in ViltImageProcessorFast * Move docstrings of custom args to ViltFastImageProcessorKwargs * Use reorder_images function for both masks and images --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-05-13 15:40:53 +00:00
Yoni Gozlan	8771766a70	Fix InternVL interpolate_pos_encoding and add to video_processing_auto (#38092 ) * fix InternVL interpolate_pos_encoding * fix modular and auto_video_processor for internvl	2025-05-13 11:18:40 -04:00
Yih-Dar	582d5e0e11	fix `check_bad commit.py` gives wrong results (#38107 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-13 16:58:22 +02:00
youngrok cha	a5cc7a67d7	[bug] fix llava processor to calculate unpadding size correctly (#37988 ) * fix llava processor to calculate unpad size correctly * repo consistency * Revert "repo consistency" & "setUp in llava family" This reverts commit `26a50af8db`. * add edge case test for padding & unpadding * compute unpadding size from original size * make test config explicit * Revert "compute unpadding size from original size" This reverts commit `752cd27ad9`. * Revert "add edge case test for padding & unpadding" This reverts commit `ccbd094d69`. * revert unpad logic * remove irrelevant tests * model test * remove processor from model test --------- Co-authored-by: jaycha <jaycha@ncsoft.com>	2025-05-13 13:49:09 +00:00
Chris	67b3d45eb6	Fix `past_key_values` type hint in model output types (#37953 ) * F: Fix type hint. * F: Use Cache type. * F: Sort import. * U: Format. * U: Address reviews.	2025-05-13 13:36:49 +00:00
Eva Koroleva	07feaad8fb	Fix bug in prefill_chunk_size that ignores disable_compile flag (#38067 ) Fix bug in prefill_chunk_size implementation that ignores disable_compile flag	2025-05-13 13:23:23 +00:00
Raushan Turganbay	e40f301f1f	[smolvlm] skip the test (#38099 ) skip the test	2025-05-13 12:50:43 +00:00
ivarflakstad	e27d230ddd	Disable report callbacks for certain training tests (#38088 ) * Disable report callbacks for certain training tests * Disable report callbacks for test_auto_batch_size_finder	2025-05-13 14:49:55 +02:00
Bongseok Lee	ab65ba47ad	fix: Propagate `lr_scheduler_kwargs` options to create LR Scheduler when LayerWiseDummyOptimizer is used (#34559 ) fix: fix get_scheduler	2025-05-13 13:56:45 +02:00
Fanli Lin	8fb60bf6be	add timeout for downloading the `librispeech_asr` dataset (#38073 ) * add timeout * change 10 to 60	2025-05-13 11:50:12 +01:00
Yih-Dar	3ad35d0bca	update `require_read_token` (#38093 ) * update require_read_token * new repo * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-13 12:07:07 +02:00
Yoni Gozlan	e3b70b0d1c	Refactor image processor phi4 (#36976 ) * refactor image processor phi4 * nits fast image proc * add image tests phi4 * Fix image processing tests * update integration tests * remove revision and add comment in integration tests	2025-05-12 15:13:40 -04:00
Yih-Dar	4143f94d51	uninstall `kernels` from docker images (#38083 ) uninstall kernels Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-12 18:03:47 +02:00
Shiyu	a63cb7578e	update seed_worker to set seed based on worker_id and rank (#37980 ) * update seed_worker to set seed based on worker_id and rank * test case * set output_dir as remove tmp dir	2025-05-12 15:59:16 +00:00
efsotr	e387821a96	Fix tot update in trainer (#37923 ) * fix total updates in epoch * add test; fix max_steps * replace with multi-gpu decorator	2025-05-12 17:45:24 +02:00
Weipeng Jiang	f0e975c6cf	fix the inconsist docstring in apply_chat_template (#38069 ) The commit (`5cf11e5ab9`) fixed the type hints for the parameter `tools` in apply_chat_template, but the docstring was not changed.	2025-05-12 16:32:01 +01:00
Junlin Zhou	31791b16a1	chore(qwen2): display warning log only when sliding window attention … (#36316 ) * chore(qwen2): display warning log only when sliding window attention is enabled * Align modeling_qwen2.py and modular_qwen2.py --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-05-12 16:31:44 +01:00
ivarflakstad	8ea72d12a2	Fix mt5 test on AMD devices (#38081 )	2025-05-12 16:59:00 +02:00
谭九鼎	5c85018072	docs: fix md style (#38057 )	2025-05-12 15:56:31 +01:00
ivarflakstad	7eaa90b87b	Add AMD expectation to test_gpt2_sample (#38079 )	2025-05-12 16:51:21 +02:00
Pavel Iakubovskii	4220039b29	Fix OneFormer integration test (#38016 ) * Fix integration tests * format	2025-05-12 16:02:41 +02:00
Joao Gante	8efe3a9d77	[`chat`] generate parameterization powered by `GenerationConfig` and UX-related changes (#38047 ) * accept arbitrary kwargs * move user commands to a separate fn * work with generation config files * rm cmmt * docs * base generate flag doc section * nits * nits * nits * no <br> * better basic args description	2025-05-12 14:04:41 +01:00
Raushan Turganbay	a5c6172c81	[VLM] fix loading issues (#38051 ) * fix qwen2-vl loading * fix a few nore models * delete print * fix copies	2025-05-12 10:14:04 +00:00
Raushan Turganbay	a31fa218ad	🔴 Video processors as a separate class (#35206 ) * initial design * update all video processors * add tests * need to add qwen2-vl (not tested yet) * add qwen2-vl in auto map * fix copies * isort * resolve confilicts kinda * nit: * qwen2-vl is happy now * qwen2-5 happy * other models are happy * fix copies * fix tests * add docs * CI green now? * add more tests * even more changes + tests * doc builder fail * nit * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * small update * imports correctly * dump, otherwise this is getting unmanagebale T-T * dump * update * another update * update * tests * move * modular * docs * test * another update * init * remove flakiness in tests * fixup * clean up and remove commented lines * docs * skip this one! * last fix after rebasing * run fixup * delete slow files * remove unnecessary tests + clean up a bit * small fixes * fix tests * more updates * docs * fix tests * update * style * fix qwen2-5-vl * fixup * fixup * unflatten batch when preparing * dump, come back soon * add docs and fix some tests * how to guard this with new dummies? * chat templates in qwen * address some comments * remove `Fast` suffix * fixup * oops should be imported from transforms * typo in requires dummies * new model added with video support * fixup once more * last fixup I hope * revert image processor name + comments * oh, this is why fetch test is failing * fix tests * fix more tests * fixup * add new models: internvl, smolvlm * update docs * imprt once * fix failing tests * do we need to guard it here again, why? * new model was added, update it * remove testcase from tester * fix tests * make style * not related CI fail, lets' just fix here * mark flaky for now, filas 15 out of 100 * style * maybe we can do this way? * don't download images in setup class --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-05-12 11:55:51 +02:00

1 2 3 4 5 ...

18962 Commits