transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Ryan Mullins	487dab1b2b	Shieldgemma2 (#36678 ) * single commit * correct config * fixup * dummy pt * Use ShieldGemma2Config in conversion script * Update src/transformers/models/shieldgemma2/configuration_shieldgemma2.py * Adding shieldgemma2 to models.__init__.py * Adding ShieldGemma2 to main __init__.py * Update shieldgemma2.md * Update shieldgemma2.md * Adding tests. Addressing review feedback. * Minor docs update * Fixing code quality feedback from CI * Fixing empty messages bug reported by ghunkins --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Ren Pang <ain-soph@live.com>	2025-03-20 15:14:38 +01:00
HuangBugWei	a63e92e2f0	Fix: remove the redundant snippet of _whole_word_mask (#36759 ) remove the redundant snippet of _whole_word_mask	2025-03-20 14:10:43 +00:00
Ryan Mullins	8124a234ca	Gemma 3: Adding explicit GenerationConfig and refactoring conversion … (#36833 ) Gemma 3: Adding explicit GenerationConfig and refactoring conversion script	2025-03-20 15:03:32 +01:00
Pavel Iakubovskii	cf8091c017	Fix import for torch 2.0, 2.1 - guard typehint for "device_mesh" (#36768 ) * Fix device_mesh * Remove rebase leftover	2025-03-20 11:55:47 +00:00
Marc Sun	388e6659bf	Update min safetensors bis (#36823 ) * update setup.py * style	2025-03-20 12:50:07 +01:00
Joao Gante	b47d9b2f8a	[generate] clarify docstrings: when to inherit `GenerationMixin` (#36605 )	2025-03-20 10:58:54 +00:00
Joao Gante	8e97b44087	[modular] Sort modular skips (#36304 )	2025-03-20 10:55:12 +00:00
Artem Kudisov	63380b77d4	Pass state dict (#35234 ) * Pass state_dict argument to get_peft_model_state_dict * Style fix * Change arguments order	2025-03-20 11:54:59 +01:00
Joao Gante	957b05b413	[qwen2 audio] remove redundant code and update docs (#36282 )	2025-03-20 10:54:51 +00:00
rasmi	f0d5b2ff04	Update deprecated Jax calls (#35919 ) * Remove deprecated arguments for jax.numpy.clip. * Remove deprecated arguments for jax.numpy.clip. * Update jax version to 0.4.27 to 0.4.38. * Avoid use of deprecated xla_bridge.get_backend().platform Co-authored-by: Jake Vanderplas <jakevdp@google.com> --------- Co-authored-by: Jake Vanderplas <jakevdp@google.com>	2025-03-20 11:51:51 +01:00
Pavel Iakubovskii	1ddb64937c	Fix fp16 ONNX export for RT-DETR and RT-DETRv2 (#36460 ) * Fix FP16 ONNX export * Fix typo * Sync omdet-turbo * Refactor encoder for better readability * Fix _no_split_modules * Fix int -> torch_int * Fix rt_detr * Apply to rt-detr-v2 * Fixup * Fix copies	2025-03-20 10:43:51 +00:00
AbdelKarim ELJANDOUBI	e7337ee7be	Pass num_items_in_batch directly to loss computation (#36753 ) * Pass num_items_in_batch directly to loss computation * use self loss instead * fix loss kwrgs * fix vocab size	2025-03-20 10:35:35 +00:00
yutong_liu	8b479e39bb	Saving `Trainer.collator.tokenizer` in when `Trainer.processing_class` is `None` (#36552 ) * feat: Saving tokenizer in collator when processing_class is None * chore: Style issue * chore: Typo * dbg: Check why test failed * dbg: Remove logics and another test failed which successed before, so should be the stablibility issue * test: Init unit-test * chore: Style * chore: Add err log * fix: Case * Update tests/trainer/test_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * chore: Try to use get_regression_trainer * fix: Impl and style * fix: Style * fix: Case * fix: Import err * fix: Missed import * fix: Import block un-sorted problem * fix: Try another tokenizer * fix: Test logic * chore: Light updates * chore: Reformat --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-20 11:27:47 +01:00
Ita Zaporozhets	3f03c379d2	fix tiktoken convert to pass AddedToken to Tokenizer (#36566 ) * pass AddedToken to Tokenizer * ruff * handle dict for special tokens * option: test tokenizer from tiktoken same as fast * ruff * ruff	2025-03-20 11:26:49 +01:00
Stas Bekman	8f64b177f6	[ForCausalLMLoss] allow users to pass shifted labels (#36607 ) * [ForCausalLMLoss] allow users to pass shifted labels Signed-off-by: Stas Bekman <stas@stason.org> * style Signed-off-by: Stas Bekman <stas@stason.org> --------- Signed-off-by: Stas Bekman <stas@stason.org>	2025-03-20 11:25:22 +01:00
HDCharles	94555437e2	Disable inductor config setter by default (#36608 ) * Disable inductor config setter by default This is hard to debug and should be off by default * remove default settings in autoquant too * Add info to torchao.md about recommended settings * satisfying Ruff format Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-20 11:23:14 +01:00
Ze-Yi LIN	8733297b41	Fix swanlab global step (#36728 ) * fix * global step	2025-03-20 11:13:37 +01:00
Quentin Gallouédec	b815fae359	Move the warning to the documentation for DataCollatorWithFlattening (#36707 ) Remove init warning	2025-03-20 11:09:57 +01:00
Matt	9be4728af8	Just import torch AdamW instead (#36177 ) * Just import torch AdamW instead * Update docs too * Make AdamW undocumented * make fixup * Add a basic wrapper class * Add it back to the docs * Just remove AdamW entirely * Remove some AdamW references * Drop AdamW from the public init * make fix-copies * Cleanup some references * make fixup * Delete lots of transformers.AdamW references * Remove extra references to adamw_hf	2025-03-19 18:29:40 +00:00
Michael Feil	51bd0ceb9e	Update configuration_qwen2.py (#36735 ) * Update configuration_qwen2_moe.py * Update modeling_qwen2_moe.py * ruff fmt * docstring add qkv_bias	2025-03-19 18:15:54 +00:00
JJJYmmm	107fedc1e2	quick fix fast_image_processor register error (#36716 ) * fix fast_image_processor register error * update error message * remove redundant import * fix format	2025-03-19 18:05:45 +00:00
Mohamed Mekkouri	258dd9cc69	Add Space to Bitsandbytes doc (#36834 ) * add space * address review	2025-03-19 18:56:07 +01:00
Tugsbayasgalan Manlaibaatar	f39f4960f3	Support tracable dynamicKVcache (#36311 ) * Support tracable dynamicKVcache * Fix lint * More fine grained test * Lint * Update * Update * Fix up * Apply suggestions from code review * Update src/transformers/cache_utils.py * Update tests/utils/test_cache_utils.py * Apply suggestions from code review * Update * Change error message * Rename * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review --------- Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-19 16:52:30 +00:00
Matt	63c3116530	One more fix for reviewer assignment (#36829 ) * one more fix * one more fix * Trigger tests	2025-03-19 16:25:24 +00:00
Joao Gante	7c233980f4	[gemma 3] multimodal checkpoints + AutoModelForCausalLM (#36741 )	2025-03-19 15:04:19 +00:00
Yao Matrix	b11050d6a2	enable OffloadedCache on XPU from PyTorch 2.7 (#36654 ) * fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model * follow Marc's suggestion to use _tie_weights to fix Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * enable OffloadedCache on XPU since PyTorch 2.7 Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * don't change bart Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> * make code more concise per review comments Signed-off-by: N <matrix.yao@intel.com> * fix review comments Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> * Revert "fix review comments" This reverts commit `acf1484b86`. * fix review comments Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> * fix style Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> Signed-off-by: N <matrix.yao@intel.com> Co-authored-by: root <root@a4bf01945cfe.jf.intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-19 15:15:52 +01:00
Driss Guessous	e8d960329e	Add option for ao base configs (#36526 )	2025-03-19 14:59:47 +01:00
Arthur	fef8b7f8e9	Add attention visualization tool (#36630 ) * add utils fiel * style * nits * nits * update * updaets * update * fix init issues * big updates * nits * nits? * small updates * nites * there were still some models left * style * fixes * updates * nits _ fixes * push changes * update * update * update * Apply suggestions from code review Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * style * styling and return a string for testing * small updates * always biderectional for now * update --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2025-03-19 13:58:46 +01:00
Joao Gante	0fe0bae0a8	[Generation] remove leftover code from end-to-end compilation (#36685 )	2025-03-19 11:28:33 +00:00
Mohamed Mekkouri	a861db01e5	Fix Device map for bitsandbytes tests (#36800 ) fix	2025-03-19 11:57:13 +01:00
Yih-Dar	b9374a0763	Remove `dist": "loadfile"` for `pytest` in CircleCI jobs (#36811 ) * fasterrrrr * avoid crash in example jobs * avoid crash in TF example jobs --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-19 11:15:09 +01:00
Yao Matrix	4fa91b1be5	fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model (#36572 ) * fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model * follow Marc's suggestion to use _tie_weights to fix Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * fix review comments. Signed-off-by: N <matrix.yao@intel.com> * fix quality Signed-off-by: N <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Signed-off-by: N <matrix.yao@intel.com>	2025-03-19 10:48:47 +01:00
ivarflakstad	706703bba6	Expectations test utils (#36569 ) * Add expectation classes + tests * Use typing Union instead of \| * Use bits to track score in properties cmp method * Add exceptions and tests + comments * Remove compute cap minor as it is not needed currently * Simplify. Remove Properties class * Add example Exceptions usage * Expectations as dict subclass * Update example Exceptions usage * Refactor. Improve type name. Document score fn. * Rename to DeviceProperties.	2025-03-18 23:39:50 +01:00
Joao Gante	179d02ffb8	[generate] ✨ vectorized beam search ✨ (#35802 )	2025-03-18 18:39:36 +00:00
Yoni Gozlan	12f2ebef63	Support custom dosctrings in modular (#36726 ) * Override docstrings in modular if not none * Update doc	2025-03-18 14:00:54 -04:00
Gar	00915d3041	Fix chameleon's TypeError because inputs_embeds may None (#36673 ) * fix chameleon TypeError when inputs_embeds is None * reformat * hotfix	2025-03-18 18:59:30 +01:00
Marc Sun	14b597f518	Fix casting dtype for qunatization (#36799 ) * fix * remove print	2025-03-18 18:46:03 +01:00
Yoni Gozlan	30580f035b	Fix Mistral3 tests (#36797 ) * fix processor tests * fix modeling tests * fix test processor chat template * revert modeling test changes	2025-03-18 13:08:12 -04:00
Cyril Vallez	db1d4c5a0b	Loading optimizations (#36742 ) * improvements * Update modeling_utils.py * add some doc about loading * Update modeling_utils.py	2025-03-18 16:38:44 +01:00
Yih-Dar	7baf00089a	Update SHA for `tj-actions/changed-files` (#36795 ) * trigger * trigger --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-18 16:19:39 +01:00
Marc Sun	3017536ebf	fix hqq due to recent modeling changes (#36771 ) * fix-hqq * style * test	2025-03-18 12:20:27 +01:00
Cyril Vallez	e959530b8f	Add Mistral3 (#36790 ) * initial start * style and dummies * Create convert_mistral3_weights_to_hf.py * update * typo * typo * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * up * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * update * update * Update image_processing_mistral3.py * Update convert_mistral3_weights_to_hf.py * fix patch merger * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * up * update modular to fit * style * Update convert_mistral3_weights_to_hf.py * typo * Update modular_mistral3.py * simplify a lot all shape shenanigans * simplify * add working test processor * Add partially working common modeling tests * All tests working and remove mistral3 image processors * add docs and fixup * fix inference with image size >1540 * 🚨fix test image proc pixtral * Remove vision_feature_select_strategy * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * clean * fix test checkpoints * Update test_modeling_mistral3.py * Update test_modeling_mistral3.py * style * Use Pixtral processor * up * finish cleaning processor to use pixtral directly * Update __init__.py * Update processing_pixtral.py * doc * Update __init__.py * Update mistral3.md * Update _toctree.yml --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: yonigozlan <yoni.gozlan10@gmail.com>	2025-03-18 12:04:42 +01:00
Lysandre Debut	bd92073692	Fix gemma3_text tokenizer in mapping (#36793 )	2025-03-18 11:50:22 +01:00
Zebin	7426d02ea8	Fixing typo in gemma3 image_processor_fast and adding a small test (#36776 ) Co-authored-by: zebz13 <zeb@fedora> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-18 11:35:06 +01:00
Afanti	19b9d8ae13	chore: fix typos in tests directory (#36785 ) * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory	2025-03-18 10:31:13 +01:00
Afanti	7f5077e536	fix typos in the tests directory (#36717 )	2025-03-17 17:45:57 +00:00
Daniel Kleine	cbfb8d7b27	doc: Clarify `is_decoder` usage in PretrainedConfig documentation (#36724 ) * fix: clarify decoder usage in PretrainedConfig documentation * Apply suggestions from code review updated doc Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-03-17 09:40:25 -07:00
Steven Liu	ac1a1b66b9	[docs] Update README (#36265 ) * update * feedback * feedback * update versions	2025-03-17 09:37:19 -07:00
Joao Gante	cff4caa0c1	[CI] remove redundant checks in `test_eager_matches_sdpa_inference` (#36740 )	2025-03-17 16:29:18 +00:00
Christopher Akiki	e3af4fec91	[MINOR:TYPO] Update hubert.md (#36733 ) * [MINOR:TYPO] Update hubert.md - typo fix (wave2vec instead of hubert) - make code snippet copiable and runnable * Run tests	2025-03-17 09:07:51 -07:00

1 2 3 4 5 ...

18306 Commits