transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-13 17:48:22 +06:00

Author	SHA1	Message	Date
Raushan Turganbay	523f6e743c	Fix: dtype cannot be str (#36262 ) * fix * this wan't supposed to be here, revert * refine tests a bit more	2025-03-21 13:27:47 +01:00
Pablo Montalvo	3f9ff19b4e	Minor Gemma 3 fixes (#36884 ) fix attention mask dtype + outputs type	2025-03-21 13:15:22 +01:00
Daniël de Kok	f94b0c59f2	Use `deformable_detr` kernel from the Hub (#36853 ) * Use `deformable_detr` kernel from the Hub Remove the `deformable_detr` kernel from `kernels/` and use the pre-built kernel from the Hub instead. * Add license header * Add `kernels` as an extra `hub-kernels` Also add it to `testing`, so that the kernel replacement gets tested when using CUDA in CI.	2025-03-21 13:08:47 +01:00
Pablo Montalvo	2638d54e78	Gemma 3 tests expect greedy decoding (#36882 ) tests expect greedy decoding	2025-03-21 12:36:39 +01:00
Pablo Montalvo	b8aadc31d5	🔴 🔴 🔴 supersede paligemma forward to shift pos id indexing (#36859 ) * supersede paligemma forward to shift pos id indexing * fix prepare_inputs_ as well * fix modular error --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-03-21 12:36:27 +01:00
Arthur Zucker	6321876b5b	add eustlb as an actor	2025-03-21 12:32:12 +01:00
Joao Gante	94f487626a	[generate] model defaults being inherited only happens for newer models (#36881 )	2025-03-21 11:01:09 +00:00
Arthur	f19d018bff	Revert "Update deprecated Jax calls (#35919 )" (#36880 ) * Revert "Update deprecated Jax calls (#35919)" This reverts commit `f0d5b2ff04`. * Revert "Update deprecated Jax calls (#35919)" This reverts commit `f0d5b2ff04`. * udpate	2025-03-21 11:01:44 +01:00
sebbaur	62116c967f	Make ViTPooler configurable (#36517 ) * Make ViT Pooler configurable, so that it is possible to pick the activation function and the number of channels in the output * Add documentation and allow functions as activations (instead of just string) * formatting change * Use ACT2FN * Formatting change * Formatting changes * force pooler_act to be string * force pooler_act to be string * Add configs to OBJECTS_TO_IGNORE to make check_docstrings happy * Making the same change in ijepa to make check_modular_conversion happy * Add IJepaConfig to make CI happy * rename pooler_size to pooler_output_size as defined in the config * typo * revert change to ignore variable * Ran utils/check_docstrings.py --fix_and_overwrite * revert unrelated change * remove redundant defaults * rename self.act -> self.activation * tanh activation function in mapping	2025-03-21 11:01:07 +01:00
Afanti	26c83490d2	chore: fix typos in the tests directory (#36813 ) * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * fix: format codes * chore: fix copy mismatch issue * fix: format codes * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: restore previous words * chore: revert unexpected changes	2025-03-21 10:20:05 +01:00
regisss	0adbc873d0	Remove call to `.item` in `get_batch_samples` (#36861 )	2025-03-21 10:14:26 +01:00
Benjamin Bossan	6bb8565f0c	FIX FSDP plugin update for QLoRA (#36720 ) The _fsdp_qlora_plugin_updates checks for LoraConfig but other PEFT methods can also support quantized models, e.g. VeRA. Therefore, the isinstance check is now looking for PeftConfig in general. Moreover, the fsdp_plugin variable may be undefined in the 2nd if condition, leading to an `UnboundLocalError` error. This is fixed by not assigning the variable at all. I checked for tests that may need updating but only found test_fsdp_config_transformers_auto_wrap associated with this change. AFAICT, this test does not cover the changed code, since the test does not start the training loop. Therefore, I haven't updated any tests. LMK if/how this fix should be tested. Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-21 10:11:47 +01:00
Joao Gante	949cca4061	[CI] doc builder without custom image (#36862 ) * no image * test * revert jax version updates * make fixup * update autodoc path for model_addition_debugger * shieldgemma2 * add missing pages to toctree	2025-03-21 09:10:27 +00:00
Raushan Turganbay	97d2f9d8ae	Mllama: raise better error (#35934 ) * fix mllama * update test * fix test	2025-03-21 09:35:37 +01:00
Yoni Gozlan	6a2627918d	Refactor Aya Vision with modular (#36688 ) * refactor aya_vision with modular (incorrect docstring) * Fix docstrings * Fix other modulars * fix docstring * revert changes * add tie_weights and resize_token_embeddings	2025-03-20 15:34:56 -04:00
gautham	9e771bf402	Add support for seed in `DataCollatorForLanguageModeling` (#36497 ) Add support for `seed` in `DataCollatorForLanguageModeling`. Also wrote tests for verifying behaviour.	2025-03-20 18:27:43 +00:00
Joao Gante	ecd60d01c3	[CI] fix update metadata job (#36850 ) fix updata_metadata job	2025-03-20 17:17:36 +00:00
Raushan Turganbay	42c489f2ae	Gemma3: fix test (#36820 ) * fix test * require_read_token and public repo ids * flash-attn test uncomment * fix torchscript	2025-03-20 18:14:53 +01:00
Marc Sun	068b663f90	[torchao] revert to get_apply_tensor_subclass (#36849 ) * revert to old name * empty commit --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-03-20 18:00:13 +01:00
Pablo Montalvo	1d3f35f30a	Add model visual debugger (#36798 ) * draft of model tracer visualiser * add context manager in addition to decorator * add debug utils to init * move model debugging utils to dedicated file * add documentation * protect some imports * format * move and protect imports * format * doc: improve errors in case of broken dummy imports. * format * use automatic torch backend * update doc * fix backend * (TEMP) move to dummies while backend wait * update documentation * doc	2025-03-20 17:37:29 +01:00
Haotong LIN	6515c25953	Add Prompt Depth Anything Model (#35401 ) * add prompt depth anything model by modular transformer * add prompt depth anything docs and imports * update code style according transformers doc * update code style: import order issue is fixed by custom_init_isort * fix depth shape from B,1,H,W to B,H,W which is as the same as Depth Anything * move prompt depth anything to vision models in _toctree.yml * update backbone test; there is no need for resnet18 backbone test * update init file & pass RUN_SLOW tests * update len(prompt_depth) to prompt_depth.shape[0] Co-authored-by: Joshua Lochner <admin@xenova.com> * fix torch_int/model_doc * fix typo * update PromptDepthAnythingImageProcessor * fix typo * fix typo for prompt depth anything doc * update promptda overview image link of huggingface repo * fix some typos in promptda doc * Update image processing to include pad_image, prompt depth position, and related explanations for better clarity and functionality. * add copy disclaimer for prompt depth anything image processing * fix some format typos in image processing and conversion scripts * fix nn.ReLU(False) to nn.ReLU() * rename residual layer as it's a sequential layer * move size compute to a separate line/variable for easier debug in modular prompt depth anything * fix modular format for prompt depth anything * update modular prompt depth anything * fix scale to meter and some internal funcs warp * fix code style in image_processing_prompt_depth_anything.py * fix issues in image_processing_prompt_depth_anything.py * fix issues in image_processing_prompt_depth_anything.py * fix issues in prompt depth anything * update converting script similar to mllamma * update testing for modeling prompt depth anything * update testing for image_processing_prompt_depth_anything * fix assertion in image_processing_prompt_depth_anything * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update docs/source/en/model_doc/prompt_depth_anything.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update docs/source/en/model_doc/prompt_depth_anything.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * update some testing * fix testing * fix * add return doc for forward of prompt depth anything * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update tests/models/prompt_depth_anything/test_modeling_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix prompt depth order * fix format for testing prompt depth anything * fix minor issues in prompt depth anything doc * fix format for modular prompt depth anything * revert format for modular prompt depth anything * revert format for modular prompt depth anything * update format for modular prompt depth anything * fix parallel testing errors * fix doc for prompt depth anything * Add header * Fix imports * Licence header --------- Co-authored-by: Joshua Lochner <admin@xenova.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-03-20 16:12:44 +00:00
Pavel Iakubovskii	66291778dd	Refactor Attention implementation for ViT-based models (#36545 ) * Refactor vit attention * Refactor ViT-based models * 🚨🚨🚨 Fix prefix for DPT * Update params order * trigger tests * Fix Dinov2 attention * Fix DPT attention impl propagation for backbone config * Common test fix: config is modif. inplace - avoid it * view->reshape * Fixup * Fixup * Enable IJepa FA2 * Add FA2 in corresponding model docs	2025-03-20 15:15:01 +00:00
inkcherry	730d2a52e7	DeepSpeed tensor parallel+ZeRO (#36825 ) add ds tp change	2025-03-20 16:12:01 +01:00
fxmarty-amd	1a374799ce	Support loading Quark quantized models in Transformers (#36372 ) * add quark quantizer * add quark doc * clean up doc * fix tests * make style * more style fixes * cleanup imports * cleaning * precise install * Update docs/source/en/quantization/quark.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/quark_integration/test_quark.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/utils/quantization_config.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * remove import guard as suggested * update copyright headers * add quark to transformers-quantization-latest-gpu Dockerfile * make tests pass on transformers main + quark==0.7 * add missing F8_E4M3 and F8_E5M2 keys from str_to_torch_dtype --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Bowen Bao <bowenbao@amd.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-03-20 15:40:51 +01:00
cyyever	ce091b1bda	Use pyupgrade --py39-plus to improve code (#36843 )	2025-03-20 14:39:44 +00:00
mobicham	3e8f0fbf44	Fix hqq skipped modules and dynamic quant (#36821 ) * Fix hqq skip_modules and dynamic_quant * fix skipped modules loading * add dynamic/skip HqqConfig test	2025-03-20 15:31:49 +01:00
Ella Charlaix	055afdb6bb	Fix ONNX export for sequence classification head (#36332 ) * set dtype to int32 * fix style	2025-03-20 14:22:48 +00:00
Ryan Mullins	487dab1b2b	Shieldgemma2 (#36678 ) * single commit * correct config * fixup * dummy pt * Use ShieldGemma2Config in conversion script * Update src/transformers/models/shieldgemma2/configuration_shieldgemma2.py * Adding shieldgemma2 to models.__init__.py * Adding ShieldGemma2 to main __init__.py * Update shieldgemma2.md * Update shieldgemma2.md * Adding tests. Addressing review feedback. * Minor docs update * Fixing code quality feedback from CI * Fixing empty messages bug reported by ghunkins --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Ren Pang <ain-soph@live.com>	2025-03-20 15:14:38 +01:00
HuangBugWei	a63e92e2f0	Fix: remove the redundant snippet of _whole_word_mask (#36759 ) remove the redundant snippet of _whole_word_mask	2025-03-20 14:10:43 +00:00
Ryan Mullins	8124a234ca	Gemma 3: Adding explicit GenerationConfig and refactoring conversion … (#36833 ) Gemma 3: Adding explicit GenerationConfig and refactoring conversion script	2025-03-20 15:03:32 +01:00
Pavel Iakubovskii	cf8091c017	Fix import for torch 2.0, 2.1 - guard typehint for "device_mesh" (#36768 ) * Fix device_mesh * Remove rebase leftover	2025-03-20 11:55:47 +00:00
Marc Sun	388e6659bf	Update min safetensors bis (#36823 ) * update setup.py * style	2025-03-20 12:50:07 +01:00
Joao Gante	b47d9b2f8a	[generate] clarify docstrings: when to inherit `GenerationMixin` (#36605 )	2025-03-20 10:58:54 +00:00
Joao Gante	8e97b44087	[modular] Sort modular skips (#36304 )	2025-03-20 10:55:12 +00:00
Artem Kudisov	63380b77d4	Pass state dict (#35234 ) * Pass state_dict argument to get_peft_model_state_dict * Style fix * Change arguments order	2025-03-20 11:54:59 +01:00
Joao Gante	957b05b413	[qwen2 audio] remove redundant code and update docs (#36282 )	2025-03-20 10:54:51 +00:00
rasmi	f0d5b2ff04	Update deprecated Jax calls (#35919 ) * Remove deprecated arguments for jax.numpy.clip. * Remove deprecated arguments for jax.numpy.clip. * Update jax version to 0.4.27 to 0.4.38. * Avoid use of deprecated xla_bridge.get_backend().platform Co-authored-by: Jake Vanderplas <jakevdp@google.com> --------- Co-authored-by: Jake Vanderplas <jakevdp@google.com>	2025-03-20 11:51:51 +01:00
Pavel Iakubovskii	1ddb64937c	Fix fp16 ONNX export for RT-DETR and RT-DETRv2 (#36460 ) * Fix FP16 ONNX export * Fix typo * Sync omdet-turbo * Refactor encoder for better readability * Fix _no_split_modules * Fix int -> torch_int * Fix rt_detr * Apply to rt-detr-v2 * Fixup * Fix copies	2025-03-20 10:43:51 +00:00
AbdelKarim ELJANDOUBI	e7337ee7be	Pass num_items_in_batch directly to loss computation (#36753 ) * Pass num_items_in_batch directly to loss computation * use self loss instead * fix loss kwrgs * fix vocab size	2025-03-20 10:35:35 +00:00
yutong_liu	8b479e39bb	Saving `Trainer.collator.tokenizer` in when `Trainer.processing_class` is `None` (#36552 ) * feat: Saving tokenizer in collator when processing_class is None * chore: Style issue * chore: Typo * dbg: Check why test failed * dbg: Remove logics and another test failed which successed before, so should be the stablibility issue * test: Init unit-test * chore: Style * chore: Add err log * fix: Case * Update tests/trainer/test_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * chore: Try to use get_regression_trainer * fix: Impl and style * fix: Style * fix: Case * fix: Import err * fix: Missed import * fix: Import block un-sorted problem * fix: Try another tokenizer * fix: Test logic * chore: Light updates * chore: Reformat --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-20 11:27:47 +01:00
Ita Zaporozhets	3f03c379d2	fix tiktoken convert to pass AddedToken to Tokenizer (#36566 ) * pass AddedToken to Tokenizer * ruff * handle dict for special tokens * option: test tokenizer from tiktoken same as fast * ruff * ruff	2025-03-20 11:26:49 +01:00
Stas Bekman	8f64b177f6	[ForCausalLMLoss] allow users to pass shifted labels (#36607 ) * [ForCausalLMLoss] allow users to pass shifted labels Signed-off-by: Stas Bekman <stas@stason.org> * style Signed-off-by: Stas Bekman <stas@stason.org> --------- Signed-off-by: Stas Bekman <stas@stason.org>	2025-03-20 11:25:22 +01:00
HDCharles	94555437e2	Disable inductor config setter by default (#36608 ) * Disable inductor config setter by default This is hard to debug and should be off by default * remove default settings in autoquant too * Add info to torchao.md about recommended settings * satisfying Ruff format Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-20 11:23:14 +01:00
Ze-Yi LIN	8733297b41	Fix swanlab global step (#36728 ) * fix * global step	2025-03-20 11:13:37 +01:00
Quentin Gallouédec	b815fae359	Move the warning to the documentation for DataCollatorWithFlattening (#36707 ) Remove init warning	2025-03-20 11:09:57 +01:00
Matt	9be4728af8	Just import torch AdamW instead (#36177 ) * Just import torch AdamW instead * Update docs too * Make AdamW undocumented * make fixup * Add a basic wrapper class * Add it back to the docs * Just remove AdamW entirely * Remove some AdamW references * Drop AdamW from the public init * make fix-copies * Cleanup some references * make fixup * Delete lots of transformers.AdamW references * Remove extra references to adamw_hf	2025-03-19 18:29:40 +00:00
Michael Feil	51bd0ceb9e	Update configuration_qwen2.py (#36735 ) * Update configuration_qwen2_moe.py * Update modeling_qwen2_moe.py * ruff fmt * docstring add qkv_bias	2025-03-19 18:15:54 +00:00
JJJYmmm	107fedc1e2	quick fix fast_image_processor register error (#36716 ) * fix fast_image_processor register error * update error message * remove redundant import * fix format	2025-03-19 18:05:45 +00:00
Mohamed Mekkouri	258dd9cc69	Add Space to Bitsandbytes doc (#36834 ) * add space * address review	2025-03-19 18:56:07 +01:00
Tugsbayasgalan Manlaibaatar	f39f4960f3	Support tracable dynamicKVcache (#36311 ) * Support tracable dynamicKVcache * Fix lint * More fine grained test * Lint * Update * Update * Fix up * Apply suggestions from code review * Update src/transformers/cache_utils.py * Update tests/utils/test_cache_utils.py * Apply suggestions from code review * Update * Change error message * Rename * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review --------- Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-19 16:52:30 +00:00

... 20 21 22 23 24 ...

19383 Commits