transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
yaswant19	99f6e5ee0b	Preliminary tests	2025-03-26 18:12:33 +05:30
yaswant19	544c19c9b1	update	2025-03-26 18:12:20 +05:30
yaswant19	16f0b92887	nits	2025-03-26 16:27:36 +05:30
yaswant19	2af7afb7c4	Minor fixes	2025-03-26 16:23:34 +05:30
yaswant19	ce81e5eb25	Added Text Model	2025-03-26 16:23:24 +05:30
yaswant19	c22841e91d	Refactor for lit variant	2025-03-26 16:20:33 +05:30
yaswant19	f59ac3b818	update	2025-03-26 16:20:13 +05:30
yaswant19	21b9231ba8	Added vison model	2025-03-26 16:19:56 +05:30
yaswant19	999be2a198	Added config and refactor	2025-03-23 00:25:33 +05:30
yaswant19	11ed21d6b4	Stupid mistake correction	2025-03-22 23:09:13 +05:30
yaswant19	4a0b442444	More changes	2025-03-22 23:06:17 +05:30
yaswant19	20f43aab77	More changes	2025-03-22 23:06:17 +05:30
yaswant19	daac33803a	Added support for aimv2-native	2025-03-22 23:06:17 +05:30
yaswant19	429e5b6348	changes	2025-03-22 23:06:17 +05:30
yaswant19	58edea8294	temp push	2025-03-22 23:06:17 +05:30
yaswanth	c63b292a0c	changes	2025-03-22 23:06:17 +05:30
yaswanth	8198d49871	Model skelton	2025-03-22 23:06:17 +05:30
Aritra Roy Gosthipaty	c9d1e5238a	Update installation.md (#36826 ) * Update installation.md * Update README.md	2025-03-21 16:32:02 -07:00
Steven Liu	d253de6d58	[docs] Model docs (#36469 ) * initial * fix * fix * update * fix * fixes * quantization * attention mask visualizer * multimodal * small changes * fix code samples	2025-03-21 15:35:22 -07:00
Yoni Gozlan	beb9b5b022	Fix Pan and Scan on batched images Gemma3 (#36864 ) * process flattened images in fast image proc * process flattened images in low proc and add tests * remove print * add unbalanced batch test pas image proc * fix integration tests	2025-03-21 13:56:00 -04:00
Cyril Vallez	dd3933dd65	Simplify keep_in_fp32_modules logic (#36722 ) * better regex everywhere * fix * Update test_modeling_instructblip.py * BC with explanations this time otherwise it makes no sense at all * Update test_modeling_instructblip.py * style * CIs * update _keep_in_fp32_modules in blip2 * Update modeling_utils.py * Update modeling_utils.py * style * CIs * add check * trigger CIs * Update modeling_utils.py * trigger CIs	2025-03-21 16:12:59 +01:00
Sukriti Sharma	90e2df5d55	fix: loss computation after embeddings resize - mllama (#36840 ) * move loss to generation class Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * code cleanup Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * test for resize and loss computation Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix tests Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix:test for resize and loss Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix resize embedding mllama test Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * review changes Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> --------- Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>	2025-03-21 14:47:59 +01:00
Arthur Zucker	4542b8fb27	push v4.51.0.dev0	2025-03-21 13:45:25 +01:00
Raushan Turganbay	523f6e743c	Fix: dtype cannot be str (#36262 ) * fix * this wan't supposed to be here, revert * refine tests a bit more	2025-03-21 13:27:47 +01:00
Pablo Montalvo	3f9ff19b4e	Minor Gemma 3 fixes (#36884 ) fix attention mask dtype + outputs type	2025-03-21 13:15:22 +01:00
Daniël de Kok	f94b0c59f2	Use `deformable_detr` kernel from the Hub (#36853 ) * Use `deformable_detr` kernel from the Hub Remove the `deformable_detr` kernel from `kernels/` and use the pre-built kernel from the Hub instead. * Add license header * Add `kernels` as an extra `hub-kernels` Also add it to `testing`, so that the kernel replacement gets tested when using CUDA in CI.	2025-03-21 13:08:47 +01:00
Pablo Montalvo	2638d54e78	Gemma 3 tests expect greedy decoding (#36882 ) tests expect greedy decoding	2025-03-21 12:36:39 +01:00
Pablo Montalvo	b8aadc31d5	🔴 🔴 🔴 supersede paligemma forward to shift pos id indexing (#36859 ) * supersede paligemma forward to shift pos id indexing * fix prepare_inputs_ as well * fix modular error --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-03-21 12:36:27 +01:00
Arthur Zucker	6321876b5b	add eustlb as an actor	2025-03-21 12:32:12 +01:00
Joao Gante	94f487626a	[generate] model defaults being inherited only happens for newer models (#36881 )	2025-03-21 11:01:09 +00:00
Arthur	f19d018bff	Revert "Update deprecated Jax calls (#35919 )" (#36880 ) * Revert "Update deprecated Jax calls (#35919)" This reverts commit `f0d5b2ff04`. * Revert "Update deprecated Jax calls (#35919)" This reverts commit `f0d5b2ff04`. * udpate	2025-03-21 11:01:44 +01:00
sebbaur	62116c967f	Make ViTPooler configurable (#36517 ) * Make ViT Pooler configurable, so that it is possible to pick the activation function and the number of channels in the output * Add documentation and allow functions as activations (instead of just string) * formatting change * Use ACT2FN * Formatting change * Formatting changes * force pooler_act to be string * force pooler_act to be string * Add configs to OBJECTS_TO_IGNORE to make check_docstrings happy * Making the same change in ijepa to make check_modular_conversion happy * Add IJepaConfig to make CI happy * rename pooler_size to pooler_output_size as defined in the config * typo * revert change to ignore variable * Ran utils/check_docstrings.py --fix_and_overwrite * revert unrelated change * remove redundant defaults * rename self.act -> self.activation * tanh activation function in mapping	2025-03-21 11:01:07 +01:00
Afanti	26c83490d2	chore: fix typos in the tests directory (#36813 ) * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * fix: format codes * chore: fix copy mismatch issue * fix: format codes * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: restore previous words * chore: revert unexpected changes	2025-03-21 10:20:05 +01:00
regisss	0adbc873d0	Remove call to `.item` in `get_batch_samples` (#36861 )	2025-03-21 10:14:26 +01:00
Benjamin Bossan	6bb8565f0c	FIX FSDP plugin update for QLoRA (#36720 ) The _fsdp_qlora_plugin_updates checks for LoraConfig but other PEFT methods can also support quantized models, e.g. VeRA. Therefore, the isinstance check is now looking for PeftConfig in general. Moreover, the fsdp_plugin variable may be undefined in the 2nd if condition, leading to an `UnboundLocalError` error. This is fixed by not assigning the variable at all. I checked for tests that may need updating but only found test_fsdp_config_transformers_auto_wrap associated with this change. AFAICT, this test does not cover the changed code, since the test does not start the training loop. Therefore, I haven't updated any tests. LMK if/how this fix should be tested. Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-21 10:11:47 +01:00
Joao Gante	949cca4061	[CI] doc builder without custom image (#36862 ) * no image * test * revert jax version updates * make fixup * update autodoc path for model_addition_debugger * shieldgemma2 * add missing pages to toctree	2025-03-21 09:10:27 +00:00
Raushan Turganbay	97d2f9d8ae	Mllama: raise better error (#35934 ) * fix mllama * update test * fix test	2025-03-21 09:35:37 +01:00
Yoni Gozlan	6a2627918d	Refactor Aya Vision with modular (#36688 ) * refactor aya_vision with modular (incorrect docstring) * Fix docstrings * Fix other modulars * fix docstring * revert changes * add tie_weights and resize_token_embeddings	2025-03-20 15:34:56 -04:00
gautham	9e771bf402	Add support for seed in `DataCollatorForLanguageModeling` (#36497 ) Add support for `seed` in `DataCollatorForLanguageModeling`. Also wrote tests for verifying behaviour.	2025-03-20 18:27:43 +00:00
Joao Gante	ecd60d01c3	[CI] fix update metadata job (#36850 ) fix updata_metadata job	2025-03-20 17:17:36 +00:00
Raushan Turganbay	42c489f2ae	Gemma3: fix test (#36820 ) * fix test * require_read_token and public repo ids * flash-attn test uncomment * fix torchscript	2025-03-20 18:14:53 +01:00
Marc Sun	068b663f90	[torchao] revert to get_apply_tensor_subclass (#36849 ) * revert to old name * empty commit --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-03-20 18:00:13 +01:00
Pablo Montalvo	1d3f35f30a	Add model visual debugger (#36798 ) * draft of model tracer visualiser * add context manager in addition to decorator * add debug utils to init * move model debugging utils to dedicated file * add documentation * protect some imports * format * move and protect imports * format * doc: improve errors in case of broken dummy imports. * format * use automatic torch backend * update doc * fix backend * (TEMP) move to dummies while backend wait * update documentation * doc	2025-03-20 17:37:29 +01:00
Haotong LIN	6515c25953	Add Prompt Depth Anything Model (#35401 ) * add prompt depth anything model by modular transformer * add prompt depth anything docs and imports * update code style according transformers doc * update code style: import order issue is fixed by custom_init_isort * fix depth shape from B,1,H,W to B,H,W which is as the same as Depth Anything * move prompt depth anything to vision models in _toctree.yml * update backbone test; there is no need for resnet18 backbone test * update init file & pass RUN_SLOW tests * update len(prompt_depth) to prompt_depth.shape[0] Co-authored-by: Joshua Lochner <admin@xenova.com> * fix torch_int/model_doc * fix typo * update PromptDepthAnythingImageProcessor * fix typo * fix typo for prompt depth anything doc * update promptda overview image link of huggingface repo * fix some typos in promptda doc * Update image processing to include pad_image, prompt depth position, and related explanations for better clarity and functionality. * add copy disclaimer for prompt depth anything image processing * fix some format typos in image processing and conversion scripts * fix nn.ReLU(False) to nn.ReLU() * rename residual layer as it's a sequential layer * move size compute to a separate line/variable for easier debug in modular prompt depth anything * fix modular format for prompt depth anything * update modular prompt depth anything * fix scale to meter and some internal funcs warp * fix code style in image_processing_prompt_depth_anything.py * fix issues in image_processing_prompt_depth_anything.py * fix issues in image_processing_prompt_depth_anything.py * fix issues in prompt depth anything * update converting script similar to mllamma * update testing for modeling prompt depth anything * update testing for image_processing_prompt_depth_anything * fix assertion in image_processing_prompt_depth_anything * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update docs/source/en/model_doc/prompt_depth_anything.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update docs/source/en/model_doc/prompt_depth_anything.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * update some testing * fix testing * fix * add return doc for forward of prompt depth anything * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update tests/models/prompt_depth_anything/test_modeling_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix prompt depth order * fix format for testing prompt depth anything * fix minor issues in prompt depth anything doc * fix format for modular prompt depth anything * revert format for modular prompt depth anything * revert format for modular prompt depth anything * update format for modular prompt depth anything * fix parallel testing errors * fix doc for prompt depth anything * Add header * Fix imports * Licence header --------- Co-authored-by: Joshua Lochner <admin@xenova.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-03-20 16:12:44 +00:00
Pavel Iakubovskii	66291778dd	Refactor Attention implementation for ViT-based models (#36545 ) * Refactor vit attention * Refactor ViT-based models * 🚨🚨🚨 Fix prefix for DPT * Update params order * trigger tests * Fix Dinov2 attention * Fix DPT attention impl propagation for backbone config * Common test fix: config is modif. inplace - avoid it * view->reshape * Fixup * Fixup * Enable IJepa FA2 * Add FA2 in corresponding model docs	2025-03-20 15:15:01 +00:00
inkcherry	730d2a52e7	DeepSpeed tensor parallel+ZeRO (#36825 ) add ds tp change	2025-03-20 16:12:01 +01:00
fxmarty-amd	1a374799ce	Support loading Quark quantized models in Transformers (#36372 ) * add quark quantizer * add quark doc * clean up doc * fix tests * make style * more style fixes * cleanup imports * cleaning * precise install * Update docs/source/en/quantization/quark.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/quark_integration/test_quark.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/utils/quantization_config.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * remove import guard as suggested * update copyright headers * add quark to transformers-quantization-latest-gpu Dockerfile * make tests pass on transformers main + quark==0.7 * add missing F8_E4M3 and F8_E5M2 keys from str_to_torch_dtype --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Bowen Bao <bowenbao@amd.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-03-20 15:40:51 +01:00
cyyever	ce091b1bda	Use pyupgrade --py39-plus to improve code (#36843 )	2025-03-20 14:39:44 +00:00
mobicham	3e8f0fbf44	Fix hqq skipped modules and dynamic quant (#36821 ) * Fix hqq skip_modules and dynamic_quant * fix skipped modules loading * add dynamic/skip HqqConfig test	2025-03-20 15:31:49 +01:00
Ella Charlaix	055afdb6bb	Fix ONNX export for sequence classification head (#36332 ) * set dtype to int32 * fix style	2025-03-20 14:22:48 +00:00

1 2 3 4 5 ...

18356 Commits