transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-08-03 03:31:05 +06:00

Author	SHA1	Message	Date
cyyever	ba531278ca	Add ruff target-version (#36971 )	2025-03-25 19:41:25 +01:00
Steven Liu	a844297088	[docs] Fix image link (#36869 ) * fix image link * fix * update * fix	2025-03-25 11:34:21 -07:00
cyyever	d68a91aebf	Remove extra tensor clone in PyTorch code (#36748 ) * Use detach().clone() * Eliminate continuous() * Merge clone and other calls with to * Merge clone and other calls with to	2025-03-25 17:42:15 +00:00
Yih-Dar	121830ab47	update examples after ruff being updated (#36972 ) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-25 18:15:47 +01:00
Sai-Suraj-27	a41677a68b	Updated docker files to use `uv` for installing packages (#36957 ) * Updated docker files to use uv pip install as uv is blazingly fast. * Removed -y flag for uv pip uninstall. * Passed --no-build-isolation flag --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-25 18:12:51 +01:00
NargiT	3dce98a437	typo fixed in README_fr.md (#36951 )	2025-03-25 09:29:36 -07:00
湛露先生	ebd2029483	Change GPUS to GPUs (#36945 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-25 17:25:39 +01:00
Yih-Dar	69632aadb7	Update after #36962 (#36965 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-25 16:16:06 +01:00
Yih-Dar	c6814b4ee8	Update ruff to `0.11.2` (#36962 ) * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-25 16:00:11 +01:00
Joao Gante	bc1c90a755	[Utils] torch version checks optionally accept dev versions (#36847 )	2025-03-25 10:58:58 +00:00
Marc Sun	80b4c5dcc9	Fix cuda index issue in cache allocator (#36937 ) fix	2025-03-25 11:51:41 +01:00
Raushan Turganbay	0f733110a6	Support `return_tensors` in audio chat templates (#34601 ) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky	2025-03-25 11:08:47 +01:00
Afanti	19085c28da	fix typos in the tests directory (#36932 ) * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: format codes	2025-03-25 10:49:24 +01:00
Guang Yang	69bcb86c58	Export for Phi4-mini (#36780 ) * Export for Phi4-mini * Update tests/models/phi3/test_modeling_phi3.py --------- Co-authored-by: Guang Yang <guangyang@fb.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-25 10:46:38 +01:00
Mohamed Mekkouri	be2c0e7bff	Fixing _pre_quantization_dtype when torch_dtype is None (#36930 ) fix	2025-03-25 10:43:27 +01:00
Cyril Vallez	4303d88c09	Add Phi4 multimodal (#36939 ) * raw start * update * update * add to imports * update * up * simplify configs * clean configs * style * typos * Update convert_phi4_multimodal_weights_to_hf.py * Update convert_phi4_multimodal_weights_to_hf.py * fix * up * up * up * Update convert_phi4_multimodal_weights_to_hf.py * Update convert_phi4_multimodal_weights_to_hf.py * up * up * up * Update feature_extraction_phi4_multimodal.py * up * up * up * up * up * simplify configs * typo * cut code * typo * typo * typo * re * typo * up * up * up * add tests * fix * fix * Update test_modeling_phi4_multimodal.py * up * Update test_modeling_phi4_multimodal.py * doc * fix * up * up * up * up * up * up * simplify * up * simplify * config docstrings * cleanup * clean * typo * typo * fix * Update phi4_multimodal.md * fix * fix * Update test_modeling_phi4_multimodal.py * update * simplify reshapes and permutes * up * simplify special tokens * simplify processor a lot * Update processing_phi4_multimodal.py * Update processing_phi4_multimodal.py * switch to fast processor * image processor * Update image_processing_phi4_multimodal_fast.py * add lora extraction to converter * Update convert_phi4_multimodal_weights_to_hf.py * Update __init__.py * add AudioInput type in audio_utils * rewrite feature_extraction: support torch batched FFT * input_audio_embeds -> audio_input_features, input_image_embeds -> image_pixel_values * test update * not mono channel warning update * remove auto maps from processor * kargs dispatch in processor * simplify kwargs dispatch * simplify merging * remove default sampling rate * style * Update test_modeling_phi4_multimodal.py * update doc * doc * torch only feature extractor * make fake tokens adjustable * Update feature_extraction_phi4_multimodal.py * fix * Update processing_phi4_multimodal.py * simplify mask * last touch * fix copies * style * Update audio_utils.py * style * Update feature_extraction_phi4_multimodal.py * Update __init__.py * docstrings * copies * fix all checks * back to fix-copies * trigger CIs * Update feature_extraction_phi4_multimodal.py * improve tests with multimodal inputs * trigger CIs --------- Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com>	2025-03-25 09:55:21 +01:00
Raushan Turganbay	47e5432805	Deprecate #36741 and map Causal to Conditional (#36917 ) * deprecate the prev fix * reword warning and update docs * reword warning * tests * dont bloat `get_text_config()`	2025-03-25 09:13:56 +01:00
Mohamed Mekkouri	2b8a15cc3f	Disallow Offload to disk for gguf files (#36933 ) update Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-24 19:30:01 +01:00
Yoni Gozlan	91455c1825	Fix processor kwargs qwen2 vl (#36890 ) * Fix qwen2_vl and qwen2_5_vl processors cutom images kwargs * change version warning	2025-03-24 13:19:26 -04:00
gautham	48385aa4f4	Added support for seed in `DataCollatorForWholeWordMask` (#36903 ) * Added support for seed in `DataCollatorForWholeWordMask`, and also wrote tests. Also fixed bugs where the code hardcoded values for mask replacement probability and random replacement probability, instead of using the values passed by the user. * formatting issues * Used better way to generate seed in TF. Made tests more consistent.	2025-03-24 16:57:17 +00:00
Yih-Dar	5932606d8e	More precise comment (#36935 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-24 17:03:09 +01:00
Pavel Iakubovskii	2be2984462	Fix pytorch defomr attn path (#36923 ) * Fix pytorch path for DeformableAttention * Apply for GroundingDino	2025-03-24 15:58:51 +00:00
cyyever	00d077267a	[2/N] Use pyupgrade --py39-plus to improve code (#36857 ) Use pyupgrade --py39-plus to improve code	2025-03-24 15:42:25 +00:00
Ethan Knights	a6ecb54159	Update `trainer_pt_utils.py` docstrings for consistency (#36912 ) * Update trainer_pt_utils.py * update docstrings trainer_pt_utils.py for consistency * Update src/transformers/trainer_pt_utils.py --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-03-24 14:46:41 +00:00
omahs	cbf924b76c	Fix typos (#36910 ) * fix typos * fix typos * fix typos * fix typos	2025-03-24 14:08:29 +00:00
Yih-Dar	340500b1a9	Use another repo. for Mistral3 processor testing (#36925 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-24 14:36:05 +01:00
Mohamed Mekkouri	9e125d9a2e	Fix Compressed tensors to_dict_diff (#36922 ) fix	2025-03-24 13:06:33 +01:00
Raushan Turganbay	57f551c78d	[chameleon] fix num image token check (#36918 ) * [chameleon] fix num image token check * embed after merging image token * skip this also * mistral require_read_token	2025-03-24 12:36:08 +01:00
Dmitry Rogozhkin	a41e08aa19	tests: fix asyncio.wait() usage for python>=3.11 (#36898 ) tests: fix asyncio.wait() usage for python>=3.7 Passing coroutings directly to `asyncio.wait()` is deprecated since python 3.8 and removed starting from python 3.11. Instead, it's required to explicitly wrap coroutine in the task with `asyncio.create_task()` which first appeared in python 3.7. We step into this issue running the following Transformers tests on a system with python 3.11 or later (for example, Ubuntu 24.04 has python 3.12): * `tests/trainer/test_trainer_distributed.py` * `tests/extended/test_trainer_ext.py` The error will be: ``` src/transformers/testing_utils.py:2380: in execute_subprocess_async result = loop.run_until_complete( /usr/lib/python3.12/asyncio/base_events.py:687: in run_until_complete return future.result() src/transformers/testing_utils.py:2368: in _stream_subprocess await asyncio.wait( ... E TypeError: Passing coroutines is forbidden, use tasks explicitly. ``` See: https://docs.python.org/3.10/library/asyncio-task.html#asyncio.wait See: https://docs.python.org/3.10/library/asyncio-task.html#asyncio.wait See: https://docs.python.org/3.7/library/asyncio-task.html#asyncio.create_task Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-24 11:53:59 +01:00
XinyuanTong	e28be7a692	[Fix] Add `original_max_position_embeddings` to YARN rope_scaling optional keys (#36877 ) [fix] Update optional keys in _validate_yarn_parameters to include original_max_position_embeddings	2025-03-24 11:05:19 +01:00
Raushan Turganbay	48da44be24	Fix torch version guard at import (#36907 ) fix	2025-03-24 10:33:33 +01:00
AbdelKarim ELJANDOUBI	fe4ca2f4a7	fix Gemma3 Config (#36893 ) * fix Gemma3 Config * fix config in modular gemm3	2025-03-24 10:05:44 +01:00
Aritra Roy Gosthipaty	c9d1e5238a	Update installation.md (#36826 ) * Update installation.md * Update README.md	2025-03-21 16:32:02 -07:00
Steven Liu	d253de6d58	[docs] Model docs (#36469 ) * initial * fix * fix * update * fix * fixes * quantization * attention mask visualizer * multimodal * small changes * fix code samples	2025-03-21 15:35:22 -07:00
Yoni Gozlan	beb9b5b022	Fix Pan and Scan on batched images Gemma3 (#36864 ) * process flattened images in fast image proc * process flattened images in low proc and add tests * remove print * add unbalanced batch test pas image proc * fix integration tests	2025-03-21 13:56:00 -04:00
Cyril Vallez	dd3933dd65	Simplify keep_in_fp32_modules logic (#36722 ) * better regex everywhere * fix * Update test_modeling_instructblip.py * BC with explanations this time otherwise it makes no sense at all * Update test_modeling_instructblip.py * style * CIs * update _keep_in_fp32_modules in blip2 * Update modeling_utils.py * Update modeling_utils.py * style * CIs * add check * trigger CIs * Update modeling_utils.py * trigger CIs	2025-03-21 16:12:59 +01:00
Sukriti Sharma	90e2df5d55	fix: loss computation after embeddings resize - mllama (#36840 ) * move loss to generation class Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * code cleanup Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * test for resize and loss computation Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix tests Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix:test for resize and loss Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix resize embedding mllama test Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * review changes Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> --------- Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>	2025-03-21 14:47:59 +01:00
Arthur Zucker	4542b8fb27	push v4.51.0.dev0	2025-03-21 13:45:25 +01:00
Raushan Turganbay	523f6e743c	Fix: dtype cannot be str (#36262 ) * fix * this wan't supposed to be here, revert * refine tests a bit more	2025-03-21 13:27:47 +01:00
Pablo Montalvo	3f9ff19b4e	Minor Gemma 3 fixes (#36884 ) fix attention mask dtype + outputs type	2025-03-21 13:15:22 +01:00
Daniël de Kok	f94b0c59f2	Use `deformable_detr` kernel from the Hub (#36853 ) * Use `deformable_detr` kernel from the Hub Remove the `deformable_detr` kernel from `kernels/` and use the pre-built kernel from the Hub instead. * Add license header * Add `kernels` as an extra `hub-kernels` Also add it to `testing`, so that the kernel replacement gets tested when using CUDA in CI.	2025-03-21 13:08:47 +01:00
Pablo Montalvo	2638d54e78	Gemma 3 tests expect greedy decoding (#36882 ) tests expect greedy decoding	2025-03-21 12:36:39 +01:00
Pablo Montalvo	b8aadc31d5	🔴 🔴 🔴 supersede paligemma forward to shift pos id indexing (#36859 ) * supersede paligemma forward to shift pos id indexing * fix prepare_inputs_ as well * fix modular error --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-03-21 12:36:27 +01:00
Arthur Zucker	6321876b5b	add eustlb as an actor	2025-03-21 12:32:12 +01:00
Joao Gante	94f487626a	[generate] model defaults being inherited only happens for newer models (#36881 )	2025-03-21 11:01:09 +00:00
Arthur	f19d018bff	Revert "Update deprecated Jax calls (#35919 )" (#36880 ) * Revert "Update deprecated Jax calls (#35919)" This reverts commit `f0d5b2ff04`. * Revert "Update deprecated Jax calls (#35919)" This reverts commit `f0d5b2ff04`. * udpate	2025-03-21 11:01:44 +01:00
sebbaur	62116c967f	Make ViTPooler configurable (#36517 ) * Make ViT Pooler configurable, so that it is possible to pick the activation function and the number of channels in the output * Add documentation and allow functions as activations (instead of just string) * formatting change * Use ACT2FN * Formatting change * Formatting changes * force pooler_act to be string * force pooler_act to be string * Add configs to OBJECTS_TO_IGNORE to make check_docstrings happy * Making the same change in ijepa to make check_modular_conversion happy * Add IJepaConfig to make CI happy * rename pooler_size to pooler_output_size as defined in the config * typo * revert change to ignore variable * Ran utils/check_docstrings.py --fix_and_overwrite * revert unrelated change * remove redundant defaults * rename self.act -> self.activation * tanh activation function in mapping	2025-03-21 11:01:07 +01:00
Afanti	26c83490d2	chore: fix typos in the tests directory (#36813 ) * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * fix: format codes * chore: fix copy mismatch issue * fix: format codes * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: restore previous words * chore: revert unexpected changes	2025-03-21 10:20:05 +01:00
regisss	0adbc873d0	Remove call to `.item` in `get_batch_samples` (#36861 )	2025-03-21 10:14:26 +01:00
Benjamin Bossan	6bb8565f0c	FIX FSDP plugin update for QLoRA (#36720 ) The _fsdp_qlora_plugin_updates checks for LoraConfig but other PEFT methods can also support quantized models, e.g. VeRA. Therefore, the isinstance check is now looking for PeftConfig in general. Moreover, the fsdp_plugin variable may be undefined in the 2nd if condition, leading to an `UnboundLocalError` error. This is fixed by not assigning the variable at all. I checked for tests that may need updating but only found test_fsdp_config_transformers_auto_wrap associated with this change. AFAICT, this test does not cover the changed code, since the test does not start the training loop. Therefore, I haven't updated any tests. LMK if/how this fix should be tested. Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-21 10:11:47 +01:00

1 2 3 4 5 ...

18421 Commits