transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-13 17:48:22 +06:00

Author	SHA1	Message	Date
Marc Sun	79d6f9fd70	Log the correct learning rate (#36973 ) * fix learning rate log * fix lr log * add lr	2025-03-26 16:52:00 +01:00
Mohamed Mekkouri	13d36e89fe	Fix device_map check for ggml files (#37003 ) fix	2025-03-26 16:24:57 +01:00
Josh Marshall	021006e1b0	Fix removing "cpu" from frozenset in bitsandbytes.py to allow better ROCm support. (#36975 ) * Fix removing "cpu" from frozenset in bitsandbytes.py to allow better ROCm support. Related to https://github.com/bitsandbytes-foundation/bitsandbytes/issues/1573 and https://github.com/huggingface/transformers/issues/36949 , this resolves a bug in allowing ROCm/HIP support in bitsandbytes. * Related to bitsandbytes-foundation/bitsandbytes#1573 and huggingface#36949 , this resolves a bug in the biteandbytes integration, allowing ROCm/HIP support in bitsandbytes. --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-03-26 16:18:08 +01:00
Cyril Vallez	788e1092e9	Allow easy registration of custom attention functions (#36889 ) * Update modeling_utils.py * style * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * add to init * Update modeling_utils.py * style * update * Update modeling_utils.py * Update modeling_utils.py * style * Add some doc * Update _toctree.yml * readd it for tgi/vllm compat * CIs * CIs	2025-03-26 16:15:06 +01:00
ivarflakstad	ad5d40de9c	Fix get_device_properties (#36997 ) Fix remove remnant self from get_device_properties Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-26 15:46:34 +01:00
cyyever	8084b26294	Fix Optional type annotation (#36841 ) * Fix annotation * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-26 13:53:44 +00:00
Yih-Dar	b56d8f07e4	Install `networkx==3.2.1` manually in some CircleCI jobs after #36957 (#37000 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-26 14:49:09 +01:00
cyyever	78afa1c537	Use torch.expm1 (#36995 )	2025-03-26 13:06:33 +00:00
Yih-Dar	181d453069	byebye CircleCI TF jobs (#36998 ) * byebye tf jobs * byebye tf jobs --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-26 12:49:50 +01:00
cyyever	e7139d06f5	Fix tensor dtype mismatch (#36985 ) * Fix tensor dtype mismatch * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-26 10:37:46 +01:00
Yoni Gozlan	be37d34f44	🚨Deprecate legacy argument for image-text-to-text models and adopt new behavior by default (#36307 ) * deprecate legacy argument and adopt new behavior by default * revert back modification git	2025-03-25 17:32:17 -04:00
Yih-Dar	ab4656f6b7	update bot comment again (#36974 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-25 19:42:09 +01:00
cyyever	ba531278ca	Add ruff target-version (#36971 )	2025-03-25 19:41:25 +01:00
Steven Liu	a844297088	[docs] Fix image link (#36869 ) * fix image link * fix * update * fix	2025-03-25 11:34:21 -07:00
cyyever	d68a91aebf	Remove extra tensor clone in PyTorch code (#36748 ) * Use detach().clone() * Eliminate continuous() * Merge clone and other calls with to * Merge clone and other calls with to	2025-03-25 17:42:15 +00:00
Yih-Dar	121830ab47	update examples after ruff being updated (#36972 ) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-25 18:15:47 +01:00
Sai-Suraj-27	a41677a68b	Updated docker files to use `uv` for installing packages (#36957 ) * Updated docker files to use uv pip install as uv is blazingly fast. * Removed -y flag for uv pip uninstall. * Passed --no-build-isolation flag --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-25 18:12:51 +01:00
NargiT	3dce98a437	typo fixed in README_fr.md (#36951 )	2025-03-25 09:29:36 -07:00
湛露先生	ebd2029483	Change GPUS to GPUs (#36945 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-25 17:25:39 +01:00
Yih-Dar	69632aadb7	Update after #36962 (#36965 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-25 16:16:06 +01:00
Yih-Dar	c6814b4ee8	Update ruff to `0.11.2` (#36962 ) * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-25 16:00:11 +01:00
Joao Gante	bc1c90a755	[Utils] torch version checks optionally accept dev versions (#36847 )	2025-03-25 10:58:58 +00:00
Marc Sun	80b4c5dcc9	Fix cuda index issue in cache allocator (#36937 ) fix	2025-03-25 11:51:41 +01:00
Raushan Turganbay	0f733110a6	Support `return_tensors` in audio chat templates (#34601 ) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky	2025-03-25 11:08:47 +01:00
Afanti	19085c28da	fix typos in the tests directory (#36932 ) * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: format codes	2025-03-25 10:49:24 +01:00
Guang Yang	69bcb86c58	Export for Phi4-mini (#36780 ) * Export for Phi4-mini * Update tests/models/phi3/test_modeling_phi3.py --------- Co-authored-by: Guang Yang <guangyang@fb.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-25 10:46:38 +01:00
Mohamed Mekkouri	be2c0e7bff	Fixing _pre_quantization_dtype when torch_dtype is None (#36930 ) fix	2025-03-25 10:43:27 +01:00
Cyril Vallez	4303d88c09	Add Phi4 multimodal (#36939 ) * raw start * update * update * add to imports * update * up * simplify configs * clean configs * style * typos * Update convert_phi4_multimodal_weights_to_hf.py * Update convert_phi4_multimodal_weights_to_hf.py * fix * up * up * up * Update convert_phi4_multimodal_weights_to_hf.py * Update convert_phi4_multimodal_weights_to_hf.py * up * up * up * Update feature_extraction_phi4_multimodal.py * up * up * up * up * up * simplify configs * typo * cut code * typo * typo * typo * re * typo * up * up * up * add tests * fix * fix * Update test_modeling_phi4_multimodal.py * up * Update test_modeling_phi4_multimodal.py * doc * fix * up * up * up * up * up * up * simplify * up * simplify * config docstrings * cleanup * clean * typo * typo * fix * Update phi4_multimodal.md * fix * fix * Update test_modeling_phi4_multimodal.py * update * simplify reshapes and permutes * up * simplify special tokens * simplify processor a lot * Update processing_phi4_multimodal.py * Update processing_phi4_multimodal.py * switch to fast processor * image processor * Update image_processing_phi4_multimodal_fast.py * add lora extraction to converter * Update convert_phi4_multimodal_weights_to_hf.py * Update __init__.py * add AudioInput type in audio_utils * rewrite feature_extraction: support torch batched FFT * input_audio_embeds -> audio_input_features, input_image_embeds -> image_pixel_values * test update * not mono channel warning update * remove auto maps from processor * kargs dispatch in processor * simplify kwargs dispatch * simplify merging * remove default sampling rate * style * Update test_modeling_phi4_multimodal.py * update doc * doc * torch only feature extractor * make fake tokens adjustable * Update feature_extraction_phi4_multimodal.py * fix * Update processing_phi4_multimodal.py * simplify mask * last touch * fix copies * style * Update audio_utils.py * style * Update feature_extraction_phi4_multimodal.py * Update __init__.py * docstrings * copies * fix all checks * back to fix-copies * trigger CIs * Update feature_extraction_phi4_multimodal.py * improve tests with multimodal inputs * trigger CIs --------- Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com>	2025-03-25 09:55:21 +01:00
Raushan Turganbay	47e5432805	Deprecate #36741 and map Causal to Conditional (#36917 ) * deprecate the prev fix * reword warning and update docs * reword warning * tests * dont bloat `get_text_config()`	2025-03-25 09:13:56 +01:00
Mohamed Mekkouri	2b8a15cc3f	Disallow Offload to disk for gguf files (#36933 ) update Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-24 19:30:01 +01:00
Yoni Gozlan	91455c1825	Fix processor kwargs qwen2 vl (#36890 ) * Fix qwen2_vl and qwen2_5_vl processors cutom images kwargs * change version warning	2025-03-24 13:19:26 -04:00
gautham	48385aa4f4	Added support for seed in `DataCollatorForWholeWordMask` (#36903 ) * Added support for seed in `DataCollatorForWholeWordMask`, and also wrote tests. Also fixed bugs where the code hardcoded values for mask replacement probability and random replacement probability, instead of using the values passed by the user. * formatting issues * Used better way to generate seed in TF. Made tests more consistent.	2025-03-24 16:57:17 +00:00
Yih-Dar	5932606d8e	More precise comment (#36935 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-24 17:03:09 +01:00
Pavel Iakubovskii	2be2984462	Fix pytorch defomr attn path (#36923 ) * Fix pytorch path for DeformableAttention * Apply for GroundingDino	2025-03-24 15:58:51 +00:00
cyyever	00d077267a	[2/N] Use pyupgrade --py39-plus to improve code (#36857 ) Use pyupgrade --py39-plus to improve code	2025-03-24 15:42:25 +00:00
Ethan Knights	a6ecb54159	Update `trainer_pt_utils.py` docstrings for consistency (#36912 ) * Update trainer_pt_utils.py * update docstrings trainer_pt_utils.py for consistency * Update src/transformers/trainer_pt_utils.py --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-03-24 14:46:41 +00:00
omahs	cbf924b76c	Fix typos (#36910 ) * fix typos * fix typos * fix typos * fix typos	2025-03-24 14:08:29 +00:00
Yih-Dar	340500b1a9	Use another repo. for Mistral3 processor testing (#36925 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-24 14:36:05 +01:00
Mohamed Mekkouri	9e125d9a2e	Fix Compressed tensors to_dict_diff (#36922 ) fix	2025-03-24 13:06:33 +01:00
Raushan Turganbay	57f551c78d	[chameleon] fix num image token check (#36918 ) * [chameleon] fix num image token check * embed after merging image token * skip this also * mistral require_read_token	2025-03-24 12:36:08 +01:00
Dmitry Rogozhkin	a41e08aa19	tests: fix asyncio.wait() usage for python>=3.11 (#36898 ) tests: fix asyncio.wait() usage for python>=3.7 Passing coroutings directly to `asyncio.wait()` is deprecated since python 3.8 and removed starting from python 3.11. Instead, it's required to explicitly wrap coroutine in the task with `asyncio.create_task()` which first appeared in python 3.7. We step into this issue running the following Transformers tests on a system with python 3.11 or later (for example, Ubuntu 24.04 has python 3.12): * `tests/trainer/test_trainer_distributed.py` * `tests/extended/test_trainer_ext.py` The error will be: ``` src/transformers/testing_utils.py:2380: in execute_subprocess_async result = loop.run_until_complete( /usr/lib/python3.12/asyncio/base_events.py:687: in run_until_complete return future.result() src/transformers/testing_utils.py:2368: in _stream_subprocess await asyncio.wait( ... E TypeError: Passing coroutines is forbidden, use tasks explicitly. ``` See: https://docs.python.org/3.10/library/asyncio-task.html#asyncio.wait See: https://docs.python.org/3.10/library/asyncio-task.html#asyncio.wait See: https://docs.python.org/3.7/library/asyncio-task.html#asyncio.create_task Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-24 11:53:59 +01:00
XinyuanTong	e28be7a692	[Fix] Add `original_max_position_embeddings` to YARN rope_scaling optional keys (#36877 ) [fix] Update optional keys in _validate_yarn_parameters to include original_max_position_embeddings	2025-03-24 11:05:19 +01:00
Raushan Turganbay	48da44be24	Fix torch version guard at import (#36907 ) fix	2025-03-24 10:33:33 +01:00
AbdelKarim ELJANDOUBI	fe4ca2f4a7	fix Gemma3 Config (#36893 ) * fix Gemma3 Config * fix config in modular gemm3	2025-03-24 10:05:44 +01:00
Aritra Roy Gosthipaty	c9d1e5238a	Update installation.md (#36826 ) * Update installation.md * Update README.md	2025-03-21 16:32:02 -07:00
Steven Liu	d253de6d58	[docs] Model docs (#36469 ) * initial * fix * fix * update * fix * fixes * quantization * attention mask visualizer * multimodal * small changes * fix code samples	2025-03-21 15:35:22 -07:00
Yoni Gozlan	beb9b5b022	Fix Pan and Scan on batched images Gemma3 (#36864 ) * process flattened images in fast image proc * process flattened images in low proc and add tests * remove print * add unbalanced batch test pas image proc * fix integration tests	2025-03-21 13:56:00 -04:00
Cyril Vallez	dd3933dd65	Simplify keep_in_fp32_modules logic (#36722 ) * better regex everywhere * fix * Update test_modeling_instructblip.py * BC with explanations this time otherwise it makes no sense at all * Update test_modeling_instructblip.py * style * CIs * update _keep_in_fp32_modules in blip2 * Update modeling_utils.py * Update modeling_utils.py * style * CIs * add check * trigger CIs * Update modeling_utils.py * trigger CIs	2025-03-21 16:12:59 +01:00
Sukriti Sharma	90e2df5d55	fix: loss computation after embeddings resize - mllama (#36840 ) * move loss to generation class Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * code cleanup Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * test for resize and loss computation Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix tests Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix:test for resize and loss Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix resize embedding mllama test Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * review changes Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> --------- Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>	2025-03-21 14:47:59 +01:00
Arthur Zucker	4542b8fb27	push v4.51.0.dev0	2025-03-21 13:45:25 +01:00

... 19 20 21 22 23 ...

19383 Commits