transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Matt	bf46e44878	🚨 🚨 Allow saving and loading multiple "raw" chat template files (#36588 ) * Add saving in the new format (but no loading yet!) * Add saving in the new format (but no loading yet!) * A new approach to template files! * make fixup * make fixup, set correct dir * Some progress but need to rework for cached_file * Rework loading handling again * Small fixes * Looks like it's working now! * make fixup * Working! * make fixup * make fixup * Add TODO so I don't miss it * Cleaner control flow with one less indent * Copy the new logic to processing_utils as well * Proper support for dicts of templates * make fixup * define the file/dir names in a single place * Update the processor chat template reload test as well * Add processor loading of multiple templates * Flatten correctly to match tokenizers * Better support when files are empty sometimes * Stop creating those empty templates * Revert changes now we don't have empty templates * Revert changes now we don't have empty templates * Don't support separate template files on the legacy path * Rework/simplify loading code * Make sure it's always a chat_template key in chat_template.json * Update processor handling of multiple templates * Add a full save-loading test to the tokenizer tests as well * Correct un-flattening * New test was incorrect * Correct error/offline handling * Better exception handling * More error handling cleanup * Add skips for test failing on main * Reorder to fix errors * make fixup * clarify legacy processor file docs and location * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Rename to _jinja and _legacy * Stop saving multiple templates in the legacy format * Cleanup the processing code * Cleanup the processing code more * make fixup * make fixup * correct reformatting * Use correct dir name * Fix import location * Use save_jinja_files instead of save_raw_chat_template_files * Correct the test for saving multiple processor templates * Fix type hint * Update src/transformers/utils/hub.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Patch llava_onevision test * Update src/transformers/processing_utils.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Refactor chat template saving out into a separate function * Update tests for the new default * Don't do chat template saving logic when chat template isn't there * Ensure save_jinja_files is propagated to tokenizer correctly * Trigger tests * Update more tests to new default * Trigger tests --------- Co-authored-by: Lucain <lucainp@gmail.com> Co-authored-by: Julien Chaumond <julien@huggingface.co>	2025-04-11 16:37:23 +01:00
Mohamed Mekkouri	897874748b	Disable kernels for quantization (#37446 ) fix	2025-04-11 16:35:38 +02:00
Wing Lian	6a75528cbc	prevent creating a view/leaf param for low rank optimizers w FSDP (#37379 ) prevent creating a view/leaf param for low rank optimizers:	2025-04-11 14:36:29 +02:00
Bowen Bao	6cef03ba66	[Regression] Fix Quark quantized model loading after refactorization (#37407 )	2025-04-11 13:43:36 +02:00
Raushan Turganbay	a563999a02	[processor] clean up mulitmodal tests (#37362 ) * clkea up mulitmodal processor tests * fixup * fix tests * fix one last test * forgot	2025-04-11 13:32:19 +02:00
Mohamed Mekkouri	3c39c07939	Remove triton mlp kernel, not compiling for some models (#37449 ) * remove mlp for now * disable on docker	2025-04-11 12:47:13 +02:00
Lysandre Debut	f797e3d98a	Fix the test fetcher (#37452 ) Test fetcher	2025-04-11 12:19:27 +02:00
Arthur	442d356aa5	Add moe kernels (#37376 ) * the fix that did not get in * add kernels * full graph does not work * simpler is better * Update src/transformers/integrations/hub_kernels.py Co-authored-by: Daniël de Kok <me@danieldk.eu> * Update src/transformers/integrations/fbgemm_fp8.py Co-authored-by: Daniël de Kok <me@danieldk.eu> * Update src/transformers/integrations/hub_kernels.py Co-authored-by: Daniël de Kok <me@danieldk.eu> * fixup --------- Co-authored-by: Daniël de Kok <me@danieldk.eu>	2025-04-11 11:56:22 +02:00
Arthur	7e9b57ce62	Update-kernel-pin (#37448 ) * update `kernels` * oups * new pinned version	2025-04-11 11:19:21 +02:00
Lysandre Debut	54a123f068	Simplify soft dependencies and update the dummy-creation process (#36827 ) * Reverse dependency map shouldn't be created when test_all is set * [test_all] Remove dummies * Modular fixes * Update utils/check_repo.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * [test_all] Better docs * [test_all] Update src/transformers/commands/chat.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * [test_all] Remove deprecated AdaptiveEmbeddings from the tests * [test_all] Doc builder * [test_all] is_dummy * [test_all] Import utils * [test_all] Doc building should not require all deps --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-04-11 11:08:36 +02:00
Donggeun Yu	931126b929	Fixes: Corrects file path for CUDA kernels (#37438 ) Corrects the file path used to locate the CUDA kernels for the Deformable Attention module. This ensures that the kernels are loaded correctly, resolving potential errors during module initialization and usage.	2025-04-11 09:41:46 +01:00
Yao Matrix	c7064cdba1	enhance require_deterministic_for_xpu (#37437 ) * enhance require_deterministic_for_xpu Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-04-11 08:06:08 +02:00
cyyever	371c44d0ef	Remove old code for PyTorch, Accelerator and tokenizers (#37234 ) * Remove unneeded library version checks Signed-off-by: cyy <cyyever@outlook.com> * Remove PyTorch condition Signed-off-by: cyy <cyyever@outlook.com> * Remove PyTorch condition Signed-off-by: cyy <cyyever@outlook.com> * Fix ROCm get_device_capability Signed-off-by: cyy <cyyever@outlook.com> * Revert "Fix ROCm get_device_capability" This reverts commit `0e756434bd`. * Remove unnecessary check Signed-off-by: cyy <cyyever@outlook.com> * Revert changes Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-04-10 20:54:21 +02:00
duanjunwen	7ff896c0f2	[Feat] Support npu in modeling models (#37369 )	2025-04-10 19:00:58 +02:00
Mohamed Mekkouri	10907e2846	Adding to self_comment_ci.yml (#37426 ) add myself	2025-04-10 17:46:56 +02:00
Mehant Kammakomati	7d76876498	(Part 2) feat: allow for tp_size attr for tplizing the model (#37054 ) * feat: custom tp_size, new transformers tp interface Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * fix: review cmt - error when tp_plan not set for tp_size Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * fix: nit in docs Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> --------- Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Matej Sirovatka <54212263+S1ro1@users.noreply.github.com>	2025-04-10 17:44:09 +02:00
Terrasse	dac443414e	fix: use mtime by default in Trainer._rotate_checkpoints with automatic fallback (#37260 ) Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-10 17:42:06 +02:00
Isotr0py	6daec12d0b	Add GGUF support to Gemma3 Text backbone (#37424 ) * add gemma3 gguf support Signed-off-by: Isotr0py <2037008807@qq.com> * fix typo and add gguf limit Signed-off-by: Isotr0py <2037008807@qq.com> * fix a typo Signed-off-by: Isotr0py <2037008807@qq.com> * add vision conversion test Signed-off-by: Isotr0py <2037008807@qq.com> * fix typos Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-10 17:15:43 +02:00
Mohamed Mekkouri	0ea1151222	Llama Kernel integration (#37092 ) * initial commit * style * update * change approach attention * clean up * fix import * update * update * fix style * change method * attention * add mlp back * change name * update name * fix copies * fix config * fix	2025-04-10 17:13:25 +02:00
Mohamed Mekkouri	9c0c323e12	Fix require_read_token (#37422 ) * nit * fix * fix	2025-04-10 17:01:40 +02:00
Mario Michael Krell	bde41d69b4	Correctly drop tokens in SwitchTransformer (#37123 ) Previously, the identity function was used for dropped tokens with a weight from the expert that was not applied to the hidden states. This was misleading, because dropping means, the expert weight is zero. Instead of trying to fix the weight, we take an easier approach by initializing with zeros. Fixes issue https://github.com/huggingface/transformers/issues/37017	2025-04-10 16:58:57 +02:00
AbdelKarim ELJANDOUBI	7ecc5b88c0	Add image classifier donut & update loss calculation for all swins (#37224 ) * add classifier head to donut * add to transformers __init__ * add to auto model * fix typo * add loss for image classification * add checkpoint * remove no needed import * reoder import * format * consistency * add test of classifier * add doc * try ignore * update loss for all swin models	2025-04-10 15:00:42 +02:00
Mohamed Mekkouri	5ae9b2cac0	Quark Quantization gated repo (#37412 ) * fix * empty commit * empty * nit * fix maybe ?	2025-04-10 14:57:15 +02:00
Yih-Dar	d9e76656ae	Fix new failure reports not including anything other than `tests/models/` (#37415 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-10 14:47:23 +02:00
Raushan Turganbay	1ae8d54b04	[chat-template] Unify tests and clean up 🧼 (#37275 ) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now	2025-04-10 14:42:32 +02:00
Arthur	10144ff116	use `rms_norm_eps` for the L2Norm for Llama4 (#37418 ) use `rms_norm_eps`	2025-04-10 13:33:50 +02:00
ivarflakstad	aa478567f8	Allow rocm systems to run these tests (#37278 ) * Allow rocm systems to run these tests * Fix skipTest logic * Use get_device_properties to check system capabilities	2025-04-10 13:33:01 +02:00
Wang, Yi	ae5ce22664	from_pretrained should handle xpu case (#37382 ) * from_pretrained should handle xpu case Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * fmt Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2025-04-10 13:23:17 +02:00
Yih-Dar	4f139f5a50	Send trainer/fsdp/deepspeed CI job reports to a single channel (#37411 ) * send trainer/fsdd/deepspeed channel * update * change name * no . * final --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-10 13:17:31 +02:00
Arthur	a2c2fb0108	update `kernels` to 0.4.3 (#37419 ) * update `kernels` * oups	2025-04-10 12:14:22 +02:00
Wing Lian	0ddad2d655	mark llama4 as not supported with fa2 (#37416 )	2025-04-10 11:48:46 +02:00
Cyril Vallez	fbb2054ed5	Offloaded hybrid cache for Llama4 (#37401 ) * first try (maybe race condition) * Update cache_utils.py * cannot avoid the race condition -> use 2 layers * Update cache_utils.py * Update cache_utils.py	2025-04-10 11:44:34 +02:00
Cyril Vallez	6d8b0b3378	Fix Llama4 offset (#37414 ) * add +1 * Update modeling_llama4.py	2025-04-10 11:40:58 +02:00
Mohamed Mekkouri	f5865d32a2	Restrict & Explain tp_plan for FBgemm (#37404 ) * explain tp_plan * add llama4 check * add clarification	2025-04-10 11:33:33 +02:00
Serge Panev	e39c732644	Handle torch ver in flexattn (#37400 ) * Handle torch ver in flexattn * update	2025-04-10 11:27:54 +02:00
Manuel de Prada Corral	bc0150bb04	Add warning when failed to acquire other user's lock at model download (#37395 )	2025-04-10 11:18:27 +02:00
Wing Lian	9cda4265d6	handle torch version edge cases (#37399 )	2025-04-09 21:49:57 +02:00
Arthur	e032d12e8a	the fix that did not get in (#37370 ) * debugging improvements * add debugging details * add more debugging details * debug more * the fix that did not get in * First fix flex * fix query offset * fix flex first * fix device mask creation for speed * small mask creation sdpa * Update flex_attention.py * remove chunked prefill from HybridChunkedCache * never seen such a fucked up merged * clean up layers + output * add summary json file * Efficient general cache * Update cache_utils.py * cleanup * fix? * fix! * oups typo * not everywhere * more fixes * revert unrelated changes * Fix but ugly for now -> should use pad instead * oups * re-initialize the cache * Use pad to simplify * style * correct slicing --------- Co-authored-by: Pablo <pablo.montalvo.leroux@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-04-09 20:15:33 +02:00
Mohamed Mekkouri	f834ca2c19	Attention Quantization with FBGemm & TP (#37384 ) * fix * keep fused * contiguous * rm print * update * update * rm print	2025-04-09 18:45:42 +02:00
DerekLiu35	c5c648dd74	Fix some failing AWQ tests (#37383 ) * update AwqQuantizer * fix style * add an arg to get_modules_to_not_convert to add get_keys_to_not_convert(model)	2025-04-09 18:24:57 +02:00
Brayden Zhong	71b35387fd	Apply torchfix to replace deprecated functions: `_pytree._register_pytree_node` and `torch.cpu.amp.autocast` (#37372 ) fix: apply torchfix	2025-04-09 16:11:18 +01:00
Sangyun_LEE (이상윤)	ad340908e4	Fix warning message for PEFT models in text-generation pipeline #36783 (#36887 ) * add peft model in constant * add test * fix formating * make fixup execute * change code * check by self.task * add test * fixup test code * fix minor typo * fix pipeline test * apply maintainers reqests	2025-04-09 15:36:52 +01:00
DerekLiu35	2527f71a47	Add "selecting a quantization method" doc (#37159 ) * initial draft * make documentation simpler * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * turn pros and cons into tables * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * add links to each quant method page * separate calibration vs no calibration methods * add calibration time estimates --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-09 15:51:37 +02:00
Marc Sun	7ae0be722e	update deepspeed docker (#37371 ) * update * create docker image * 03 * uninstall pytest as it conflits with transformers * wrong one * better * see which package depends on pytest * up * resintall * fix * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-09 14:54:06 +02:00
Arthur	e3eda6d188	Add glm4 (#37388 ) * add changed * Revert "add changed" This reverts commit `0a0166a1fe`. * update with NEW MODEL class called GLM4 * update * Update glm4.md * Name * style * fix copies * fixup test --------- Co-authored-by: Yuxuan Zhang <2448370773@qq.com>	2025-04-09 14:02:04 +02:00
Jonas M. Kübler	1e6ff5fd55	fix: llama4 conversion script no_rope_layers (#37359 ) fix conversion script no_rope_layers `no_rope_layers` should either be a list of NoPE layers or None, such that it is created in the config from the `no_rope_layer_interval` Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2025-04-09 13:02:15 +02:00
Raushan Turganbay	6f4058aee3	Update composition flag usage (#36263 ) * update composition flag usage * remove print * fix tests * actually fix * oh c'mon * now should be fixed right? * fix copies	2025-04-09 11:48:49 +02:00
Jerry Zhang	08e3217baf	Preserve requires_grad in pre quantized model (#37354 ) * Preserve requires_grad in pre quantized model Summary: discovered this when running lm-eval for some models, current code will set requires_grad to True always Test Plan: lm_eval --model hf --model_args pretrained=jerryzh168/phi4-torchao-gguf-q4_k --tasks hellaswag --device cuda:0 --batch_size 8 Reviewers: Subscribers: Tasks: Tags: * ruff format --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-04-08 18:41:30 +02:00
Matt	4d0de5f73a	🚨 🚨 Setup -> setupclass conversion (#37282 ) * More limited setup -> setupclass conversion * make fixup * Trigger tests * Fixup UDOP * Missed a spot * tearDown -> tearDownClass where appropriate * Couple more class fixes * Fixups for UDOP and VisionTextDualEncoder * Ignore errors when removing the tmpdir, in case it already got cleaned up somewhere * CLIP fixes * More correct classmethods * Wav2Vec2Bert fixes * More methods become static * More class methods * More class methods * Revert changes for integration tests / modeling files * Use a different tempdir for tests that actually write to it * Remove addClassCleanup and just use teardownclass * Remove changes in modeling files * Cleanup get_processor_dict() for got_ocr2 * Fix regression on Wav2Vec2BERT test that was masked by this before * Rework tests that modify the tmpdir * make fix-copies * revert clvp modeling test changes * Fix CLIP processor test * make fix-copies	2025-04-08 17:15:37 +01:00
KimmiShi	c15a7adb28	fix(qwen): fix shape error when using tp (#36947 ) * fix(qwen): fix shape error when using tp * Update modeling_qwen2_vl.py --------- Co-authored-by: shidongxing <shidongxing@pjlab.org.cn> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-04-08 17:47:30 +02:00

1 2 3 4 5 ...

18604 Commits