transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Afanti	19b9d8ae13	chore: fix typos in tests directory (#36785 ) * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory	2025-03-18 10:31:13 +01:00
Afanti	7f5077e536	fix typos in the tests directory (#36717 )	2025-03-17 17:45:57 +00:00
Daniel Kleine	cbfb8d7b27	doc: Clarify `is_decoder` usage in PretrainedConfig documentation (#36724 ) * fix: clarify decoder usage in PretrainedConfig documentation * Apply suggestions from code review updated doc Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-03-17 09:40:25 -07:00
Steven Liu	ac1a1b66b9	[docs] Update README (#36265 ) * update * feedback * feedback * update versions	2025-03-17 09:37:19 -07:00
Joao Gante	cff4caa0c1	[CI] remove redundant checks in `test_eager_matches_sdpa_inference` (#36740 )	2025-03-17 16:29:18 +00:00
Christopher Akiki	e3af4fec91	[MINOR:TYPO] Update hubert.md (#36733 ) * [MINOR:TYPO] Update hubert.md - typo fix (wave2vec instead of hubert) - make code snippet copiable and runnable * Run tests	2025-03-17 09:07:51 -07:00
Petr Kuderov	c8a2b25f91	Fix `TrainingArguments.torch_empty_cache_steps` post_init check (#36734 ) Mistaken use of De Morgan's law. Fixed "not (X or Y)" to correct "not (X and Y)" check to raise a ValueError. Added corresponding test to check "positive int or None" condition. Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-17 16:09:46 +01:00
Sambhav Dixit	8e67230860	Fix test isolation for clear_import_cache utility (#36345 ) * test fixup * test fixup * fixing tests for unused imports * style fixes * fix * style fixes * styke fix * remove isolated module cache * rm custom subprocess defination * run using exsiting fn * style fixup * make fixup * remove redundant comments * rm redundat skipif + style changes	2025-03-17 16:09:09 +01:00
jiqing-feng	27361bd218	fix xpu tests (#36656 ) * fix awq xpu tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix llava next video bnb tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-17 15:57:49 +01:00
Fredrik Norén	da7d64f4ff	Allow ray datasets to be used with trainer (#36699 ) Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-17 15:44:47 +01:00
jiqing-feng	2256875a77	fix can_generate (#36570 ) * fix can_generate Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix can generate for speecht5 and blip Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix speecht5 tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>	2025-03-17 14:56:18 +01:00
Marc Sun	9e94801146	enable/disable compile for quants methods (#36519 ) * disable compile for most quants methods * fix * Update src/transformers/generation/configuration_utils.py Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Update tests/quantization/bnb/test_mixed_int8.py Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Update src/transformers/generation/configuration_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * changes from joao suggestions --------- Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-17 11:38:21 +01:00
Armaghan Shakir	c53d53da89	🚨🚨🚨 Fix sdpa in SAM and refactor relative position embeddings (#36422 ) * fall back to eager if output_attentions * improve relative position embeddings * run modular on got_ocr2 * run-slow: sam * fix run-length encoding * fix tf processor errors * update tf_sam * fix compile error * re-run tests	2025-03-17 09:39:52 +00:00
Joao Gante	fc8764c9a6	[Generation, Gemma 3] When passing a custom `generation_config`, overwrite default values with the model's base `generation_config` (#36684 )	2025-03-15 12:40:09 +00:00
Guillaume LEGENDRE	f263e88dcf	Update self-push-caller.yml	2025-03-15 11:32:04 +01:00
Ilyas Moutawwakil	6f3e0b68e0	Fix grad accum arbitrary value (#36691 )	2025-03-14 22:03:01 +01:00
Cyril Vallez	2c2495cc7b	Fix post_init() code duplication (#36727 ) * Update modeling_utils.py * CIs	2025-03-14 17:36:02 +01:00
MaCAT	25992b493c	🌐 [i18n-KO] Translated codegen.md to Korean (#36698 ) * Initial translation * Add _toctree.yml	2025-03-14 09:31:18 -07:00
Joao Gante	42ebb6c23e	[tests] Parameterized `test_eager_matches_sdpa_inference` (#36650 )	2025-03-14 14:41:27 +00:00
Matt	9215cc62d4	Try working around the processor registration bugs (#36184 ) * Try working around the processor registration bugs * oops * Update error message * Clarify error * Docstring docstring docstring * The extra content is indexed by config class, so let's grab some values out of there * Commit my confusion as a TODO * Resolve my confusion * Cleanup and mostly revert to the original * Better autoclass fallback * Don't nest f-strings you lunatic * Clearer error message * Less getattr() * Revert a lot of changes to try a different approach! * Try the global registry * Check the dynamic list as well as the transformers root * Move the dynamic list somewhere safer * Move the dynamic list somewhere even safer * More import cleanup * Simplify all the register_for_auto_class methods * Set _auto_class in the register() methods * Stop setting the cls attribute in register() * Restore specifying the model class for Model derivatives only * Fix accidentally taking the .__class__ of a class * Revert register_for_auto_class changes * Fix get_possibly_dynamic_module * No more ALL_CUSTOM_CLASSES * Fix up get_possibly_dynamic_module as well * Revert unnecessary formatting changes * Trigger tests	2025-03-14 13:56:21 +00:00
Sean (Seok-Won) Yi	691d1b52c3	Fix/best model checkpoint fix (#35885 ) * Set best_model_checkpoint only when ckpt exists. Rather than set it explicitly without checking if the checkpoint directory even exists as before, now we moved the setting logic inside of _save_checkpoint and are only setting it if it exists. * Added best_global_step to TrainerState. * Added tests for best_model_checkpoint. * Fixed hard-coded values in test to prevent fail. * Added helper func and removed hard-coded best_step. * Added side effect patch generator for _eval. * Added evaluate side effect func. * Removed erroneous patching. * Fixed minor bug. * Applied Ruff. * Fixed Ruff problem in make style. * Used Trainer.set_initial_training_values.	2025-03-14 14:24:53 +01:00
Joao Gante	3bd1a0ddf1	[model loading] don't `gc.collect()` if only 1 shard is used (#36721 ) * don't gc collect if 1 shard is used * delete state dict anyways	2025-03-14 12:56:56 +00:00
Matt	8cb522b419	Cleanup the regex used for doc preprocessing (#36648 ) * Cleanup the regex used for doc preprocessing * Run tests	2025-03-14 12:18:49 +00:00
Matt	72861e11eb	Make the flaky list a little more general (#36704 ) * Make the flaky list a little more general * Trigger tests * Make the flaky list a little more general	2025-03-14 12:15:32 +00:00
Kingsley	53742b11f5	Gemma3 processor typo (#36710 ) * fix typo when is on * tiny * add test and remove 'text_crops' * lint	2025-03-14 13:07:55 +01:00
Yoni Gozlan	69bc848480	Add support for fast image processors in add-new-model-like CLI (#36313 ) * add support for fast image processors in add-new-model-like * fix header not found add-fast-image-processor-cli * Encourage adding fast image processor * nit * start improve doc * update docs * make requested modifs	2025-03-13 14:16:37 -04:00
Matt	48ef468c74	Final CI cleanup (#36703 ) * make fixup * make fixup * Correct skip decorator * Add TODOs * add is_flaky() parentheses	2025-03-13 17:26:09 +00:00
Isotr0py	b070025aa6	Add GGUF support to T5-Encoder (#36700 ) * add gguf support to t5encoder Signed-off-by: Isotr0py <2037008807@qq.com> * fix Signed-off-by: Isotr0py <2037008807@qq.com> * remove gguf from model_kwargs Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-03-13 17:57:33 +01:00
Mohamed Mekkouri	4a60bae8e2	Handling an exception related to HQQ quantization in modeling (#36702 ) * adding exception * style * add types	2025-03-13 17:53:36 +01:00
Mehant Kammakomati	09a309d273	fix: fsdp sharded state dict wont work for save_only_model knob (#36627 ) Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-13 17:17:35 +01:00
Cyril Vallez	2a004f9ff1	Add loading speed test (#36671 ) * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * trigger CIs * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * better error messages * Update test_modeling_utils.py * Update test_modeling_utils.py	2025-03-13 17:07:30 +01:00
Joao Gante	a3201cea14	[CI] Automatic rerun of certain test failures (#36694 )	2025-03-13 15:40:23 +00:00
Afanti	d84569387f	chore: fix typos in utils module (#36668 ) * chore: fix typos in utils module * chore: fix typos in utils module * chore: fix typos in utils module * chore: fix typos in utils module * chore: fix typos in utils module * chore: fix typos in utils module	2025-03-13 15:12:44 +00:00
Cyril Vallez	32c95bd847	Fix dtype for params without tp_plan (#36681 ) * Update tensor_parallel.py * CIs	2025-03-13 15:28:14 +01:00
wineandchord	bb965d8e87	fix type annotation for ALL_ATTENTION_FUNCTIONS (#36690 ) Corrects the type annotation to match actual usage. The variable was typed as Dict[str, Dict[str, Callable]] but is actually used as Dict[str, Callable] where keys are attention mechanism names and values are the corresponding attention functions directly. This change makes the type annotation consistent with how the dictionary is used in the codebase.	2025-03-13 14:27:50 +00:00
Yoni Gozlan	1c287aecfc	Change Qwen2_VL image processors to have init and call accept the same kwargs (#36207 ) Change qwen2VL image processors to have init and call accept the same kwargs	2025-03-13 10:15:17 -04:00
Mohamed Mekkouri	65b8e38aac	Upgrading torch version and cuda version in quantization docker (#36264 ) * update * small update * no spqr quant * testing * testing * test nightly * gptqmodel * flute * fix hadamard * running tests * new docker * fix docker * run tests * testing new docker * new docker * run tests * new docker * run tests * final test * update * update * run tests * new docker * launch tests * test_docker * running tests * add comments * fixing yml * revert	2025-03-13 12:39:16 +01:00
bd793fcb	87b30c3589	fix wandb hp search unable to resume from sweep_id (#35883 ) * fix wandb hp search unable to resume from sweep_id * format styles --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-13 12:32:26 +01:00
Mohamed Mekkouri	47cc4da351	Changing the test model in Quanto kv cache (#36670 ) changing model	2025-03-13 12:23:34 +01:00
Marc Sun	bc3d5781e7	Fix slicing for 0-dim param (#36580 ) * fix * switch to ellipsis instead * Add co-author Co-authored-by: fxmarty-amd <fxmarty-amd@users.noreply.github.com> * Add co-author second try Co-authored-by: fxmarty-amd <felmarty@amd.com>	2025-03-13 12:16:13 +01:00
Marc Sun	fbb18ce68b	Update config.torch_dtype correctly (#36679 ) * fix * style * new test	2025-03-13 12:08:02 +01:00
Joao Gante	c4161238bd	[Cache] Don't initialize the cache on `meta` device (#36543 )	2025-03-13 10:13:29 +00:00
Yoni Gozlan	79254c9b61	Fix rescale normalize inconsistencies in fast image processors (#36388 ) * fix fused rescale normalize inconsistencies * fix siglip2 fast image processor * refactor kwargs validation and fused nirmalize rescale * cleanup kwargs handling in preprocess * update new procs after refactor	2025-03-12 23:18:34 -04:00
Yoni Gozlan	48292a9848	Refactor siglip2 fast image processor (#36406 ) * refactor siglip2 fast image processor, add unused_kwargs in base fast image processor * nits * change unused_kwargs default to None * update siglip2 fast image proc	2025-03-12 20:28:27 -04:00
Yoni Gozlan	ea219ed164	Remove differences between init and preprocess kwargs for fast image processors (#36186 ) * Remove differences between init and preprocess kwargs in fast image processors * make modifs got_ocr2 * update gemma3	2025-03-12 19:44:05 -04:00
Marc Sun	cc3a361b46	[quants] refactor logic for modules_to_not_convert (#36672 )	2025-03-12 23:43:30 +01:00
Yoni Gozlan	bc3253f076	Remove hardcoded slow image processor class in processors supporting fast ones (#36266 ) * Add fast image processor class to processors supporting them * fix test kosmos2	2025-03-12 18:39:25 -04:00
Mohamed Mekkouri	0013ba61e5	Fix Failing GPTQ tests (#36666 ) fix tests	2025-03-12 20:03:02 +01:00
Matt	c7eb95581a	Don't accidentally mutate the base_model_tp_plan (#36677 ) * Don't accidentally mutate the base_model_tp_plan * Co-authored by: Joao Gante <joaofranciscocardosogante@gmail.com> * Trigger tests * Marking grad accum test as slow * Add a flaky decorator * Add a flaky decorator * Use cyril's codeblock * Don't copy() when it's None * Use cyril's new codeblock * make fixup	2025-03-12 18:59:13 +00:00
Cyril Vallez	071a161d3e	[core] Large/full refactor of `from_pretrained` (#36033 ) * squash everything together start to simplify inner logic Update modeling_utils.py Update modeling_utils.py Update modeling_utils.py Update modeling_utils.py continue refactor fix small fixes add type hints/docstring Update modeling_utils.py remove _fast_init keep improving Update modeling_utils.py Update modeling_utils.py new first tp loading version style fix weird in-place op trigger CIs Update modeling_utils.py much clearer renaming of keys fix update Update test_modeling_common.py trigger CIs update update style Update modeling_utils.py Update modeling_utils.py Update modeling_utils.py fix fast download first prototype remove old function remove old functions Remove unused function and move back _get_tp_registry fix tp plan registry simplify CIs Update hub.py Update modeling_utils.py simplify simplify renaming logic remove unused check add sanity check back (a test depends on it) Update modeling_utils.py finalize sound renaming logic style add forgotten check Update modeling_utils.py add key_mapping keyword style Update modeling_utils.py add comment minor updates minor change for clarity fix small prefix issue and simplify style trigger CIs typo fix Post rebase fix post rebase cleanup simplify tp typo oupsi typo correctly escape improvements based on Marc's review finalize Marc's review comments squash everything * improve * Update modeling_utils.py * Update modeling_utils.py * fix * Update modeling_utils.py * Update modeling_utils.py * style * Update modeling_utils.py * simplify * style * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * fix dtype issue * Update modeling_utils.py * style * remove test that does not make sense * style * small fixes * style * fix * cleanup after rebase * style * typo * escape * tp for task specific top modules * Update modeling_utils.py * Update modeling_utils.py * fix allocation * CIs * CIs * CIs * improve docstring * CIs * Update modeling_utils.py * fix	2025-03-12 13:39:25 +01:00

1 2 3 4 5 ...

18262 Commits