transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-08-02 11:11:05 +06:00

Author	SHA1	Message	Date
gebbissimo	1a0cd69435	feat: allow to use hf-hub models for timm backbone (#34729 ) Currently a backbone name like 'hf-hub:bioptimus/H-optimus-0' throws an error, even though it could work. Co-authored-by: Christian Gebbe <>	2024-11-19 10:26:35 +00:00
Guillem García Subies	d8a5d31d9c	Trainer hyperparameter search kwargs docs update (#34459 ) * doc: Trainer.hyperparameter_search docstring discrepancy solved * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-19 11:23:03 +01:00
Arthur	dadb286f06	protect tensor parallel usage (#34800 ) protect	2024-11-19 09:54:11 +01:00
Yih-Dar	eed11f34ab	Fix Whisper CI (#34617 ) * Revert "Revert "Fix Whisper CI" (#34605)" This reverts commit `74d3824cc0`. * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-11-18 21:37:50 +01:00
Aymeric Roucher	759a378ee5	Allow handling files as args for a tool created with Tool.from_space (#34687 ) * Allow handling files as args for a tool created with `Tool.from_space`	2024-11-18 20:15:35 +01:00
Ke Wen	20142ab542	Simplify Tensor Parallel implementation with PyTorch TP (#34184 ) * Simplify Tensor Parallel implementation with PyTorch TP * Move tp_plan to config * Lint * Format and warning * Disable copy-from check * Conditionally get attr from config * make fix-copies * Move base_model_tp_plan to PretrainedConfig * Move TP into from_pretrained * Add device context for load * Do not serialize * Move _tp_plan setting to post_init * Add has_tp_plan * Add test_tp * Add 'Multi-gpu inference' doc * Add backward support for device type identification * Auto-detect accelerator * supports_tp_plan * copyright year * Fix copy	2024-11-18 19:51:49 +01:00
ecyht2	7df93d6ffb	fix: Wrong task mentioned in docs (#34757 )	2024-11-18 18:42:28 +00:00
Hun-soo Jung	7693b62268	Fix callback key name (#34762 ) Fixes typo.	2024-11-18 18:41:12 +00:00
Eon Kim	1ef6c5f1c5	fix: Update pixel_values parameter in hf_model input (#34782 )	2024-11-18 18:40:01 +00:00
Fanli Lin	e80a65ba4f	[tests] add XPU part to testing (#34778 ) add XPU part to testing Signed-off-by: Lin, Fanli <fanli.lin@intel.com>	2024-11-18 09:59:11 -08:00
Fanli Lin	9568a9dfc5	[docs] add XPU besides CUDA, MPS etc. (#34777 ) add XPU	2024-11-18 09:58:50 -08:00
Fanli Lin	8568bf1bcf	[docs] make `empty_cache` device-agnostic (#34774 ) make device-agnostic	2024-11-18 09:58:26 -08:00
Wing Lian	36759f3312	make sure to disable gradients for integer tensor (#32943 )	2024-11-18 16:49:37 +01:00
Dmitry Rogozhkin	1c471fc307	Fix skip of test_training_gradient_checkpointing (#34723 ) `19d58d31f` has introduced a context manager to manage subtests of test_training_gradient_checkpointing. However, test body was not moved under "with" statement. Thus, while tests are correctly marked as skipped, test bodies were still executed. In some cases, as with llama this caused attribute errors. Fixes: #34722 Fixes: `19d58d31f` ("Add MLLama (#33703)") Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>	2024-11-18 15:45:40 +01:00
ZuoChen_BUPT	c772d4d91e	fix a typo bug where 'id2label' was incorrectly written as 'i2label' when reading config (#34637 ) fix a bug where 'id2label' was incorrectly written as 'i2label' when reading the config from pretrained config	2024-11-18 14:41:48 +01:00
Ofek Lev	eb0ab3ed4b	Fix broken link (#34618 )	2024-11-18 14:13:26 +01:00
Raushan Turganbay	1646ffb4d1	VLMs: `patch_size` -> `num_image_tokens` in processing (#33424 ) * use num additional tokens * fix copies + docs * another fix copies :) * add docs * move order for BC	2024-11-18 13:21:07 +01:00
Shane A	3ee24e2208	Add OLMo November 2024 (#34551 ) * Add model skeletion with transformers-cli add-new-model-like * Convert config to modular, add rms_norm_eps, delete clip_qkv * Convert model to modular, add RMSNorm * Add flash attention with qk norm and no qkv clipping * Add decoder layer with RMSNorm after attention/feedforward layers * Add base and causal model * Add converter improvements from OLMo repo * Update weight loading in OLMo to HF converter * Set correct default for rms_norm_eps * Set correct pipeline_model_mapping in test * Run make fixup * Fix model type * Re-run modular conversion * Manually set config docs to fix build errors * Convert olmo-1124 to olmo_1124 to fix flash attention docs errors * Start updating tests * Update tests * Copy upstream test_eager_matches_sdpa_inference_1_bfloat16 changes to olmo_1124 * Rename input_layernorm and post_attention_layernorm to reflect their ops better * Use correct tokenizer * Remove test unsupported by GPT2 tokenizer * Create GenerationConfig outside of from_pretrained call * Use simpler init file structure * Add explicit __all__ to support simplified init * Make safetensor serialization the default * Update OLMo November 2024 docs	2024-11-18 10:43:10 +01:00
Joao Gante	13493215ab	🧼 remove v4.44 deprecations (#34245 ) * remove v4.44 deprecations * PR comments * deprecations scheduled for v4.50 * hub version update * make fiuxp --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-11-15 23:07:24 +01:00
AbdelKarim ELJANDOUBI	8d50fda644	Remove FSDP wrapping from sub-models. (#34452 ) * Remove FSDP wrapping from sub-models. * solve conflict trainer.py * make fixup * add unit test for fsdp_auto_wrap_policy when using auto_find_batch_size * put back extract_model_from_parallel * use transformers unwrap_model	2024-11-15 23:00:03 +01:00
Wing Lian	b0c0ba7b4d	FSDP grad accum fix (#34645 ) * add gradient accumulation steps tests for fsdp * invert no_sync context to fix training for fsdp	2024-11-15 22:28:06 +01:00
jiqing-feng	52ea4aa589	add xpu path for awq (#34712 ) * add xpu path for awq * update readme	2024-11-15 15:45:24 +01:00
CezaPasc	7b3d615bc2	fix(wandb): pass fake dataset to avoid exception in trainer (see #34455 ) (#34720 )	2024-11-15 15:44:02 +01:00
Lysandre Debut	f5dbfab7f3	Update llava.md (#34749 ) LLava -> Llava	2024-11-15 15:39:57 +01:00
lewtun	8ba3e1505e	Retain newlines in chat template when `continue_final_message=True` (#34253 ) * Retain newlines in chat template when * Add try/except * Add regression test * Simplify test * Apply suggestions from code review Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2024-11-15 14:27:04 +00:00
Fanli Lin	a3d69a8994	[docs] add xpu device check (#34684 ) * add XPU path * use accelerate API * Update docs/source/en/tasks/semantic_segmentation.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update more places with accelerate API --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-13 14:16:59 -08:00
Xiao Yuan	68f8186a89	Fix example in EsmConfig docstring (#34653 )	2024-11-13 13:55:58 -08:00
Pedro Cuenca	e7c36a9d57	[docs] Broken link in generation_strategies (#34717 ) [docs] Broken link	2024-11-13 13:44:42 -08:00
MaCAT	be8748a53c	🌐 [i18n-KO] Translated marian.md to Korean (#34698 ) * initial translation * removed english * Fixed Trivial Typos, updated _toctree.yml	2024-11-13 13:14:23 -08:00
Aymeric Roucher	33eef99250	Agents: Small fixes in streaming to gradio + add tests (#34549 ) * Better support transformers.agents in gradio: small fixes and additional tests	2024-11-11 20:52:09 +01:00
Ahmed Almaghz	6de2a4d1f1	[i18n-ar] Translated file : `docs/source/ar/torchscript.md` into Arabic (#33079 ) * Add docs/source/ar/torchscript.md to Add_docs_source_ar_torchscript.md * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Merge troubleshooting.md with this Branch * Update _toctree.yml * Update torchscript.md * Update troubleshooting.md --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-11-11 10:41:01 -08:00
Fanli Lin	25f510a9c6	[docs] update not-working model revision (#34682 ) update revision	2024-11-11 07:09:31 -08:00
Aymeric Roucher	3ea3ab62d8	Agents: turn any Space into a Tool with `Tool.from_space()` (#34561 ) * Agents: you can now load a Space as a tool	2024-11-10 12:22:40 +01:00
Louis Brulé Naudet	134ba90da9	Update llm_engine.py (#33332 ) * Update llm_engine.py - Added support for optional token and max_tokens parameters in the constructor. - Provided usage examples and detailed documentation for each method.	2024-11-10 12:19:20 +01:00
Ahmed Almaghz	768f3c016e	[i18n-ar] Translated file : `docs/source/ar/trainer.md` into Arabic (#33080 ) * Add docs/source/ar/trainer.md to Add_docs_source_ar_trainer.md * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update trainer.md * Update trainer.md * Update trainer.md * Create _toctree.yml * Delete docs/source/ar/_toctree.yml * Update _toctree.yml - add trainer * Update _toctree.yml * merge serialization.md into this branch * merge sagemaker.md into this PR * Update _toctree.yml * Update docs/source/ar/trainer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-09 11:26:28 -08:00
MaCAT	a06a0d1263	🌐 [i18n-KO] Translated bert.md to Korean (#34627 ) * Translated bert.md, Need additional check * Translation 2nd ver, changed _toctree.yml * Fixed Typo * Update bert.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update bert.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update bert.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update bert.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-07 18:56:09 -08:00
Jiwook Han	1cf17077bf	🌐 [i18n-KO] Translated `timesformer.md` to Korean (#33972 ) * docs: ko: model_doc/timesformer.md * feat: nmt draft * fix: manual edits * fix_toctree * fix toctree on Video Models	2024-11-07 11:04:27 -08:00
Ivan Shcheklein	6938524a28	fix(dvclive): pass fake dataset to avoid exception in trainer init (#34455 ) fix(dvclive): pass fake dataset to avoid exception in trainer	2024-11-07 15:57:34 +01:00
Ahnjj_DEV	7bbc624743	🌐 [i18n-KO] Translated `convbert.md` to Korean (#34599 ) * docs: ko: convbert.md * Update _toctree.yml * feat: nmt draft	2024-11-05 09:32:17 -08:00
Isotr0py	e83aaaa86b	Fix `use_parallel_residual` and `qkv_bias` for StableLM GGUF config extraction (#34450 ) * fix stablelm qkv_bias * fix stablelm qkv_bias and use_parallel_residual * remove original_model.config for stablelm gguf test	2024-11-05 18:26:20 +01:00
Yoni Gozlan	9f28d0c5d0	Fix torchvision interpolation CI (#34539 ) fix-torch-interpolation-ci	2024-11-05 11:02:14 -05:00
Mohamed Mekkouri	d2bae7ee9d	Changing __repr__ in torchao to show quantized Linear (#34202 ) * Changing __repr__ in torchao * small update * make style * small update * add LinearActivationQuantizedTensor * remove some cases * update imports & handle return None * update	2024-11-05 16:11:02 +01:00
Yih-Dar	f2d5dfbab2	Remove `@slow` for `test_eager_matches_sdpa_inference` (#34558 ) * update * update * update * update * update * update * update * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-05 16:10:42 +01:00
Yoni Gottesman	082e57e0d4	Fix #34494 assistant tokens when truncated (#34531 ) * Fix assistant tokens when truncated * fix test * fix test * step	2024-11-05 15:10:15 +00:00
Yih-Dar	74d3824cc0	Revert "Fix Whisper CI" (#34605 ) Revert "Fix Whisper CI (#34541)" This reverts commit `eb811449a2`.	2024-11-05 15:12:47 +01:00
Eon Kim	45b0c7680c	Remove unused test_dataset (#34516 )	2024-11-05 14:01:25 +00:00
Guang Yang	663c851239	DistilBERT is ExecuTorch compatible (#34475 ) * DistillBERT is ExecuTorch compatible * [run_slow] distilbert * [run_slow] distilbert --------- Co-authored-by: Guang Yang <guangyang@fb.com>	2024-11-05 13:41:48 +01:00
Raushan Turganbay	893ad04fad	Load sub-configs from composite configs (#34410 ) * save/load sub-configs * nit forgot these * fix copies * move test to common * use dict for sub-configs * add load-save-laod test * clean up modeling check * oops this are correct keys * fix some tests, missed some composite configs * this model was missed	2024-11-05 11:34:01 +01:00
Benjamin Bossan	5e1fd4e204	FIX: Broken repr of TorchAoConfig (#34560 ) FIX Broken repr of TorchAoConfig The __repr__ method references a non-existent self.kwargs. This is now fixed. There does not appear to be a uniform way of defining __repr__ for quantization configs. I copied the method as implemented for HQQ: `e2ac16b28a/src/transformers/utils/quantization_config.py (L285-L287)`	2024-11-05 10:26:13 +01:00
AbdelKarim ELJANDOUBI	d0b1d8d888	Skip DeepSpeed ZeRO Stage 3 model initialization when bnb (#34395 ) * Skip DeepSpeed ZeRO Stage 3 model initialization when it is intended to be quantized. * Propagate the quantization state using a context manager * make fixup	2024-11-05 10:06:07 +01:00

1 2 3 4 5 ...

17378 Commits