transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
wwwbai	3033509327	Translate attention.md into Chinese (#34716 ) * try * tryagain * tryagggain * translated * translated2 * Update docs/source/zh/attention.md Co-authored-by: Huazhong Ji <hzji210@gmail.com> --------- Co-authored-by: Huazhong Ji <hzji210@gmail.com>	2024-11-19 10:03:12 -08:00
Merve Noyan	befbbf2f98	Added image-text-to-text pipeline to task guide (#34783 ) * Added image-text-to-text pipeline to task guide * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Merge codeblocks --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-19 09:49:10 -08:00
Yih-Dar	469eddbe2d	Fix `check_training_gradient_checkpointing` (#34806 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-19 17:48:34 +01:00
Yih-Dar	05ebe8b9b0	Run `test_medium_seamless_m4t_pt` in `subprocess` to avoid many failures (#34812 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-19 17:32:10 +01:00
Yoni Gozlan	eedc113914	Add Image Processor Fast Deformable DETR (#34353 ) * add deformable detr image processor fast * add fast processor to doc * fix copies * nit docstring * Add tests gpu/cpu and fix docstrings * fix docstring * import changes from detr * fix imports * rebase and fix * fix input data format change in detr and rtdetr fast	2024-11-19 11:18:58 -05:00
Yoni Gozlan	b99ca4d28b	Add support for OpenAI api "image_url" input in chat for image-text-to-text pipeline (#34562 ) * add support for openai api image_url input * change continue to elif * Explicitely add support for OpenAI/TGI chat format * rewrite content to transformers chat format and add tests * Add support for typing of image type in chat templates * add base64 to possible image types * refactor nesting	2024-11-19 11:08:37 -05:00
dependabot[bot]	15dd625a0f	Bump aiohttp from 3.10.2 to 3.10.11 in /examples/research_projects/decision_transformer (#34792 ) Bump aiohttp in /examples/research_projects/decision_transformer Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.10.2 to 3.10.11. - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst) - [Commits](https://github.com/aio-libs/aiohttp/compare/v3.10.2...v3.10.11) --- updated-dependencies: - dependency-name: aiohttp dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-11-19 16:08:07 +00:00
Wang, Yi	dc42330388	fix crash in tiiuae/falcon-11B-vlm image-to-text generation (#34728 ) Signed-off-by: Wang, Yi <yi.a.wang@intel.com>	2024-11-19 16:51:32 +01:00
David Zhang	427b62ed1a	Fix post process function called in the instance segmentation example of mask2former (#34588 ) * Fix post process function called in the instance segmentation example of mask2former * fix description and additional notes for post_process_instance_segmentation of maskformers * remove white space in maskformers post_process_instance_segmentation doc * change image.size[::-1] to height and width for clarity in segmentation examples	2024-11-19 16:49:25 +01:00
jp	fdb9230485	Add do_convert_rgb to vit (#34523 ) * Add: do_convert_rgb * Add: doc string * Update src/transformers/models/vit/image_processing_vit.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/vit/image_processing_vit.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/vit/image_processing_vit.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Add: do_convert_rgb to fast * Add: convert_to_rgb --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2024-11-19 16:48:05 +01:00
Tibor Reiss	7b9e51c1a0	Feature: print tokens per second during training (#34507 ) * Log tokens per second during training * Nitpicks * Move logic into _maybe_log_save_evaluate * Use speed_metrics	2024-11-19 16:46:04 +01:00
Phillip Kuznetsov	5fa4f64605	🚨🚨🚨 fix(Mask2Former): torch export 🚨🚨🚨 (#34393 ) * fix(Mask2Former): torch export Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * revert level_start_index and create a level_start_index_list Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * Add a comment to explain the level_start_index_list Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * Address comment Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * add torch.export.export test Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * rename arg Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * remove spatial_shapes Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * Use the version check from pytorch_utils Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * [run_slow] mask2former Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> --------- Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>	2024-11-19 16:44:53 +01:00
huismiling	581524389a	MLU devices : Checks if mlu is available via an cndev-based check which won't trigger the drivers and leave mlu (#34326 ) * add Cambricon MLUs support * fix mlu device rng state * up for quality check * up mlu to support fp16 * fix mlu device dependency error * fix mlu device dependency error * enable mlu device for bf16 * fix mlu device memory tracker * Cambricon support SDPA and flash_attn * MLU devices : Checks if `mlu` is available via an `cndev-based` check which won't trigger the drivers and leave mlu	2024-11-19 16:37:39 +01:00
Cyril Vallez	e3a5889ef0	Modular fix (#34802 ) * Modular fix * style * remove logger warning * Update modular_model_converter.py	2024-11-19 16:08:57 +01:00
Marc Sun	ce1d328e3b	Fix cache_utils for optimum.quanto kvcache quantization (#34750 ) * add co-author Co-authored-by: w3rew <w3rew@users.noreply.github.com> * fix docs * fix cache * remove print --------- Co-authored-by: w3rew <w3rew@users.noreply.github.com>	2024-11-19 14:16:34 +01:00
Arthur	4bff54f921	Gemma capping (#34282 ) * softcapping * soft cap before the mask * style * ... * super nit * update * fixes * update * small issue with modular * fix modular imports * update * fixup * simplify a hell lot * simplify cleaning imports * finish fixing * update our design * nits * use a deprecation cycle * updates * Fix modular (recursive deps need to always be computed after merges!) * push * fix * update * fix modular order * make fix-copies * updates * update * ? * don't compile for now * ? * fix some stuff * donc! * fix copies * update * fixup * ? * fix two tests * fix? * for now, don't use head info * eager when output attentoin and sdpa or flash as it's the simplest behaviour (for our tests as well :)) * fix-copies * revert sdpa check * Apply suggestions from code review Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> * rebase, fix-copies and push * add a slow integration test * update the test * fix left padding issue * fix test * remove duplicate scaling * quality * add a small test and make sure it works * 2b --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2024-11-19 13:52:38 +01:00
Arthur	54739a320e	Self-speculation (Layer-Skip Llama) (#34240 ) * 😅 * early exit (#34244) * mvp * docs and tests * a few fixes * no shared cache * Apply suggestions from code review Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org> * docs * make fix-copies * cohere fix * [test all] * [test all] consistent model code copies * [test all] make fix-copies :D * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org> * Update src/transformers/generation/candidate_generator.py * Update src/transformers/generation/configuration_utils.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * [test all] don't use a stand-alone attribute; fix test --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2024-11-19 12:20:07 +00:00
jiqing-feng	5de58d5955	fix cpu bnb path (#34647 ) * fix cpu bnb path * Update src/transformers/generation/utils.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix awq quantizer env check * fix awq quantizer device check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-11-19 12:44:44 +01:00
jp	3cd78be34e	Fix: siglip image processor rgb_convert is not being applied correctly. (#34301 ) Fix: do_convert_rgb	2024-11-19 12:40:36 +01:00
Jiahao Li	0db91c3c8d	Support gradient checkpointing in Qwen2VL ViT (#34724 ) * Support gradient checkpointing in Qwen2VL ViT * Enable gradient checkpoint tests for Qwen2VL * [run-slow] qwen2_vl	2024-11-19 12:30:44 +01:00
gebbissimo	1a0cd69435	feat: allow to use hf-hub models for timm backbone (#34729 ) Currently a backbone name like 'hf-hub:bioptimus/H-optimus-0' throws an error, even though it could work. Co-authored-by: Christian Gebbe <>	2024-11-19 10:26:35 +00:00
Guillem García Subies	d8a5d31d9c	Trainer hyperparameter search kwargs docs update (#34459 ) * doc: Trainer.hyperparameter_search docstring discrepancy solved * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-19 11:23:03 +01:00
Arthur	dadb286f06	protect tensor parallel usage (#34800 ) protect	2024-11-19 09:54:11 +01:00
Yih-Dar	eed11f34ab	Fix Whisper CI (#34617 ) * Revert "Revert "Fix Whisper CI" (#34605)" This reverts commit `74d3824cc0`. * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-11-18 21:37:50 +01:00
Aymeric Roucher	759a378ee5	Allow handling files as args for a tool created with Tool.from_space (#34687 ) * Allow handling files as args for a tool created with `Tool.from_space`	2024-11-18 20:15:35 +01:00
Ke Wen	20142ab542	Simplify Tensor Parallel implementation with PyTorch TP (#34184 ) * Simplify Tensor Parallel implementation with PyTorch TP * Move tp_plan to config * Lint * Format and warning * Disable copy-from check * Conditionally get attr from config * make fix-copies * Move base_model_tp_plan to PretrainedConfig * Move TP into from_pretrained * Add device context for load * Do not serialize * Move _tp_plan setting to post_init * Add has_tp_plan * Add test_tp * Add 'Multi-gpu inference' doc * Add backward support for device type identification * Auto-detect accelerator * supports_tp_plan * copyright year * Fix copy	2024-11-18 19:51:49 +01:00
ecyht2	7df93d6ffb	fix: Wrong task mentioned in docs (#34757 )	2024-11-18 18:42:28 +00:00
Hun-soo Jung	7693b62268	Fix callback key name (#34762 ) Fixes typo.	2024-11-18 18:41:12 +00:00
Eon Kim	1ef6c5f1c5	fix: Update pixel_values parameter in hf_model input (#34782 )	2024-11-18 18:40:01 +00:00
Fanli Lin	e80a65ba4f	[tests] add XPU part to testing (#34778 ) add XPU part to testing Signed-off-by: Lin, Fanli <fanli.lin@intel.com>	2024-11-18 09:59:11 -08:00
Fanli Lin	9568a9dfc5	[docs] add XPU besides CUDA, MPS etc. (#34777 ) add XPU	2024-11-18 09:58:50 -08:00
Fanli Lin	8568bf1bcf	[docs] make `empty_cache` device-agnostic (#34774 ) make device-agnostic	2024-11-18 09:58:26 -08:00
Wing Lian	36759f3312	make sure to disable gradients for integer tensor (#32943 )	2024-11-18 16:49:37 +01:00
Dmitry Rogozhkin	1c471fc307	Fix skip of test_training_gradient_checkpointing (#34723 ) `19d58d31f` has introduced a context manager to manage subtests of test_training_gradient_checkpointing. However, test body was not moved under "with" statement. Thus, while tests are correctly marked as skipped, test bodies were still executed. In some cases, as with llama this caused attribute errors. Fixes: #34722 Fixes: `19d58d31f` ("Add MLLama (#33703)") Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>	2024-11-18 15:45:40 +01:00
ZuoChen_BUPT	c772d4d91e	fix a typo bug where 'id2label' was incorrectly written as 'i2label' when reading config (#34637 ) fix a bug where 'id2label' was incorrectly written as 'i2label' when reading the config from pretrained config	2024-11-18 14:41:48 +01:00
Ofek Lev	eb0ab3ed4b	Fix broken link (#34618 )	2024-11-18 14:13:26 +01:00
Raushan Turganbay	1646ffb4d1	VLMs: `patch_size` -> `num_image_tokens` in processing (#33424 ) * use num additional tokens * fix copies + docs * another fix copies :) * add docs * move order for BC	2024-11-18 13:21:07 +01:00
Shane A	3ee24e2208	Add OLMo November 2024 (#34551 ) * Add model skeletion with transformers-cli add-new-model-like * Convert config to modular, add rms_norm_eps, delete clip_qkv * Convert model to modular, add RMSNorm * Add flash attention with qk norm and no qkv clipping * Add decoder layer with RMSNorm after attention/feedforward layers * Add base and causal model * Add converter improvements from OLMo repo * Update weight loading in OLMo to HF converter * Set correct default for rms_norm_eps * Set correct pipeline_model_mapping in test * Run make fixup * Fix model type * Re-run modular conversion * Manually set config docs to fix build errors * Convert olmo-1124 to olmo_1124 to fix flash attention docs errors * Start updating tests * Update tests * Copy upstream test_eager_matches_sdpa_inference_1_bfloat16 changes to olmo_1124 * Rename input_layernorm and post_attention_layernorm to reflect their ops better * Use correct tokenizer * Remove test unsupported by GPT2 tokenizer * Create GenerationConfig outside of from_pretrained call * Use simpler init file structure * Add explicit __all__ to support simplified init * Make safetensor serialization the default * Update OLMo November 2024 docs	2024-11-18 10:43:10 +01:00
Joao Gante	13493215ab	🧼 remove v4.44 deprecations (#34245 ) * remove v4.44 deprecations * PR comments * deprecations scheduled for v4.50 * hub version update * make fiuxp --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-11-15 23:07:24 +01:00
AbdelKarim ELJANDOUBI	8d50fda644	Remove FSDP wrapping from sub-models. (#34452 ) * Remove FSDP wrapping from sub-models. * solve conflict trainer.py * make fixup * add unit test for fsdp_auto_wrap_policy when using auto_find_batch_size * put back extract_model_from_parallel * use transformers unwrap_model	2024-11-15 23:00:03 +01:00
Wing Lian	b0c0ba7b4d	FSDP grad accum fix (#34645 ) * add gradient accumulation steps tests for fsdp * invert no_sync context to fix training for fsdp	2024-11-15 22:28:06 +01:00
jiqing-feng	52ea4aa589	add xpu path for awq (#34712 ) * add xpu path for awq * update readme	2024-11-15 15:45:24 +01:00
CezaPasc	7b3d615bc2	fix(wandb): pass fake dataset to avoid exception in trainer (see #34455 ) (#34720 )	2024-11-15 15:44:02 +01:00
Lysandre Debut	f5dbfab7f3	Update llava.md (#34749 ) LLava -> Llava	2024-11-15 15:39:57 +01:00
lewtun	8ba3e1505e	Retain newlines in chat template when `continue_final_message=True` (#34253 ) * Retain newlines in chat template when * Add try/except * Add regression test * Simplify test * Apply suggestions from code review Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2024-11-15 14:27:04 +00:00
Fanli Lin	a3d69a8994	[docs] add xpu device check (#34684 ) * add XPU path * use accelerate API * Update docs/source/en/tasks/semantic_segmentation.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update more places with accelerate API --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-13 14:16:59 -08:00
Xiao Yuan	68f8186a89	Fix example in EsmConfig docstring (#34653 )	2024-11-13 13:55:58 -08:00
Pedro Cuenca	e7c36a9d57	[docs] Broken link in generation_strategies (#34717 ) [docs] Broken link	2024-11-13 13:44:42 -08:00
MaCAT	be8748a53c	🌐 [i18n-KO] Translated marian.md to Korean (#34698 ) * initial translation * removed english * Fixed Trivial Typos, updated _toctree.yml	2024-11-13 13:14:23 -08:00
Aymeric Roucher	33eef99250	Agents: Small fixes in streaming to gradio + add tests (#34549 ) * Better support transformers.agents in gradio: small fixes and additional tests	2024-11-11 20:52:09 +01:00

1 2 3 4 5 ...

17398 Commits