transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-17 11:38:21 +06:00

Author	SHA1	Message	Date
Mohamed Mekkouri	b41591d847	Fix : fix doc fp8 (#36173 ) * fix * fix	2025-02-13 15:29:59 +01:00
Arthur	b079dd1fa2	Fix red CI (#36174 ) test was weird	2025-02-13 14:27:55 +01:00
Joao Gante	d114a6f78e	[Modular] skip modular checks based on diff (#36130 ) skip modular checks based on diff	2025-02-13 12:53:21 +00:00
Pavel Iakubovskii	6397916dd2	Remove loading custom kernel for RT-DETRv2 (#36098 ) * Remove loading custom kernels * Remove config param * Fixup	2025-02-13 12:01:53 +00:00
Mohamed Mekkouri	efe72fe21f	Adding FP8 Quantization to transformers (#36026 ) * first commit * adding kernels * fix create_quantized_param * fix quantization logic * end2end * fix style * fix imports * fix consistency * update * fix style * update * udpate after review * make style * update * update * fix * update * fix docstring * update * update after review * update * fix scheme * update * update * fix * update * fix docstring * add source * fix test --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-13 13:01:19 +01:00
Lysandre Debut	c82319b493	Helium documentation fixes (#36170 ) * Helium documentation fixes * Update helium.md * Update helium.md * Update helium.md	2025-02-13 12:20:53 +01:00
Thomas Bauwens	8f137b2427	Move `DataCollatorForMultipleChoice` from the docs to the package (#34763 ) * Add implementation for DataCollatorForMultipleChoice based on docs. * Add DataCollatorForMultipleChoice to import structure. * Remove custom DataCollatorForMultipleChoice implementations from example scripts. * Remove custom implementations of DataCollatorForMultipleChoice from docs in English, Spanish, Japanese and Korean. * Refactor torch version of DataCollatorForMultipleChoice to be more easily understandable. * Apply suggested changes and run make fixup. * fix copies, style and fixup * add missing documentation * nits * fix docstring * style * nits * isort --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2025-02-13 12:01:28 +01:00
CL-ModelCloud	35c155052d	Fix PretrainedTokenizerFast check => Fix PretrainedTokenizerFast Save (#35835 ) * Fix the bug in tokenizer.save_pretrained when saving tokenizer_class to tokenizer_config.json * Update tokenization_utils_base.py * Update tokenization_utils_base.py * Update tokenization_utils_base.py * add tokenizer class type test * code review * code opt * fix bug * Update test_tokenization_fast.py * ruff check * make style * code opt * Update test_tokenization_fast.py --------- Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>	2025-02-13 12:00:33 +01:00
Marco Edward Gorelli	3c912c9089	docs: fix return type annotation of `get_default_model_revision` (#35982 )	2025-02-13 11:59:15 +01:00
gewenbin0992	6a1ab634b6	qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1 (#36083 ) * qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1 * fix * fix * fix * fix * add tests * fix test bugs * fix * fix failed tests * fix	2025-02-13 11:35:28 +01:00
Pavel Iakubovskii	d419862889	Fix tests for vision models (#35654 ) * Trigger tests * [run-slow] beit, detr, dinov2, vit, textnet * Fix BEiT interpolate_pos_encoding * Fix DETR test * Update DINOv2 test * Fix textnet * Fix vit * Fix DPT * fix data2vec test * Fix textnet test * Update interpolation check * Fix ZoeDepth tests * Update interpolate embeddings for BEiT * Apply suggestions from code review	2025-02-13 10:28:37 +00:00
Lucain	e60ae0d078	Replace deprecated update_repo_visibility (#35970 )	2025-02-13 11:27:55 +01:00
Nerogar	9065cf0d92	Fix Gemma2 dtype issue when storing weights in float16 precision (#35398 ) fix gemma2 dtype issue when storing weights in float16 precision	2025-02-13 11:17:37 +01:00
Ben Schneider	08ab1abff4	Add reminder config to issue template and print DS version in env (#35156 ) * update env command to log deepspeed version * suppress deepspeed import logging * Add reminder to include configs to repro description in bug report. * make fixup * [WIP] update import utils for deepspeed * Change to using is_deepspeed_available() from integrations. * make fixup	2025-02-13 10:55:49 +01:00
Sambhav Dixit	950cfb0b4f	Fix PaliGemma Pad Token Masking During Training #35855 (#35859 ) * change order of unmasking of tokens * library import * class setup * test function * refactor * add commit message * test modified * explict initiliasation of weights + made model smaller * removed sepete testing file * fixup * fixup core * test attention mask with token types * tests fixup * removed PaliGemmaAttentionMaskTest class --------- Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>	2025-02-13 10:11:44 +01:00
Benjamin Badger	1614d196e8	Mllama fsdp (#36000 ) * pixel input assignment revoked * double send * Update src/transformers/models/mllama/modeling_mllama.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-02-13 09:49:39 +01:00
ivarflakstad	847854b023	Add git LFS to AMD docker image (#36016 ) Add git lfs to AMD docker image	2025-02-12 22:27:21 +01:00
Yih-Dar	9985d06add	skip `test_initialization` for `VitPoseBackboneModelTest` for now (#36154 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-12 18:24:24 +01:00
Yih-Dar	4a5a7b991a	Fix test fetcher (#36129 ) * fix * fix * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-12 17:35:41 +01:00
Zach Mueller	1fae54c721	Add more rigerous non-slow grad accum tests (#35668 ) * Add more rigerous non-slow grad accum tests * Further nits * Re-add space * Readbility * Use tinystories instead * Revert transformer diff * tweak threshs	2025-02-12 10:26:21 -05:00
Ke Wen	f869d486d3	Update doc re list of models supporting TP (#35864 ) Update doc about models' TP support	2025-02-12 15:53:27 +01:00
hsilva664	281c0c8b5b	adding option to save/reload scaler (#34932 ) * Adding option to save/reload scaler * Removing duplicate variable * Adding save/reload test * Small fixes on deterministic algorithm call * Moving LLM test to another file to isolate its environment * Moving back to old file and using subprocess to run test isolated * Reverting back accidental change * Reverting back accidental change	2025-02-12 15:48:16 +01:00
kang sheng	a33ac830af	Fix multi gpu loss sync condition, add doc and test (#35743 ) * Fix multi gpu loss sync condition, add doc and test * rename function and class * loss should not scale during inference * fix typo	2025-02-12 15:41:31 +01:00
zhuHQ	08c4959a23	Optim: APOLLO optimizer integration (#36062 ) * Added APOLLO optimizer integration * fix comment * Remove redundancy: Modularize low-rank optimizer construction * Remove redundancy: Remove useless comment * Fix comment: Add typing * Fix comment: Rewrite apollo desc	2025-02-12 15:33:43 +01:00
Dmitry Rogozhkin	2440512723	multi-gpu: fix tensor device placements for various models (#35763 ) * milti-gpu: fix inputs_embeds + position_embeds Fixing the following errors in few models: ``` > hidden_states = inputs_embeds + pos_embeds E RuntimeError: Expected all tensors to be on the same device, but found at least two devices, xpu:2 and xpu:3! ``` Fixes: #35762 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * multi-gpu: fix tensor device placements for various models Fixes: #35762 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * Apply make fix-copies Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> --------- Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>	2025-02-12 15:28:18 +01:00
Lucain	befea8c4f0	🚨 Remove cache migration script (#35810 ) * Remove cache migration script * remove dummy move_cache	2025-02-12 15:12:38 +01:00
dependabot[bot]	d52a9d08ce	Bump cryptography from 43.0.1 to 44.0.1 in /examples/research_projects/decision_transformer (#36142 ) Bump cryptography in /examples/research_projects/decision_transformer Bumps [cryptography](https://github.com/pyca/cryptography) from 43.0.1 to 44.0.1. - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst) - [Commits](https://github.com/pyca/cryptography/compare/43.0.1...44.0.1) --- updated-dependencies: - dependency-name: cryptography dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-12 13:34:52 +00:00
dependabot[bot]	31e4831b98	Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/vqgan-clip (#36136 ) Bump transformers in /examples/research_projects/vqgan-clip Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-12 13:21:09 +00:00
Leon Engländer	243aeb7c4a	Fix Gradient Checkpointing for Deberta & Deberta-V2 using PEFT / Adapters (#35898 ) Replace In-Place Operations for Deberta and Deberta-V2	2025-02-12 14:21:01 +01:00
Joao Gante	8a2f062eac	[commands] remove deprecated/inoperational commands (#35718 ) rm deprecated/inoperational commands	2025-02-12 12:23:58 +00:00
Raushan Turganbay	8fc6ecba4f	VLM: enable skipped tests (#35746 ) * fix cached tests * fix some tests * fix pix2struct * fix	2025-02-12 12:55:46 +01:00
Sambhav Dixit	d6897b46bd	Add utility for Reload Transformers imports cache for development workflow #35508 (#35858 ) * Reload transformers fix form cache * add imports * add test fn for clearing import cache * ruff fix to core import logic * ruff fix to test file * fixup for imports * fixup for test * lru restore * test check * fix style changes * added documentation for usecase * fixing --------- Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>	2025-02-12 12:45:11 +01:00
Joao Gante	1cc7ca3295	Whisper: remove redundant assisted generation tests (#34814 ) * remove redundant test * delete another test * revert default max_length * (wrong place, moving)	2025-02-12 11:37:19 +00:00
MilkClouds	0cd5e2dfd0	added warning to Trainer when label_names is not specified for PeftModel (#32085 ) * feat: added warning to Trainer when label_names is not specified for PeftModel * Update trainer.py * feat: peft detectw ith `_is_peft_model` * Update src/transformers/trainer.py Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> * Applied formatting in trainer.py --------- Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>	2025-02-12 12:34:47 +01:00
nhamanasu	377d8e2b9c	add RAdamScheduleFree optimizer (#35313 ) * add RAdamScheduleFree optimizer * revert schedulefree version to the minimum requirement * refine is_schedulefree_available so that it can take min_version * refine documents --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-02-12 11:31:51 +01:00
Harry Mellor	f5fff672db	Add pipeline parallel plan to `PretrainedConfig` and `PreTrainedModel` (#36091 ) * Add `base_model_pp_plan` to `PretrainedConfig` Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add `_pp_plan` to `PreTrainedModel` Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add both to Llama for testing Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Fix type error Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Update to suggested schema Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * `_pp_plan` keys are not patterns Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Simplify schema Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Fix typing error Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Update input name for Llama Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Aria Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Bamba Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Cohere 1 & 2 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to diffllama and emu3 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Gemma 1 & 2 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to GLM and GPT NeoX Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Granite and Helium Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Mistral and Mixtral Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to OLMo 1 & 2 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Phi and Phi 3 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan for Qwen 2, 2 MoE, 2 VL and 2.5 VL Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan for Starcoder 2 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add enum for accessing inputs and outputs Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Update type hints to use tuples Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Change outer list to tuple Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> --------- Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-02-12 10:51:48 +01:00
Fanli Lin	11afab19c0	[docs] update awq doc (#36079 ) * update awq doc * Update docs/source/en/quantization/awq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/awq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/awq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/awq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * add note for inference --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-02-11 10:35:28 -08:00
Fanli Lin	9b69986e8a	[docs] minor doc fix (#36127 ) fix	2025-02-11 10:31:12 -08:00
Sambhav Dixit	1b57de8dcf	Make `output_dir` Optional in `TrainingArguments` #27866 (#35735 ) * make output_dir optional * inintaied a basic testing module to validate and verify the changes * Test output_dir default to 'tmp_trainer' when unspecified. * test existing functionality of output_dir. * test that output dir only created when needed * final check * added doc string and changed the tmp_trainer to trainer_output * amke style fixes to test file. * another round of fixup --------- Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>	2025-02-11 18:54:36 +01:00
Arthur	03534a92f8	update tiktoken integ to use converted (#36135 )	2025-02-11 18:27:22 +01:00
Pablo Montalvo	3a5c328fd8	Fix CI issues (#35662 ) * make explicit gpu dep * [run-slow] bamba	2025-02-11 18:17:01 +01:00
Hicham Tala	775252abd4	Fix max size deprecated warning (#34998 ) * Remove unused `max_size` variable in processor which was always `None` and triggered unnecessary deprecated warning * Remove unused `max_size` variable in processor which was always `None` and triggered unnecessary deprecated warning * Remove deprecated warnings and eliminate `max_size` usage * Test use `int` as argument for `size` Add a test to ensure test can pass successfully and backward compatibility * The test pipelines still use `max_size` Remove `max_size` from test pipelines and replace by `size` by a `Dict` with `'shortest_edge'` `'longest_edge'` as keys * Reformatting * Reformatting * Revert "Reformatting" This reverts commit `c3040acee7`. * Revert "Reformatting" This reverts commit `ac4522e5c9`. * Revert "The test pipelines still use `max_size`" This reverts commit `eaed96f041`. * Revert "Test use `int` as argument for `size`" This reverts commit `1925ee38c7`. * Revert "Remove deprecated warnings and eliminate `max_size` usage" This reverts commit `d8e7e6ff90`. * Change version `4.26` to "a future version" * Reformatting * Revert "Change version `4.26` to "a future version"" This reverts commit `2b53f9e4`	2025-02-11 18:14:31 +01:00
湛露先生	5489fea557	update awesome-transformers.md. (#36115 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-02-11 15:55:49 +00:00
Maxim Evtush	76048be419	fix: typos in documentation files (#36122 ) * Update tools.py * Update text_generation.py * Update question_answering.py	2025-02-11 13:47:20 +00:00
Pavel Iakubovskii	f42d46ccb4	Add common test for `torch.export` and fix some vision models (#35124 ) * Add is_torch_greater_or_equal test decorator * Add common test for torch.export * Fix bit * Fix focalnet * Fix imagegpt * Fix seggpt * Fix swin2sr * Enable torch.export test for vision models * Enable test for video models * Remove json * Enable for hiera * Enable for ijepa * Fix detr * Fic conditional_detr * Fix maskformer * Enable test maskformer * Fix test for deformable detr * Fix custom kernels for export in rt-detr and deformable-detr * Enable test for all DPT * Remove custom test for deformable detr * Simplify test to use only kwargs for export * Add comment * Move compile_compatible_method_lru_cache to utils * Fix beit export * Fix deformable detr * Fix copies data2vec<->beit * Fix typos, update test to work with dict * Add seed to the test * Enable test for vit_mae * Fix beit tests * [run-slow] beit, bit, conditional_detr, data2vec, deformable_detr, detr, focalnet, imagegpt, maskformer, rt_detr, seggpt, swin2sr * Add vitpose test * Add textnet test * Add dinov2 with registers * Update tests/test_modeling_common.py * Switch to torch.testing.assert_close * Fix masformer * Remove save-load from test * Add dab_detr * Add depth_pro * Fix and test RT-DETRv2 * Fix dab_detr	2025-02-11 11:37:31 +00:00
Arthur	1779f5180e	Fix nighlty CIs: missing atols (#35903 ) fix osme missing atols	2025-02-11 10:49:21 +01:00
ivarflakstad	1feebb5b41	AutoformerForPrediction test add atol (#36017 )	2025-02-10 19:22:24 +01:00
Joao Gante	be2ac0916a	[generate] shape checks in tests compatible with fixed-length caches (+ some minor fixes) (#35993 ) * shape checks compatible with static cache * add test * tmp * manually turn on eager attn when we want to output attn * typo * generalize to encoder-decoder models * force compilation on cpu * tmp commit * fix static cache shape checks * models with odd caches * fix copies * shorter cache search loop * use decoder_past_key_values everywhere * better test variable names and comments * signature * rename _check_outputs into _check_generate_outputs * add comments * HybridCache future test note	2025-02-10 17:50:54 +00:00
Marc Sun	9510ae39d9	fix bnb warning (#36116 ) fix	2025-02-10 17:34:50 +01:00
kkscilife	09261ccf12	[Bugfix] fix file name of docstring in utils/check_table.py (#36108 ) fix file name Co-authored-by: kkscilife <qa-caif-cicd@pjlab.org.cn>	2025-02-10 15:48:02 +00:00

... 26 27 28 29 30 ...

19383 Commits