transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-18 20:18:24 +06:00

Author	SHA1	Message	Date
dependabot[bot]	62ab94dea8	Bump tornado from 6.4.1 to 6.4.2 in /examples/research_projects/visual_bert (#34887 ) Bump tornado in /examples/research_projects/visual_bert Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.1 to 6.4.2. - [Changelog](https://github.com/tornadoweb/tornado/blob/v6.4.2/docs/releases.rst) - [Commits](https://github.com/tornadoweb/tornado/compare/v6.4.1...v6.4.2) --- updated-dependencies: - dependency-name: tornado dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-11-25 12:54:55 +00:00
Meliksah Turker	c50b5675d6	prepare_fa2_from_position_ids function bugfix (#33269 ) contiguous() is called before view() for key and value within prepare_fa2_from_position_ids function	2024-11-25 13:51:26 +01:00
VictorAtIfInsurance	a0f4f3174f	allow unused input parameters passthrough when chunking in asr pipelines (#33889 ) * allow unused parameter passthrough when chunking in asr pipelines * format code * format * run fixup * update tests * update parameters to pipline in test * updates parametrs in tests * change spelling in gitignore * revert .gitignore to main * add git ignore of devcontainer folder * assert asr output follows expected inference output type * run fixup * Remove .devcontainer from .gitignore * remove compliance check	2024-11-25 11:36:44 +01:00
kang sheng	4dc1a69349	Sum gathered input tokens (#34554 ) * sum gathered input tokens * ruff line-length is 119, format the code --------- Co-authored-by: kangsheng <kangsheng@meituan.com>	2024-11-25 11:27:13 +01:00
Raushan Turganbay	1e492afd61	🔴 Mllama: fix base prefix (#34874 ) fix base prefix	2024-11-25 11:20:20 +01:00
Arthur	857d46ca0c	[`Deberta/Deberta-v2`] Refactor code base to support compile, export, and fix LLM (#22105 ) * some modification for roadmap * revert some changes * yups * weird * make it work * sttling * fix-copies * fixup * renaming * more fix-copies * move stuff around * remove torch script warnings * ignore copies * revert bad changes * woops * just styling * nit * revert * style fixup * nits configuration style * fixup * nits * will this fix the tf pt issue? * style * ??????? * update * eval? * update error message * updates * style * grumble grumble * update * style * nit * skip torch fx tests that were failing * style * skip the failing tests * skip another test and make style	2024-11-25 10:43:16 +01:00
Raushan Turganbay	098962dac2	BLIP: fix generation after hub update (#34876 ) * fix blip generation * dont remove it yet * Update src/transformers/models/blip_2/modeling_blip_2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * address comments * modular --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-11-25 10:41:55 +01:00
Raushan Turganbay	c1a8520419	Cache: init empty cache when `use_cache` (#34274 ) * fix * fix tests * fix copies * add docs * Revert "add docs" This reverts commit `32d35634f1`. * qwen move deltas * mllama can potentiall fullgraph compile * enable mllama compile and fix tests * remove mllama fixes	2024-11-25 10:11:33 +01:00
Dmitry Rogozhkin	1339a14dca	Add safe_globals to resume training on PyTorch 2.6 (#34632 ) Starting from version 2.4 PyTorch introduces a stricter check for the objects which can be loaded with torch.load(). Starting from version 2.6 loading with weights_only=True requires allowlisting of such objects. This commit adds allowlist of some numpy objects used to load model checkpoints. Usage is restricted by context manager. User can still additionally call torch.serialization.add_safe_globals() to add other objects into the safe globals list. Accelerate library also stepped into same problem and addressed it with PR-3036. Fixes: #34631 See: https://github.com/pytorch/pytorch/pull/137602 See: https://pytorch.org/docs/stable/notes/serialization.html#torch.serialization.add_safe_globals See: https://github.com/huggingface/accelerate/pull/3036 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>	2024-11-25 10:03:43 +01:00
jeongin601	318fe25f22	Fix: Enable prefill phase key value caching of nemotron/minitron models (#34742 ) * modeling nemotron kv caching bugfix Signed-off-by: jeongin601 <0200angela@gmail.com> * test file deleted Signed-off-by: jeongin601 <0200angela@gmail.com> * code refinement Signed-off-by: jeongin601 <0200angela@gmail.com> * remove unused variables Signed-off-by: jeongin601 <0200angela@gmail.com> * import block sorted * removed deprecation warning Signed-off-by: jeongin601 <0200angela@gmail.com> * removed support for tuple shape past_key_values Signed-off-by: jeongin601 <0200angela@gmail.com> * Update conditional statement for cache initialization Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Signed-off-by: jeongin601 <0200angela@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-11-25 09:45:35 +01:00
Yoni Gozlan	3a8eb74668	Fix support for image processors modifications in modular (#34866 ) * add fix and examples * fix camel case naming	2024-11-22 18:14:24 -05:00
Mohamed Mekkouri	54be2d7ae8	Bitnet test fix to avoid using gated model (#34863 ) small test fix	2024-11-22 17:18:49 +01:00
Benjamin Bossan	286ffaaf0a	[CI] Skip EETQ tests while package is broken with latest transformers (#34854 ) * CI Skip EETQ tests while package is broken EETQ tries to import the shard_checkpoint function from transformers but the function has been removed. Therefore, trying to use EETQ currently results in an import error. This fix results in EETQ tests being skipped if there is an import error. The issue has been reported to EETQ: https://github.com/NetEase-FuXi/EETQ/issues/34 * Raise helpful error when trying to use eetq * Forget to raise the error in else clause	2024-11-22 17:13:30 +01:00
Andrés Marafioti	861758e235	smol improvements to support more flexible usage (#34857 ) * smol improvements to support more flexible usage * ruff	2024-11-22 16:34:38 +01:00
Nadav Timor	42b36d7395	Speculative decoding: Test the target distribution (to prevent issues like #32867 ) (#34553 ) * Update test_utils.py * formatting * Update test_utils.py * formatting * formatting * Update test_utils.py * formatting * Update test_utils.py * formatting * format * comments at standard positions	2024-11-22 16:02:37 +01:00
Arthur	597efd21d2	Auto compile when static cache (#34247 ) * generate with compile * nits * simple * generate with compile * nits * simple * safe * style * Update src/transformers/generation/utils.py Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> * remove TOKENIZER forked warning --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2024-11-22 15:33:35 +01:00
Konrad Kalita	d9e6f307e7	Remove quantization related config from dequantized model (#34856 ) * Remove quantization related config from dequantized model * Fix whitespace	2024-11-22 10:06:29 +01:00
Logan Adams	1867be666d	Update checks for torch.distributed.tensor to require torch >= 2.5 (#34816 ) * Update checks for torch.distributed.tensor * Update PR with feedback * Formatting fix for import order * Remove unused function	2024-11-22 10:05:26 +01:00
Raushan Turganbay	6a912ff2c5	Watermarking: fix order (#34849 ) fix watermarking order	2024-11-22 08:25:14 +01:00
Cyril Vallez	4e90b99ed9	Refactor StarCoder2 using modular (#34015 ) * Create modular_starcoder2.py * Update modular_starcoder2.py * update * finalize modular * revert # no-unravel * Add support * style * Update modular_model_converter.py * update docstring	2024-11-21 14:52:39 +01:00
Jonathan Mamou	18871599c9	Fix heuristic scheduling for UAG (#34805 ) * fix heuristic schedule * fix style * fix format	2024-11-21 14:46:35 +01:00
AbdelKarim ELJANDOUBI	d6a5c23f71	Fix ds nvme (#34444 ) * skip nested deepspeed.zero.Init call * make fixup * solve conflict * solve conflict * put back local * use context mangers instead of local thread * Skip recursive calls to deepspeed.zero.Init * Skip recursive calls to deepspeed.zero.Init * back to old notebooks * make style	2024-11-21 13:52:22 +01:00
Vladislav Bronzov	ae5cbf804b	Improve gguf tensor processing (#34515 ) * add tensor processing system to separate logic for models * format refactoring * small fix * make some methods private * move custom methods to processors * refactor tensor processing * format fix	2024-11-21 13:40:49 +01:00
farrosalferro	c57eafdaa1	Add Nemotron GGUF Loading Support (#34725 ) * Add Nemotron GGUF Loading Support * fix the Nemotron architecture assignation --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-11-21 11:37:34 +01:00
Quentin Gallouédec	d4e1acbb7c	Change logging level from warning to info for `max_steps` overriding `num_train_epochs` (#34810 ) Update trainer.py	2024-11-21 11:37:02 +01:00
Raushan Turganbay	28fb02fc05	VLMs: enable generation tests - last batch (#34484 ) * add tests for 3 more vlms * fix fuyu back * skip test	2024-11-21 11:00:22 +01:00
Yih-Dar	40821a2478	Fix CI slack reporting issue (#34833 ) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-20 21:36:13 +01:00
Marc Sun	3cb8676a91	Fix CI by tweaking torchao tests (#34832 )	2024-11-20 20:28:51 +01:00
Corentin Royer	bf42c3bd4b	Fix hyperparameter search when optuna+deepseed (#34642 ) * Fix hyperparameter search when optuna+deepseed * Adding free_memory to the search setup --------- Co-authored-by: Corentin-Royer <corentin.royer@ibm.com>	2024-11-20 18:02:58 +01:00
Marc Sun	67890de3b8	Torchao weights only + prequantized compability (#34355 ) * weights only compability * better tests from code review * ping torch version * add weights_only check	2024-11-20 17:24:45 +01:00
Tibor Reiss	f297af55df	Fix: take into account meta device (#34134 ) * Do not load for meta device * Make some minor improvements * Add test * Update tests/utils/test_modeling_utils.py Update test parameters Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Make the test simpler --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-11-20 11:32:07 +01:00
Phillip Kuznetsov	8cadf76e1c	fix(DPT,Depth-Anything) `torch.export` (#34103 ) * Fix torch.export issue in dpt based models Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * Simplify the if statements Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * Move activation definitions of zoe_depth to init() Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * Add test_export for dpt and zoedepth Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * add depth anything Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * Remove zoedepth non-automated zoedepth changes and zoedepth test Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * [run_slow] dpt, depth_anything, zoedepth Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> --------- Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>	2024-11-20 11:31:21 +01:00
kjohew	9d16441e4f	Fix the memory usage issue of logits in generate() (#34813 )	2024-11-20 11:25:37 +01:00
Raushan Turganbay	9470d65324	Fix low memory beam search (#34746 ) * fix * higher max positions in tests	2024-11-20 07:46:35 +01:00
Raushan Turganbay	145fbd46cb	LLaVA OV: fix unpadding precision (#34779 ) * fix * propagate * type check	2024-11-20 07:46:13 +01:00
wwwbai	3033509327	Translate attention.md into Chinese (#34716 ) * try * tryagain * tryagggain * translated * translated2 * Update docs/source/zh/attention.md Co-authored-by: Huazhong Ji <hzji210@gmail.com> --------- Co-authored-by: Huazhong Ji <hzji210@gmail.com>	2024-11-19 10:03:12 -08:00
Merve Noyan	befbbf2f98	Added image-text-to-text pipeline to task guide (#34783 ) * Added image-text-to-text pipeline to task guide * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Merge codeblocks --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-19 09:49:10 -08:00
Yih-Dar	469eddbe2d	Fix `check_training_gradient_checkpointing` (#34806 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-19 17:48:34 +01:00
Yih-Dar	05ebe8b9b0	Run `test_medium_seamless_m4t_pt` in `subprocess` to avoid many failures (#34812 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-19 17:32:10 +01:00
Yoni Gozlan	eedc113914	Add Image Processor Fast Deformable DETR (#34353 ) * add deformable detr image processor fast * add fast processor to doc * fix copies * nit docstring * Add tests gpu/cpu and fix docstrings * fix docstring * import changes from detr * fix imports * rebase and fix * fix input data format change in detr and rtdetr fast	2024-11-19 11:18:58 -05:00
Yoni Gozlan	b99ca4d28b	Add support for OpenAI api "image_url" input in chat for image-text-to-text pipeline (#34562 ) * add support for openai api image_url input * change continue to elif * Explicitely add support for OpenAI/TGI chat format * rewrite content to transformers chat format and add tests * Add support for typing of image type in chat templates * add base64 to possible image types * refactor nesting	2024-11-19 11:08:37 -05:00
dependabot[bot]	15dd625a0f	Bump aiohttp from 3.10.2 to 3.10.11 in /examples/research_projects/decision_transformer (#34792 ) Bump aiohttp in /examples/research_projects/decision_transformer Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.10.2 to 3.10.11. - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst) - [Commits](https://github.com/aio-libs/aiohttp/compare/v3.10.2...v3.10.11) --- updated-dependencies: - dependency-name: aiohttp dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-11-19 16:08:07 +00:00
Wang, Yi	dc42330388	fix crash in tiiuae/falcon-11B-vlm image-to-text generation (#34728 ) Signed-off-by: Wang, Yi <yi.a.wang@intel.com>	2024-11-19 16:51:32 +01:00
David Zhang	427b62ed1a	Fix post process function called in the instance segmentation example of mask2former (#34588 ) * Fix post process function called in the instance segmentation example of mask2former * fix description and additional notes for post_process_instance_segmentation of maskformers * remove white space in maskformers post_process_instance_segmentation doc * change image.size[::-1] to height and width for clarity in segmentation examples	2024-11-19 16:49:25 +01:00
jp	fdb9230485	Add do_convert_rgb to vit (#34523 ) * Add: do_convert_rgb * Add: doc string * Update src/transformers/models/vit/image_processing_vit.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/vit/image_processing_vit.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/vit/image_processing_vit.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Add: do_convert_rgb to fast * Add: convert_to_rgb --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2024-11-19 16:48:05 +01:00
Tibor Reiss	7b9e51c1a0	Feature: print tokens per second during training (#34507 ) * Log tokens per second during training * Nitpicks * Move logic into _maybe_log_save_evaluate * Use speed_metrics	2024-11-19 16:46:04 +01:00
Phillip Kuznetsov	5fa4f64605	🚨🚨🚨 fix(Mask2Former): torch export 🚨🚨🚨 (#34393 ) * fix(Mask2Former): torch export Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * revert level_start_index and create a level_start_index_list Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * Add a comment to explain the level_start_index_list Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * Address comment Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * add torch.export.export test Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * rename arg Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * remove spatial_shapes Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * Use the version check from pytorch_utils Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * [run_slow] mask2former Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> --------- Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>	2024-11-19 16:44:53 +01:00
huismiling	581524389a	MLU devices : Checks if mlu is available via an cndev-based check which won't trigger the drivers and leave mlu (#34326 ) * add Cambricon MLUs support * fix mlu device rng state * up for quality check * up mlu to support fp16 * fix mlu device dependency error * fix mlu device dependency error * enable mlu device for bf16 * fix mlu device memory tracker * Cambricon support SDPA and flash_attn * MLU devices : Checks if `mlu` is available via an `cndev-based` check which won't trigger the drivers and leave mlu	2024-11-19 16:37:39 +01:00
Cyril Vallez	e3a5889ef0	Modular fix (#34802 ) * Modular fix * style * remove logger warning * Update modular_model_converter.py	2024-11-19 16:08:57 +01:00
Marc Sun	ce1d328e3b	Fix cache_utils for optimum.quanto kvcache quantization (#34750 ) * add co-author Co-authored-by: w3rew <w3rew@users.noreply.github.com> * fix docs * fix cache * remove print --------- Co-authored-by: w3rew <w3rew@users.noreply.github.com>	2024-11-19 14:16:34 +01:00

... 38 39 40 41 42 ...

19383 Commits