transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Zach Mueller	8ec028aded	Reduce the error log when using core models that need their weights renamed, and provide a step forward (#32656 ) * Fin * Modify msg * Finish up nits	2024-08-16 13:05:57 -04:00
Marc Sun	1c36db697a	fix multi-gpu with static cache (#32543 )	2024-08-16 19:02:37 +02:00
Zach Mueller	0b066bed14	Revert PR 32299, flag users when Zero-3 was missed (#32851 ) Revert PR 32299	2024-08-16 12:35:41 -04:00
Zhan Rongrui	f20d0e81ea	improve _get_is_as_tensor_fns (#32596 ) * improve _get_is_as_tensor_fns * format	2024-08-16 15:59:44 +01:00
Yangshen⚡Deng	a27182b7fc	Fix AutoConfig and AutoModel support for Llava-Next-Video (#32844 ) * Fix: fix all model_type of Llava-Next-Video to llava_next_video * Fix doc for llava_next_video * * Fix formatting issues * Change llava-next-video.md file name into llava_next_video.md to make it compatible with implementation * Fix docs TOC for llava-next-video	2024-08-16 12:41:05 +01:00
Joao Gante	cf32ee1753	Cache: use `batch_size` instead of `max_batch_size` (#32657 ) * more precise name * better docstrings * Update src/transformers/cache_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-08-16 11:48:45 +01:00
Fanli Lin	8f9fa3b081	[tests] make test_sdpa_equivalence device-agnostic (#32520 ) * fix on xpu * [run_all]	2024-08-16 11:34:13 +01:00
Joao Gante	70d5df6107	Generate: unify `LogitsWarper` and `LogitsProcessor` (#32626 )	2024-08-16 11:20:41 +01:00
Ao Tang	5fd7ca7bc9	Use head_dim if in config for RoPE (#32495 ) * use head_dim if in config for RoPE * typo * simplify with getattr	2024-08-16 11:37:43 +02:00
Arthur	c215523528	add back the position ids (#32554 ) * add back the position ids * fix failing test	2024-08-16 11:00:05 +02:00
Raushan Turganbay	f3c8b18053	VLMs: small clean-up for cache class (#32417 ) * fix beam search in video llava * [run-slow] video_llava	2024-08-16 09:07:05 +05:00
muddlebee	d6751d91c8	fix: update doc link for runhouse in README.md (#32664 )	2024-08-15 20:00:55 +01:00
Sai-Suraj-27	ab7e893d09	fix: Corrected `falcon-mamba-7b` model checkpoint name (#32837 ) Corrected the model checkpoint.	2024-08-15 18:03:18 +01:00
jp	e840127370	reopen: llava-next fails to consider padding_side during Training (#32679 ) restore #32386	2024-08-15 11:44:19 +01:00
Sai-Suraj-27	8820fe8b8c	Updated workflows to the latest versions (#32405 ) Updated few workflows to the latest versions.	2024-08-14 20:18:14 +02:00
Zach Mueller	0cea2081a3	Unpin deepspeed in Docker image/tests (#32572 ) Unpin deepspeed	2024-08-14 18:30:25 +01:00
Sai-Suraj-27	95a77819db	fix: Fixed unknown pytest config option `doctest_glob` (#32475 ) Fixed unknown config option doctest_glob.	2024-08-14 18:30:01 +01:00
Dina Suehiro Jones	6577c77d93	Update the distributed CPU training on Kubernetes documentation (#32669 ) * Update the Kubernetes CPU training example * Add namespace arg Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com> --------- Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>	2024-08-14 09:36:43 -07:00
Yih-Dar	20a04497a8	Fix `JetMoeIntegrationTest` (#32332 ) JetMoeIntegrationTest Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-08-14 16:22:06 +02:00
Jerry Zhang	78d78cdf8a	Add TorchAOHfQuantizer (#32306 ) * Add TorchAOHfQuantizer Summary: Enable loading torchao quantized model in huggingface. Test Plan: local test Reviewers: Subscribers: Tasks: Tags: * Fix a few issues * style * Added tests and addressed some comments about dtype conversion * fix torch_dtype warning message * fix tests * style * TorchAOConfig -> TorchAoConfig * enable offload + fix memory with multi-gpu * update torchao version requirement to 0.4.0 * better comments * add torch.compile to torchao README, add perf number link --------- Co-authored-by: Marc Sun <marc@huggingface.co>	2024-08-14 16:14:24 +02:00
Steven Liu	9485289f37	Update translation docs review (#32662 ) update list of people to tag	2024-08-14 13:57:07 +02:00
Sai-Suraj-27	df323476a3	fix: Fixed failing tests in `tests/utils/test_add_new_model_like.py` (#32678 ) * Fixed failing tests in tests/utils/test_add_new_model_like.py * Fixed formatting using ruff. * Small nit.	2024-08-14 12:06:17 +01:00
fmo-mt	a22ff36e0e	Support MUSA (Moore Threads GPU) backend in transformers (#31913 ) Add accelerate version check, needs accelerate>=0.33.0	2024-08-13 21:10:25 -04:00
Pablo Montalvo	c1357834e8	Fix tests recurrent (#32651 ) * add fix for recurrentgemma * [no-filter] * trigger-ci * [no-filter] * [no-filter] * attempt to fix mysterious zip error * [no-filter] * fix lookup error * [no-filter] * remove summarization hack * [no-filter]	2024-08-13 23:40:50 +02:00
Seungwoo Lee	9d2ab8824c	TF_Deberta supporting mixed precision (#32618 ) * Update modeling_tf_deberta.py Corrected some codes which do not support mixed precision * Update modeling_tf_deberta_v2.py Corrected some codes which do not support mixed precision * Update modeling_tf_deberta_v2.py * Update modeling_tf_deberta.py * Add files via upload * Add files via upload	2024-08-13 18:15:24 +01:00
Yoni Gozlan	5bcbdff159	Modify ProcessorTesterMixin for better generalization (#32637 ) * Add padding="max_length" to tokenizer kwargs and change crop_size to size for image_processor kwargs * remove crop_size argument in align processor tests to be coherent with base tests * Add pad_token when loading tokenizer if needed, change test override tokenizer kwargs, remove unnecessary test overwrites in grounding dino	2024-08-13 11:48:53 -04:00
Sai-Suraj-27	c3cd9d807e	Fix: Fixed directory path for utils folder in `test_tokenization_utils.py` (#32601 ) * Removed un-necessary expressions. * Fixed directory path for utils folder in test_tokenization_utils.py	2024-08-13 16:48:15 +01:00
Bertrand Thia	cc25757a44	Add Depth Anything V2 Metric models (#32126 ) * add checkpoint and repo names * adapt head to support metric depth estimation * add max_depth output scaling * add expected logits * improve docs * fix docstring * add checkpoint and repo names * adapt head to support metric depth estimation * add max_depth output scaling * add expected logits * improve docs * fix docstring * rename depth_estimation to depth_estimation_type * add integration test * Refactored tests to include metric depth model inference test * Integration test pass when the timm backbone lines are commented (L220-L227) * address feedback * replace model path to use organization path * formatting * delete deprecated TODO * address feedback * [run_slow] depth_anything	2024-08-13 16:16:30 +02:00
Eric Hartford	481e15604a	Add support for GrokAdamW optimizer (#32521 ) * add grokadamw * reformat * code review feedback, unit test * reformat * reformat	2024-08-13 13:20:28 +01:00
Fanli Lin	b5016d5de7	fix tensors on different devices in `WhisperGenerationMixin` (#32316 ) * fix * enable on xpu * no manual remove * move to device * remove to * add move to	2024-08-13 11:29:57 +01:00
Pablo Montalvo	a5a8291ad1	Fix tests (#32649 ) * skip failing tests * [no-filter] * [no-filter] * fix wording catch in FA2 test * [no-filter] * trigger normal CI without filtering	2024-08-13 09:46:21 +01:00
Lysandre Debut	29c3a0fa01	Automatically add `transformers` tag to the modelcard (#32623 ) * Automatically add `transformers` tag to the modelcard * Specify library_name and test	2024-08-13 07:59:01 +02:00
Raushan Turganbay	a29eabd0eb	Expand inputs in processors for VLMs (#30962 ) * let it be * draft * should not have changed * add warnings * fix & add tests * fix tests * ipnuts embeds cannot be passed with pixels * more updates * paligemma ready! * minor typos * update blip-2 * fix tests & raise error * docstring * add blip2 test * tmp * add image seq length to config * update docstring * delete * fix tests * fix blip * fix paligemma * out-of-place scatter * add llava-next-video * Update src/transformers/models/blip_2/modeling_blip_2.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * remove tmp * codestyle * nits * more nits * remove overriding in tests * comprehension when merging video * fix-copies * revert changes for embeds test * fix tests after making comprehension * Update src/transformers/models/blip_2/processing_blip_2.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * Update src/transformers/models/blip_2/processing_blip_2.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * more updates * fix tests --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2024-08-13 10:14:39 +05:00
Sai-Suraj-27	2a5a6ad18a	fix: Updated the `is_torch_mps_available()` function to include `min_version` argument (#32545 ) * Fixed wrong argument in is_torch_mps_available() function call. * Fixed wrong argument in is_torch_mps_available() function call. * sorted the import. * Fixed wrong argument in is_torch_mps_available() function call. * Fixed wrong argument in is_torch_mps_available() function call. * Update src/transformers/utils/import_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * removed extra space. * Added type hint for the min_version parameter. * Added missing import. --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-08-12 20:42:57 +01:00
Quentin Gallouédec	f1c8542ff7	"to be not" -> "not to be" (#32636 ) * "to be not" -> "not to be" * Update sam.md * Update trainer.py * Update modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py	2024-08-12 20:20:17 +01:00
dependabot[bot]	126cbdb365	Bump tensorflow from 2.11.1 to 2.12.1 in /examples/research_projects/decision_transformer (#32341 ) Bump tensorflow in /examples/research_projects/decision_transformer Bumps [tensorflow](https://github.com/tensorflow/tensorflow) from 2.11.1 to 2.12.1. - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md) - [Commits](https://github.com/tensorflow/tensorflow/compare/v2.11.1...v2.12.1) --- updated-dependencies: - dependency-name: tensorflow dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-08-12 19:57:07 +01:00
Sai-Suraj-27	ce4b28830a	fix: Fixed failing `test_find_base_model_checkpoint` (#32638 ) Fixed failing test_find_base_model_checkpoint.	2024-08-12 19:51:30 +01:00
Ahnjj_DEV	7f777ab7d9	🌐 [i18n-KO] Translated `awq.md`to Korean (#32324 ) * fix: manual edits * Apply suggestions from code review Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> * fix:manual edits - 잘못된 경로에 번역본 파일을 생성해서 옮김 * Delete docs/source/ko/tasks/awq.md * Update docs/source/ko/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-08-12 10:12:48 -07:00
YONGSANG	4996990d61	🌐 [i18n-KO] Translated `deepspeed.md` to Korean (#32431 ) * Update _toctree.yml * docs: ko: deepspeed.md * Apply suggestions from code review Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> * Update docs/source/ko/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ko/deepspeed.md * Update docs/source/ko/deepspeed.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Apply suggestions from code review Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> * Update docs/source/ko/_toctree.yml --------- Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>	2024-08-12 10:07:31 -07:00
Matt	b7ea171403	Cleanup tool calling documentation and rename doc (#32337 ) * Rename "Templates for Chat Models" doc to "Chat Templates" * Small formatting fix * Small formatting fix * Small formatting fix * Cleanup tool calling docs as well * Remove unneeded 'revision' * Move tip to below main code example * Little bonus section on template editing	2024-08-12 16:20:14 +01:00
dependabot[bot]	8a3c55eb21	Bump torch from 1.13.1 to 2.2.0 in /examples/research_projects/visual_bert (#32220 ) Bump torch in /examples/research_projects/visual_bert Bumps [torch](https://github.com/pytorch/pytorch) from 1.13.1 to 2.2.0. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v1.13.1...v2.2.0) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-08-12 16:02:52 +01:00
dependabot[bot]	50837f2060	Bump aiohttp from 3.9.4 to 3.10.2 in /examples/research_projects/decision_transformer (#32569 ) Bump aiohttp in /examples/research_projects/decision_transformer Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.9.4 to 3.10.2. - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst) - [Commits](https://github.com/aio-libs/aiohttp/compare/v3.9.4...v3.10.2) --- updated-dependencies: - dependency-name: aiohttp dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-08-12 15:49:59 +01:00
Lucain	e31a7a2638	Fix `.push_to_hub(..., create_pr=True, revision="my-branch")` when creating PR on not-owned repo (#32094 ) Fix create_pr aagainst existing revision	2024-08-12 15:35:32 +01:00
Sai-Suraj-27	bd251e4955	fix: Fixed conditional check for `encodec` model names (#32581 ) * Fixed conditional check for encodec model names. * Reformatted conditional check.	2024-08-12 12:07:46 +01:00
Chaehong Jeong	342e3f9f20	Fix sliding window attention used in Gemma2FlashAttention2 (#32522 ) * fix sliding window attention (flash2) in gemma2 model * [run-slow] gemma * fix slicing attention_mask for flash_attn2 * fix slicing attention_mask when flash_attn is used * add missing comment * slice the last seq_len tokens in the key, value states * revert code of slicing key, value states	2024-08-12 11:18:15 +02:00
Raushan Turganbay	8f2b6d5e3d	Fix: FA2 with packed training (#32487 ) * fix check * add tests * [run-slow] llama, gemma2 * oops, whisper actually runs but needed some special treatment	2024-08-12 13:40:07 +05:00
Younes Belkada	7c11491208	Add new model (#32615 ) * v1 - working version * fix * fix * fix * fix * rename to correct name * fix title * fixup * rename files * fix * add copied from on tests * rename to `FalconMamba` everywhere and fix bugs * fix quantization + accelerate * fix copies * add `torch.compile` support * fix tests * fix tests and add slow tests * copies on config * merge the latest changes * fix tests * add few lines about instruct * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * fix tests --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-08-12 08:22:47 +02:00
wony617	48101cf8d1	🌐 [i18n-KO] Translated `agent.md` to Korean (#32351 ) * docs: ko: main_classes/agent * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: thsamaji <60818655+thsamajiki@users.noreply.github.com> Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * fix: resolve suggestions * fix: resolve code line number --------- Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: thsamaji <60818655+thsamajiki@users.noreply.github.com> Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>	2024-08-09 09:58:52 -07:00
zhanweidu	e7f4ace092	fix non contiguous tensor value error in save_pretrained (#32422 ) Signed-off-by: duzhanwei <duzhanwei@bytedance.com> Co-authored-by: duzhanwei <duzhanwei@bytedance.com>	2024-08-09 12:59:43 +01:00
Arthur	e4522fe399	fix slow integration gemma2 test (#32534 ) no empty revision	2024-08-09 11:28:22 +02:00

1 2 3 4 5 ...

16618 Commits