Zach Mueller
8ec028aded
Reduce the error log when using core models that need their weights renamed, and provide a step forward ( #32656 )
...
* Fin
* Modify msg
* Finish up nits
2024-08-16 13:05:57 -04:00
Marc Sun
1c36db697a
fix multi-gpu with static cache ( #32543 )
2024-08-16 19:02:37 +02:00
Zach Mueller
0b066bed14
Revert PR 32299, flag users when Zero-3 was missed ( #32851 )
...
Revert PR 32299
2024-08-16 12:35:41 -04:00
Zhan Rongrui
f20d0e81ea
improve _get_is_as_tensor_fns ( #32596 )
...
* improve _get_is_as_tensor_fns
* format
2024-08-16 15:59:44 +01:00
Yangshen⚡Deng
a27182b7fc
Fix AutoConfig and AutoModel support for Llava-Next-Video ( #32844 )
...
* Fix: fix all model_type of Llava-Next-Video to llava_next_video
* Fix doc for llava_next_video
* * Fix formatting issues
* Change llava-next-video.md file name into llava_next_video.md to make it compatible with implementation
* Fix docs TOC for llava-next-video
2024-08-16 12:41:05 +01:00
Joao Gante
cf32ee1753
Cache: use batch_size
instead of max_batch_size
( #32657 )
...
* more precise name
* better docstrings
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-16 11:48:45 +01:00
Fanli Lin
8f9fa3b081
[tests] make test_sdpa_equivalence device-agnostic ( #32520 )
...
* fix on xpu
* [run_all]
2024-08-16 11:34:13 +01:00
Joao Gante
70d5df6107
Generate: unify LogitsWarper
and LogitsProcessor
( #32626 )
2024-08-16 11:20:41 +01:00
Ao Tang
5fd7ca7bc9
Use head_dim if in config for RoPE ( #32495 )
...
* use head_dim if in config for RoPE
* typo
* simplify with getattr
2024-08-16 11:37:43 +02:00
Arthur
c215523528
add back the position ids ( #32554 )
...
* add back the position ids
* fix failing test
2024-08-16 11:00:05 +02:00
Raushan Turganbay
f3c8b18053
VLMs: small clean-up for cache class ( #32417 )
...
* fix beam search in video llava
* [run-slow] video_llava
2024-08-16 09:07:05 +05:00
muddlebee
d6751d91c8
fix: update doc link for runhouse in README.md ( #32664 )
2024-08-15 20:00:55 +01:00
Sai-Suraj-27
ab7e893d09
fix: Corrected falcon-mamba-7b
model checkpoint name ( #32837 )
...
Corrected the model checkpoint.
2024-08-15 18:03:18 +01:00
jp
e840127370
reopen: llava-next fails to consider padding_side during Training ( #32679 )
...
restore #32386
2024-08-15 11:44:19 +01:00
Sai-Suraj-27
8820fe8b8c
Updated workflows to the latest versions ( #32405 )
...
Updated few workflows to the latest versions.
2024-08-14 20:18:14 +02:00
Zach Mueller
0cea2081a3
Unpin deepspeed in Docker image/tests ( #32572 )
...
Unpin deepspeed
2024-08-14 18:30:25 +01:00
Sai-Suraj-27
95a77819db
fix: Fixed unknown pytest config option doctest_glob
( #32475 )
...
Fixed unknown config option doctest_glob.
2024-08-14 18:30:01 +01:00
Dina Suehiro Jones
6577c77d93
Update the distributed CPU training on Kubernetes documentation ( #32669 )
...
* Update the Kubernetes CPU training example
* Add namespace arg
Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>
---------
Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>
2024-08-14 09:36:43 -07:00
Yih-Dar
20a04497a8
Fix JetMoeIntegrationTest
( #32332 )
...
JetMoeIntegrationTest
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-08-14 16:22:06 +02:00
Jerry Zhang
78d78cdf8a
Add TorchAOHfQuantizer ( #32306 )
...
* Add TorchAOHfQuantizer
Summary:
Enable loading torchao quantized model in huggingface.
Test Plan:
local test
Reviewers:
Subscribers:
Tasks:
Tags:
* Fix a few issues
* style
* Added tests and addressed some comments about dtype conversion
* fix torch_dtype warning message
* fix tests
* style
* TorchAOConfig -> TorchAoConfig
* enable offload + fix memory with multi-gpu
* update torchao version requirement to 0.4.0
* better comments
* add torch.compile to torchao README, add perf number link
---------
Co-authored-by: Marc Sun <marc@huggingface.co>
2024-08-14 16:14:24 +02:00
Steven Liu
9485289f37
Update translation docs review ( #32662 )
...
update list of people to tag
2024-08-14 13:57:07 +02:00
Sai-Suraj-27
df323476a3
fix: Fixed failing tests in tests/utils/test_add_new_model_like.py
( #32678 )
...
* Fixed failing tests in tests/utils/test_add_new_model_like.py
* Fixed formatting using ruff.
* Small nit.
2024-08-14 12:06:17 +01:00
fmo-mt
a22ff36e0e
Support MUSA (Moore Threads GPU) backend in transformers ( #31913 )
...
Add accelerate version check, needs accelerate>=0.33.0
2024-08-13 21:10:25 -04:00
Pablo Montalvo
c1357834e8
Fix tests recurrent ( #32651 )
...
* add fix for recurrentgemma
* [no-filter]
* trigger-ci
* [no-filter]
* [no-filter]
* attempt to fix mysterious zip error
* [no-filter]
* fix lookup error
* [no-filter]
* remove summarization hack
* [no-filter]
2024-08-13 23:40:50 +02:00
Seungwoo Lee
9d2ab8824c
TF_Deberta supporting mixed precision ( #32618 )
...
* Update modeling_tf_deberta.py
Corrected some codes which do not support mixed precision
* Update modeling_tf_deberta_v2.py
Corrected some codes which do not support mixed precision
* Update modeling_tf_deberta_v2.py
* Update modeling_tf_deberta.py
* Add files via upload
* Add files via upload
2024-08-13 18:15:24 +01:00
Yoni Gozlan
5bcbdff159
Modify ProcessorTesterMixin for better generalization ( #32637 )
...
* Add padding="max_length" to tokenizer kwargs and change crop_size to size for image_processor kwargs
* remove crop_size argument in align processor tests to be coherent with base tests
* Add pad_token when loading tokenizer if needed, change test override tokenizer kwargs, remove unnecessary test overwrites in grounding dino
2024-08-13 11:48:53 -04:00
Sai-Suraj-27
c3cd9d807e
Fix: Fixed directory path for utils folder in test_tokenization_utils.py
( #32601 )
...
* Removed un-necessary expressions.
* Fixed directory path for utils folder in test_tokenization_utils.py
2024-08-13 16:48:15 +01:00
Bertrand Thia
cc25757a44
Add Depth Anything V2 Metric models ( #32126 )
...
* add checkpoint and repo names
* adapt head to support metric depth estimation
* add max_depth output scaling
* add expected logits
* improve docs
* fix docstring
* add checkpoint and repo names
* adapt head to support metric depth estimation
* add max_depth output scaling
* add expected logits
* improve docs
* fix docstring
* rename depth_estimation to depth_estimation_type
* add integration test
* Refactored tests to include metric depth model inference test
* Integration test pass when the timm backbone lines are commented (L220-L227)
* address feedback
* replace model path to use organization path
* formatting
* delete deprecated TODO
* address feedback
* [run_slow] depth_anything
2024-08-13 16:16:30 +02:00
Eric Hartford
481e15604a
Add support for GrokAdamW optimizer ( #32521 )
...
* add grokadamw
* reformat
* code review feedback, unit test
* reformat
* reformat
2024-08-13 13:20:28 +01:00
Fanli Lin
b5016d5de7
fix tensors on different devices in WhisperGenerationMixin
( #32316 )
...
* fix
* enable on xpu
* no manual remove
* move to device
* remove to
* add move to
2024-08-13 11:29:57 +01:00
Pablo Montalvo
a5a8291ad1
Fix tests ( #32649 )
...
* skip failing tests
* [no-filter]
* [no-filter]
* fix wording catch in FA2 test
* [no-filter]
* trigger normal CI without filtering
2024-08-13 09:46:21 +01:00
Lysandre Debut
29c3a0fa01
Automatically add transformers
tag to the modelcard ( #32623 )
...
* Automatically add `transformers` tag to the modelcard
* Specify library_name and test
2024-08-13 07:59:01 +02:00
Raushan Turganbay
a29eabd0eb
Expand inputs in processors for VLMs ( #30962 )
...
* let it be
* draft
* should not have changed
* add warnings
* fix & add tests
* fix tests
* ipnuts embeds cannot be passed with pixels
* more updates
* paligemma ready!
* minor typos
* update blip-2
* fix tests & raise error
* docstring
* add blip2 test
* tmp
* add image seq length to config
* update docstring
* delete
* fix tests
* fix blip
* fix paligemma
* out-of-place scatter
* add llava-next-video
* Update src/transformers/models/blip_2/modeling_blip_2.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* remove tmp
* codestyle
* nits
* more nits
* remove overriding in tests
* comprehension when merging video
* fix-copies
* revert changes for embeds test
* fix tests after making comprehension
* Update src/transformers/models/blip_2/processing_blip_2.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* Update src/transformers/models/blip_2/processing_blip_2.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* more updates
* fix tests
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
2024-08-13 10:14:39 +05:00
Sai-Suraj-27
2a5a6ad18a
fix: Updated the is_torch_mps_available()
function to include min_version
argument ( #32545 )
...
* Fixed wrong argument in is_torch_mps_available() function call.
* Fixed wrong argument in is_torch_mps_available() function call.
* sorted the import.
* Fixed wrong argument in is_torch_mps_available() function call.
* Fixed wrong argument in is_torch_mps_available() function call.
* Update src/transformers/utils/import_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* removed extra space.
* Added type hint for the min_version parameter.
* Added missing import.
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-12 20:42:57 +01:00
Quentin Gallouédec
f1c8542ff7
"to be not" -> "not to be" ( #32636 )
...
* "to be not" -> "not to be"
* Update sam.md
* Update trainer.py
* Update modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
2024-08-12 20:20:17 +01:00
dependabot[bot]
126cbdb365
Bump tensorflow from 2.11.1 to 2.12.1 in /examples/research_projects/decision_transformer ( #32341 )
...
Bump tensorflow in /examples/research_projects/decision_transformer
Bumps [tensorflow](https://github.com/tensorflow/tensorflow ) from 2.11.1 to 2.12.1.
- [Release notes](https://github.com/tensorflow/tensorflow/releases )
- [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md )
- [Commits](https://github.com/tensorflow/tensorflow/compare/v2.11.1...v2.12.1 )
---
updated-dependencies:
- dependency-name: tensorflow
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-12 19:57:07 +01:00
Sai-Suraj-27
ce4b28830a
fix: Fixed failing test_find_base_model_checkpoint
( #32638 )
...
Fixed failing test_find_base_model_checkpoint.
2024-08-12 19:51:30 +01:00
Ahnjj_DEV
7f777ab7d9
🌐 [i18n-KO] Translated awq.md
to Korean ( #32324 )
...
* fix: manual edits
* Apply suggestions from code review
Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
* fix:manual edits
- 잘못된 경로에 번역본 파일을 생성해서 옮김
* Delete docs/source/ko/tasks/awq.md
* Update docs/source/ko/_toctree.yml
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-08-12 10:12:48 -07:00
YONGSANG
4996990d61
🌐 [i18n-KO] Translated deepspeed.md
to Korean ( #32431 )
...
* Update _toctree.yml
* docs: ko: deepspeed.md
* Apply suggestions from code review
Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com>
* Update docs/source/ko/_toctree.yml
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ko/deepspeed.md
* Update docs/source/ko/deepspeed.md
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
* Apply suggestions from code review
Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com>
* Update docs/source/ko/_toctree.yml
---------
Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
2024-08-12 10:07:31 -07:00
Matt
b7ea171403
Cleanup tool calling documentation and rename doc ( #32337 )
...
* Rename "Templates for Chat Models" doc to "Chat Templates"
* Small formatting fix
* Small formatting fix
* Small formatting fix
* Cleanup tool calling docs as well
* Remove unneeded 'revision'
* Move tip to below main code example
* Little bonus section on template editing
2024-08-12 16:20:14 +01:00
dependabot[bot]
8a3c55eb21
Bump torch from 1.13.1 to 2.2.0 in /examples/research_projects/visual_bert ( #32220 )
...
Bump torch in /examples/research_projects/visual_bert
Bumps [torch](https://github.com/pytorch/pytorch ) from 1.13.1 to 2.2.0.
- [Release notes](https://github.com/pytorch/pytorch/releases )
- [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md )
- [Commits](https://github.com/pytorch/pytorch/compare/v1.13.1...v2.2.0 )
---
updated-dependencies:
- dependency-name: torch
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-12 16:02:52 +01:00
dependabot[bot]
50837f2060
Bump aiohttp from 3.9.4 to 3.10.2 in /examples/research_projects/decision_transformer ( #32569 )
...
Bump aiohttp in /examples/research_projects/decision_transformer
Bumps [aiohttp](https://github.com/aio-libs/aiohttp ) from 3.9.4 to 3.10.2.
- [Release notes](https://github.com/aio-libs/aiohttp/releases )
- [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst )
- [Commits](https://github.com/aio-libs/aiohttp/compare/v3.9.4...v3.10.2 )
---
updated-dependencies:
- dependency-name: aiohttp
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-12 15:49:59 +01:00
Lucain
e31a7a2638
Fix .push_to_hub(..., create_pr=True, revision="my-branch")
when creating PR on not-owned repo ( #32094 )
...
Fix create_pr aagainst existing revision
2024-08-12 15:35:32 +01:00
Sai-Suraj-27
bd251e4955
fix: Fixed conditional check for encodec
model names ( #32581 )
...
* Fixed conditional check for encodec model names.
* Reformatted conditional check.
2024-08-12 12:07:46 +01:00
Chaehong Jeong
342e3f9f20
Fix sliding window attention used in Gemma2FlashAttention2 ( #32522 )
...
* fix sliding window attention (flash2) in gemma2 model
* [run-slow] gemma
* fix slicing attention_mask for flash_attn2
* fix slicing attention_mask when flash_attn is used
* add missing comment
* slice the last seq_len tokens in the key, value states
* revert code of slicing key, value states
2024-08-12 11:18:15 +02:00
Raushan Turganbay
8f2b6d5e3d
Fix: FA2 with packed training ( #32487 )
...
* fix check
* add tests
* [run-slow] llama, gemma2
* oops, whisper actually runs but needed some special treatment
2024-08-12 13:40:07 +05:00
Younes Belkada
7c11491208
Add new model ( #32615 )
...
* v1 - working version
* fix
* fix
* fix
* fix
* rename to correct name
* fix title
* fixup
* rename files
* fix
* add copied from on tests
* rename to `FalconMamba` everywhere and fix bugs
* fix quantization + accelerate
* fix copies
* add `torch.compile` support
* fix tests
* fix tests and add slow tests
* copies on config
* merge the latest changes
* fix tests
* add few lines about instruct
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix
* fix tests
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-12 08:22:47 +02:00
wony617
48101cf8d1
🌐 [i18n-KO] Translated agent.md
to Korean ( #32351 )
...
* docs: ko: main_classes/agent
* feat: chatgpt draft
* fix: manual edits
* fix: resolve suggestions
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: thsamaji <60818655+thsamajiki@users.noreply.github.com>
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
* fix: resolve suggestions
* fix: resolve code line number
---------
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: thsamaji <60818655+thsamajiki@users.noreply.github.com>
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
2024-08-09 09:58:52 -07:00
zhanweidu
e7f4ace092
fix non contiguous tensor value error in save_pretrained ( #32422 )
...
Signed-off-by: duzhanwei <duzhanwei@bytedance.com>
Co-authored-by: duzhanwei <duzhanwei@bytedance.com>
2024-08-09 12:59:43 +01:00
Arthur
e4522fe399
fix slow integration gemma2 test ( #32534 )
...
no empty revision
2024-08-09 11:28:22 +02:00