Commit Graph

19383 Commits

Author SHA1 Message Date
Arthur
13e645bb40
Allow-head-dim (#32857)
* support head dim

* fix the doc

* fixup

* add oproj

Co-authored-by: Suhara
<suhara@users.noreply.github.com>>

* update

Co-authored-by: bzantium <bzantium@users.noreply.github.com>

* Co-authored-by: suhara <suhara@users.noreply.github.com>

* Update

Co-authored-by: Yoshi Suhara <suhara@users.noreply.github.com>

---------

Co-authored-by: bzantium <bzantium@users.noreply.github.com>
Co-authored-by: Yoshi Suhara <suhara@users.noreply.github.com>
2024-08-20 10:24:48 +02:00
Matt
85345bb439
Add tip to clarify tool calling (#32883) 2024-08-19 18:37:35 +01:00
Sai-Suraj-27
37204848f1
Docs: Fixed whisper-large-v2 model link in docs (#32871)
Fixed whisper-large-v2 model link in docs.
2024-08-19 09:50:35 -07:00
Anton Vlasjuk
61d89c19d8
Fix: Mamba2 generation mismatch between input_ids and inputs_embeds (#32694)
* fix cache when using input embeddings

* simplify check, we can always add input ids seq len since its 0 in first pass
2024-08-19 16:06:07 +02:00
Younes Belkada
93e538ae2e
Mamba / FalconMamba: Fix mamba left padding (#32677)
* fix mamba left padding

* Apply suggestions from code review

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>

* fix copies

* test with `inputs_embeds`

* Update src/transformers/models/falcon_mamba/modeling_falcon_mamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* copies

* clairfy

* fix last comments

* remove

---------

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-19 16:01:35 +02:00
Isotr0py
59e8f1919c
Fix incorrect vocab size retrieval in GGUF config (#32551)
* fix gguf config vocab size

* minor fix

* link issue
2024-08-19 15:53:54 +02:00
Alan-Blanchet
5f6c080b62
RT-DETR parameterized batchnorm freezing (#32631)
* fix: Parameterized norm freezing

For the R18 model, the authors don't freeze norms in the backbone.

* Update src/transformers/models/rt_detr/configuration_rt_detr.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2024-08-19 14:50:57 +01:00
Yitong Huang
8a4857c0db
Support save/load ckpt for XLA FSDP (#32311)
* Support save/load ckpt for XLA FSDP

* Fix bug for save

* Fix style

* reserve sharded ckpt and better file naming

* minor fix

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* add is_fsdp_xla_v1_enabled

---------

Co-authored-by: Zach Mueller <muellerzr@gmail.com>
2024-08-19 15:44:21 +02:00
Aaron Chung
f1b720ed62
Add __repr__ for Conv1D (#32425)
* Add representation for Conv1D, for better output info.

* code format for Conv1D

* We add a __repr__ func for Conv1D, this allows the print (or output) of the model's info has a better description for Conv1D.
2024-08-19 15:26:19 +02:00
Fanli Lin
e55b33ceb4
[tests] make test_sdpa_can_compile_dynamic device-agnostic (#32519)
* enable

* fix
2024-08-19 12:46:59 +01:00
Ita Zaporozhets
54b7703682
support torch-speech (#32537) 2024-08-19 11:26:35 +02:00
Kamil Akesbi
8260cb311e
Add Descript-Audio-Codec model (#31494)
* dac model

* original dac works

* add dac model

* dac can be instatiated

* add forward pass

* load weights

* all weights are used

* convert checkpoint script ready

* test

* add feature extractor

* up

* make style

* apply cookicutter

* fix tests

* iterate on FeatureExtractor

* nit

* update dac doc

* replace nn.Sequential with nn.ModuleList

* nit

* apply review suggestions 1/2

* Update src/transformers/models/dac/modeling_dac.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* up

* apply review suggestions 2/2

* update padding in FeatureExtractor

* apply review suggestions

* iterate on design and tests

* add integration tests

* feature extractor tests

* make style

* all tests pass

* make style

* fixup

* apply review suggestions

* fix-copies

* apply review suggestions

* apply review suggestions

* Update docs/source/en/model_doc/dac.md

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* Update docs/source/en/model_doc/dac.md

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* anticipate transfer weights to descript

* up

* make style

* apply review suggestions

* update slow test values

* update slow tests

* update test values

* update with CI values

* update with vorace values

* update test with slice

* make style

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
2024-08-19 10:21:51 +01:00
MAHIR DAIYAN
843e5e20ca
Add Flax Dinov2 (#31960)
* tfmsenv restored in main

* installed flax

* forward pass done and all tests passed

* make fix-copies and cleaning the scripts

* fixup attempt 1

* fixup attempt 2

* fixup third attempt

* fixup attempt 4

* fixup attempt 5

* dinov2 doc fixed

* FlaxDinov2Model + ForImageClassification added to OBJECTS_TO_IGNORE

* external pos_encoding layer removed

* fixup attempt 6

* fixed integration test values

* fixup attempt 7

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* comments removed

* comment removed from the test

* fixup

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* new fixes 1

* interpolate_pos_encoding function removed

* droppath rng fixed, pretrained beit copied-from still not working

* modeling_flax_dinov2.py reformatted

* Update tests/models/dinov2/test_modeling_flax_dinov2.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* added Copied from, to the tests

* copied from statements removed from tests

* fixed copied from statements in the tests

* [run_slow] dinov2

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2024-08-19 09:28:13 +01:00
Joao Gante
52cb4034ad
generate: missing to in DoLa body, causing exceptions in multi-gpu generation (#32856) 2024-08-17 16:37:00 +01:00
Alex Calderwood
6806d33567
Make beam_constraints.Constraint.advance() docstring more accurate (#32674)
* Fix beam_constraints.Constraint.advance() docstring

* Update src/transformers/generation/beam_constraints.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-08-16 19:36:55 +01:00
Zach Mueller
8ec028aded
Reduce the error log when using core models that need their weights renamed, and provide a step forward (#32656)
* Fin

* Modify msg

* Finish up nits
2024-08-16 13:05:57 -04:00
Marc Sun
1c36db697a
fix multi-gpu with static cache (#32543) 2024-08-16 19:02:37 +02:00
Zach Mueller
0b066bed14
Revert PR 32299, flag users when Zero-3 was missed (#32851)
Revert PR 32299
2024-08-16 12:35:41 -04:00
Zhan Rongrui
f20d0e81ea
improve _get_is_as_tensor_fns (#32596)
* improve _get_is_as_tensor_fns

* format
2024-08-16 15:59:44 +01:00
Yangshen⚡Deng
a27182b7fc
Fix AutoConfig and AutoModel support for Llava-Next-Video (#32844)
* Fix: fix all model_type of Llava-Next-Video to llava_next_video

* Fix doc for llava_next_video

* * Fix formatting issues
* Change llava-next-video.md file name into llava_next_video.md to make it compatible with implementation

* Fix docs TOC for llava-next-video
2024-08-16 12:41:05 +01:00
Joao Gante
cf32ee1753
Cache: use batch_size instead of max_batch_size (#32657)
* more precise name

* better docstrings

* Update src/transformers/cache_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-16 11:48:45 +01:00
Fanli Lin
8f9fa3b081
[tests] make test_sdpa_equivalence device-agnostic (#32520)
* fix on xpu

* [run_all]
2024-08-16 11:34:13 +01:00
Joao Gante
70d5df6107
Generate: unify LogitsWarper and LogitsProcessor (#32626) 2024-08-16 11:20:41 +01:00
Ao Tang
5fd7ca7bc9
Use head_dim if in config for RoPE (#32495)
* use head_dim if in config for RoPE

* typo

* simplify with getattr
2024-08-16 11:37:43 +02:00
Arthur
c215523528
add back the position ids (#32554)
* add back the position ids

* fix failing test
2024-08-16 11:00:05 +02:00
Raushan Turganbay
f3c8b18053
VLMs: small clean-up for cache class (#32417)
* fix beam search in video llava

* [run-slow] video_llava
2024-08-16 09:07:05 +05:00
muddlebee
d6751d91c8
fix: update doc link for runhouse in README.md (#32664) 2024-08-15 20:00:55 +01:00
Sai-Suraj-27
ab7e893d09
fix: Corrected falcon-mamba-7b model checkpoint name (#32837)
Corrected the model checkpoint.
2024-08-15 18:03:18 +01:00
jp
e840127370
reopen: llava-next fails to consider padding_side during Training (#32679)
restore #32386
2024-08-15 11:44:19 +01:00
Sai-Suraj-27
8820fe8b8c
Updated workflows to the latest versions (#32405)
Updated few workflows to the latest versions.
2024-08-14 20:18:14 +02:00
Zach Mueller
0cea2081a3
Unpin deepspeed in Docker image/tests (#32572)
Unpin deepspeed
2024-08-14 18:30:25 +01:00
Sai-Suraj-27
95a77819db
fix: Fixed unknown pytest config option doctest_glob (#32475)
Fixed unknown config option doctest_glob.
2024-08-14 18:30:01 +01:00
Dina Suehiro Jones
6577c77d93
Update the distributed CPU training on Kubernetes documentation (#32669)
* Update the Kubernetes CPU training example

* Add namespace arg

Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>

---------

Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>
2024-08-14 09:36:43 -07:00
Yih-Dar
20a04497a8
Fix JetMoeIntegrationTest (#32332)
JetMoeIntegrationTest

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-08-14 16:22:06 +02:00
Jerry Zhang
78d78cdf8a
Add TorchAOHfQuantizer (#32306)
* Add TorchAOHfQuantizer

Summary:
Enable loading torchao quantized model in huggingface.

Test Plan:
local test

Reviewers:

Subscribers:

Tasks:

Tags:

* Fix a few issues

* style

* Added tests and addressed some comments about dtype conversion

* fix torch_dtype warning message

* fix tests

* style

* TorchAOConfig -> TorchAoConfig

* enable offload + fix memory with multi-gpu

* update torchao version requirement to 0.4.0

* better comments

* add torch.compile to torchao README, add perf number link

---------

Co-authored-by: Marc Sun <marc@huggingface.co>
2024-08-14 16:14:24 +02:00
Steven Liu
9485289f37
Update translation docs review (#32662)
update list of people to tag
2024-08-14 13:57:07 +02:00
Sai-Suraj-27
df323476a3
fix: Fixed failing tests in tests/utils/test_add_new_model_like.py (#32678)
* Fixed failing tests in tests/utils/test_add_new_model_like.py

* Fixed formatting using ruff.

* Small nit.
2024-08-14 12:06:17 +01:00
fmo-mt
a22ff36e0e
Support MUSA (Moore Threads GPU) backend in transformers (#31913)
Add accelerate version check, needs accelerate>=0.33.0
2024-08-13 21:10:25 -04:00
Pablo Montalvo
c1357834e8
Fix tests recurrent (#32651)
* add fix for recurrentgemma

* [no-filter]

* trigger-ci

* [no-filter]

* [no-filter]

* attempt to fix mysterious zip error

* [no-filter]

* fix lookup error

* [no-filter]

* remove summarization hack

* [no-filter]
2024-08-13 23:40:50 +02:00
Seungwoo Lee
9d2ab8824c
TF_Deberta supporting mixed precision (#32618)
* Update modeling_tf_deberta.py

Corrected some codes which do not support mixed precision

* Update modeling_tf_deberta_v2.py

Corrected some codes which do not support mixed precision

* Update modeling_tf_deberta_v2.py

* Update modeling_tf_deberta.py

* Add files via upload

* Add files via upload
2024-08-13 18:15:24 +01:00
Yoni Gozlan
5bcbdff159
Modify ProcessorTesterMixin for better generalization (#32637)
* Add padding="max_length" to tokenizer kwargs and change crop_size to size for image_processor kwargs

* remove crop_size argument in align processor tests to be coherent with base tests

* Add pad_token when loading tokenizer if needed, change test override tokenizer kwargs, remove unnecessary test overwrites in grounding dino
2024-08-13 11:48:53 -04:00
Sai-Suraj-27
c3cd9d807e
Fix: Fixed directory path for utils folder in test_tokenization_utils.py (#32601)
* Removed un-necessary expressions.

* Fixed directory path for utils folder in test_tokenization_utils.py
2024-08-13 16:48:15 +01:00
Bertrand Thia
cc25757a44
Add Depth Anything V2 Metric models (#32126)
* add checkpoint and repo names

* adapt head to support metric depth estimation

* add max_depth output scaling

* add expected logits

* improve docs

* fix docstring

* add checkpoint and repo names

* adapt head to support metric depth estimation

* add max_depth output scaling

* add expected logits

* improve docs

* fix docstring

* rename depth_estimation to depth_estimation_type

* add integration test

* Refactored tests to include metric depth model inference test
* Integration test pass when the timm backbone lines are commented (L220-L227)

* address feedback

* replace model path to use organization path

* formatting

* delete deprecated TODO

* address feedback

* [run_slow] depth_anything
2024-08-13 16:16:30 +02:00
Eric Hartford
481e15604a
Add support for GrokAdamW optimizer (#32521)
* add grokadamw

* reformat

* code review feedback, unit test

* reformat

* reformat
2024-08-13 13:20:28 +01:00
Fanli Lin
b5016d5de7
fix tensors on different devices in WhisperGenerationMixin (#32316)
* fix

* enable on xpu

* no manual remove

* move to device

* remove to

* add move to
2024-08-13 11:29:57 +01:00
Pablo Montalvo
a5a8291ad1
Fix tests (#32649)
* skip failing tests

* [no-filter]

* [no-filter]

* fix wording catch in FA2 test

* [no-filter]

* trigger normal CI without filtering
2024-08-13 09:46:21 +01:00
Lysandre Debut
29c3a0fa01
Automatically add transformers tag to the modelcard (#32623)
* Automatically add `transformers` tag to the modelcard

* Specify library_name and test
2024-08-13 07:59:01 +02:00
Raushan Turganbay
a29eabd0eb
Expand inputs in processors for VLMs (#30962)
* let it be

* draft

* should not have changed

* add warnings

* fix & add tests

* fix tests

* ipnuts embeds cannot be passed with pixels

* more updates

* paligemma ready!

* minor typos

* update blip-2

* fix tests & raise error

* docstring

* add blip2 test

* tmp

* add image seq length to config

* update docstring

* delete

* fix tests

* fix blip

* fix paligemma

* out-of-place scatter

* add llava-next-video

* Update src/transformers/models/blip_2/modeling_blip_2.py

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>

* remove tmp

* codestyle

* nits

* more nits

* remove overriding in tests

* comprehension when merging video

* fix-copies

* revert changes for embeds test

* fix tests after making comprehension

* Update src/transformers/models/blip_2/processing_blip_2.py

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>

* Update src/transformers/models/blip_2/processing_blip_2.py

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>

* more updates

* fix tests

---------

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
2024-08-13 10:14:39 +05:00
Sai-Suraj-27
2a5a6ad18a
fix: Updated the is_torch_mps_available() function to include min_version argument (#32545)
* Fixed wrong argument in is_torch_mps_available() function call.

* Fixed wrong argument in is_torch_mps_available() function call.

* sorted the import.

* Fixed wrong argument in is_torch_mps_available() function call.

* Fixed wrong argument in is_torch_mps_available() function call.

* Update src/transformers/utils/import_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* removed extra space.

* Added type hint for the min_version parameter.

* Added missing import.

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-12 20:42:57 +01:00
Quentin Gallouédec
f1c8542ff7
"to be not" -> "not to be" (#32636)
* "to be not" -> "not to be"

* Update sam.md

* Update trainer.py

* Update modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py
2024-08-12 20:20:17 +01:00