Commit Graph

19383 Commits

Author SHA1 Message Date
amyeroberts
aafa7ce72b
[DETR] Remove timm hardcoded logic in modeling files (#29038)
* Enable instantiating model with pretrained backbone weights

* Clarify pretrained import

* Use load_backbone instead

* Add backbone_kwargs to config

* Fix up

* Add tests

* Tidy up

* Enable instantiating model with pretrained backbone weights

* Update tests so backbone checkpoint isn't passed in

* Clarify pretrained import

* Update configs - docs and validation check

* Update src/transformers/utils/backbone_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Clarify exception message

* Update config init in tests

* Add test for when use_timm_backbone=True

* Use load_backbone instead

* Add use_timm_backbone to the model configs

* Add backbone_kwargs to config

* Pass kwargs to constructors

* Draft

* Fix tests

* Add back timm - weight naming

* More tidying up

* Whoops

* Tidy up

* Handle when kwargs are none

* Update tests

* Revert test changes

* Deformable detr test - don't use default

* Don't mutate; correct model attributes

* Add some clarifying comments

* nit - grammar is hard

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-04-26 16:55:24 +01:00
Zach Mueller
77ff304d29
Remove skipping logic now that set_epoch exists (#30501)
* Remove skipping logic now that set_epoch exists

* Working version, clean
2024-04-26 11:52:09 -04:00
JB (Don)
dfa7b580e9
[BERT] Add support for sdpa (#28802)
* Adding SDPA support for BERT

* Using the proper input name for testing model input in inference()

* Adding documentation for SDPA in BERT model page

* Use the stable link for the documentation

* Adding a gate to only call .contiguous() for torch < 2.2.0

* Additions and fixes to the documentation

* Minor updates to documentation

* Adding extra requirements needed for the contiguous() bug

* Adding "Adapted from" in plcae of the "Copied from"

* Add benchmark speedup tables to the documentation

* Minor fixes to the documentation

* Use ClapText as a replacemenet for Bert in the Copied-From

* Some more fixes for the fix-copies references

* Overriding the test_eager_matches_sdpa_generate in bert tests to not load with low_cpu_mem_usage

[test all]

* Undo changes to separate test

* Refactored SDPA self attention code for KV projections

* Change use_sdpa to attn_implementation

* Fix test_sdpa_can_dispatch_on_flash by preparing input (required for MultipleChoice models)
2024-04-26 16:23:44 +01:00
Matt
2de5cb12be
Use the Keras set_random_seed in tests (#30504)
Use the Keras set_random_seed to ensure reproducible weight initialization
2024-04-26 16:14:53 +01:00
Michael Goin
20081c743e
Update dtype_byte_size to handle torch.float8_e4m3fn/float8_e5m2 types (#30488)
* Update modeling_utils/dtype_byte_size to handle float8 types

* Add a test for dtype_byte_size

* Format

* Fix bool
2024-04-26 11:26:43 +01:00
kyo
59e715f71c
Fix the bitsandbytes error formatting ("Some modules are dispatched on ...") (#30494)
Fix the `bitsandbytes` error when some modules are not properly offloaded.
2024-04-26 10:13:52 +01:00
Younes Belkada
19cfdf0fac
FEAT: PEFT support for EETQ (#30449)
Update quantizer_eetq.py
2024-04-26 10:20:35 +02:00
Aaron Jimenez
a98c41798c
[docs] Spanish translation of pipeline_tutorial.md (#30252)
* add pipeline_webserver to es/

* add pipeline_webserver to es/, translate first section

* add comment for checking link

* translate pipeline_webserver

* edit pipeline_webserver

* fix typo
2024-04-25 12:18:06 -07:00
Younes Belkada
26ddc58047
Quantization: HfQuantizer quant method update (#30484)
ensure popular quant methods are supported
2024-04-25 21:09:28 +02:00
Matt
f39627125b
Add sidebar tutorial for chat models (#30401)
* Draft tutorial for talking to chat models

* Reformat lists and text snippets

* Cleanups and clarifications

* Finish up remaining TODOs

* Correct section link

* Small fix

* Add proper quantization examples

* Add proper quantization examples

* Add proper quantization examples

* Update docs/source/en/conversations.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/conversations.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/conversations.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/conversations.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/conversations.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/conversations.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/conversations.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/conversations.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/conversations.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/conversations.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/_toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/conversations.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Fix Text Generation Pipeline link and add a ref to the LLM inference guide

* intelligent -> capable

* Small intro cleanup

* Small text cleanup

* Small text cleanup

* Clarification about system message

* Clarification about system message

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-04-25 19:38:48 +01:00
Xuehai Pan
bc274a28a9
Do not use deprecated SourceFileLoader.load_module() in dynamic module loading (#30370) 2024-04-25 18:23:39 +02:00
Raushan Turganbay
e60491adc9
Fix Llava for 0-embeddings (#30473) 2024-04-25 20:28:51 +05:00
Zach Mueller
ad697f1801
Introduce Stateful Callbacks (#29666)
* Introduce saveable callbacks

* Add note

* Test for non-present and flag

* Support early stopping and refusing to train further

* Update docstring

* More saving

* Import oopsie

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Make it go through TrainerArguments

* Document

* Fix test

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Rework to allow for duplicates

* CLean

* Fix failing tests

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-04-25 11:00:09 -04:00
Zach Mueller
86f2569738
Make accelerate install non-torch dependent (#30463)
* Pin accelerate w/o eager

* Eager

* Update .circleci/create_circleci_config.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Expound

* Expound squared

* PyTorch -> dependency

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-04-25 09:37:55 -04:00
manju rangam
928331381e
Fix Issue #29817 Video Classification Task Guide Using Undeclared Variables (#30457)
* Fix issue #29817

Video Classification Task Guide Using Undeclared Variables

* Update docs/source/en/tasks/video_classification.md

updated with review comments

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fix issue #29817

Add line space following PR comments

---------

Co-authored-by: manju-rangam <Manju1@Git>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-04-25 13:49:30 +01:00
Alexander Visheratin
7b1170b0fa
Add WSD scheduler (#30231)
* Added WSD scheduler.

* Added tests.

* Fixed errors.

* Fix formatting.

* CI fixes.
2024-04-25 12:07:21 +01:00
Yoach Lacombe
90cb55bf77
🚨 Add training compatibility for Musicgen-like models (#29802)
* first modeling code

* make repository

* still WIP

* update model

* add tests

* add latest change

* clean docstrings and copied from

* update docstrings md and readme

* correct chroma function

* correct copied from and remove unreleated test

* add doc to toctree

* correct imports

* add convert script to notdoctested

* Add suggestion from Sanchit

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* correct get_uncoditional_inputs docstrings

* modify README according to SANCHIT feedback

* add chroma to audio utils

* clean librosa and torchaudio hard dependencies

* fix FE

* refactor audio decoder -> audio encoder for consistency with previous musicgen

* refactor conditional -> encoder

* modify sampling rate logics

* modify license at the beginning

* refactor all_self_attns->all_attentions

* remove ignore copy from causallm generate

* add copied from for from_sub_models

* fix make copies

* add warning if audio is truncated

* add copied from where relevant

* remove artefact

* fix convert script

* fix torchaudio and FE

* modify chroma method according to feedback-> better naming

* refactor input_values->input_features

* refactor input_values->input_features and fix import fe

* add input_features to docstrigs

* correct inputs_embeds logics

* remove dtype conversion

* refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation

* change warning for chroma length

* Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* change way to save wav, using soundfile

* correct docs and change to soundfile

* fix import

* fix init proj layers

* add draft training

* fix cross entropy

* clean loss computation

* fix labels

* remove line breaks from md

* fix issue with docstrings

* add FE suggestions

* improve is in logics and remove useless imports

* remove custom from_pretrained

* simplify docstring code

* add suggestions for modeling tests

* make style

* update converting script with sanity check

* remove encoder attention mask from conditional generation

* replace musicgen melody checkpoints with official orga

* rename ylacombe->facebook in checkpoints

* fix copies

* remove unecessary warning

* add shape in code docstrings

* add files to slow doc tests

* fix md bug and add md to not_tested

* make fix-copies

* fix hidden states test and batching

* update training code

* add training tests for melody

* add training for o.g musicgen

* fix copied from

* remove final todos

* make style

* fix style

* add suggestions from review

* add ref to the original loss computation code

* rename method + fix labels in tests

* make style

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2024-04-25 12:51:19 +02:00
Tom Aarsen
ce5ae5a434
Prevent crash with WandbCallback with third parties (#30477)
* Use EAFP principle to prevent crash with third parties

* Remove leftover debugging code

* Add info-level logger message
2024-04-25 12:49:06 +02:00
amyeroberts
aca4a1037f
Don't run fp16 MusicGen tests on CPU (#30466) 2024-04-25 11:14:07 +01:00
amyeroberts
4fed29e3a4
Fix SigLip classification doctest (#30475)
* Fix SigLip classification doctest

* Remove extra line

* Update src/transformers/models/siglip/modeling_siglip.py
2024-04-25 11:13:53 +01:00
amyeroberts
30ee508c6c
Script for finding candidate models for deprecation (#29686)
* Add utility for finding candidate models for deprecation

* Better model filtering

* Update

* Add warning tip

* Fix up

* Review comments

* Filter requests based on tags

* Add copyright header
2024-04-25 10:10:01 +01:00
Arthur
c60749d6a6
[fix codellama conversion] (#30472)
* fix codellama conversion

* nit
2024-04-25 10:56:48 +02:00
Younes Belkada
e9b1635478
FIX / Workflow: Fix SSH workflow bug (#30474)
Update ssh-runner.yml
2024-04-25 10:36:54 +02:00
Younes Belkada
cd0cd12add
FIX / Workflow: Change tailscale trigger condition (#30471)
Update push-important-models.yml
2024-04-25 10:33:12 +02:00
Younes Belkada
cebb07262f
Workflow / ENH: Add SSH into our runners workflow (#30425)
* add SSH into our runners workflow

* fix

* fix

* fix

* use our previous approaches

* forward contrib credits from discussions

---------

Co-authored-by: Yih-Dar <ydshieh@users.noreply.github.com>
2024-04-25 10:23:40 +02:00
Yih-Dar
fbb41cd420
consistent job / pytest report / artifact name correspondence (#30392)
* better names

* run better names

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-04-24 22:32:42 +02:00
Zach Mueller
6ad9c8f743
Non blocking support to torch DL's (#30465)
* Non blocking support

* Check for optimization

* Doc
2024-04-24 16:24:23 -04:00
Zach Mueller
5c57463bde
Enable fp16 on CPU (#30459)
* Check removing flag for torch

* LLM oops

* Getting there...

* More discoveries

* Change

* Clean up and prettify

* Logic check

* Not
2024-04-24 15:38:52 -04:00
jeffhataws
d1d94d798f
Neuron: When save_safetensor=False, no need to move model to CPU (#29703)
save_safetensor=True is default as of release 4.35.0, which then
required TPU hotfix https://github.com/huggingface/transformers/pull/27799
(issue https://github.com/huggingface/transformers/issues/27578).
However, when the flag save_safetensor is set to False (compatibility mode),
moving the model to CPU causes generation of too many graphs
during checkpoint https://github.com/huggingface/transformers/issues/28438.
This PR disable moving of model to CPU when save_safetensor=False.
2024-04-24 18:22:08 +01:00
Arthur
661190b44d
[research_project] Most of the security issues come from this requirement.txt (#29977)
update most of decision transformers research project
2024-04-24 17:56:45 +02:00
Yih-Dar
d0d430f14a
Fix wrong indent in utils/check_if_new_model_added.py (#30456)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-04-24 17:44:12 +02:00
Gustavo de Rosa
c9693db2fc
Phi-3 (#30423)
* chore(root): Initial commit of Phi-3 files.

* fix(root): Fixes Phi-3 missing on readme.

* fix(root): Ensures files are consistent.

* fix(phi3): Fixes unit tests.

* fix(tests): Fixes style of phi-3 test file.

* chore(tests): Adds integration tests for Phi-3.

* fix(phi3): Removes additional flash-attention usage, .e.g, swiglu and rmsnorm.

* fix(phi3): Fixes incorrect docstrings.

* fix(phi3): Fixes docstring typos.

* fix(phi3): Adds support for Su and Yarn embeddings.

* fix(phi3): Improves according first batch of reviews.

* fix(phi3): Uses up_states instead of y in Phi3MLP.

* fix(phi3): Uses gemma rotary embedding to support torch.compile.

* fix(phi3): Improves how rotary embedding classes are defined.

* fix(phi3): Fixes inv_freq not being re-computed for extended RoPE.

* fix(phi3): Adds last suggestions to modeling file.

* fix(phi3): Splits inv_freq calculation in two lines.
2024-04-24 17:32:09 +02:00
Yih-Dar
42fed15c81
Add paths filter to avoid the chance of being triggered (#30453)
* trigger

* remove the last job

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-04-24 16:58:54 +02:00
Eduardo Pacheco
d26c14139c
[SegGPT] Fix loss calculation (#30421)
* Fixed main train issues

* Added loss test

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Added missing labels arg in SegGptModel forward

* Fixed typo

* Added slow test to test loss calculation

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-04-24 15:24:34 +01:00
Marc Sun
37fa1f654f
fix jamba slow foward for multi-gpu (#30418)
* fix jamba slow foward for multi-gpu

* remove comm

* oups

* style
2024-04-24 14:19:08 +02:00
Anton Vlasjuk
5d64ae9d75
fix uncaught init of linear layer in clip's/siglip's for image classification models (#30435)
* fix clip's/siglip's _init_weights to reflect linear layers in "for image classification"

* trigger slow tests
2024-04-24 13:03:30 +01:00
Fanli Lin
16c8e176f9
[tests] make test device-agnostic (#30444)
* make device-agnostic

* clean code
2024-04-24 11:21:27 +01:00
Arthur
9a4a119c10
[Llava] + CIs fix red cis and llava integration tests (#30440)
* nit

* nit and fmt skip

* fixup

* Update src/transformers/convert_slow_tokenizer.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* set to true

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-04-24 10:51:35 +02:00
Pavel Iakubovskii
767e351840
Fix YOLOS image processor resizing (#30436)
* Add test for square image that fails

* Fix for square images

* Extend test cases

* Fix resizing in tests

* Style fixup
2024-04-24 09:50:17 +01:00
Arthur
89c510d842
Add llama3 (#30334)
* nuke

* add co-author

* add co-author

* update card

* fixup and fix copies to please our ci

* nit fixup

* super small nits

* remove tokenizer_path from call to `write_model`

* always safe serialize by default

---------

Co-authored-by: pcuenca <pcuenca@users.noreply.github.com>
Co-authored-by: xenova <xenova@users.noreply.github.com>
2024-04-24 10:11:19 +02:00
Yih-Dar
fc34f842cc
New model PR needs green (slow tests) CI (#30341)
* You should not pass

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-04-24 09:52:55 +02:00
Lysandre Debut
c6bba94040
Remove mentions of models in the READMEs and link to the documentation page in which they are featured. (#30420)
* REAMDEs

* REAMDEs v2
2024-04-24 09:38:31 +02:00
Lysandre Debut
d4e92f1a21
Remove add-new-model in favor of add-new-model-like (#30424)
* Remove add-new-model in favor of add-new-model-like

* nits
2024-04-24 09:38:18 +02:00
Lysandre Debut
0eb8fbcdac
Remove task guides auto-update in favor of links towards task pages (#30429) 2024-04-24 09:38:10 +02:00
Arthur
e34da3ee3c
[LlamaTokenizerFast] Refactor default llama (#28881)
* push legacy to fast as well

* super strange

* Update src/transformers/convert_slow_tokenizer.py

* make sure we are BC

* fix Llama test

* nit

* revert

* more test

* style

* update

* small update w.r.t tokenizers

* nit

* don't split

* lol

* add a test for `add_prefix_space=False`

* fix gemma tokenizer as well

* update

* fix gemma

* nicer failures

* fixup

* update

* fix the example for legacy = False

* use `huggyllama/llama-7b` for the PR doctest

* nit

* use from_slow

* fix llama
2024-04-23 23:12:59 +02:00
Jiewen Tan
12c39e5693
Fix use_cache for xla fsdp (#30353)
* Fix use_cache for xla fsdp

* Fix linters
2024-04-23 18:01:35 +01:00
Steven Basart
b8b1e442e3
Rename torch.run to torchrun (#30405)
torch.run does not exist anywhere as far as I can tell.
2024-04-23 09:04:17 -07:00
Matt
696ededd2b
Remove old TF port docs (#30426)
* Remove old TF port guide

* repo-consistency

* Remove some translations as well for consistency

* Remove some translations as well for consistency
2024-04-23 16:06:20 +01:00
Yih-Dar
416fdbad7a
Fix LayoutLMv2 init issue and doctest (#30278)
* fix

* try suggestion

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-04-23 15:33:17 +02:00
Younes Belkada
d179b9dc78
FIX: re-add bnb on docker image (#30427)
Update Dockerfile
2024-04-23 15:32:54 +02:00