Commit Graph

1474 Commits

Author SHA1 Message Date
Ilyas Moutawwakil
fddbd3c13c
Fix pix2struct (#34374)
* fix

* fix and test use_cache test

* style

* remove atol
2024-10-28 11:24:56 +01:00
Joao Gante
186b8dc190
Tests: upgrade test_eager_matches_sdpa_generate (#34386) 2024-10-25 11:55:07 +01:00
Yoni Gozlan
940a6bd343
Use non nested images and batched text Idefics2/3 (#34222)
* add support for non nested images and add tests

* add tests error scenario

* fix style

* added single and no image to error tests
2024-10-24 20:00:13 -04:00
Cyril Vallez
4c6e0c9252
Correct the new defaults (#34377)
* Correct the new defaults

* CIs

* add check

* Update utils.py

* Update utils.py

* Add the max_length in generate test checking shape without passing length

* style

* CIs

* fix fx CI issue
2024-10-24 18:42:03 +02:00
Michael Benayoun
1c5918d910
Fix torch.fx issue related to the new loss_kwargs keyword argument (#34380)
* Fix FX

* Unskip tests
2024-10-24 18:34:28 +02:00
Raushan Turganbay
b29c24ff1e
CI: fix failures (#34371)
fix
2024-10-24 13:44:53 +02:00
Yih-Dar
c42b3223db
skip test_pipeline_depth_estimation temporarily (#34316)
skip

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-23 17:27:51 +02:00
Zach Mueller
d9f733625c
Enable Gradient Accumulation fix across all models + trainer fully in forward() (#34283)
* Enable grad accum fix across all models + trainer fully in forward()

* handle peft case

* Account for DDP: need to run scale tests

* Use accelerator state

* Quality

* Guard

* Experiment w/ only fairseq fix

* Fairseq only

* Revert multiply_grads fix

* Mult by grad accum to fully bring back solution

* Style

* Good to go now

* Skip fx tests for now

* Bookmark

* Working now
2024-10-23 11:24:57 -04:00
Yoni Gozlan
e7c3fa7f57
Fix continue_final_message for image-text-to-text chat templates (#34236)
* fix continue_final_message for vlms

* Add one test for vlms continue_final_message chat template
2024-10-22 11:57:44 -04:00
Guang Yang
c14ccbcd64
Olmo is ExecuTorch Compatible (#34181)
Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-22 15:53:01 +02:00
Guang Yang
7a08a772cc
Qwen2.5 is ExecuTorch Compatible (#34102)
Qwen2 is ExecuTorch Compatible

Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-22 15:52:23 +02:00
Alexandros Benetatos
c31a6ff474
Add post_process_depth_estimation to image processors and support ZoeDepth's inference intricacies (#32550)
* add colorize_depth and matplotlib availability check

* add post_process_depth_estimation for zoedepth + tests

* add post_process_depth_estimation for DPT + tests

* add post_process_depth_estimation in DepthEstimationPipeline & special case for zoedepth

* run `make fixup`

* fix import related error on tests

* fix more import related errors on test

* forgot some `torch` calls in declerations

* remove `torch` call in zoedepth tests that caused error

* updated docs for depth estimation

* small fix for `colorize` input/output types

* remove `colorize_depth`, fix various names, remove matplotlib dependency

* fix formatting

* run fixup

* different images for test

* update examples in `forward` functions

* fixed broken links

* fix output types for docs

* possible format fix inside `<Tip>`

* Readability related updates

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Readability related update

* cleanup after merge

* refactor `post_process_depth_estimation` to return dict; simplify ZoeDepth's `post_process_depth_estimation`

* rewrite dict merging to support python 3.8

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2024-10-22 15:50:54 +02:00
Raushan Turganbay
73d65e637b
T5 compile compatibilty (#34089)
* this worked in normal generation, needs more tests

* fix almost all tests in t5

* nit

* longt5, umt5, mt5

* style

* udop, pix2struct

* more models

* fix some tests

* fix onnx tests

* tracing tests fixed

* compile enabled and tested for t5 models

* fix small bug in slow tests

* [run-slow] t5

* uncomment

* style

* update with new generation refactoring

* nit

* fix copies

* this is the fix, had to change t5 to fix copies

* update

* [run-slow] t5

* [run-slow] t5

* update

* add test for encoder only T5

* clean up after rebase

* fix pop2piano

* add comment

* style

* fix copies after rebase

* fix copies  missed this one
2024-10-22 08:23:53 +02:00
Raushan Turganbay
21d5025826
Attn implementation for composite models (#32238)
* first try

* codestyle

* idefics2 is happy

* [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo, paligemma

* fix-copies

* [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo

* blip-2 needs to init vision from config

* when was this removed O_o

* minor fix

* tests

* this way?

* tests

* model-agnostic code

* codestyle

* add tests for idefics

* modify general test for VLMs

* no generation test for vlm yet!

* no generation test here also

* wanr in VIT-SDPA if output attn

* add more tests

* user can pass dict as attn impl

* repo consistency

* update

* muicgen

* no prints

* forgot speech enc-dec and clip

* how many composite models we have?

* musicgen meelody is same as mudicgen

* +siglip

* fix tests + add some more

* remove idefics custom overriden code

* make idefics2 automappable

* nits

* skip tests

* doctests

* Update src/transformers/models/idefics2/configuration_idefics2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/clip/test_modeling_clip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/idefics2/test_modeling_idefics2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/idefics2/test_modeling_idefics2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/configuration_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* major update, no need for automap

* clean up

* add FA2 test

* more tests

* style

* skip tests

* why did these started failing now?

* no attributes for FA2 needed

* one tiny test

* address comment about FA2 false warning

* style

* add new models and resolve conflicts

* fix copies

* let it be this way for now, come back tomorrow to review

* some more fixes

* update

* more updates

* update

* fix copies

* style and tests

* another big update

* fix tests

* fix tests

* update

* another update

* fix tests

* fix copies

* fix tests

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-10-22 06:54:44 +02:00
Yoni Gozlan
a4122813d1
Add DetrImageProcessorFast (#34063)
* add fully functionning image_processing_detr_fast

* Create tensors on the correct device

* fix copies

* fix doc

* add tests equivalence cpu gpu

* fix doc en

* add relative imports and copied from

* Fix copies and nit
2024-10-21 09:05:05 -04:00
Raushan Turganbay
ca541bd4f4
Generation tests: don't rely on main input name (#34228)
* don't rely on main input name

* update
2024-10-21 10:00:14 +02:00
Cyril Vallez
6604764007
add Glm (#33823)
* Create modular_glm.py

* Update modular_glm.py

* Finalize architecture without all attentions

* Add all attentions modules

* Finalize modular

* Update given last version

* Last update

* Finalize model

* Finalize converter

* Update convert_glm_weights_to_hf.py

* style

* style

* Create __init__.py

* Aff all inits

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Correct the rotary embeddings

* Remove apply_residual_connection_post_layernorm (always false)

* remove use_rms_norm (always true)

* remove past_layer_norm (always true)

* Update __init__.py

* Update config and license

* start adding tests and doc

* Add doc + style

* Update test_modeling_glm.py

* Add dummies

* Apply correct modeling

* Refactor attention to follow llama

* Update __init__.py

* Update convert_glm_weights_to_hf.py

* Correct bias

* remove linear_bias and pdrop (never used)

* apply modular

* Simplify converter

* remove dummies + style

* add model_input_names

* Add pretraining_tp to config for when eager attention is used

* Update modular to remove all pretraining_tp

* Update test_modeling_glm.py

* Update the __all__

* Update __all__

* Update __init__.py

* Update test_modeling_glm.py

* add revisions

* Add the correct repos and revisions

* style

* Update __init__.py

* update exports

* remove import of modular files

* style

* Apply Llama changes + refine converter

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* style

* Use new modular converter

* add pretrainedmodel to init

* style

* Update test_modeling_glm.py

* Move config outside modular to please CI about docstrings

* Add dummies to please CI

* Update glm.md

* Update glm.md
2024-10-18 17:41:12 +02:00
Arthur
b54109c746
Fix-red-ci (#34230)
* fix copies, skip fx for llama

* styke

* re-fix copies

* last?

* style
2024-10-17 23:38:35 +02:00
Guang Yang
9470c00042
Llama3 and Llama2 are ExecuTorch compatible (#34101)
Llama3_1b and Llama2_7b are ExecuTorch compatible

Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-17 17:33:19 +02:00
Yoach Lacombe
9ba021ea75
Moshi integration (#33624)
* clean mimi commit

* some nits suggestions from Arthur

* make fixup

* first moshi WIP

* converting weights working + configuration + generation configuration

* finalize converting script - still missing tokenizer and FE and processor

* fix saving model w/o default config

* working generation

* use GenerationMixin instead of inheriting

* add delay pattern mask

* fix right order: moshi codes then user codes

* unconditional inputs + generation config

* get rid of MoshiGenerationConfig

* blank user inputs

* update convert script:fix conversion, add  tokenizer, feature extractor and bf16

* add and correct Auto classes

* update modeling code, configuration and tests

* make fixup

* fix some copies

* WIP: add integration tests

* add dummy objects

* propose better readiblity and code organisation

* update tokenization tests

* update docstrigns, eval and modeling

* add .md

* make fixup

* add MoshiForConditionalGeneration to ignore Auto

* revert mimi changes

* re

* further fix

* Update moshi.md

* correct md formating

* move prepare causal mask to class

* fix copies

* fix depth decoder causal

* fix and correct some tests

* make style and update .md

* correct config checkpoitn

* Update tests/models/moshi/test_tokenization_moshi.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/models/moshi/test_tokenization_moshi.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* make style

* Update src/transformers/models/moshi/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup

* change firm in copyrights

* udpate config with nested dict

* replace einsum

* make style

* change split to True

* add back splt=False

* remove tests in convert

* Update tests/models/moshi/test_modeling_moshi.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* add default config repo + add model to FA2 docstrings

* remove logits float

* fix some tokenization tests and ignore some others

* make style tokenization tests

* update modeling with sliding window + update modeling tests

* [run-slow] moshi

* remove prepare for generation frol CausalLM

* isort

* remove copied from

* ignore offload tests

* update causal mask and prepare 4D mask aligned with recent changes

* further test refine + add back prepare_inputs_for_generation for depth decoder

* correct conditional use of prepare mask

* update slow integration tests

* fix multi-device forward

* remove previous solution to device_map

* save_load is flaky

* fix generate multi-devices

* fix device

* move tensor to int

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Marc Sun <marc@huggingface.co>
2024-10-16 11:21:49 +02:00
Raushan Turganbay
d087165db0
IDEFICS: support inputs embeds (#34043)
* support embeds

* use cache from config

* style...

* fix tests after rebase
2024-10-16 09:25:26 +02:00
laurentd-lunit
0f49deacbf
[feat] LlavaNext add feature size check to avoid CUDA Runtime Error (#33608)
* [feat] add feature size check to avoid CUDA Runtime Error

* [minor] add error handling to all llava models

* [minor] avoid nested if else

* [minor] add error message to Qwen2-vl and chameleon

* [fix] token dimension for check

* [minor] add feature dim check for videos too

* [fix] dimension check

* [fix] test reference values

---------

Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
2024-10-15 16:19:18 +02:00
Prakarsh Kaushik
293e6271c6
Add sdpa for Vivit (#33757)
* chore:add sdpa to vivit

* fix:failing slow test_inference_interpolate_pos_encoding(failing on main branch too)

* chore:fix nits

* ci:fix repo consistency failure

* chore:add info and benchmark to model doc

* [run_slow] vivit

* chore:revert interpolation test fix for new issue

* [run_slow] vivit

* [run_slow] vivit

* [run_slow] vivit

* chore:add fallback for output_attentions being True

* [run_slow] vivit

* style:make fixup

* [run_slow] vivit
2024-10-15 11:27:54 +02:00
Raushan Turganbay
23874f5948
Idefics: enable generation tests (#34062)
* add idefics

* conflicts after merging main

* enable tests but need to fix some

* fix tests

* no print

* fix/skip some slow tests

* continue not skip

* rebasing broken smth, this is the fix
2024-10-15 11:17:14 +02:00
Anton Vlasjuk
7434c0ed21
Mistral-related models for QnA (#34045)
* mistral qna start

* mixtral qna

* oops

* qwen2 qna

* qwen2moe qna

* add missing input embed methods

* add copied to all methods, can't directly from llama due to the prefix

* make top level copied from
2024-10-14 08:53:32 +02:00
Yih-Dar
7b06473b8f
avoid many failures for ImageGPT (#34071)
* skip

* [run-slow] imagegpt

* skip

* [run-slow] imagegpt

* [run-slow] imagegpt,video_llava

* skip

* [run-slow] imagegpt,video_llava

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-11 15:24:01 +02:00
Yoach Lacombe
9dca0c9116
Fix DAC slow tests (#34088)
* Fix DAC slow tests and fix decode

* [run-slow] dac
2024-10-11 14:43:03 +02:00
Joao Gante
e878eaa9fc
Tests: upcast logits to float() (#34042)
upcast
2024-10-11 11:51:49 +01:00
Guang Yang
7d97cca8dd
Generate using exported model and enable gemma2-2b in ExecuTorch (#33707)
* Generate using exported model and enable gemma2-2b in ExecuTorch

* [run_slow] gemma, gemma2

* truncate expected output message

* Bump required torch version to support gemma2 export

* [run_slow] gemma, gemma2

---------

Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-11 10:16:31 +02:00
Pavel Iakubovskii
8363fd8346
Update Blip2 is_pipeline_test_to_skip method signature (#34067)
Update method signature
2024-10-10 16:32:08 +01:00
Mohamed Abu El-Nasr
4a3f1a686f
check if eigenvalues of covariance matrix are complex. (#34037)
check if eigenvalues of covariance complex for psd checking
2024-10-10 14:44:05 +02:00
Raushan Turganbay
adea67541a
Phi3: fix attn for sliding window (#33586)
* fix phi3 attn fir sliding window

* fix tests

* address most comment

* style

* update after rebase

* add more models

* fix tests
2024-10-10 11:50:39 +02:00
Avishai Elmakies
a265600c60
add sdpa to OPT (#33298)
* add sdpa to OPT

* chore: remove redundant whitespace in OPTDecoder class

* fixup

* bug fix

* add sdpa and attention generate test

* fixup

* Refactor OPTAttention forward method for improved readability and maintainability

* undo refactor for _shape and key,val states

* add OPT to doc, fixup didn't find it for some reason

* change order

* change default attn_implemntation in testing to eager

* [run-slow] opt

* change test_eager_matches_sdpa_generate to the one llama

* Update default attention implementation in testing common

* [run-slow] opt

* remove uneeded print

* [run-slow] opt

* refactor model testers to have attn_implementation="eager"

* [run-slow] opt

* convert test_eager_matches_sdpa_generate to opt-350M

* bug fix when creating mask for opt

* [run-slow] opt

* if layer head mask default to eager

* if head mask is not none fall to eager

* [run-slow] opt

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Clean up Unpack imports (#33631)

clean up Unpack imports

* Fix DPT /Dinov2 sdpa regression on main (#33660)

* fallback to eager if output attentions.

* fix copies

* handle dependency errors in check_imports (#33622)

* handle dependency errors in check_imports

* change log level to warning

* add back self.max_position_embeddings = config.max_position_embeddings (#33550)

* add back self.max_position_embeddings = config.max_position_embeddings

* fix-copies

* Fix Llava conversion for LlavaQwen2ForCausalLM with Clip vision tower (#33613)

fix llavaqwen2 model conversion

* Uniformize kwargs for Udop processor and update docs (#33628)

* Add optional kwargs and uniformize udop

* cleanup Unpack

* nit Udop

* Generation: deprecate `PreTrainedModel` inheriting from `GenerationMixin`  (#33203)

* Enable BNB multi-backend support (#31098)

* enable cpu bnb path

* fix style

* fix code style

* fix 4 bit path

* Update src/transformers/utils/import_utils.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* add multi backend refactor tests

* fix style

* tweak 4bit quantizer + fix corresponding tests

* tweak 8bit quantizer + *try* fixing corresponding tests

* fix dequant bnb 8bit

* account for Intel CPU in variability of expected outputs

* enable cpu and xpu device map

* further tweaks to account for Intel CPU

* fix autocast to work with both cpu + cuda

* fix comments

* fix comments

* switch to testing_utils.torch_device

* allow for xpu in multi-gpu tests

* fix tests 4bit for CPU NF4

* fix bug with is_torch_xpu_available needing to be called as func

* avoid issue where test reports attr err due to other failure

* fix formatting

* fix typo from resolving of merge conflict

* polish based on last PR review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* fix CI

* Update src/transformers/integrations/integration_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/integrations/integration_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix error log

* fix error msg

* add \n in error log

* make quality

* rm bnb cuda restriction in doc

* cpu model don't need dispatch

* fix doc

* fix style

* check cuda avaliable in testing

* fix tests

* Update docs/source/en/model_doc/chameleon.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update docs/source/en/model_doc/llava_next.md

Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update tests/quantization/bnb/test_4bit.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update tests/quantization/bnb/test_4bit.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* fix doc

* fix check multibackends

* fix import sort

* remove check torch in bnb

* docs: update bitsandbytes references with multi-backend info

* docs: fix small mistakes in bnb paragraph

* run formatting

* reveret bnb check

* move bnb multi-backend check to import_utils

* Update src/transformers/utils/import_utils.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* fix bnb check

* minor fix for bnb

* check lib first

* fix code style

* Revert "run formatting"

This reverts commit ac108c6d6b.

* fix format

* give warning when bnb version is low and no cuda found]

* fix device assignment check to be multi-device capable

* address akx feedback on get_avlbl_dev fn

* revert partially, as we don't want the function that public, as docs would be too much (enforced)

---------

Co-authored-by: Aarni Koskela <akx@iki.fi>
Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Fix error string after refactoring into get_chat_template (#33652)

* Fix error string after refactoring into get_chat_template

* Take suggestion from CR

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* uniformize git processor (#33668)

* uniformize git processor

* update doctring

* Modular `transformers`: modularity and inheritance for new model additions (#33248)

* update exampel

* update

* push the converted diff files for testing and ci

* correct one example

* fix class attributes and docstring

* nits

* oups

* fixed config!

* update

* nitd

* class attributes are not matched against the other, this is missing

* fixed overwriting self.xxx now onto the attributes I think

* partial fix, now order with docstring

* fix docstring order?

* more fixes

* update

* fix missing docstrings!

* examples don't all work yet

* fixup

* nit

* updated

* hick

* update

* delete

* update

* update

* update

* fix

* all default

* no local import

* fix more diff

* some fix related to "safe imports"

* push fixed

* add helper!

* style

* add a check

* all by default

* add the

* update

* FINALLY!

* nit

* fix config dependencies

* man that is it

* fix fix

* update diffs

* fix the last issue

* re-default to all

* alll the fixes

* nice

* fix properties vs setter

* fixup

* updates

* update dependencies

* make sure to install what needs to be installed

* fixup

* quick fix for now

* fix!

* fixup

* update

* update

* updates

* whitespaces

* nit

* fix

* simplify everything, and make it file agnostic (should work for image processors)

* style

* finish fixing all import issues

* fixup

* empty modeling should not be written!

* Add logic to find who depends on what

* update

* cleanup

* update

* update gemma to support positions

* some small nits

* this is the correct docstring for gemma2

* fix merging of docstrings

* update

* fixup

* update

* take doc into account

* styling

* update

* fix hidden activation

* more fixes

* final fixes!

* fixup

* fixup instruct  blip video

* update

* fix bugs

* align gemma2 with the rest as well

* updats

* revert

* update

* more reversiom

* grind

* more

* arf

* update

* order will matter

* finish del stuff

* update

* rename to modular

* fixup

* nits

* update makefile

* fixup

* update order of the checks!

* fix

* fix docstring that has a call inside

* fiix conversion check

* style

* add some initial documentation

* update

* update doc

* some fixup

* updates

* yups

* Mostly todo gimme a minut

* update

* fixup

* revert some stuff

* Review docs for the modular transformers (#33472)

Docs

* good update

* fixup

* mmm current updates lead to this code

* okay, this fixes it

* cool

* fixes

* update

* nit

* updates

* nits

* fix doc

* update

* revert bad changes

* update

* updates

* proper update

* update

* update?

* up

* update

* cool

* nits

* nits

* bon bon

* fix

* ?

* minimise changes

* update

* update

* update

* updates?

* fixed gemma2

* kind of a hack

* nits

* update

* remove `diffs` in favor of `modular`

* fix make fix copies

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Fix CIs post merging modular transformers (#33681)

update

* Fixed docstring for cohere model regarding unavailability of prune_he… (#33253)

* Fixed docstring for cohere model regarding unavailability of prune_head() methods

The docstring mentions that cohere model supports prune_heads() methods. I have fixed the docstring by explicitly mentioning that it doesn't support that functionality.

* Update src/transformers/models/cohere/modeling_cohere.py

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Generation tests: update imagegpt input name, remove unused functions (#33663)

* Improve Error Messaging for Flash Attention 2 on CPU (#33655)

Update flash-attn error message on CPU

Rebased to latest branch

* Gemma2: fix config initialization (`cache_implementation`) (#33684)

* Fix ByteLevel alphabet missing when Sequence pretokenizer is used (#33556)

* Fix ByteLevel alphabet missing when Sequence pretokenizer is used

* Fixed formatting with `ruff`.

* Uniformize kwargs for image-text-to-text processors (#32544)

* uniformize FUYU processor kwargs

* Uniformize instructblip processor kwargs

* Fix processor kwargs and tests Fuyu, InstructBlip, Kosmos2

* Uniformize llava_next processor

* Fix save_load test for processor with chat_template only as extra init args

* Fix import Unpack

* Fix Fuyu Processor import

* Fix FuyuProcessor import

* Fix FuyuProcessor

* Add defaults for specific kwargs kosmos2

* Fix Udop to return BatchFeature instead of BatchEncoding and uniformize kwargs

* Add tests processor Udop

* remove Copied from in processing Udop as change of input orders caused by BatchEncoding -> BatchFeature

* Fix overwrite tests kwargs processors

* Add warnings and BC for changes in processor inputs order, change docs, add BC for text_pair as arg for Udop

* Fix processing test fuyu

* remove unnecessary pad_token check in instructblip ProcessorTest

* Fix BC tests and cleanup

* FIx imports fuyu

* Uniformize Pix2Struct

* Fix wrong name for FuyuProcessorKwargs

* Fix slow tests reversed inputs align fuyu llava-next, change udop warning

* Fix wrong logging import udop

* Add check images text input order

* Fix copies

* change text pair handling when positional arg

* rebase on main, fix imports in test_processing_common

* remove optional args and udop uniformization from this PR

* fix failing tests

* remove unnecessary test, fix processing utils and test processing common

* cleanup Unpack

* cleanup

* fix conflict grounding dino

* 🚨🚨 Setting default behavior of assisted decoding (#33657)

* tests: fix pytorch tensor placement errors (#33485)

This commit fixes the following errors:
* Fix "expected all tensors to be on the same device" error
* Fix "can't convert device type tensor to numpy"

According to pytorch documentation torch.Tensor.numpy(force=False)
performs conversion only if tensor is on CPU (plus few other restrictions)
which is not the case. For our case we need force=True since we just
need a data and don't care about tensors coherency.

Fixes: #33517
See: https://pytorch.org/docs/2.4/generated/torch.Tensor.numpy.html

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* bump tokenizers, fix added tokens fast (#32535)

* update based on tokenizers release

* update

* nits

* update

* revert re addition

* don't break that yet

* fmt

* revert unwanted

* update tokenizers version

* update dep table

* update

* update in conversion script as well

* some fix

* revert

* fully revert

* fix training

* remove set trace

* fixup

* update

* update

* [Pixtral] Improve docs, rename model (#33491)

* Improve docs, rename model

* Fix style

* Update repo id

* fix code quality after merge

* HFQuantizer implementation for compressed-tensors library (#31704)

* Add compressed-tensors HFQuantizer implementation

* flag serializable as False

* run

* revive lines deleted by ruff

* fixes to load+save from sparseml, edit config to quantization_config, and load back

* address satrat comment

* compressed_tensors to compressed-tensors and revert back is_serializable

* rename quant_method from sparseml to compressed-tensors

* tests

* edit tests

* clean up tests

* make style

* cleanup

* cleanup

* add test skip for when compressed tensors is not installed

* remove pydantic import + style

* delay torch import in test

* initial docs

* update main init for compressed tensors config

* make fix-copies

* docstring

* remove fill_docstring

* Apply suggestions from code review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* review comments

* review comments

* comments - suppress warnings on state dict load, tests, fixes

* bug-fix - remove unnecessary call to apply quant lifecycle

* run_compressed compatability

* revert changes not needed for compression

* no longer need unexpected keys fn

* unexpected keys not needed either

* Apply suggestions from code review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* add to_diff_dict

* update docs and expand testing

* Update _toctree.yml with compressed-tensors

* Update src/transformers/utils/quantization_config.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update doc

* add note about saving a loaded model

---------

Co-authored-by: George Ohashi <george@neuralmagic.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Sara Adkins <sara@neuralmagic.com>
Co-authored-by: Sara Adkins <sara.adkins65@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Dipika Sikka <ds3822@columbia.edu>
Co-authored-by: Dipika <dipikasikka1@gmail.com>

* update model card for opt

* add batch size to inference table

* [slow-run] opt

* [run-slow] opt

---------

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Co-authored-by: Avishai Elmakies <avishai.elma@cs.huji.ac.il>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: chengchengpei <5881383+chengchengpei@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Aarni Koskela <akx@iki.fi>
Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Tibor Reiss <75096465+tibor-reiss@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>
Co-authored-by: Muhammad Naufil <m.naufil1@gmail.com>
Co-authored-by: sizhky <yyeshr@gmail.com>
Co-authored-by: Umar Butler <umar@umar.au>
Co-authored-by: Jonathan Mamou <jonathan.mamou@intel.com>
Co-authored-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>
Co-authored-by: George Ohashi <george@neuralmagic.com>
Co-authored-by: Sara Adkins <sara@neuralmagic.com>
Co-authored-by: Sara Adkins <sara.adkins65@gmail.com>
Co-authored-by: Dipika Sikka <ds3822@columbia.edu>
Co-authored-by: Dipika <dipikasikka1@gmail.com>
2024-10-10 11:49:34 +02:00
Pavel Iakubovskii
48461c0fe2
Make pipeline able to load processor (#32514)
* Refactor get_test_pipeline

* Fixup

* Fixing tests

* Add processor loading in tests

* Restructure processors loading

* Add processor to the pipeline

* Move model loading on tom of the test

* Update `get_test_pipeline`

* Fixup

* Add class-based flags for loading processors

* Change `is_pipeline_test_to_skip` signature

* Skip t5 failing test for slow tokenizer

* Fixup

* Fix copies for T5

* Fix typo

* Add try/except for tokenizer loading (kosmos-2 case)

* Fixup

* Llama not fails for long generation

* Revert processor pass in text-generation test

* Fix docs

* Switch back to json file for image processors and feature extractors

* Add processor type check

* Remove except for tokenizers

* Fix docstring

* Fix empty lists for tests

* Fixup

* Fix load check

* Ensure we have non-empty test cases

* Update src/transformers/pipelines/__init__.py

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Update src/transformers/pipelines/base.py

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Rework comment

* Better docs, add note about pipeline components

* Change warning to error raise

* Fixup

* Refine pipeline docs

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-10-09 16:46:11 +01:00
Raushan Turganbay
5ee52ae0bc
Mllama: fix tests (#34000)
* fix tests

* don't need this

* style
2024-10-09 14:02:56 +02:00
Joao Gante
295a90cb40
Generate: remove most decoder-only LLMs prepare_inputs_for_generation (#33870) 2024-10-09 12:15:48 +01:00
Mohamed Abu El-Nasr
cdee5285ca
Fix Failed tests with mobile bert resize tokens embedding (#33950)
* Fix Failed tests with mobile bert

* Cast to the correct dtype

* Code fixup

* Fix padding_idx larger that embedding_size

* Reduce covariance more. use 1e-7 instead of 1e-5

* Comment fix

* Reduce covariance more. use 1e-9 instead of 1e-7

* Copy new config

* all but MRA fixed

* fix mra

* very flaky

* skip instead

* make fixup

---------

Co-authored-by: Joao Gante <joao@huggingface.co>
2024-10-09 11:23:50 +01:00
Yoni Gozlan
e2001c3413
Add auto model for image-text-to-text (#32472)
* Add Auto model for image-text-to-text

* Remove donut from processing auto, add chameleon ti image text to text models

* add qwen2_vl and llava_onevision

* add pixtral to auto model for image-text-to-text

* add mllama and idefics3

* remove models in IGNORE_NON_AUTO_CONFIGURED

* add AutoModelForImageTextToText to tests and doc
2024-10-08 14:26:43 +02:00
Arthur
736c7cde51
[pytes collection] Fix flax test collection (#34004)
bit weird but to filter I had to use this
2024-10-07 18:11:13 +02:00
Arthur
9b4b0c07db
[Red CIs] Fix hub failures (#34001)
maybe setup should work?
2024-10-07 10:56:24 +02:00
TomLim
1bd604d11c
[WIP] Add Tokenizer for MyT5 Model (#31286)
* Initial commit for MyT5 model

* custom implementation of MyT5 tokenizer, unused files deleted

* unittest for myt5 tokenizer

* upadate of import structure and style

* removed remmanents of MyT5Config

* fixed docstrings

* Updates after review: filled documentaion file, new docstrings and tests added

* Fixed code style issues

* fixed copied from to refer to function

* updated loading myt5 tokenizer in tests, added sample byte map file to fixtures

* changes after review

* removed redundant copied from

* removed redundant copied from

* optimalization and loading model from hf

* [run_slow] myt5

* [run-slow] myt5

* Updated en documentation for myt5

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-10-06 10:33:16 +02:00
Yehoshua Cohen
56be9f1925
add test for Jamba with new model jamba-tiny-dev (#33863)
* add test for jamba with new model

* ruff fix

---------

Co-authored-by: Yehoshua Cohen <yehoshuaco@ai21.com>
2024-10-05 16:03:12 +02:00
Raushan Turganbay
612065efeb
Paligemma: fix static cache test (#33941)
* fix

* not flaky anymore + style
2024-10-05 09:47:37 +02:00
Joao Gante
38f9f10dd9
Cache: revert DynamicCache init for BC (#33861)
* tmp commit

* tmp commit

* make fixup

* missing removal

* fix condition

* fix end-to-end compilation

* if -> elif

* BC

* BC

* use @deprecate_kwarg("num_hidden_layers", version="4.47.0")

* wups the import

* 🥴

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2024-10-04 22:47:08 +02:00
Arthur
f92d354823
fix red check-copies (#33964) 2024-10-04 22:45:37 +02:00
pglorio
f319ba16fa
Add Zamba (#30950)
* Update index.md

* Rebase

* Rebase

* Updates from make fixup

* Update zamba.md

* Batched inference

* Update

* Fix tests

* Fix tests

* Fix tests

* Fix tests

* Update docs/source/en/model_doc/zamba.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/model_doc/zamba.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update configuration_zamba.py

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update modeling_zamba.py

* Update modeling_zamba.py

* Update modeling_zamba.py

* Update configuration_zamba.py

* Update modeling_zamba.py

* Update modeling_zamba.py

* Merge branch 'main' of https://github.com/Zyphra/transformers_zamba

* Update ZambaForCausalLM

* Update ZambaForCausalLM

* Describe diffs with original mamba layer

* Moved mamba init into `_init_weights`

* Update index.md

* Rebase

* Rebase

* Updates from make fixup

* Update zamba.md

* Batched inference

* Update

* Fix tests

* Fix tests

* Fix tests

* Fix tests

* Update docs/source/en/model_doc/zamba.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/model_doc/zamba.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update configuration_zamba.py

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update modeling_zamba.py

* Update modeling_zamba.py

* Update modeling_zamba.py

* Update configuration_zamba.py

* Update modeling_zamba.py

* Update modeling_zamba.py

* Merge branch 'main' of https://github.com/Zyphra/transformers_zamba

* Update ZambaForCausalLM

* Moved mamba init into `_init_weights`

* Update ZambaForCausalLM

* Describe diffs with original mamba layer

* make fixup fixes

* quality test fixes

* Fix Zamba model path

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* Update

* circleci fixes

* fix zamba test from merge

* fix ValueError for disabling mamba kernels

* add HF copyright

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* shared_transf --> shared_transformer

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Fixes

* Move attention head dim to config

* Fix circle/ci tests

* Update modeling_zamba.py

* apply GenerationMixin inheritance change from upstream

* apply import ordering

* update needed transformers version for zamba

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* add contribution author

* add @slow to avoid CI

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Define attention_hidden_size

* Added doc for attention_head_size

* trigger CI

* Fix doc of attention_hidden_size

* [run-slow] zamba

* Fixed shared layer logic, swapped up<->gate in mlp

* shared_transformer -> shared_transf

* reformat HybridLayer __init__

* fix docstrings in zamba config

* added definition of _get_input_ids_and_config

* fixed formatting of _get_input_ids_and_config

---------

Co-authored-by: root <root@node-4.us-southcentral1-a.compute.internal>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: root <root@node-1.us-southcentral1-a.compute.internal>
Co-authored-by: Quentin Anthony <qganthony@yahoo.com>
2024-10-04 22:28:05 +02:00
Amit Garg
e3775539c8
PhiMoE (#33363)
* onboard phimoe model

* removed debug code

* added unit tests

* updated docs

* formatted

* fixed unit tests

* fixed test case

* fixed format

* refactored code

* fixed expected outputs in the integration tests

* Added a warning msg

* Addressed comments

* Addressed comments

* fixed test cases

* added paper link

* Addressed comments

* Refactored PhimoeForCausalLM forward fn

* Refactored PhimoeRotaryEmbedding class

* fixed test cases

* fixed testcase

* fixed test case

* Addressed comments

* fixed test cases

* fixed testcases

* Used cache position instead to get the seq len
2024-10-04 21:39:45 +02:00
Longjie Zheng
0d1692a49b
Fix attn mask ignore logic in training-time trace (#32613)
* fix attn mask logic for training-time trace

* add test

* fix

* fix

* fix

* fix

* fix

* format

* [run-slow] llama

* avoid accelearate

* [run-slow] llama
2024-10-04 19:00:45 +02:00
Yoach Lacombe
124713c32b
Fix distil whisper segment computation (#33920)
* Fix distil whisper segment computation

* [run-slow] whisper
2024-10-04 11:18:01 +02:00
Yoni Gozlan
074aa3b3fd
Uniformize kwargs for Idefics/2 processors (#32568)
* Add uniformize idefics processor kwargs and tests

* Uniformize idefics2 processor kwargs

* add image_processor tests idefics

* add BC args order change idefics2 processor and update doc

* Add support for multiple images per prompt in image-text-to-text mode idefics

* Fix processor input args in idefics tests

* improve test processing common, remove unnecessary tests, update process uniformization

* fix doctrings idefics

* fix tests processors idefics/2
2024-10-03 18:08:24 +02:00