Commit Graph

4599 Commits

Author SHA1 Message Date
Joao Gante
42ebb6c23e
[tests] Parameterized test_eager_matches_sdpa_inference (#36650) 2025-03-14 14:41:27 +00:00
Matt
9215cc62d4
Try working around the processor registration bugs (#36184)
* Try working around the processor registration bugs

* oops

* Update error message

* Clarify error

* Docstring docstring docstring

* The extra content is indexed by config class, so let's grab some values out of there

* Commit my confusion as a TODO

* Resolve my confusion

* Cleanup and mostly revert to the original

* Better autoclass fallback

* Don't nest f-strings you lunatic

* Clearer error message

* Less getattr()

* Revert a lot of changes to try a different approach!

* Try the global registry

* Check the dynamic list as well as the transformers root

* Move the dynamic list somewhere safer

* Move the dynamic list somewhere even safer

* More import cleanup

* Simplify all the register_for_auto_class methods

* Set _auto_class in the register() methods

* Stop setting the cls attribute in register()

* Restore specifying the model class for Model derivatives only

* Fix accidentally taking the .__class__ of a class

* Revert register_for_auto_class changes

* Fix get_possibly_dynamic_module

* No more ALL_CUSTOM_CLASSES

* Fix up get_possibly_dynamic_module as well

* Revert unnecessary formatting changes

* Trigger tests
2025-03-14 13:56:21 +00:00
Sean (Seok-Won) Yi
691d1b52c3
Fix/best model checkpoint fix (#35885)
* Set best_model_checkpoint only when ckpt exists.

Rather than set it explicitly without checking if the checkpoint directory even exists as before, now we moved the setting logic inside of _save_checkpoint and are only setting it if it exists.

* Added best_global_step to TrainerState.

* Added tests for best_model_checkpoint.

* Fixed hard-coded values in test to prevent fail.

* Added helper func and removed hard-coded best_step.

* Added side effect patch generator for _eval.

* Added evaluate side effect func.

* Removed erroneous patching.

* Fixed minor bug.

* Applied Ruff.

* Fixed Ruff problem in make style.

* Used Trainer.set_initial_training_values.
2025-03-14 14:24:53 +01:00
Matt
72861e11eb
Make the flaky list a little more general (#36704)
* Make the flaky list a little more general

* Trigger tests

* Make the flaky list a little more general
2025-03-14 12:15:32 +00:00
Kingsley
53742b11f5
Gemma3 processor typo (#36710)
* fix typo when  is on

* tiny

* add test and remove 'text_crops'

* lint
2025-03-14 13:07:55 +01:00
Matt
48ef468c74
Final CI cleanup (#36703)
* make fixup

* make fixup

* Correct skip decorator

* Add TODOs

* add is_flaky() parentheses
2025-03-13 17:26:09 +00:00
Cyril Vallez
2a004f9ff1
Add loading speed test (#36671)
* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* trigger CIs

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* better error messages

* Update test_modeling_utils.py

* Update test_modeling_utils.py
2025-03-13 17:07:30 +01:00
bd793fcb
87b30c3589
fix wandb hp search unable to resume from sweep_id (#35883)
* fix wandb hp search unable to resume from sweep_id

* format styles

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-13 12:32:26 +01:00
Mohamed Mekkouri
47cc4da351
Changing the test model in Quanto kv cache (#36670)
changing model
2025-03-13 12:23:34 +01:00
Marc Sun
fbb18ce68b
Update config.torch_dtype correctly (#36679)
* fix

* style

* new test
2025-03-13 12:08:02 +01:00
Joao Gante
c4161238bd
[Cache] Don't initialize the cache on meta device (#36543) 2025-03-13 10:13:29 +00:00
Yoni Gozlan
ea219ed164
Remove differences between init and preprocess kwargs for fast image processors (#36186)
* Remove differences between init and preprocess kwargs in fast image processors

* make modifs got_ocr2

* update gemma3
2025-03-12 19:44:05 -04:00
Yoni Gozlan
bc3253f076
Remove hardcoded slow image processor class in processors supporting fast ones (#36266)
* Add fast image processor class to processors supporting them

* fix test kosmos2
2025-03-12 18:39:25 -04:00
Mohamed Mekkouri
0013ba61e5
Fix Failing GPTQ tests (#36666)
fix tests
2025-03-12 20:03:02 +01:00
Matt
c7eb95581a
Don't accidentally mutate the base_model_tp_plan (#36677)
* Don't accidentally mutate the base_model_tp_plan

* Co-authored by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Trigger tests

* Marking grad accum test as slow

* Add a flaky decorator

* Add a flaky decorator

* Use cyril's codeblock

* Don't copy() when it's None

* Use cyril's new codeblock

* make fixup
2025-03-12 18:59:13 +00:00
Cyril Vallez
071a161d3e
[core] Large/full refactor of from_pretrained (#36033)
* squash everything together
start to simplify inner logic

Update modeling_utils.py

Update modeling_utils.py

Update modeling_utils.py

Update modeling_utils.py

continue refactor

fix

small fixes

add type hints/docstring

Update modeling_utils.py

remove _fast_init

keep improving

Update modeling_utils.py

Update modeling_utils.py

new first tp loading version

style

fix weird in-place op

trigger CIs

Update modeling_utils.py

much clearer renaming of keys

fix

update

Update test_modeling_common.py

trigger CIs

update

update

style

Update modeling_utils.py

Update modeling_utils.py

Update modeling_utils.py

fix

fast download first prototype

remove old function

remove old functions

Remove unused function and move back _get_tp_registry

fix tp plan registry

simplify

CIs

Update hub.py

Update modeling_utils.py

simplify

simplify renaming logic

remove unused check

add sanity check back (a test depends on it)

Update modeling_utils.py

finalize sound renaming logic

style

add forgotten check

Update modeling_utils.py

add key_mapping keyword

style

Update modeling_utils.py

add comment

minor updates

minor change for clarity

fix small prefix issue and simplify

style

trigger CIs

typo fix

Post rebase fix

post rebase cleanup

simplify tp

typo

oupsi

typo

correctly escape

improvements based on Marc's review

finalize Marc's review comments

 squash everything

* improve

* Update modeling_utils.py

* Update modeling_utils.py

* fix

* Update modeling_utils.py

* Update modeling_utils.py

* style

* Update modeling_utils.py

* simplify

* style

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* fix dtype issue

* Update modeling_utils.py

* style

* remove test that does not make sense

* style

* small fixes

* style

* fix

* cleanup after rebase

* style

* typo

* escape

* tp for task specific top modules

* Update modeling_utils.py

* Update modeling_utils.py

* fix allocation

* CIs

* CIs

* CIs

* improve docstring

* CIs

* Update modeling_utils.py

* fix
2025-03-12 13:39:25 +01:00
Ilyas Moutawwakil
89f6956015
HPU support (#36424)
* test

* fix

* fix

* skip some and run some first

* test fsdp

* fix

* patches for generate

* test distributed

* copy

* don't test distributed loss for hpu

* require fp16 and run first

* changes from marc's PR fixing zero3

* better alternative

* return True when fp16 support on gaudi without creating bridge

* fix

* fix tested dtype in deepspeed inference test

* test

* fix

* test

* fix

* skip

* require fp16

* run first fsdp

* Apply suggestions from code review

* address comments

* address comments and refactor test

* reduce precison

* avoid doing gaudi1 specific stuff in the genreation loop

* document test_gradient_accumulation_loss_alignment_with_model_loss test a bit more
2025-03-12 09:08:12 +01:00
Ryan Mullins
50d3530aa0
Gemma3 (#36658)
* Fix converter

* [Broken] Adds Gemma 3 to Hugging Face Transformers

* Consolidating Config and Processor params across impls

* Sorting out configuration parameters. Adds qk_norm before RoPE. Still not sure if RoPE is right.

* Additional plumbing for CausalLM and ConditionalGeneration variants

* incomplete draft of Orbax conversion script

* More complete checkpoint conversion

* Supporting Gemma 3 1B checkpoints

* Updating RoPE for multiple frequencies

* Adjustments to rotary embedder

* Proof of life for text-only operation

* Updating the conversion script to handle multimodal projection weights

* Fixing tet-only conversions

* Cleaner conversion script with multimodal support and a simpler processor

* Additional refatcors to the Gemma3Processor

* Simplified Processor to work over text representations

* Updated conversion script to join text and vision embeddings at converion time

* Logging for debugging

* Update src/transformers/models/gemma2/modeling_gemma2.py

Co-authored-by: Joshua Lochner <admin@xenova.com>

* Removed extraneous Config params

* Switching to fast tokenizer for checkpoint conversions

* isolating siglip for performance tetsing

* Minor changes for debugging tests against baselines

* Adding average pooling for soft tokens

* Updating processor code to enable simpler embedding interleaving for arbitrary number of images in prompts

* Updating conversion script for ShieldGemma 2 conversion compatibility

* Allow disable_compile to be provided as a kwarg

* Refresh from modular

* Updated conversion script and corrected sliding window

* Fix type mismatch in cache_position (#4)

* Fix dtype (#5)

* Fix type mismatch in cache_position

* Actually fix in the modular file

Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>

---------

Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>

* fixes for embedding table overflow and missing image_soft_token_mask from Gemma3Processor

* Adding 2D pooling for image embeddings

* Revert "Adding 2D pooling for image embeddings"

This reverts commit 65350cf531.

* Gemma3 average pooling changed from 1D to 2D

* Major refactor to Gemma3MultimodalInputProjection

* Updating Gemm 3 Auto* registrations

* Add option to save Gemma 3 chat template with tokenizer during weights conversion

* Removing unused imports

* Moving out-of-vocab handling from Gemma3Processor to Gemma3ForConditionalGeneration

* Removing duplicate config property

* Removing final logit softcapping and 1-indexing of position ids

* Fixing image processor config and none --> None typo

* Fixing sliding window size for 1B

* Updating image_mean and image_std in Image Processor

* Attention masking changed to lower triangular

* Moving image special tokens to conversion script

* Mirror image processor defaults from conversion script into Gemma3ProcessorKwargs

* Remove special token variables from symbol space

* Moving image soft token mask computation from Gemma3Processor to Gemma3ForConditionalGeneration

* tie lm_head and embedding weights

Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

* Correct tied weights in Gemma3CausalLM

* iterative bidirectional attention

* resolving merge conflicts

* Reverting to Gemma 2 HybridCache with sldiing window support and a sliding_window_pattern of 6

* Correcting RoPE scaling

* clean up first pass, dummy model geenration works

* final clean up before fixing tests

* causal lm test works, so fine

* Fix conversion

* Update src/transformers/models/gemma3/processing_gemma3.py

* model tests are happy

* processor tests are happy

* image processing tests added

* fixup

* Fix pre-processing in conversion

* Inputs merging

* Do not normalize vision embeddings

* Apply Ryan's (and team) changes to attention

* token type ids + mask

* template

* move embed scale, add rope scale, fix tests

* Add chat template to tokenizer

* Use prefix for causal model loading

* use existing code for sliding mask from gemma2

* self.embed_tokens already normalizes

* Correcting Gemma3TextConfig parameters in conversion script

* typo, modular overwrites my fixes

* enable device map for text model

* Conversion updates

* ultra nit: no einsums

* update image token

* copy deepcopy config + some docs

* add some test, still WIP

* Refactoring --include_chat_tempalte logic in converter

* Update src/transformers/models/gemma3/modular_gemma3.py

Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

* Add eos tokens for instruct models

* dump so i can work on dgx

* Removing add_bos by default

* dump

* add fast im proc

* docs for PaS + fixup

* another fixup

* one more fixup

* fix tests

* Inverting prior BOS change

* ultra nit

* Reverting to Tokenizer saved with add_bos_token=True and chat template starting with BOS

* resize embeds, remove sqrt, add slow test outputs

* FA2 but quality is meh

* nit

* skip FA2, no idea what happened

* last bit for green CI

* please, green CI for docs

* T_T

* Fix for Gemma3 logits

* Support both options for system prompt

* Update src/transformers/models/gemma3/image_processing_gemma3_fast.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/model_doc/gemma3.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/model_doc/gemma3.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/model_doc/gemma3.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/model_doc/gemma3.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/model_doc/gemma3.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Docs updates now that assets are live

* Style fixes

---------

Co-authored-by: Joshua Lochner <admin@xenova.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Co-authored-by: Mayank Chaturvedi <imayank@google.com>
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
Co-authored-by: raushan <raushan@huggingface.co>
Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
Co-authored-by: Lysandre <hi@lysand.re>
2025-03-12 09:06:17 +01:00
Travis Johnson
d8663cb8c5
Fix bugs in mllama image processing (#36156)
* fix: handle input_channel_dim == channels_last

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

* fix: default PIL images to channels_last

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

* Apply suggestions from code review

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* fixup from review batch

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

* test: add 1x1 PIL image to ambiguous channel test

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

* fix(mllama): avoid 0 dimension for image with impractical aspect ratio

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

---------

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-03-11 10:22:48 +01:00
Arthur
1c4b62b219
Refactor some core stuff (#36539)
* some config changes

* update

* current state

* update

* update

* updates and cleanup

* something that works

* fixup

* fixes

* nits

* nit

* nits and fix

* Update src/transformers/integrations/tensor_parallel.py

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Update src/transformers/integrations/tensor_parallel.py

Co-authored-by: Lysandre Debut <hi@lysand.re>

* cleanup

* style

* safe import

* fix

* updates

* rename stuff an clean

* style

* small updates

* ups

* oups

* nit

* protect imports

* update tp

* rodfl

* arf

* turbo nit on init

* fix import error

* frumble gumbgle

* try to fix the import error

* should fix the non model test

* update keep in float32

* update

* fix

* nits

* fix subvconfigs

* test was weird

* nit

* fix failing test

* fix instruct blip

* fixes

* style

* x.com

* fix overwrite

* ok last bit of failing test

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>
2025-03-11 09:26:28 +01:00
Joao Gante
858545047c
[HybridCache] disable automatic compilation (#36620) 2025-03-10 09:24:26 +00:00
gautham
a1cf9f3390
Fixed datatype related issues in DataCollatorForLanguageModeling (#36457)
Fixed 2 issues regarding `tests/trainer/test_data_collator.py::TFDataCollatorIntegrationTest::test_all_mask_replacement`:
1. I got the error `RuntimeError: "bernoulli_tensor_cpu_p_" not implemented for 'Long'`. This is because the `mask_replacement_prob=1` and `torch.bernoulli` doesn't accept this type (which would be a `torch.long` dtype instead. I fixed this by manually casting the probability arguments in the `__post_init__` function of `DataCollatorForLanguageModeling`.
2. I also got the error `tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute Equal as input #1(zero-based) was expected to be a int64 tensor but is a int32 tensor [Op:Equal]` due to the line `tf.reduce_all((batch["input_ids"] == inputs) | (batch["input_ids"] == tokenizer.mask_token_id))` in `test_data_collator.py`. This occurs because the type of the `inputs` variable is `tf.int32`. Solved this by manually casting it to `tf.int64` in the test, as the expected return type of `batch["input_ids"]` is `tf.int64`.
2025-03-07 14:09:27 +00:00
Joao Gante
5275ef6f3d
[XGLM] tag tests as slow (#36592)
these tests should be slow
2025-03-06 17:54:41 +00:00
co63oc
996f512d52
Fix typos in tests (#36547)
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
2025-03-05 15:04:06 -08:00
ivarflakstad
c0c5acff07
Fix bamba tests amd (#36535) 2025-03-04 15:24:27 +01:00
Arthur
84f0186e89
Add aya (#36521)
* initial commit

* small fix

* move stuff to image processing file

* remove stuff in validate turn and fix return tensor

* remove liquid stuff

* in the process of addressing comments

* changes to get the right tokenization

* new __init__ works

* fixing defulat std and mean

* works

* small testing scipt -- to be deleted before merge

* remove redundant code

* addressing comments

* fix inits, add docs templates

* refactor processor, switch to gotocr image processor

* remove image proc from init

* refactor to working llava-style architecture

* Change AyaVisionModel to AyaVisionForConditionalGeneration

* add tests

* fixups

* update doc

* Adding logits_to_keep explicitly in ayavision forward to enable compatibility with cohere model

* better variable names + remove code paths

* Updates to aya_vision.md

* address comments

* adding copied from

* make style and remove unused projector_hidden_act from config

* sort init

* include usage of fast image proc and proc on cuda in doc

* update checkpoint iin test processor

* update checkpoint in test processor 2

* remove test_model and update docstring

* skip failing tests

---------

Co-authored-by: Saurabh Dash <saurabh@cohere.com>
Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
2025-03-04 12:24:33 +01:00
Steven Liu
c0f8d055ce
[docs] Redesign (#31757)
* toctree

* not-doctested.txt

* collapse sections

* feedback

* update

* rewrite get started sections

* fixes

* fix

* loading models

* fix

* customize models

* share

* fix link

* contribute part 1

* contribute pt 2

* fix toctree

* tokenization pt 1

* Add new model (#32615)

* v1 - working version

* fix

* fix

* fix

* fix

* rename to correct name

* fix title

* fixup

* rename files

* fix

* add copied from on tests

* rename to `FalconMamba` everywhere and fix bugs

* fix quantization + accelerate

* fix copies

* add `torch.compile` support

* fix tests

* fix tests and add slow tests

* copies on config

* merge the latest changes

* fix tests

* add few lines about instruct

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

* fix tests

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* "to be not" -> "not to be" (#32636)

* "to be not" -> "not to be"

* Update sam.md

* Update trainer.py

* Update modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* fix hfoption tag

* tokenization pt. 2

* image processor

* fix toctree

* backbones

* feature extractor

* fix file name

* processor

* update not-doctested

* update

* make style

* fix toctree

* revision

* make fixup

* fix toctree

* fix

* make style

* fix hfoption tag

* pipeline

* pipeline gradio

* pipeline web server

* add pipeline

* fix toctree

* not-doctested

* prompting

* llm optims

* fix toctree

* fixes

* cache

* text generation

* fix

* chat pipeline

* chat stuff

* xla

* torch.compile

* cpu inference

* toctree

* gpu inference

* agents and tools

* gguf/tiktoken

* finetune

* toctree

* trainer

* trainer pt 2

* optims

* optimizers

* accelerate

* parallelism

* fsdp

* update

* distributed cpu

* hardware training

* gpu training

* gpu training 2

* peft

* distrib debug

* deepspeed 1

* deepspeed 2

* chat toctree

* quant pt 1

* quant pt 2

* fix toctree

* fix

* fix

* quant pt 3

* quant pt 4

* serialization

* torchscript

* scripts

* tpu

* review

* model addition timeline

* modular

* more reviews

* reviews

* fix toctree

* reviews reviews

* continue reviews

* more reviews

* modular transformers

* more review

* zamba2

* fix

* all frameworks

* pytorch

* supported model frameworks

* flashattention

* rm check_table

* not-doctested.txt

* rm check_support_list.py

* feedback

* updates/feedback

* review

* feedback

* fix

* update

* feedback

* updates

* update

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
2025-03-03 10:33:46 -08:00
Matt
1975be4d97
Fix edge case for continue_final_message (#36404)
* Fix edge case for continue_final_message

* lstrip() correctly

* Add regression test

* Add a clearer error message when the final message is not present

* Add a clearer error message when the final message is not present

* Fix massive bug!
2025-03-03 18:03:03 +00:00
Matt
2aff938992
Fix pipeline+peft interaction (#36480)
* Fix pipeline-peft interaction

* once again you have committed a debug breakpoint

* Remove extra testing line

* Add a test to check adapter loading

* Correct adapter path

* make fixup

* Remove unnecessary check

* Make check a little more stringent
2025-03-03 18:01:43 +00:00
Zach Mueller
4d8259d245
Fix loading zero3 weights (#36455)
* Check if fixes

* Fix zero3 loading

* Quality

* Fix marc nit

* Add fast tests

* Migrate to integrations.deepspeed rather than modeling_utils

* Style
2025-03-03 15:05:58 +01:00
Yoni Gozlan
2c5d038f92
Add Got-OCR 2 Fast image processor and refactor slow one (#36185)
* refactor image processor slow got ocr

* add working image processor fast

* fix fast image processor, update doc

* use one big loop for processing patches
2025-03-01 00:56:00 -05:00
Eduardo Pacheco
222505c7e4
[GroundingDino] Fix grounding dino loss 🚨 (#31828)
* Starting to fix GroundingDinoLoss and GroundingDinoHungarianMatcher

* More updates

* More updates

* fixed: GroundingDinoLoss

* fixed: failing tests

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/grounding_dino/test_modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Addressed comments

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>

* add: cardinality loss and make box loss as copy from

* change: default for reduction loss is sum

* fix: vectorized generate fake box

* fix copies

* Addressed comments

* addressed comments

* addressed one-hot

* Update tests/models/grounding_dino/test_modeling_grounding_dino.py

Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>

* Addressed comments

* fixed test

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

* Update tests/models/grounding_dino/test_modeling_grounding_dino.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Starting to fix GroundingDinoLoss and GroundingDinoHungarianMatcher

* More updates

* More updates

* fixed: GroundingDinoLoss

* add: cardinality loss and make box loss as copy from

* fix copies

* Revert "Update tests/models/grounding_dino/test_modeling_grounding_dino.py"

This reverts commit aa74c4c57c430e54cc74c414d6269edb65c73e83.

* [run-slow] groundigdino

* remove nestedtensor

* [run-slow] groundig_dino

* [run-slow] grounding_dino

* [run-slow] grounding_dino

* [run-slow] grounding_dino

* check

* check

* add: enconder intermediate outputs to ImageLoss forward

* add: GroundingDinoForObjectDetectionLoss in the loss directory

* make style

* fix the loss function

* remove class_reduction since it sum is default

* remove class_reduction

* Update src/transformers/loss/loss_grounding_dino.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* simple fix

* Update src/transformers/loss/loss_grounding_dino.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* minor fix

* Update src/transformers/loss/loss_for_object_detection.py

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Co-authored-by: sangbumchoi <danielsejong55@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-27 19:15:58 +00:00
Yih-Dar
482d17be60
Fix hub_retry (#36449)
* cry

* trigger

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-27 14:38:25 +01:00
Joao Gante
8aed019764
[generate] torch.distributed-compatible DynamicCache (#36373)
* test

* docstring

* prepare distributed cache data

* fix cat dim

* test mvp

* add test checks

* like this?

* working test and solution

* nit

* nit

* add shape info
2025-02-27 11:48:57 +00:00
Mohamed Mekkouri
a7fbab33ae
Fix Expected output for compressed-tensors tests (#36425)
fix
2025-02-26 21:17:24 +01:00
Arthur
1603018e7a
Update form pretrained to make TP a first class citizen (#36335)
* clean code

* oups

* fix merge

* yups

* fix if

* now you can play

* fix shape issue

* try non blocking

* fix

* updates

* up

* updates

* fix most of thetests

* update

* update

* small updates

* up

* fix the remaining bug?

* update

* rename when you read from the file

* buffer issues

* current status

* cleanup

* properly allocate dumb memory

* update a small bug

* fix colwise rep issue

* fix keep in float 32 that was keeping everything in float 32

* typo

* more fixes with keep_in_fp32_modules as we use to serach on it

* fix ROPE dtype for TP

* remove what's breaking the tests

* updates

* update and fixes

* small cleanup after merging

* allocate 2x to be safe

* style, auto

* update

* yup nit

* fix

* remove slow as fuck torch api :(

* work

* fixup

* update

* brting the fix back

* fix and update

* fixes

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* updates because some suggestions were wrong 👀

* update?

* fuck this bloated function

* typo

* fix the dumb prefix thing once and forall

* fixes here and there

* updates

* remove prints

* fix strict cases

* styel

* properly fix keys on load!

* update

* fix base model prefix issue

* style

* update

* fix all?

* remoce 1 print

* fix the final etsts

* fixup

* last nits

* fix the detach issue which cause a 2x slowdown

* fixup

* small fixes

* ultra nit

* fix

* fix

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-26 20:12:38 +01:00
Nadav Timor
d18d9c3205
Universal Speculative Decoding CandidateGenerator (#35029)
* move `TestAssistedCandidateGeneratorDifferentTokenizers` into a new testing file

* refactor

* NOTHING. add space to rerun github actions tests

* remove it...

* `UniversalSpeculativeDecodingGenerator`

* Use `UniversalSpeculativeDecodingGenerator` when `generation_config.do_sample=True`

* assistant tokenizes only the target's new suffix

* formatting

* fix code

* fix code

* formatting

* add `TestGenerateWithDifferentModels`

* `TestGenerateWithDifferentModels` parameterize on `do_sample`

* `AssistantVocabMapping` & `AssistantVocabMappingCache`

* formatting

* `AssistantToTargetTranslator`: `get_target_input_ids` & `get_target_logits`

* improve `_get_assistant_to_target_input_ids` & formatting

* renaming

* WIP: debugging `min_new_tokens`

* fix get_target_ids

* `UniversalSpeculativeDecodingGenerator`

* assistant tokenizes only the target's new suffix

* formatting

* fix code

* fix code

* formatting

* `TestGenerateWithDifferentModels` parameterize on `do_sample`

* `AssistantVocabMapping` & `AssistantVocabMappingCache`

* formatting

* `AssistantToTargetTranslator`: `get_target_input_ids` & `get_target_logits`

* improve `_get_assistant_to_target_input_ids` & formatting

* renaming

* WIP: debugging `min_new_tokens`

* fix get_target_ids

* fix device issue

* fix get_assistant_input_ids

* add `TestAssistedCandidateGeneratorDifferentTokenizers`

* formatting

* `AssistantVocabTranslatorCache` refactor & tests

* revert changes in `src/transformers/generation/logits_process.py`

* refactor `AssistedCandidateGenerator`

* refactor `AssistedCandidateGeneratorDifferentTokenizers`

* formatting

* refactor `UniversalSpeculativeDecodingGenerator`

* fix negative value for max_new_tokens

* fix generation length target + attention_mask vs. assistant + attent

* fix device

* fix negative max_new_tokens bug

* fix UAG

* minor

* formatting

* `AssistedCandidateGeneratorDifferentTokenizers` `lookbehind`s init

* resolve conflict & formatting

* rerun CI tests

* remove space...

* remove old code

* fix candidate_input_ids device

* minor

* formatting

* Fix prepare + apply (#7)

* fix prepare + apply

* move to cpu

* simplity suppress_tokens

* fix bugs and refacatoring

* device move

* handle self.config.vocab_size > len(target_tokenizer.get_vocab())

* no need to normalize in candidate_generator

* address Nadav's comments + minor

* optimize device move + SuppressTokensLogitsProcessor

* AssistantToTargetTranslator, SuppressTokensLogitsProcessor and tokenizers mapping improvements

* padding size

* padding improvement

* fix and simplify get_target_logits

* renaming in get_target_logits

* minor

* add filter_value and suppress_tokens_id

* style + rename

* remove TODO

* restore original SelectTokensLogitsProcessor with modification

* fix style

* fix _update_past_and_masks and optimize code

* remove assistant_vocab_size arg

* fix attention_mask

* call _prepare_attention_mask also if not has_past_key_values

* handling attention mask for first generation

* comment

* restore test

* remove SelectTokensLogitsProcessor

* _update_past_and_masks implementation for USD

* Add unittests for Universal Assisted generation

* fix style

* update tests

* Remove unused import and fix `test_speculation_depth` test

* exclude special and reserved tokens from tokenizer for UAG

* mv `test_universal_assisted_generation.py` to `generation/test_candidate_generator.py`

* Remove unused imports and fix style using `make style` (#9)

* formatting

* Swap gated `meta-llama/llama-3.2` with `allenai/llama` (#10)

* Fix space sign disagreement (#12)

* default values for AssistantToTargetTranslator fileds

* fix space sign

* minor

* fix test + style

* Default values for some fields of assistant to target translator (#11)

* default values for AssistantToTargetTranslator fileds

* fix

* add support to empty logit_processors

* Update candidate_generator.py (#15)

fix typo

* BUG fix in _prepare_assistant_input_ids (#14)

* fix _prepare_assistant_input_ids

* target_to_assistant_input_ids

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Nadav Timor <nadav.timor@weizmann.ac.il>

---------

Co-authored-by: Nadav Timor <nadav.timor@weizmann.ac.il>

* typo (`target_to_assistant_input_ids`)

* formatting

* merge upstream/main

* Fix minor review comments (#16)

* Fix: `token_ids.to(torch.int64)` (#18)

* tok ids to `torch.int64` (reference: https://huggingface.co/docs/transformers.js/en/api/tokenizers)

* `LongTensor`

* fix dtype

* `assistant_input_ids.to(dtype=torch.long)`

* Remove unused import from test_candidate_generator.py

* Remove unused import from test_candidate_generator.py

* Remove `numpy` import

* resolve pr comments (#19)

* `AssistantToTargetTranslator` docstring

* (per gante's comment) `filter_value` and `suppress_tokens_id` to class constants

* update `AssistantToTargetTranslator` docstring

* (gante's comment) replace `match-case`

* formatting

* Fix Joao's comments (#21)

* remove threading

* fix logits_processor

* fix test device

* fix style (#23)

* Move atm (#24)

* move AssistantToTargetTranslator

* fixup

* fix logit_processor

* add atm_translator test

* refactor test

* remove threading from test

* add require_torch in tests

* move AssistantVocabTranslatorCache + add tests

* ruff fix

---------

Co-authored-by: jmamou <jonathan.mamou@intel.com>
Co-authored-by: Gaurav <gauravj@d-matrix.ai>
Co-authored-by: Gaurav Jain <gaurjain14@gmail.com>
Co-authored-by: gauravjain14 <41287729+gauravjain14@users.noreply.github.com>
2025-02-26 16:14:02 +00:00
Manny Cortes
082834dd79
fix: prevent model access error during Optuna hyperparameter tuning (#36395)
* fix: prevent model access error during Optuna hyperparameter tuning

The `transformers.integrations.integration_utils.run_hp_search_optuna` function releases model memory and sets trainer.model to None after each trial. This causes an AttributeError when  subsequent Trainer.train calls attempt to access the model before reinitialization. This is only an issue when `fp16_full_eval` or `bf16_full_eval` flags are enabled.

* Update src/transformers/trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-26 17:06:48 +01:00
Zach Mueller
41925e4213
Add retry hf hub decorator (#35213)
* Add retry torch decorator

* New approach

* Empty commit

* Empty commit

* Style

* Use logger.error

* Add a test

* Update src/transformers/testing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Fix err

* Update tests/utils/test_modeling_utils.py

---------

Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-02-25 20:53:11 +01:00
Chulhwa (Evan) Han
9ebfda3263
Fixed VitDet for non-squre Images (#35969)
* size tuple

* delete original input_size

* use zip

* process the other case

* Update src/transformers/models/vitdet/modeling_vitdet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* [VITDET] Test non-square image

* [Fix] Make Quality

* make fix style

* Update src/transformers/models/vitdet/modeling_vitdet.py

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-02-25 19:31:24 +00:00
Pavel Iakubovskii
fb83befb14
Fix pytorch integration tests for SAM (#36397)
Fix device in tests
2025-02-25 14:53:34 +00:00
jiqing-feng
7c8916ddb5
fix audio classification pipeline fp16 test on cuda (#36359)
* fix audio classification pipeline fp16 test on cuda

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add comments

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* Update tests/pipelines/test_pipelines_audio_classification.py

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-02-25 15:01:25 +01:00
Fanli Lin
c3700b0eee
[tests] enable autoawq tests on XPU (#36327)
add autoawq

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-02-25 13:38:09 +01:00
Dmitry Rogozhkin
b4b9da6d9b
tests: revert change of torch_require_multi_gpu to be device agnostic (#35721)
* tests: revert change of torch_require_multi_gpu to be device agnostic

The 11c27dd33 modified `torch_require_multi_gpu()` to be device agnostic
instead of being CUDA specific. This broke some tests which are rightfully
CUDA specific, such as:

* `tests/trainer/test_trainer_distributed.py::TestTrainerDistributed`

In the current Transformers tests architecture `require_torch_multi_accelerator()`
should be used to mark multi-GPU tests agnostic to device.

This change addresses the issue introduced by 11c27dd33 and reverts
modification of `torch_require_multi_gpu()`.

Fixes: 11c27dd33 ("Enable BNB multi-backend support (#31098)")
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* fix bug: modification of frozen set

---------

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-02-25 13:36:10 +01:00
MAHIR DAIYAN
d80d52b007
addressing the issue #34611 to make FlaxDinov2 compatible with any batch size (#35138)
fixed the batch_size error, all tests are passing

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-02-25 10:44:44 +00:00
jiqing-feng
9d6abf9778
enable torchao quantization on CPU (#36146)
* enable torchao quantization on CPU

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix int4

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* enable CPU torchao tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix cuda tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix cpu tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix style

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix cuda tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix torchao available

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix torchao available

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix torchao config cannot convert to json

* fix docs

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* rm to_dict to rebase

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* limited torchao version for CPU

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix skip

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* Update src/transformers/testing_utils.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* fix cpu test

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-25 11:06:52 +01:00
Jerry Zhang
2af272c101
Add autoquant support for torchao quantizer (#35503)
* Add autoquant support for torchao quantizer

Summary:
att, also verified that autoquantized model can be saved and loaded:

save: https://gist.github.com/jerryzh168/01d367aaf44dbbbfd4068a4a10a00061
load: https://gist.github.com/jerryzh168/d5c6c401b2abdf18e0b6771341f1525c

Test Plan:
tested locally with above script
model uploaded to https://huggingface.co/jerryzh168/llama3-8b-autoquant

Reviewers:

Subscribers:

Tasks:

Tags:

* add test

* ruff fix

* ruff reformat

* add docs and min_sqnr support

* format

* format

* fix test

* update doc

* format

* remove disable_compile

* format
2025-02-24 15:54:16 +01:00
Rahul Tuli
884a8ea1f0
Improve model loading for compressed tensor models (#36152)
* Disable warnings for stacked compressors
* Introduce two new hooks in HfQuantizer lifecycle
to allow updates to missing and unexpected keys
* Update missing and unexpected keys
for stacked compressors
* Add tests
* Fix: run_compressed cases
* Fix: uncompressed cases

* Rename compressed_tensor folder to compressed_tensors
Move RunCompressedTest to the same file
Update tests to unittest
2025-02-24 13:47:21 +01:00
Fanli Lin
4dbf17c17f
[tests] enable bnb tests on xpu (#36233)
* fix failed test

* fix device

* fix more device cases

* add more cases

* fix empty cache

* Update test_4bit.py

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-02-24 11:30:15 +01:00
CalOmnie
547911e727
Uses Collection in transformers.image_transforms.normalize (#36301)
* Uses Collection instead of Sequence in transformers.image_transforms.normalize

* Uses collections.abc.Collection in lieu of deprecated typing one
2025-02-21 18:38:41 +01:00