Commit Graph

264 Commits

Author SHA1 Message Date
Matt
9f563ada70
Deprecate TF + JAX (#38758)
* Scatter deprecation warnings around

* Delete the tests

* Make logging work properly!
2025-06-11 17:28:06 +01:00
Peter St. John
bab40c6838
[core] support tensor-valued _extra_state values in from_pretrained (#38155)
Support tensor-valued _extra_state values

TransformerEngine uses the pytorch get/set_extra_state API to store FP8
layer config information as bytes Tensor in the _extra_state entry in
the state dict. With recent changes to from_pretrained, this
functionality has broken and loading a model that uses this API doesn't
appear to work. This PR fixes the save/load pretrained functions for
extra state entries that use a pytorch tensor, and adds a (currently
x-failing) test for a dictionary extra state.

Signed-off-by: Peter St. John <pstjohn@nvidia.com>
2025-05-28 15:38:42 +02:00
Raushan Turganbay
19fdb75cf0
[video utils] group and reorder by number of frames (#38374)
fix
2025-05-27 11:32:33 +02:00
Yao Matrix
a5a0c7b888
switch to device agnostic device calling for test cases (#38247)
* use device agnostic APIs in test cases

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* fix style

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* add one more

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* xpu now supports integer device id, aligning to CUDA behaviors

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* update to use device_properties

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* fix style

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* update comment

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* fix comments

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* fix style

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

---------

Signed-off-by: Matrix Yao <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-26 10:18:53 +02:00
Aaron V
d5f992f5e6
Enhance Model Loading By Providing Parallelism, Uses Optional Env Flag (#36835)
* Get parallel loader working. Include tests.

* Update the tests for parallel loading

* Rename env variables.

* Add docs for parallel model weight loading.

* Touch up parallel model loading docs.

* Touch up parallel model loading docs again.

* Edit comment in test_modeling_utils_parallel_loading.py

* Make sure HF_PARALLEL_LOADING_WORKERS is spelled correctly in modeling_utils.py

* Correct times for parallelized loading, previous times were for a "hot" filesystem

* Update parallel model loading so the spawn method is encapsulated. DRY up the code by leveraging get_submodule.

* Update docs on model loading parallelism so that details on setting the multiprocessing start method are removed, now that the package handles this step internally.

* Fix style on model loading parallelism changes.

* Merge latest version of master's modeling_utils.

* Removed unused variable.

* Fix argument packing for the parallel loader.

* Fix state dict being undefined in the parallel model loader.

* Rename variables used in parallel model loading for clarity. Use get_module_from_name().

* Switch to the use of threads for parallel model loading.

* Update docs for parallel loading.

* Remove the use of json.loads when evaluating HF_ENABLE_PARALLEL_LOADING. Prefer simple casting.

* Move parallelized shard loading into its own function.

* Remove use of is_true(). Favor checking env var true values for HF_ENABLE_PARALLEL_LOADING.

* Update copyright to 2025 in readme for paralell model loading.

* Remove garbage collection line in load_shard_file, implicit garbage collection already occurs.

* Run formatter on modeling_utils.py

* Apply style fixes

* Delete tests/utils/test_modeling_utils_parallel_loading.py

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
2025-05-23 16:39:47 +00:00
Arthur
f5d45d89c4
🚨Early-error🚨 config will error out if output_attentions=True and the attn implementation is wrong (#38288)
* Protect ParallelInterface

* early error out on output attention setting for no wraning in modeling

* modular update

* fixup

* update model tests

* update

* oups

* set model's config

* more cases

* ??

* properly fix

* fixup

* update

* last onces

* update

* fix?

* fix wrong merge commit

* fix hub test

* nits

* wow I am tired

* updates

* fix pipeline!

---------

Co-authored-by: Lysandre <hi@lysand.re>
2025-05-23 17:17:38 +02:00
Cyril Vallez
163138a911
🚨🚨[core] Completely rewrite the masking logic for all attentions (#37866)
* start

* start having a clean 4d mask primitive

* Update mask_utils.py

* Update mask_utils.py

* switch name

* Update masking_utils.py

* add a new AttentionMask tensor class

* fix import

* nits

* fixes

* use full and quandrants

* general sdpa mask for all caches

* style

* start some tests

* tests with sliding, chunked

* add styling

* test hybrid

* Update masking_utils.py

* small temp fixes

* Update modeling_gemma2.py

* compile compatible

* Update masking_utils.py

* improve

* start making it more general

* Update masking_utils.py

* generate

* make it work with flex style primitives!

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* improve

* Update cache_utils.py

* Update masking_utils.py

* simplify - starting to look good!

* Update masking_utils.py

* name

* Update masking_utils.py

* style

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* small fix for flex

* flex compile

* FA2

* Update masking_utils.py

* Escape for TGI/vLLM!

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* General case without cache

* rename

* full test on llama4

* small fix for FA2 guard with chunk

* Update modeling_gemma2.py

* post rebase cleanup

* FA2 supports static cache!

* Update modeling_flash_attention_utils.py

* Update flex_attention.py

* Update masking_utils.py

* Update masking_utils.py

* Update utils.py

* override for export

* Update executorch.py

* Update executorch.py

* Update executorch.py

* Update executorch.py

* Update masking_utils.py

* Update masking_utils.py

* output attentions

* style

* Update masking_utils.py

* Update executorch.py

* Add doicstring

* Add license and put mask visualizer at the end

* Update test_modeling_common.py

* fix broken test

* Update test_modeling_gemma.py

* Update test_modeling_gemma2.py

* Use fullgraph=False with FA2

* Update utils.py

* change name

* Update masking_utils.py

* improve doc

* change name

* Update modeling_attn_mask_utils.py

* more explicit logic based on model's property

* pattern in config

* extend

* fixes

* make it better

* generalize to other test models

* fix

* Update masking_utils.py

* fix

* do not check mask equivalence if layer types are different

* executorch

* Update modeling_gemma2.py

* Update masking_utils.py

* use layer_idx instead

* adjust

* Update masking_utils.py

* test

* fix imports

* Update modeling_gemma2.py

* other test models

* Update modeling_llama4.py

* Update masking_utils.py

* improve

* simplify

* Update masking_utils.py

* typos

* typo

* fix

* Update masking_utils.py

* default DynamicCache

* remove default cache

* simplify

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* simplify

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* export

* Update executorch.py

* Update executorch.py

* Update flex_attention.py

* Update executorch.py

* upstream to modular gemma 1 & 2

* Update modular_mistral.py

* switch names

* use dict

* put it in the Layer directly

* update copy model source for mask functions

* apply so many modular (hopefully 1 shot)

* use explicite dicts for make style happy

* protect import

* check docstring

* better default in hybrid caches

* qwens

* Update modular_qwen2.py

* simplify core logic!

* Update executorch.py

* qwen3 moe

* Update masking_utils.py

* Update masking_utils.py

* simplify a lot sdpa causal skip

* Update masking_utils.py

* post-rebase

* gemma3 finally

* style

* check it before

* gemma3

* More general with newer torch

* align gemma3

* Update utils.py

* Update utils.py

* Update masking_utils.py

* Update test_modeling_common.py

* Update flex_attention.py

* Update flex_attention.py

* Update flex_attention.py

* test

* executorch

* Update test_modeling_common.py

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* Update masking_utils.py

* Update executorch.py

* Update test_modeling_common.py

* fix copies

* device

* sdpa can be used without mask -> pass the torchscript tests in this case

* Use enum for check

* revert enum and add check instead

* remove broken test

* cohere2

* some doc & reorganize the Interface

* Update tensor_parallel.py

* Update tensor_parallel.py

* doc and dummy

* Update test_modeling_paligemma2.py

* Update modeling_falcon_h1.py

* Update masking_utils.py

* executorch patch

* style

* CIs

* use register in executorch

* final comments!

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2025-05-22 11:38:26 +02:00
Yuanyuan Chen
ae3e4e2d97
Improve typing in TrainingArgument (#36944)
* Better error message in TrainingArgument typing checks

* Better typing

* Small fixes

Signed-off-by: cyy <cyyever@outlook.com>

---------

Signed-off-by: cyy <cyyever@outlook.com>
2025-05-21 13:54:38 +00:00
Manuel de Prada Corral
d34e21e7dd
New cache tests and refactored Hybrid Cache (#37972) 2025-05-20 12:46:13 +02:00
Yao Matrix
3bd1c20149
enable misc cases on XPU & use device agnostic APIs for cases in tests (#38192)
* use device agnostic APIs in tests

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* more

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* fix style

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* add reset_peak_memory_stats API

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* update

---------

Signed-off-by: Matrix Yao <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-20 10:09:01 +02:00
Lysandre Debut
003deb16f1
Support for transformers explicit filename (#38152)
* Support for transformers explicit filename

* Tests

* Rerun tests
2025-05-19 14:33:47 +02:00
Raushan Turganbay
aaf224d570
[video processor] fix tests (#38104)
* fix tests

* delete

* fix one more test

* fix qwen + some tests are failing irrespective of `VideoProcessor`

* delete file
2025-05-14 10:24:07 +00:00
Raushan Turganbay
a31fa218ad
🔴 Video processors as a separate class (#35206)
* initial design

* update all video processors

* add tests

* need to add qwen2-vl (not tested yet)

* add qwen2-vl in auto map

* fix copies

* isort

* resolve confilicts kinda

* nit:

* qwen2-vl is happy now

* qwen2-5 happy

* other models are happy

* fix copies

* fix tests

* add docs

* CI green now?

* add more tests

* even more changes + tests

* doc builder fail

* nit

* Update src/transformers/models/auto/processing_auto.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* small update

* imports correctly

* dump, otherwise this is getting unmanagebale T-T

* dump

* update

* another update

* update

* tests

* move

* modular

* docs

* test

* another update

* init

* remove flakiness in tests

* fixup

* clean up and remove commented lines

* docs

* skip this one!

* last fix after rebasing

* run fixup

* delete slow files

* remove unnecessary tests + clean up a bit

* small fixes

* fix tests

* more updates

* docs

* fix tests

* update

* style

* fix qwen2-5-vl

* fixup

* fixup

* unflatten batch when preparing

* dump, come back soon

* add docs and fix some tests

* how to guard this with new dummies?

* chat templates in qwen

* address some comments

* remove `Fast` suffix

* fixup

* oops should be imported from transforms

* typo in requires dummies

* new model added with video support

* fixup once more

* last fixup I hope

* revert image processor name + comments

* oh, this is why fetch test is failing

* fix tests

* fix more tests

* fixup

* add new models: internvl, smolvlm

* update docs

* imprt once

* fix failing tests

* do we need to guard it here again, why?

* new model was added, update it

* remove testcase from tester

* fix tests

* make style

* not related CI fail, lets' just fix here

* mark flaky for now, filas 15 out of 100

* style

* maybe we can do this way?

* don't download images in setup class

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-05-12 11:55:51 +02:00
Arjuna Sky Kok
716819b830
fix(conversion): Fix size mismatch error during TF->PT model loading (#38014) 2025-05-10 11:11:07 +00:00
Lysandre Debut
23d79cea75
Support for version spec in requires & arbitrary mismatching depths across folders (#37854)
* Support for version spec in requires & arbitrary mismatching depths

* Quality

* Testing
2025-05-09 15:26:27 +02:00
Yao Matrix
a72cb31434
enable utils test cases on XPU (#38005)
* enable utils test cases on XPU

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* Update tests/utils/test_skip_decorators.py

Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>

* fix comment

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

---------

Signed-off-by: Yao Matrix <matrix.yao@intel.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
2025-05-09 08:45:01 +02:00
Arthur
5f5ccfdc54
[AutoDocstring] Based on inspect parsing of the signature (#33771)
* delete common docstring

* nit

* updates

* push

* fixup

* move stuff around fixup

* no need for dataclas

* damn nice modular

* add auto class docstring

* style

* modular update

* import autodocstring

* fixup

* maybe add original doc!

* more cleanup

* remove class do cas well

* update

* nits

* more celanup

* fix

* wups

* small check

* updatez

* some fixes

* fix doc

* update

* nits

* try?

* nit

* some updates

* a little bit better

* where ever we did not have help we are not really adding it!

* revert llama config

* small fixes and small tests

* test

* fixup

* more fix-copies

* updates

* updates

* fix doc building

* style

* small fixes

* nits

* fix-copies

* fix merge issues faster

* fix merge conf

* nits jamba

* ?

* working autodoc for model class and forward except returns and example

* support return section and unpack kwargs description

* nits and cleanup

* fix-copies

* fix-copies

* nits

* Add support for llava-like models

* fixup

* add class args subset support

* add examples inferred from automodel/pipelines

* update ruff

* autodocstring for Aria, Albert + fixups

* Fix empty return blocks

* fix copies

* fix copies

* add autodoc for all fast image processors + align, altclip

* fix copies

* add auto_doc for audio_spectrogram, auto_former, bark, bamba

* Drastically improve speed + add bart beit bert

* add autodoc to all bert-like models

* Fix broken doc

* fix copies

* fix auto_docstring after merge

* add autodoc to models

* add models

* add models

* add models and improve support for optional, and custom shape in args docstring

* update fast image processors

* refactor auto_method_docstring in args_doc

* add models and fix docstring parsing

* add models

* add models

* remove debugging

* add models

* add fix_auto_docstrings and improve args_docs

* add support for additional_info in args docstring

* refactor (almost) all models

* fix check docstring

* fix -copies

* fill in all missing docstrings

* fix copies

* fix qwen3 moe docstring

* add documentation

* add back labels

* update docs and fix can_return_tuple in modular files

* fix LongformerForMaskedLM docstring

* add auto_docstring to _toctree

* remove auto_docstring tests temporarily

* fix copyrights new files

* fix can_return_tuple granite hybrid

* fix fast beit

* Fix empty config doc

* add support for COMMON_CUSTOM_ARGS in check_docstrings and add missing models

* fix code block not closed flava

* fix can_return_tuple sam hq

* Fix Flaubert dataclass

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2025-05-08 17:46:07 -04:00
Joao Gante
f2b59c6173
[caches] Raise exception on offloaded static caches + multi device (#37974)
* skip tests on >1 gpu

* add todo
2025-05-08 14:37:36 +01:00
Joao Gante
9981214d32
[tests] Smaller model in slow cache tests (#37922) 2025-05-06 11:15:25 +01:00
Joao Gante
1b222903c3
[tests] Test all cache implementations (#37873) 2025-04-30 15:37:00 +01:00
Lysandre Debut
d538293f62
Transformers cli clean command (#37657)
* transformers-cli -> transformers

* Chat command works with positional argument

* update doc references to transformers-cli

* doc headers

* deepspeed

---------

Co-authored-by: Joao Gante <joao@huggingface.co>
2025-04-30 12:15:43 +01:00
Guang Yang
a57274466f
Allow override inputs to export recipe (#37508)
Add option to specify dynamic shapes during export

Co-authored-by: Guang Yang <guangyang@fb.com>
2025-04-30 10:19:27 +02:00
Joao Gante
755b0fa2fe
[tests] reorganize cache tests and clean memory between tests (#37684) 2025-04-29 12:21:14 +01:00
co63oc
d5fa7d2d19
Fix typos in strings and comments (#37799) 2025-04-28 11:39:11 +01:00
Cyril Vallez
0cfbf9c95b
Force torch>=2.6 with torch.load to avoid vulnerability issue (#37785)
* fix all main files

* fix test files

* oups forgot modular

* add link

* update message
2025-04-25 16:57:09 +02:00
Poedator
7c62e69326
GPT2Model StaticCache support (#35761)
* initial GPT2 changes

* causal_mask support

* return_legacy_cache

* cleanup

* fix1

* outputs shape fixes

* gpt2 return fix

* pkv, attn fixes

* fix dual_head

* is_causal arg fix

* decision transformer updated

* style fix

* batch_size from inputs_embeds

* DecisionTransformerModel fixes

* cross-attn support + cache warning

* x-attn @decision

* EDCache proper init

* simplified logic in `if use_cache:` for GPT2Model

* @deprecate_kwarg for DecisionTr attn fwd

* @deprecate_kwarg in gpt2

* deprecation version updated to 4.51

* kwargs in gradient_checkpointing_fn

* rename next_cache to past_key_values

* attention_mask prep

* +cache_position in GPT2DoubleHeadsModel

* undo kwargs in gradient checkpointing

* moved up `if self.gradient_checkpointing`

* consistency in decision_transformer

* pastkv, cache_pos in grad_checkpt args

* rm _reorder_cache

* output_attentions streamlined

* decision_transformer consistency

* return_legacy_cache improved

* ClvpForCausalLM used for legacy cache test now

* is_causal fixed

* attn_output cleanup

* consistency @ decision_transformer

* Updated deprecation notice version to 4.52

* upd deprecation

* consistent legacy cache code in decision transformers\

* next_cache -> past_kv in decision_tr

* cache support flags in decision_transf

* rm legacy cache warning

* consistency in cache init for decision transf

* no Static Cache for Decision Transformer

---------

Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
2025-04-24 14:46:35 +02:00
Manuel de Prada Corral
1cd110c6cb
Add test to ensure unknown exceptions reraising in utils/hub.py::cached_files() (#37651)
* add test to ensure unknown exceptions are reraised in utils/hub.py::cached_files()
2025-04-22 11:38:10 +02:00
Pablo Montalvo
4afd3f4820
Model debugger upgrades (#37391)
* debugging improvements

* add debugging details

* add more debugging details

* debug more

* clean up layers + output

* add summary json file

* cleanup

* copies 👀

* remove hooks + add documentation

* draft a small test, why not

* respect the format (respect it)

* fixup imports

* nit

* add tests and configurable pruning of layers
2025-04-18 16:45:54 +02:00
Lysandre Debut
54a123f068
Simplify soft dependencies and update the dummy-creation process (#36827)
* Reverse dependency map shouldn't be created when test_all is set

* [test_all] Remove dummies

* Modular fixes

* Update utils/check_repo.py

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>

* [test_all] Better docs

* [test_all] Update src/transformers/commands/chat.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* [test_all] Remove deprecated AdaptiveEmbeddings from the tests

* [test_all] Doc builder

* [test_all] is_dummy

* [test_all] Import utils

* [test_all] Doc building should not require all deps

---------

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-04-11 11:08:36 +02:00
cyyever
371c44d0ef
Remove old code for PyTorch, Accelerator and tokenizers (#37234)
* Remove unneeded library version checks

Signed-off-by: cyy <cyyever@outlook.com>

* Remove PyTorch condition

Signed-off-by: cyy <cyyever@outlook.com>

* Remove PyTorch condition

Signed-off-by: cyy <cyyever@outlook.com>

* Fix ROCm get_device_capability

Signed-off-by: cyy <cyyever@outlook.com>

* Revert "Fix ROCm get_device_capability"

This reverts commit 0e756434bd.

* Remove unnecessary check

Signed-off-by: cyy <cyyever@outlook.com>

* Revert changes

Signed-off-by: cyy <cyyever@outlook.com>

---------

Signed-off-by: cyy <cyyever@outlook.com>
2025-04-10 20:54:21 +02:00
Joao Gante
4321b0648c
[core] remove GenerationMixin inheritance by default in PreTrainedModel (#37173) 2025-04-08 16:42:05 +01:00
cyyever
1e6b546ea6
Use Python 3.9 syntax in tests (#37343)
Signed-off-by: cyy <cyyever@outlook.com>
2025-04-08 14:12:08 +02:00
Yao Matrix
12bf24d6ae
enable 2 llama UT cases on xpu (#37126)
* enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* switch to use Expectations

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* extract gen bits from architecture and use it

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* add cross refererence

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

---------

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-04-07 16:02:14 +02:00
Matt
cbfa14823b
No more dtype_byte_size() (#37144)
* No more dtype_byte_size()

* Remove function once again

* Fix rebase cruft

* Trigger tests
2025-04-02 14:58:38 +01:00
Yih-Dar
adfc91cd46
Try to avoid/reduce some remaining CI job failures (#37202)
* try

* try

* Update tests/pipelines/test_pipelines_video_classification.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-04-02 14:39:57 +02:00
Qizhi Chen
fac70ff3c0
Convert _VALID_DICT_FIELDS to class attribute for shared dict parsing in subclasses (#36736)
* make _VALID_DICT_FIELDS as a class attribute

* fix test case about TrainingArguments
2025-04-01 12:29:12 +02:00
cyyever
786d9c5ed9
Fix more inefficient PT operations (#37060)
* Fix inefficient operations

* Remove cpu() call

* Reorder detach()

* Reorder detach()

* tolist without detach

* item without detach

* Update src/transformers/models/rag/modeling_rag.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/models/encodec/test_modeling_encodec.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Use detach().cpu().numpy

* Revert some numpy operations

* More fixes

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-31 16:31:24 +01:00
Pavel Iakubovskii
a1e389e637
Refactor return_dict logic to remove complicated if/else paths (#36794)
* SAM

* CLIP

* SigLIP

* GOT-OCR2 (depends on SAM)

* SigLIP2 (depends on SigLIP)

* trigger tests

* Fix SAM

* Fix missed indexing, use named attributes

* Llama

* Aria

* Bamba

* Update llama: missed outputs return type

* (fixup) Aria

* DiffLlama

* Emu3

* Gemma

* Gemma2

* Paligemma

* Fix paligemma

* Gemma3

* GLM

* Helium

* JetMoe

* Jamba

* Mistral

* Mistral

* Mixtral

* Nemotron

* Olmo

* Olmo2

* Persimmon

* Phi

* Phi3

* PhiMoe

* Qwen2

* Qwen2_moe

* StableLM

* Starcoder2

* Add return_dict decorator

* SAM

* Update decorator: compile, export, trace - friendly

* Llama (decorator)

* SAM (decorator)

* Add decorator `can_return_tuple`

* Llama

* Update to decorator

* Update CLIP

* Update decorator to store `_is_top_level_module` in self

* Update decorator to correctly handle compile/export

* Remove is_torchdynamo_compiling constraint, all work fine with self attribute assignment

* Typing

* GPT NeoX

* Fixup

* Fix attribute Granite

* Fix return type mixtral

* Update Gemma3

* Fix Cohere amd Cohere2

* Fixup

* Fix corner case for Phi4, when activation is shared

* (fix-copies) deepseekv3, phi4

* Fixup

* Apply to qwen3/qwen3_moe

* Fix
2025-03-31 16:23:37 +01:00
Zhen
e686fed635
[Feature] Support using FlashAttention2 on Ascend NPU (#36696)
* [Feature] Support using flash-attention on Ascend NPU

* Fix qwen3 and qwen3_moe moduler conversion mismatch
2025-03-31 16:12:58 +02:00
cyyever
6cc9c8d7d1
Remove deprecated batch_size parameter (#37007) 2025-03-27 15:01:56 +00:00
cyyever
41a0e58e5b
Set weights_only in torch.load (#36991) 2025-03-27 14:55:50 +00:00
eustlb
fb8e6c50e4
[audio utils] fix fft_bin_width computation (#36603)
* fix fft_bin_width computation

* update docstring + enforce correct params

* update test with correct value

* udpate test

* update feature extractors for concerned models

* update

* make

* udpate docstring

* udpate docstring
2025-03-27 15:20:02 +01:00
Sungyoon Jeong
d1eafe8d4e
Optimize to_py_obj for python-native numeric lists and scalars (#36885)
* Optimize to_py_obj for python-native numeric lists and scalars

* Fix bug that tuple is not converted to list

* Try np.array for more robust type checking

* Apply review and add tests for to_py_obj
2025-03-27 14:16:46 +01:00
Joao Gante
bc1c90a755
[Utils] torch version checks optionally accept dev versions (#36847) 2025-03-25 10:58:58 +00:00
omahs
cbf924b76c
Fix typos (#36910)
* fix typos

* fix typos

* fix typos

* fix typos
2025-03-24 14:08:29 +00:00
Raushan Turganbay
523f6e743c
Fix: dtype cannot be str (#36262)
* fix

* this wan't supposed to be here, revert

* refine tests a bit more
2025-03-21 13:27:47 +01:00
Tugsbayasgalan Manlaibaatar
f39f4960f3
Support tracable dynamicKVcache (#36311)
* Support tracable dynamicKVcache

* Fix lint

* More fine grained test

* Lint

* Update

* Update

* Fix up

* Apply suggestions from code review

* Update src/transformers/cache_utils.py

* Update tests/utils/test_cache_utils.py

* Apply suggestions from code review

* Update

* Change error message

* Rename

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

---------

Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-19 16:52:30 +00:00
Yao Matrix
b11050d6a2
enable OffloadedCache on XPU from PyTorch 2.7 (#36654)
* fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model

* follow Marc's suggestion to use _tie_weights to fix

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>

* enable OffloadedCache on XPU since PyTorch 2.7

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>

* don't change bart

Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>

* make code more concise per review comments

Signed-off-by: N <matrix.yao@intel.com>

* fix review comments

Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>

* Revert "fix review comments"

This reverts commit acf1484b86.

* fix review comments

Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>

* fix style

Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>

---------

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
Signed-off-by: N <matrix.yao@intel.com>
Co-authored-by: root <root@a4bf01945cfe.jf.intel.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-19 15:15:52 +01:00
ivarflakstad
706703bba6
Expectations test utils (#36569)
* Add expectation classes + tests

* Use typing Union instead of |

* Use bits to track score in properties cmp method

* Add exceptions and tests + comments

* Remove compute cap minor as it is not needed currently

* Simplify. Remove Properties class

* Add example Exceptions usage

* Expectations as dict subclass

* Update example Exceptions usage

* Refactor. Improve type name. Document score fn.

* Rename to DeviceProperties.
2025-03-18 23:39:50 +01:00
Afanti
7f5077e536
fix typos in the tests directory (#36717) 2025-03-17 17:45:57 +00:00