Peter St. John
bab40c6838
[core] support tensor-valued _extra_state values in from_pretrained
( #38155 )
...
Support tensor-valued _extra_state values
TransformerEngine uses the pytorch get/set_extra_state API to store FP8
layer config information as bytes Tensor in the _extra_state entry in
the state dict. With recent changes to from_pretrained, this
functionality has broken and loading a model that uses this API doesn't
appear to work. This PR fixes the save/load pretrained functions for
extra state entries that use a pytorch tensor, and adds a (currently
x-failing) test for a dictionary extra state.
Signed-off-by: Peter St. John <pstjohn@nvidia.com>
2025-05-28 15:38:42 +02:00
Raushan Turganbay
19fdb75cf0
[video utils] group and reorder by number of frames ( #38374 )
...
fix
2025-05-27 11:32:33 +02:00
Yao Matrix
a5a0c7b888
switch to device agnostic device calling for test cases ( #38247 )
...
* use device agnostic APIs in test cases
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
* fix style
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
* add one more
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* xpu now supports integer device id, aligning to CUDA behaviors
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
* update to use device_properties
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
* fix style
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
* update comment
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
* fix comments
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
* fix style
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
---------
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-26 10:18:53 +02:00
Aaron V
d5f992f5e6
Enhance Model Loading By Providing Parallelism, Uses Optional Env Flag ( #36835 )
...
* Get parallel loader working. Include tests.
* Update the tests for parallel loading
* Rename env variables.
* Add docs for parallel model weight loading.
* Touch up parallel model loading docs.
* Touch up parallel model loading docs again.
* Edit comment in test_modeling_utils_parallel_loading.py
* Make sure HF_PARALLEL_LOADING_WORKERS is spelled correctly in modeling_utils.py
* Correct times for parallelized loading, previous times were for a "hot" filesystem
* Update parallel model loading so the spawn method is encapsulated. DRY up the code by leveraging get_submodule.
* Update docs on model loading parallelism so that details on setting the multiprocessing start method are removed, now that the package handles this step internally.
* Fix style on model loading parallelism changes.
* Merge latest version of master's modeling_utils.
* Removed unused variable.
* Fix argument packing for the parallel loader.
* Fix state dict being undefined in the parallel model loader.
* Rename variables used in parallel model loading for clarity. Use get_module_from_name().
* Switch to the use of threads for parallel model loading.
* Update docs for parallel loading.
* Remove the use of json.loads when evaluating HF_ENABLE_PARALLEL_LOADING. Prefer simple casting.
* Move parallelized shard loading into its own function.
* Remove use of is_true(). Favor checking env var true values for HF_ENABLE_PARALLEL_LOADING.
* Update copyright to 2025 in readme for paralell model loading.
* Remove garbage collection line in load_shard_file, implicit garbage collection already occurs.
* Run formatter on modeling_utils.py
* Apply style fixes
* Delete tests/utils/test_modeling_utils_parallel_loading.py
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
2025-05-23 16:39:47 +00:00
Arthur
f5d45d89c4
🚨 Early-error 🚨 config will error out if output_attentions=True
and the attn implementation is wrong ( #38288 )
...
* Protect ParallelInterface
* early error out on output attention setting for no wraning in modeling
* modular update
* fixup
* update model tests
* update
* oups
* set model's config
* more cases
* ??
* properly fix
* fixup
* update
* last onces
* update
* fix?
* fix wrong merge commit
* fix hub test
* nits
* wow I am tired
* updates
* fix pipeline!
---------
Co-authored-by: Lysandre <hi@lysand.re>
2025-05-23 17:17:38 +02:00
Cyril Vallez
163138a911
🚨 🚨 [core] Completely rewrite the masking logic for all attentions ( #37866 )
...
* start
* start having a clean 4d mask primitive
* Update mask_utils.py
* Update mask_utils.py
* switch name
* Update masking_utils.py
* add a new AttentionMask tensor class
* fix import
* nits
* fixes
* use full and quandrants
* general sdpa mask for all caches
* style
* start some tests
* tests with sliding, chunked
* add styling
* test hybrid
* Update masking_utils.py
* small temp fixes
* Update modeling_gemma2.py
* compile compatible
* Update masking_utils.py
* improve
* start making it more general
* Update masking_utils.py
* generate
* make it work with flex style primitives!
* Update masking_utils.py
* Update masking_utils.py
* Update masking_utils.py
* improve
* Update cache_utils.py
* Update masking_utils.py
* simplify - starting to look good!
* Update masking_utils.py
* name
* Update masking_utils.py
* style
* Update masking_utils.py
* Update masking_utils.py
* Update masking_utils.py
* Update masking_utils.py
* small fix for flex
* flex compile
* FA2
* Update masking_utils.py
* Escape for TGI/vLLM!
* Update masking_utils.py
* Update masking_utils.py
* Update masking_utils.py
* General case without cache
* rename
* full test on llama4
* small fix for FA2 guard with chunk
* Update modeling_gemma2.py
* post rebase cleanup
* FA2 supports static cache!
* Update modeling_flash_attention_utils.py
* Update flex_attention.py
* Update masking_utils.py
* Update masking_utils.py
* Update utils.py
* override for export
* Update executorch.py
* Update executorch.py
* Update executorch.py
* Update executorch.py
* Update masking_utils.py
* Update masking_utils.py
* output attentions
* style
* Update masking_utils.py
* Update executorch.py
* Add doicstring
* Add license and put mask visualizer at the end
* Update test_modeling_common.py
* fix broken test
* Update test_modeling_gemma.py
* Update test_modeling_gemma2.py
* Use fullgraph=False with FA2
* Update utils.py
* change name
* Update masking_utils.py
* improve doc
* change name
* Update modeling_attn_mask_utils.py
* more explicit logic based on model's property
* pattern in config
* extend
* fixes
* make it better
* generalize to other test models
* fix
* Update masking_utils.py
* fix
* do not check mask equivalence if layer types are different
* executorch
* Update modeling_gemma2.py
* Update masking_utils.py
* use layer_idx instead
* adjust
* Update masking_utils.py
* test
* fix imports
* Update modeling_gemma2.py
* other test models
* Update modeling_llama4.py
* Update masking_utils.py
* improve
* simplify
* Update masking_utils.py
* typos
* typo
* fix
* Update masking_utils.py
* default DynamicCache
* remove default cache
* simplify
* Update masking_utils.py
* Update masking_utils.py
* Update masking_utils.py
* Update masking_utils.py
* simplify
* Update masking_utils.py
* Update masking_utils.py
* Update masking_utils.py
* export
* Update executorch.py
* Update executorch.py
* Update flex_attention.py
* Update executorch.py
* upstream to modular gemma 1 & 2
* Update modular_mistral.py
* switch names
* use dict
* put it in the Layer directly
* update copy model source for mask functions
* apply so many modular (hopefully 1 shot)
* use explicite dicts for make style happy
* protect import
* check docstring
* better default in hybrid caches
* qwens
* Update modular_qwen2.py
* simplify core logic!
* Update executorch.py
* qwen3 moe
* Update masking_utils.py
* Update masking_utils.py
* simplify a lot sdpa causal skip
* Update masking_utils.py
* post-rebase
* gemma3 finally
* style
* check it before
* gemma3
* More general with newer torch
* align gemma3
* Update utils.py
* Update utils.py
* Update masking_utils.py
* Update test_modeling_common.py
* Update flex_attention.py
* Update flex_attention.py
* Update flex_attention.py
* test
* executorch
* Update test_modeling_common.py
* Update masking_utils.py
* Update masking_utils.py
* Update masking_utils.py
* Update masking_utils.py
* Update executorch.py
* Update test_modeling_common.py
* fix copies
* device
* sdpa can be used without mask -> pass the torchscript tests in this case
* Use enum for check
* revert enum and add check instead
* remove broken test
* cohere2
* some doc & reorganize the Interface
* Update tensor_parallel.py
* Update tensor_parallel.py
* doc and dummy
* Update test_modeling_paligemma2.py
* Update modeling_falcon_h1.py
* Update masking_utils.py
* executorch patch
* style
* CIs
* use register in executorch
* final comments!
---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2025-05-22 11:38:26 +02:00
Yuanyuan Chen
ae3e4e2d97
Improve typing in TrainingArgument ( #36944 )
...
* Better error message in TrainingArgument typing checks
* Better typing
* Small fixes
Signed-off-by: cyy <cyyever@outlook.com>
---------
Signed-off-by: cyy <cyyever@outlook.com>
2025-05-21 13:54:38 +00:00
Manuel de Prada Corral
d34e21e7dd
New cache tests and refactored Hybrid Cache ( #37972 )
2025-05-20 12:46:13 +02:00
Yao Matrix
3bd1c20149
enable misc cases on XPU & use device agnostic APIs for cases in tests ( #38192 )
...
* use device agnostic APIs in tests
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
* more
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
* fix style
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
* add reset_peak_memory_stats API
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* update
---------
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-20 10:09:01 +02:00
Lysandre Debut
003deb16f1
Support for transformers explicit filename ( #38152 )
...
* Support for transformers explicit filename
* Tests
* Rerun tests
2025-05-19 14:33:47 +02:00
Raushan Turganbay
aaf224d570
[video processor] fix tests ( #38104 )
...
* fix tests
* delete
* fix one more test
* fix qwen + some tests are failing irrespective of `VideoProcessor`
* delete file
2025-05-14 10:24:07 +00:00
Raushan Turganbay
a31fa218ad
🔴 Video processors as a separate class ( #35206 )
...
* initial design
* update all video processors
* add tests
* need to add qwen2-vl (not tested yet)
* add qwen2-vl in auto map
* fix copies
* isort
* resolve confilicts kinda
* nit:
* qwen2-vl is happy now
* qwen2-5 happy
* other models are happy
* fix copies
* fix tests
* add docs
* CI green now?
* add more tests
* even more changes + tests
* doc builder fail
* nit
* Update src/transformers/models/auto/processing_auto.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* small update
* imports correctly
* dump, otherwise this is getting unmanagebale T-T
* dump
* update
* another update
* update
* tests
* move
* modular
* docs
* test
* another update
* init
* remove flakiness in tests
* fixup
* clean up and remove commented lines
* docs
* skip this one!
* last fix after rebasing
* run fixup
* delete slow files
* remove unnecessary tests + clean up a bit
* small fixes
* fix tests
* more updates
* docs
* fix tests
* update
* style
* fix qwen2-5-vl
* fixup
* fixup
* unflatten batch when preparing
* dump, come back soon
* add docs and fix some tests
* how to guard this with new dummies?
* chat templates in qwen
* address some comments
* remove `Fast` suffix
* fixup
* oops should be imported from transforms
* typo in requires dummies
* new model added with video support
* fixup once more
* last fixup I hope
* revert image processor name + comments
* oh, this is why fetch test is failing
* fix tests
* fix more tests
* fixup
* add new models: internvl, smolvlm
* update docs
* imprt once
* fix failing tests
* do we need to guard it here again, why?
* new model was added, update it
* remove testcase from tester
* fix tests
* make style
* not related CI fail, lets' just fix here
* mark flaky for now, filas 15 out of 100
* style
* maybe we can do this way?
* don't download images in setup class
---------
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-05-12 11:55:51 +02:00
Arjuna Sky Kok
716819b830
fix(conversion): Fix size mismatch error during TF->PT model loading ( #38014 )
2025-05-10 11:11:07 +00:00
Lysandre Debut
23d79cea75
Support for version spec in requires & arbitrary mismatching depths across folders ( #37854 )
...
* Support for version spec in requires & arbitrary mismatching depths
* Quality
* Testing
2025-05-09 15:26:27 +02:00
Yao Matrix
a72cb31434
enable utils test cases on XPU ( #38005 )
...
* enable utils test cases on XPU
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
* fix style
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
* Update tests/utils/test_skip_decorators.py
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
* fix comment
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
---------
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
2025-05-09 08:45:01 +02:00
Arthur
5f5ccfdc54
[AutoDocstring
] Based on inspect parsing of the signature ( #33771 )
...
* delete common docstring
* nit
* updates
* push
* fixup
* move stuff around fixup
* no need for dataclas
* damn nice modular
* add auto class docstring
* style
* modular update
* import autodocstring
* fixup
* maybe add original doc!
* more cleanup
* remove class do cas well
* update
* nits
* more celanup
* fix
* wups
* small check
* updatez
* some fixes
* fix doc
* update
* nits
* try?
* nit
* some updates
* a little bit better
* where ever we did not have help we are not really adding it!
* revert llama config
* small fixes and small tests
* test
* fixup
* more fix-copies
* updates
* updates
* fix doc building
* style
* small fixes
* nits
* fix-copies
* fix merge issues faster
* fix merge conf
* nits jamba
* ?
* working autodoc for model class and forward except returns and example
* support return section and unpack kwargs description
* nits and cleanup
* fix-copies
* fix-copies
* nits
* Add support for llava-like models
* fixup
* add class args subset support
* add examples inferred from automodel/pipelines
* update ruff
* autodocstring for Aria, Albert + fixups
* Fix empty return blocks
* fix copies
* fix copies
* add autodoc for all fast image processors + align, altclip
* fix copies
* add auto_doc for audio_spectrogram, auto_former, bark, bamba
* Drastically improve speed + add bart beit bert
* add autodoc to all bert-like models
* Fix broken doc
* fix copies
* fix auto_docstring after merge
* add autodoc to models
* add models
* add models
* add models and improve support for optional, and custom shape in args docstring
* update fast image processors
* refactor auto_method_docstring in args_doc
* add models and fix docstring parsing
* add models
* add models
* remove debugging
* add models
* add fix_auto_docstrings and improve args_docs
* add support for additional_info in args docstring
* refactor (almost) all models
* fix check docstring
* fix -copies
* fill in all missing docstrings
* fix copies
* fix qwen3 moe docstring
* add documentation
* add back labels
* update docs and fix can_return_tuple in modular files
* fix LongformerForMaskedLM docstring
* add auto_docstring to _toctree
* remove auto_docstring tests temporarily
* fix copyrights new files
* fix can_return_tuple granite hybrid
* fix fast beit
* Fix empty config doc
* add support for COMMON_CUSTOM_ARGS in check_docstrings and add missing models
* fix code block not closed flava
* fix can_return_tuple sam hq
* Fix Flaubert dataclass
---------
Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2025-05-08 17:46:07 -04:00
Joao Gante
f2b59c6173
[caches] Raise exception on offloaded static caches + multi device ( #37974 )
...
* skip tests on >1 gpu
* add todo
2025-05-08 14:37:36 +01:00
Joao Gante
9981214d32
[tests] Smaller model in slow cache tests ( #37922 )
2025-05-06 11:15:25 +01:00
Joao Gante
1b222903c3
[tests] Test all cache implementations ( #37873 )
2025-04-30 15:37:00 +01:00
Lysandre Debut
d538293f62
Transformers cli clean command ( #37657 )
...
* transformers-cli -> transformers
* Chat command works with positional argument
* update doc references to transformers-cli
* doc headers
* deepspeed
---------
Co-authored-by: Joao Gante <joao@huggingface.co>
2025-04-30 12:15:43 +01:00
Guang Yang
a57274466f
Allow override inputs to export recipe ( #37508 )
...
Add option to specify dynamic shapes during export
Co-authored-by: Guang Yang <guangyang@fb.com>
2025-04-30 10:19:27 +02:00
Joao Gante
755b0fa2fe
[tests] reorganize cache tests and clean memory between tests ( #37684 )
2025-04-29 12:21:14 +01:00
co63oc
d5fa7d2d19
Fix typos in strings and comments ( #37799 )
2025-04-28 11:39:11 +01:00
Cyril Vallez
0cfbf9c95b
Force torch>=2.6 with torch.load to avoid vulnerability issue ( #37785 )
...
* fix all main files
* fix test files
* oups forgot modular
* add link
* update message
2025-04-25 16:57:09 +02:00
Poedator
7c62e69326
GPT2Model
StaticCache support (#35761 )
...
* initial GPT2 changes
* causal_mask support
* return_legacy_cache
* cleanup
* fix1
* outputs shape fixes
* gpt2 return fix
* pkv, attn fixes
* fix dual_head
* is_causal arg fix
* decision transformer updated
* style fix
* batch_size from inputs_embeds
* DecisionTransformerModel fixes
* cross-attn support + cache warning
* x-attn @decision
* EDCache proper init
* simplified logic in `if use_cache:` for GPT2Model
* @deprecate_kwarg for DecisionTr attn fwd
* @deprecate_kwarg in gpt2
* deprecation version updated to 4.51
* kwargs in gradient_checkpointing_fn
* rename next_cache to past_key_values
* attention_mask prep
* +cache_position in GPT2DoubleHeadsModel
* undo kwargs in gradient checkpointing
* moved up `if self.gradient_checkpointing`
* consistency in decision_transformer
* pastkv, cache_pos in grad_checkpt args
* rm _reorder_cache
* output_attentions streamlined
* decision_transformer consistency
* return_legacy_cache improved
* ClvpForCausalLM used for legacy cache test now
* is_causal fixed
* attn_output cleanup
* consistency @ decision_transformer
* Updated deprecation notice version to 4.52
* upd deprecation
* consistent legacy cache code in decision transformers\
* next_cache -> past_kv in decision_tr
* cache support flags in decision_transf
* rm legacy cache warning
* consistency in cache init for decision transf
* no Static Cache for Decision Transformer
---------
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
2025-04-24 14:46:35 +02:00
Manuel de Prada Corral
1cd110c6cb
Add test to ensure unknown exceptions reraising in utils/hub.py::cached_files() ( #37651 )
...
* add test to ensure unknown exceptions are reraised in utils/hub.py::cached_files()
2025-04-22 11:38:10 +02:00
Pablo Montalvo
4afd3f4820
Model debugger upgrades ( #37391 )
...
* debugging improvements
* add debugging details
* add more debugging details
* debug more
* clean up layers + output
* add summary json file
* cleanup
* copies 👀
* remove hooks + add documentation
* draft a small test, why not
* respect the format (respect it)
* fixup imports
* nit
* add tests and configurable pruning of layers
2025-04-18 16:45:54 +02:00
Lysandre Debut
54a123f068
Simplify soft dependencies and update the dummy-creation process ( #36827 )
...
* Reverse dependency map shouldn't be created when test_all is set
* [test_all] Remove dummies
* Modular fixes
* Update utils/check_repo.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* [test_all] Better docs
* [test_all] Update src/transformers/commands/chat.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* [test_all] Remove deprecated AdaptiveEmbeddings from the tests
* [test_all] Doc builder
* [test_all] is_dummy
* [test_all] Import utils
* [test_all] Doc building should not require all deps
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-04-11 11:08:36 +02:00
cyyever
371c44d0ef
Remove old code for PyTorch, Accelerator and tokenizers ( #37234 )
...
* Remove unneeded library version checks
Signed-off-by: cyy <cyyever@outlook.com>
* Remove PyTorch condition
Signed-off-by: cyy <cyyever@outlook.com>
* Remove PyTorch condition
Signed-off-by: cyy <cyyever@outlook.com>
* Fix ROCm get_device_capability
Signed-off-by: cyy <cyyever@outlook.com>
* Revert "Fix ROCm get_device_capability"
This reverts commit 0e756434bd
.
* Remove unnecessary check
Signed-off-by: cyy <cyyever@outlook.com>
* Revert changes
Signed-off-by: cyy <cyyever@outlook.com>
---------
Signed-off-by: cyy <cyyever@outlook.com>
2025-04-10 20:54:21 +02:00
Joao Gante
4321b0648c
[core] remove GenerationMixin
inheritance by default in PreTrainedModel
( #37173 )
2025-04-08 16:42:05 +01:00
cyyever
1e6b546ea6
Use Python 3.9 syntax in tests ( #37343 )
...
Signed-off-by: cyy <cyyever@outlook.com>
2025-04-08 14:12:08 +02:00
Yao Matrix
12bf24d6ae
enable 2 llama UT cases on xpu ( #37126 )
...
* enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* switch to use Expectations
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* fix style
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* extract gen bits from architecture and use it
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* add cross refererence
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* fix style
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
---------
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-04-07 16:02:14 +02:00
Matt
cbfa14823b
No more dtype_byte_size() ( #37144 )
...
* No more dtype_byte_size()
* Remove function once again
* Fix rebase cruft
* Trigger tests
2025-04-02 14:58:38 +01:00
Yih-Dar
adfc91cd46
Try to avoid/reduce some remaining CI job failures ( #37202 )
...
* try
* try
* Update tests/pipelines/test_pipelines_video_classification.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-04-02 14:39:57 +02:00
Qizhi Chen
fac70ff3c0
Convert _VALID_DICT_FIELDS
to class attribute for shared dict parsing in subclasses ( #36736 )
...
* make _VALID_DICT_FIELDS as a class attribute
* fix test case about TrainingArguments
2025-04-01 12:29:12 +02:00
cyyever
786d9c5ed9
Fix more inefficient PT operations ( #37060 )
...
* Fix inefficient operations
* Remove cpu() call
* Reorder detach()
* Reorder detach()
* tolist without detach
* item without detach
* Update src/transformers/models/rag/modeling_rag.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update tests/models/encodec/test_modeling_encodec.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Use detach().cpu().numpy
* Revert some numpy operations
* More fixes
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-31 16:31:24 +01:00
Pavel Iakubovskii
a1e389e637
Refactor return_dict
logic to remove complicated if/else paths ( #36794 )
...
* SAM
* CLIP
* SigLIP
* GOT-OCR2 (depends on SAM)
* SigLIP2 (depends on SigLIP)
* trigger tests
* Fix SAM
* Fix missed indexing, use named attributes
* Llama
* Aria
* Bamba
* Update llama: missed outputs return type
* (fixup) Aria
* DiffLlama
* Emu3
* Gemma
* Gemma2
* Paligemma
* Fix paligemma
* Gemma3
* GLM
* Helium
* JetMoe
* Jamba
* Mistral
* Mistral
* Mixtral
* Nemotron
* Olmo
* Olmo2
* Persimmon
* Phi
* Phi3
* PhiMoe
* Qwen2
* Qwen2_moe
* StableLM
* Starcoder2
* Add return_dict decorator
* SAM
* Update decorator: compile, export, trace - friendly
* Llama (decorator)
* SAM (decorator)
* Add decorator `can_return_tuple`
* Llama
* Update to decorator
* Update CLIP
* Update decorator to store `_is_top_level_module` in self
* Update decorator to correctly handle compile/export
* Remove is_torchdynamo_compiling constraint, all work fine with self attribute assignment
* Typing
* GPT NeoX
* Fixup
* Fix attribute Granite
* Fix return type mixtral
* Update Gemma3
* Fix Cohere amd Cohere2
* Fixup
* Fix corner case for Phi4, when activation is shared
* (fix-copies) deepseekv3, phi4
* Fixup
* Apply to qwen3/qwen3_moe
* Fix
2025-03-31 16:23:37 +01:00
Zhen
e686fed635
[Feature] Support using FlashAttention2 on Ascend NPU ( #36696 )
...
* [Feature] Support using flash-attention on Ascend NPU
* Fix qwen3 and qwen3_moe moduler conversion mismatch
2025-03-31 16:12:58 +02:00
cyyever
6cc9c8d7d1
Remove deprecated batch_size parameter ( #37007 )
2025-03-27 15:01:56 +00:00
cyyever
41a0e58e5b
Set weights_only in torch.load ( #36991 )
2025-03-27 14:55:50 +00:00
eustlb
fb8e6c50e4
[audio utils] fix fft_bin_width computation ( #36603 )
...
* fix fft_bin_width computation
* update docstring + enforce correct params
* update test with correct value
* udpate test
* update feature extractors for concerned models
* update
* make
* udpate docstring
* udpate docstring
2025-03-27 15:20:02 +01:00
Sungyoon Jeong
d1eafe8d4e
Optimize to_py_obj
for python-native numeric lists and scalars ( #36885 )
...
* Optimize to_py_obj for python-native numeric lists and scalars
* Fix bug that tuple is not converted to list
* Try np.array for more robust type checking
* Apply review and add tests for to_py_obj
2025-03-27 14:16:46 +01:00
Joao Gante
bc1c90a755
[Utils] torch version checks optionally accept dev versions ( #36847 )
2025-03-25 10:58:58 +00:00
omahs
cbf924b76c
Fix typos ( #36910 )
...
* fix typos
* fix typos
* fix typos
* fix typos
2025-03-24 14:08:29 +00:00
Raushan Turganbay
523f6e743c
Fix: dtype cannot be str ( #36262 )
...
* fix
* this wan't supposed to be here, revert
* refine tests a bit more
2025-03-21 13:27:47 +01:00
Tugsbayasgalan Manlaibaatar
f39f4960f3
Support tracable dynamicKVcache ( #36311 )
...
* Support tracable dynamicKVcache
* Fix lint
* More fine grained test
* Lint
* Update
* Update
* Fix up
* Apply suggestions from code review
* Update src/transformers/cache_utils.py
* Update tests/utils/test_cache_utils.py
* Apply suggestions from code review
* Update
* Change error message
* Rename
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
---------
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-19 16:52:30 +00:00
Yao Matrix
b11050d6a2
enable OffloadedCache on XPU from PyTorch 2.7 ( #36654 )
...
* fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model
* follow Marc's suggestion to use _tie_weights to fix
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
* enable OffloadedCache on XPU since PyTorch 2.7
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
* fix style
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
* don't change bart
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
* make code more concise per review comments
Signed-off-by: N <matrix.yao@intel.com>
* fix review comments
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
* Revert "fix review comments"
This reverts commit acf1484b86
.
* fix review comments
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
* fix style
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
---------
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
Signed-off-by: N <matrix.yao@intel.com>
Co-authored-by: root <root@a4bf01945cfe.jf.intel.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-19 15:15:52 +01:00
ivarflakstad
706703bba6
Expectations test utils ( #36569 )
...
* Add expectation classes + tests
* Use typing Union instead of |
* Use bits to track score in properties cmp method
* Add exceptions and tests + comments
* Remove compute cap minor as it is not needed currently
* Simplify. Remove Properties class
* Add example Exceptions usage
* Expectations as dict subclass
* Update example Exceptions usage
* Refactor. Improve type name. Document score fn.
* Rename to DeviceProperties.
2025-03-18 23:39:50 +01:00
Afanti
7f5077e536
fix typos in the tests directory ( #36717 )
2025-03-17 17:45:57 +00:00
Sambhav Dixit
8e67230860
Fix test isolation for clear_import_cache utility ( #36345 )
...
* test fixup
* test fixup
* fixing tests for unused imports
* style fixes
* fix
* style fixes
* styke fix
* remove isolated module cache
* rm custom subprocess defination
* run using exsiting fn
* style fixup
* make fixup
* remove redundant comments
* rm redundat skipif + style changes
2025-03-17 16:09:09 +01:00