Manuel de Prada Corral
1cd110c6cb
Add test to ensure unknown exceptions reraising in utils/hub.py::cached_files() ( #37651 )
...
* add test to ensure unknown exceptions are reraised in utils/hub.py::cached_files()
2025-04-22 11:38:10 +02:00
Pablo Montalvo
4afd3f4820
Model debugger upgrades ( #37391 )
...
* debugging improvements
* add debugging details
* add more debugging details
* debug more
* clean up layers + output
* add summary json file
* cleanup
* copies 👀
* remove hooks + add documentation
* draft a small test, why not
* respect the format (respect it)
* fixup imports
* nit
* add tests and configurable pruning of layers
2025-04-18 16:45:54 +02:00
Lysandre Debut
54a123f068
Simplify soft dependencies and update the dummy-creation process ( #36827 )
...
* Reverse dependency map shouldn't be created when test_all is set
* [test_all] Remove dummies
* Modular fixes
* Update utils/check_repo.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* [test_all] Better docs
* [test_all] Update src/transformers/commands/chat.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* [test_all] Remove deprecated AdaptiveEmbeddings from the tests
* [test_all] Doc builder
* [test_all] is_dummy
* [test_all] Import utils
* [test_all] Doc building should not require all deps
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-04-11 11:08:36 +02:00
cyyever
371c44d0ef
Remove old code for PyTorch, Accelerator and tokenizers ( #37234 )
...
* Remove unneeded library version checks
Signed-off-by: cyy <cyyever@outlook.com>
* Remove PyTorch condition
Signed-off-by: cyy <cyyever@outlook.com>
* Remove PyTorch condition
Signed-off-by: cyy <cyyever@outlook.com>
* Fix ROCm get_device_capability
Signed-off-by: cyy <cyyever@outlook.com>
* Revert "Fix ROCm get_device_capability"
This reverts commit 0e756434bd
.
* Remove unnecessary check
Signed-off-by: cyy <cyyever@outlook.com>
* Revert changes
Signed-off-by: cyy <cyyever@outlook.com>
---------
Signed-off-by: cyy <cyyever@outlook.com>
2025-04-10 20:54:21 +02:00
Joao Gante
4321b0648c
[core] remove GenerationMixin
inheritance by default in PreTrainedModel
( #37173 )
2025-04-08 16:42:05 +01:00
cyyever
1e6b546ea6
Use Python 3.9 syntax in tests ( #37343 )
...
Signed-off-by: cyy <cyyever@outlook.com>
2025-04-08 14:12:08 +02:00
Yao Matrix
12bf24d6ae
enable 2 llama UT cases on xpu ( #37126 )
...
* enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* switch to use Expectations
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* fix style
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* extract gen bits from architecture and use it
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* add cross refererence
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* fix style
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
---------
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-04-07 16:02:14 +02:00
Matt
cbfa14823b
No more dtype_byte_size() ( #37144 )
...
* No more dtype_byte_size()
* Remove function once again
* Fix rebase cruft
* Trigger tests
2025-04-02 14:58:38 +01:00
Yih-Dar
adfc91cd46
Try to avoid/reduce some remaining CI job failures ( #37202 )
...
* try
* try
* Update tests/pipelines/test_pipelines_video_classification.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-04-02 14:39:57 +02:00
Qizhi Chen
fac70ff3c0
Convert _VALID_DICT_FIELDS
to class attribute for shared dict parsing in subclasses ( #36736 )
...
* make _VALID_DICT_FIELDS as a class attribute
* fix test case about TrainingArguments
2025-04-01 12:29:12 +02:00
cyyever
786d9c5ed9
Fix more inefficient PT operations ( #37060 )
...
* Fix inefficient operations
* Remove cpu() call
* Reorder detach()
* Reorder detach()
* tolist without detach
* item without detach
* Update src/transformers/models/rag/modeling_rag.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update tests/models/encodec/test_modeling_encodec.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Use detach().cpu().numpy
* Revert some numpy operations
* More fixes
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-31 16:31:24 +01:00
Pavel Iakubovskii
a1e389e637
Refactor return_dict
logic to remove complicated if/else paths ( #36794 )
...
* SAM
* CLIP
* SigLIP
* GOT-OCR2 (depends on SAM)
* SigLIP2 (depends on SigLIP)
* trigger tests
* Fix SAM
* Fix missed indexing, use named attributes
* Llama
* Aria
* Bamba
* Update llama: missed outputs return type
* (fixup) Aria
* DiffLlama
* Emu3
* Gemma
* Gemma2
* Paligemma
* Fix paligemma
* Gemma3
* GLM
* Helium
* JetMoe
* Jamba
* Mistral
* Mistral
* Mixtral
* Nemotron
* Olmo
* Olmo2
* Persimmon
* Phi
* Phi3
* PhiMoe
* Qwen2
* Qwen2_moe
* StableLM
* Starcoder2
* Add return_dict decorator
* SAM
* Update decorator: compile, export, trace - friendly
* Llama (decorator)
* SAM (decorator)
* Add decorator `can_return_tuple`
* Llama
* Update to decorator
* Update CLIP
* Update decorator to store `_is_top_level_module` in self
* Update decorator to correctly handle compile/export
* Remove is_torchdynamo_compiling constraint, all work fine with self attribute assignment
* Typing
* GPT NeoX
* Fixup
* Fix attribute Granite
* Fix return type mixtral
* Update Gemma3
* Fix Cohere amd Cohere2
* Fixup
* Fix corner case for Phi4, when activation is shared
* (fix-copies) deepseekv3, phi4
* Fixup
* Apply to qwen3/qwen3_moe
* Fix
2025-03-31 16:23:37 +01:00
Zhen
e686fed635
[Feature] Support using FlashAttention2 on Ascend NPU ( #36696 )
...
* [Feature] Support using flash-attention on Ascend NPU
* Fix qwen3 and qwen3_moe moduler conversion mismatch
2025-03-31 16:12:58 +02:00
cyyever
6cc9c8d7d1
Remove deprecated batch_size parameter ( #37007 )
2025-03-27 15:01:56 +00:00
cyyever
41a0e58e5b
Set weights_only in torch.load ( #36991 )
2025-03-27 14:55:50 +00:00
eustlb
fb8e6c50e4
[audio utils] fix fft_bin_width computation ( #36603 )
...
* fix fft_bin_width computation
* update docstring + enforce correct params
* update test with correct value
* udpate test
* update feature extractors for concerned models
* update
* make
* udpate docstring
* udpate docstring
2025-03-27 15:20:02 +01:00
Sungyoon Jeong
d1eafe8d4e
Optimize to_py_obj
for python-native numeric lists and scalars ( #36885 )
...
* Optimize to_py_obj for python-native numeric lists and scalars
* Fix bug that tuple is not converted to list
* Try np.array for more robust type checking
* Apply review and add tests for to_py_obj
2025-03-27 14:16:46 +01:00
Joao Gante
bc1c90a755
[Utils] torch version checks optionally accept dev versions ( #36847 )
2025-03-25 10:58:58 +00:00
omahs
cbf924b76c
Fix typos ( #36910 )
...
* fix typos
* fix typos
* fix typos
* fix typos
2025-03-24 14:08:29 +00:00
Raushan Turganbay
523f6e743c
Fix: dtype cannot be str ( #36262 )
...
* fix
* this wan't supposed to be here, revert
* refine tests a bit more
2025-03-21 13:27:47 +01:00
Tugsbayasgalan Manlaibaatar
f39f4960f3
Support tracable dynamicKVcache ( #36311 )
...
* Support tracable dynamicKVcache
* Fix lint
* More fine grained test
* Lint
* Update
* Update
* Fix up
* Apply suggestions from code review
* Update src/transformers/cache_utils.py
* Update tests/utils/test_cache_utils.py
* Apply suggestions from code review
* Update
* Change error message
* Rename
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
---------
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-19 16:52:30 +00:00
Yao Matrix
b11050d6a2
enable OffloadedCache on XPU from PyTorch 2.7 ( #36654 )
...
* fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model
* follow Marc's suggestion to use _tie_weights to fix
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
* enable OffloadedCache on XPU since PyTorch 2.7
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
* fix style
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
* don't change bart
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
* make code more concise per review comments
Signed-off-by: N <matrix.yao@intel.com>
* fix review comments
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
* Revert "fix review comments"
This reverts commit acf1484b86
.
* fix review comments
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
* fix style
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
---------
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
Signed-off-by: N <matrix.yao@intel.com>
Co-authored-by: root <root@a4bf01945cfe.jf.intel.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-19 15:15:52 +01:00
ivarflakstad
706703bba6
Expectations test utils ( #36569 )
...
* Add expectation classes + tests
* Use typing Union instead of |
* Use bits to track score in properties cmp method
* Add exceptions and tests + comments
* Remove compute cap minor as it is not needed currently
* Simplify. Remove Properties class
* Add example Exceptions usage
* Expectations as dict subclass
* Update example Exceptions usage
* Refactor. Improve type name. Document score fn.
* Rename to DeviceProperties.
2025-03-18 23:39:50 +01:00
Afanti
7f5077e536
fix typos in the tests directory ( #36717 )
2025-03-17 17:45:57 +00:00
Sambhav Dixit
8e67230860
Fix test isolation for clear_import_cache utility ( #36345 )
...
* test fixup
* test fixup
* fixing tests for unused imports
* style fixes
* fix
* style fixes
* styke fix
* remove isolated module cache
* rm custom subprocess defination
* run using exsiting fn
* style fixup
* make fixup
* remove redundant comments
* rm redundat skipif + style changes
2025-03-17 16:09:09 +01:00
Matt
48ef468c74
Final CI cleanup ( #36703 )
...
* make fixup
* make fixup
* Correct skip decorator
* Add TODOs
* add is_flaky() parentheses
2025-03-13 17:26:09 +00:00
Cyril Vallez
2a004f9ff1
Add loading speed test ( #36671 )
...
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* trigger CIs
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* better error messages
* Update test_modeling_utils.py
* Update test_modeling_utils.py
2025-03-13 17:07:30 +01:00
Marc Sun
fbb18ce68b
Update config.torch_dtype correctly ( #36679 )
...
* fix
* style
* new test
2025-03-13 12:08:02 +01:00
Joao Gante
c4161238bd
[Cache] Don't initialize the cache on meta
device ( #36543 )
2025-03-13 10:13:29 +00:00
Cyril Vallez
071a161d3e
[core] Large/full refactor of from_pretrained
( #36033 )
...
* squash everything together
start to simplify inner logic
Update modeling_utils.py
Update modeling_utils.py
Update modeling_utils.py
Update modeling_utils.py
continue refactor
fix
small fixes
add type hints/docstring
Update modeling_utils.py
remove _fast_init
keep improving
Update modeling_utils.py
Update modeling_utils.py
new first tp loading version
style
fix weird in-place op
trigger CIs
Update modeling_utils.py
much clearer renaming of keys
fix
update
Update test_modeling_common.py
trigger CIs
update
update
style
Update modeling_utils.py
Update modeling_utils.py
Update modeling_utils.py
fix
fast download first prototype
remove old function
remove old functions
Remove unused function and move back _get_tp_registry
fix tp plan registry
simplify
CIs
Update hub.py
Update modeling_utils.py
simplify
simplify renaming logic
remove unused check
add sanity check back (a test depends on it)
Update modeling_utils.py
finalize sound renaming logic
style
add forgotten check
Update modeling_utils.py
add key_mapping keyword
style
Update modeling_utils.py
add comment
minor updates
minor change for clarity
fix small prefix issue and simplify
style
trigger CIs
typo fix
Post rebase fix
post rebase cleanup
simplify tp
typo
oupsi
typo
correctly escape
improvements based on Marc's review
finalize Marc's review comments
squash everything
* improve
* Update modeling_utils.py
* Update modeling_utils.py
* fix
* Update modeling_utils.py
* Update modeling_utils.py
* style
* Update modeling_utils.py
* simplify
* style
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* fix dtype issue
* Update modeling_utils.py
* style
* remove test that does not make sense
* style
* small fixes
* style
* fix
* cleanup after rebase
* style
* typo
* escape
* tp for task specific top modules
* Update modeling_utils.py
* Update modeling_utils.py
* fix allocation
* CIs
* CIs
* CIs
* improve docstring
* CIs
* Update modeling_utils.py
* fix
2025-03-12 13:39:25 +01:00
Joao Gante
8aed019764
[generate] torch.distributed
-compatible DynamicCache
( #36373 )
...
* test
* docstring
* prepare distributed cache data
* fix cat dim
* test mvp
* add test checks
* like this?
* working test and solution
* nit
* nit
* add shape info
2025-02-27 11:48:57 +00:00
Arthur
1603018e7a
Update form pretrained to make TP a first class citizen ( #36335 )
...
* clean code
* oups
* fix merge
* yups
* fix if
* now you can play
* fix shape issue
* try non blocking
* fix
* updates
* up
* updates
* fix most of thetests
* update
* update
* small updates
* up
* fix the remaining bug?
* update
* rename when you read from the file
* buffer issues
* current status
* cleanup
* properly allocate dumb memory
* update a small bug
* fix colwise rep issue
* fix keep in float 32 that was keeping everything in float 32
* typo
* more fixes with keep_in_fp32_modules as we use to serach on it
* fix ROPE dtype for TP
* remove what's breaking the tests
* updates
* update and fixes
* small cleanup after merging
* allocate 2x to be safe
* style, auto
* update
* yup nit
* fix
* remove slow as fuck torch api :(
* work
* fixup
* update
* brting the fix back
* fix and update
* fixes
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* updates because some suggestions were wrong 👀
* update?
* fuck this bloated function
* typo
* fix the dumb prefix thing once and forall
* fixes here and there
* updates
* remove prints
* fix strict cases
* styel
* properly fix keys on load!
* update
* fix base model prefix issue
* style
* update
* fix all?
* remoce 1 print
* fix the final etsts
* fixup
* last nits
* fix the detach issue which cause a 2x slowdown
* fixup
* small fixes
* ultra nit
* fix
* fix
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-26 20:12:38 +01:00
Zach Mueller
41925e4213
Add retry hf hub decorator ( #35213 )
...
* Add retry torch decorator
* New approach
* Empty commit
* Empty commit
* Style
* Use logger.error
* Add a test
* Update src/transformers/testing_utils.py
Co-authored-by: Lucain <lucainp@gmail.com>
* Fix err
* Update tests/utils/test_modeling_utils.py
---------
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-02-25 20:53:11 +01:00
Joao Gante
678885bbbd
[CI] Check test if the GenerationTesterMixin
inheritance is correct 🐛 🔫 ( #36180 )
2025-02-21 10:18:20 +00:00
Ilyas Moutawwakil
5e2183f344
Make cache traceable ( #35873 )
...
simply make cache traceable
2025-02-20 09:59:25 +01:00
Joao Gante
e3d99ec2f5
[tests] make test_from_pretrained_low_cpu_mem_usage_equal
less flaky ( #36255 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-19 15:14:02 +00:00
Joao Gante
99adc74462
[tests] remove flax-pt equivalence and cross tests ( #36283 )
2025-02-19 15:13:27 +00:00
Joao Gante
0863eef248
[tests] remove pt_tf
equivalence tests ( #36253 )
2025-02-19 11:55:11 +00:00
Yoni Gozlan
e6a7981711
Fix make_batched_videos and add tests ( #36143 )
...
* add support for initial shift in video processing and other fixes
* revert modifications video loading functions
2025-02-13 17:14:30 -05:00
Arthur
b079dd1fa2
Fix red CI ( #36174 )
...
test was weird
2025-02-13 14:27:55 +01:00
Lucain
e60ae0d078
Replace deprecated update_repo_visibility ( #35970 )
2025-02-13 11:27:55 +01:00
Sambhav Dixit
d6897b46bd
Add utility for Reload Transformers imports cache for development workflow #35508 ( #35858 )
...
* Reload transformers fix form cache
* add imports
* add test fn for clearing import cache
* ruff fix to core import logic
* ruff fix to test file
* fixup for imports
* fixup for test
* lru restore
* test check
* fix style changes
* added documentation for usecase
* fixing
---------
Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>
2025-02-12 12:45:11 +01:00
Zach Mueller
1ce0e2992e
Nail in edge case of torch dtype being overriden permantly in the case of an error ( #35845 )
...
* Nail in edge case of torch dtype
* Rm unused func
* Apply suggestions from code review
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
* Refactor tests to only mock what we need, don't introduce injection functions
* SetUp/TearDown
* Do super
---------
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
2025-02-06 09:05:23 -05:00
Marc Sun
9f486badd5
Display warning for unknown quants config instead of an error ( #35963 )
...
* add supports_quant_method check
* fix
* add test and fix suggestions
* change logic slightly
---------
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-02-04 15:17:01 +01:00
Yoni Gozlan
d7188ba600
Add support for nested images to LLava and VipLLava ( #35558 )
...
* move make_flat_list_of_images and make_batched_videos to image_utils
* remove unnecessary is_vision_available
* move make_nested_list_of_images to image_utils
* fix fast pixtral image processor
* fix import mllama
* fix make_nested_list_of_images
* add tests
* convert 4d arrays/tensors to list
* add test_make_batched_videos
* add support nested batch of videos
* fix image processing qwen2vl
2025-01-30 16:49:20 -05:00
Joao Gante
ece8c42488
Test: generate with torch.compile(model.forward)
as a fast test ( #34544 )
2025-01-28 14:10:38 +00:00
Raushan Turganbay
b764c20b09
Fix: loading DBRX back from saved path ( #35728 )
...
* fix dtype as dict for some models + add test
* add comment in tests
2025-01-28 11:38:45 +01:00
Arthur
b912f5ee43
use torch.testing.assertclose instead to get more details about error in cis ( #35659 )
...
* use torch.testing.assertclose instead to get more details about error in cis
* fix
* style
* test_all
* revert for I bert
* fixes and updates
* more image processing fixes
* more image processors
* fix mamba and co
* style
* less strick
* ok I won't be strict
* skip and be done
* up
2025-01-24 16:55:28 +01:00
Cyril Vallez
d3af76df58
[Backend support] Allow num_logits_to_keep
as Tensor + add flag ( #35757 )
...
* support
* Update modeling_utils.py
* style
* most models
* Other models
* fix-copies
* tests + generation utils
2025-01-23 09:47:54 +01:00
Raushan Turganbay
373e50e970
Init cache on meta device ( #35164 )
...
* init cache on meta device
* offloaded static + enable tests
* tests weren't running before :(
* update
* fix mamba
* fix copies
* update
* address comments and fix tests
* fix copies
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* update
* mamba fix
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-22 09:49:17 +01:00