cyyever
786d9c5ed9
Fix more inefficient PT operations ( #37060 )
...
* Fix inefficient operations
* Remove cpu() call
* Reorder detach()
* Reorder detach()
* tolist without detach
* item without detach
* Update src/transformers/models/rag/modeling_rag.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update tests/models/encodec/test_modeling_encodec.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Use detach().cpu().numpy
* Revert some numpy operations
* More fixes
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-31 16:31:24 +01:00
Pavel Iakubovskii
a1e389e637
Refactor return_dict
logic to remove complicated if/else paths ( #36794 )
...
* SAM
* CLIP
* SigLIP
* GOT-OCR2 (depends on SAM)
* SigLIP2 (depends on SigLIP)
* trigger tests
* Fix SAM
* Fix missed indexing, use named attributes
* Llama
* Aria
* Bamba
* Update llama: missed outputs return type
* (fixup) Aria
* DiffLlama
* Emu3
* Gemma
* Gemma2
* Paligemma
* Fix paligemma
* Gemma3
* GLM
* Helium
* JetMoe
* Jamba
* Mistral
* Mistral
* Mixtral
* Nemotron
* Olmo
* Olmo2
* Persimmon
* Phi
* Phi3
* PhiMoe
* Qwen2
* Qwen2_moe
* StableLM
* Starcoder2
* Add return_dict decorator
* SAM
* Update decorator: compile, export, trace - friendly
* Llama (decorator)
* SAM (decorator)
* Add decorator `can_return_tuple`
* Llama
* Update to decorator
* Update CLIP
* Update decorator to store `_is_top_level_module` in self
* Update decorator to correctly handle compile/export
* Remove is_torchdynamo_compiling constraint, all work fine with self attribute assignment
* Typing
* GPT NeoX
* Fixup
* Fix attribute Granite
* Fix return type mixtral
* Update Gemma3
* Fix Cohere amd Cohere2
* Fixup
* Fix corner case for Phi4, when activation is shared
* (fix-copies) deepseekv3, phi4
* Fixup
* Apply to qwen3/qwen3_moe
* Fix
2025-03-31 16:23:37 +01:00
Zhen
e686fed635
[Feature] Support using FlashAttention2 on Ascend NPU ( #36696 )
...
* [Feature] Support using flash-attention on Ascend NPU
* Fix qwen3 and qwen3_moe moduler conversion mismatch
2025-03-31 16:12:58 +02:00
cyyever
6cc9c8d7d1
Remove deprecated batch_size parameter ( #37007 )
2025-03-27 15:01:56 +00:00
cyyever
41a0e58e5b
Set weights_only in torch.load ( #36991 )
2025-03-27 14:55:50 +00:00
eustlb
fb8e6c50e4
[audio utils] fix fft_bin_width computation ( #36603 )
...
* fix fft_bin_width computation
* update docstring + enforce correct params
* update test with correct value
* udpate test
* update feature extractors for concerned models
* update
* make
* udpate docstring
* udpate docstring
2025-03-27 15:20:02 +01:00
Sungyoon Jeong
d1eafe8d4e
Optimize to_py_obj
for python-native numeric lists and scalars ( #36885 )
...
* Optimize to_py_obj for python-native numeric lists and scalars
* Fix bug that tuple is not converted to list
* Try np.array for more robust type checking
* Apply review and add tests for to_py_obj
2025-03-27 14:16:46 +01:00
Joao Gante
bc1c90a755
[Utils] torch version checks optionally accept dev versions ( #36847 )
2025-03-25 10:58:58 +00:00
omahs
cbf924b76c
Fix typos ( #36910 )
...
* fix typos
* fix typos
* fix typos
* fix typos
2025-03-24 14:08:29 +00:00
Raushan Turganbay
523f6e743c
Fix: dtype cannot be str ( #36262 )
...
* fix
* this wan't supposed to be here, revert
* refine tests a bit more
2025-03-21 13:27:47 +01:00
Tugsbayasgalan Manlaibaatar
f39f4960f3
Support tracable dynamicKVcache ( #36311 )
...
* Support tracable dynamicKVcache
* Fix lint
* More fine grained test
* Lint
* Update
* Update
* Fix up
* Apply suggestions from code review
* Update src/transformers/cache_utils.py
* Update tests/utils/test_cache_utils.py
* Apply suggestions from code review
* Update
* Change error message
* Rename
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
---------
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-19 16:52:30 +00:00
Yao Matrix
b11050d6a2
enable OffloadedCache on XPU from PyTorch 2.7 ( #36654 )
...
* fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model
* follow Marc's suggestion to use _tie_weights to fix
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
* enable OffloadedCache on XPU since PyTorch 2.7
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
* fix style
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
* don't change bart
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
* make code more concise per review comments
Signed-off-by: N <matrix.yao@intel.com>
* fix review comments
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
* Revert "fix review comments"
This reverts commit acf1484b86
.
* fix review comments
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
* fix style
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
---------
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
Signed-off-by: N <matrix.yao@intel.com>
Co-authored-by: root <root@a4bf01945cfe.jf.intel.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-19 15:15:52 +01:00
ivarflakstad
706703bba6
Expectations test utils ( #36569 )
...
* Add expectation classes + tests
* Use typing Union instead of |
* Use bits to track score in properties cmp method
* Add exceptions and tests + comments
* Remove compute cap minor as it is not needed currently
* Simplify. Remove Properties class
* Add example Exceptions usage
* Expectations as dict subclass
* Update example Exceptions usage
* Refactor. Improve type name. Document score fn.
* Rename to DeviceProperties.
2025-03-18 23:39:50 +01:00
Afanti
7f5077e536
fix typos in the tests directory ( #36717 )
2025-03-17 17:45:57 +00:00
Sambhav Dixit
8e67230860
Fix test isolation for clear_import_cache utility ( #36345 )
...
* test fixup
* test fixup
* fixing tests for unused imports
* style fixes
* fix
* style fixes
* styke fix
* remove isolated module cache
* rm custom subprocess defination
* run using exsiting fn
* style fixup
* make fixup
* remove redundant comments
* rm redundat skipif + style changes
2025-03-17 16:09:09 +01:00
Matt
48ef468c74
Final CI cleanup ( #36703 )
...
* make fixup
* make fixup
* Correct skip decorator
* Add TODOs
* add is_flaky() parentheses
2025-03-13 17:26:09 +00:00
Cyril Vallez
2a004f9ff1
Add loading speed test ( #36671 )
...
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* trigger CIs
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* better error messages
* Update test_modeling_utils.py
* Update test_modeling_utils.py
2025-03-13 17:07:30 +01:00
Marc Sun
fbb18ce68b
Update config.torch_dtype correctly ( #36679 )
...
* fix
* style
* new test
2025-03-13 12:08:02 +01:00
Joao Gante
c4161238bd
[Cache] Don't initialize the cache on meta
device ( #36543 )
2025-03-13 10:13:29 +00:00
Cyril Vallez
071a161d3e
[core] Large/full refactor of from_pretrained
( #36033 )
...
* squash everything together
start to simplify inner logic
Update modeling_utils.py
Update modeling_utils.py
Update modeling_utils.py
Update modeling_utils.py
continue refactor
fix
small fixes
add type hints/docstring
Update modeling_utils.py
remove _fast_init
keep improving
Update modeling_utils.py
Update modeling_utils.py
new first tp loading version
style
fix weird in-place op
trigger CIs
Update modeling_utils.py
much clearer renaming of keys
fix
update
Update test_modeling_common.py
trigger CIs
update
update
style
Update modeling_utils.py
Update modeling_utils.py
Update modeling_utils.py
fix
fast download first prototype
remove old function
remove old functions
Remove unused function and move back _get_tp_registry
fix tp plan registry
simplify
CIs
Update hub.py
Update modeling_utils.py
simplify
simplify renaming logic
remove unused check
add sanity check back (a test depends on it)
Update modeling_utils.py
finalize sound renaming logic
style
add forgotten check
Update modeling_utils.py
add key_mapping keyword
style
Update modeling_utils.py
add comment
minor updates
minor change for clarity
fix small prefix issue and simplify
style
trigger CIs
typo fix
Post rebase fix
post rebase cleanup
simplify tp
typo
oupsi
typo
correctly escape
improvements based on Marc's review
finalize Marc's review comments
squash everything
* improve
* Update modeling_utils.py
* Update modeling_utils.py
* fix
* Update modeling_utils.py
* Update modeling_utils.py
* style
* Update modeling_utils.py
* simplify
* style
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* fix dtype issue
* Update modeling_utils.py
* style
* remove test that does not make sense
* style
* small fixes
* style
* fix
* cleanup after rebase
* style
* typo
* escape
* tp for task specific top modules
* Update modeling_utils.py
* Update modeling_utils.py
* fix allocation
* CIs
* CIs
* CIs
* improve docstring
* CIs
* Update modeling_utils.py
* fix
2025-03-12 13:39:25 +01:00
Joao Gante
8aed019764
[generate] torch.distributed
-compatible DynamicCache
( #36373 )
...
* test
* docstring
* prepare distributed cache data
* fix cat dim
* test mvp
* add test checks
* like this?
* working test and solution
* nit
* nit
* add shape info
2025-02-27 11:48:57 +00:00
Arthur
1603018e7a
Update form pretrained to make TP a first class citizen ( #36335 )
...
* clean code
* oups
* fix merge
* yups
* fix if
* now you can play
* fix shape issue
* try non blocking
* fix
* updates
* up
* updates
* fix most of thetests
* update
* update
* small updates
* up
* fix the remaining bug?
* update
* rename when you read from the file
* buffer issues
* current status
* cleanup
* properly allocate dumb memory
* update a small bug
* fix colwise rep issue
* fix keep in float 32 that was keeping everything in float 32
* typo
* more fixes with keep_in_fp32_modules as we use to serach on it
* fix ROPE dtype for TP
* remove what's breaking the tests
* updates
* update and fixes
* small cleanup after merging
* allocate 2x to be safe
* style, auto
* update
* yup nit
* fix
* remove slow as fuck torch api :(
* work
* fixup
* update
* brting the fix back
* fix and update
* fixes
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* updates because some suggestions were wrong 👀
* update?
* fuck this bloated function
* typo
* fix the dumb prefix thing once and forall
* fixes here and there
* updates
* remove prints
* fix strict cases
* styel
* properly fix keys on load!
* update
* fix base model prefix issue
* style
* update
* fix all?
* remoce 1 print
* fix the final etsts
* fixup
* last nits
* fix the detach issue which cause a 2x slowdown
* fixup
* small fixes
* ultra nit
* fix
* fix
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-26 20:12:38 +01:00
Zach Mueller
41925e4213
Add retry hf hub decorator ( #35213 )
...
* Add retry torch decorator
* New approach
* Empty commit
* Empty commit
* Style
* Use logger.error
* Add a test
* Update src/transformers/testing_utils.py
Co-authored-by: Lucain <lucainp@gmail.com>
* Fix err
* Update tests/utils/test_modeling_utils.py
---------
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-02-25 20:53:11 +01:00
Joao Gante
678885bbbd
[CI] Check test if the GenerationTesterMixin
inheritance is correct 🐛 🔫 ( #36180 )
2025-02-21 10:18:20 +00:00
Ilyas Moutawwakil
5e2183f344
Make cache traceable ( #35873 )
...
simply make cache traceable
2025-02-20 09:59:25 +01:00
Joao Gante
e3d99ec2f5
[tests] make test_from_pretrained_low_cpu_mem_usage_equal
less flaky ( #36255 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-19 15:14:02 +00:00
Joao Gante
99adc74462
[tests] remove flax-pt equivalence and cross tests ( #36283 )
2025-02-19 15:13:27 +00:00
Joao Gante
0863eef248
[tests] remove pt_tf
equivalence tests ( #36253 )
2025-02-19 11:55:11 +00:00
Yoni Gozlan
e6a7981711
Fix make_batched_videos and add tests ( #36143 )
...
* add support for initial shift in video processing and other fixes
* revert modifications video loading functions
2025-02-13 17:14:30 -05:00
Arthur
b079dd1fa2
Fix red CI ( #36174 )
...
test was weird
2025-02-13 14:27:55 +01:00
Lucain
e60ae0d078
Replace deprecated update_repo_visibility ( #35970 )
2025-02-13 11:27:55 +01:00
Sambhav Dixit
d6897b46bd
Add utility for Reload Transformers imports cache for development workflow #35508 ( #35858 )
...
* Reload transformers fix form cache
* add imports
* add test fn for clearing import cache
* ruff fix to core import logic
* ruff fix to test file
* fixup for imports
* fixup for test
* lru restore
* test check
* fix style changes
* added documentation for usecase
* fixing
---------
Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>
2025-02-12 12:45:11 +01:00
Zach Mueller
1ce0e2992e
Nail in edge case of torch dtype being overriden permantly in the case of an error ( #35845 )
...
* Nail in edge case of torch dtype
* Rm unused func
* Apply suggestions from code review
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
* Refactor tests to only mock what we need, don't introduce injection functions
* SetUp/TearDown
* Do super
---------
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
2025-02-06 09:05:23 -05:00
Marc Sun
9f486badd5
Display warning for unknown quants config instead of an error ( #35963 )
...
* add supports_quant_method check
* fix
* add test and fix suggestions
* change logic slightly
---------
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-02-04 15:17:01 +01:00
Yoni Gozlan
d7188ba600
Add support for nested images to LLava and VipLLava ( #35558 )
...
* move make_flat_list_of_images and make_batched_videos to image_utils
* remove unnecessary is_vision_available
* move make_nested_list_of_images to image_utils
* fix fast pixtral image processor
* fix import mllama
* fix make_nested_list_of_images
* add tests
* convert 4d arrays/tensors to list
* add test_make_batched_videos
* add support nested batch of videos
* fix image processing qwen2vl
2025-01-30 16:49:20 -05:00
Joao Gante
ece8c42488
Test: generate with torch.compile(model.forward)
as a fast test ( #34544 )
2025-01-28 14:10:38 +00:00
Raushan Turganbay
b764c20b09
Fix: loading DBRX back from saved path ( #35728 )
...
* fix dtype as dict for some models + add test
* add comment in tests
2025-01-28 11:38:45 +01:00
Arthur
b912f5ee43
use torch.testing.assertclose instead to get more details about error in cis ( #35659 )
...
* use torch.testing.assertclose instead to get more details about error in cis
* fix
* style
* test_all
* revert for I bert
* fixes and updates
* more image processing fixes
* more image processors
* fix mamba and co
* style
* less strick
* ok I won't be strict
* skip and be done
* up
2025-01-24 16:55:28 +01:00
Cyril Vallez
d3af76df58
[Backend support] Allow num_logits_to_keep
as Tensor + add flag ( #35757 )
...
* support
* Update modeling_utils.py
* style
* most models
* Other models
* fix-copies
* tests + generation utils
2025-01-23 09:47:54 +01:00
Raushan Turganbay
373e50e970
Init cache on meta device ( #35164 )
...
* init cache on meta device
* offloaded static + enable tests
* tests weren't running before :(
* update
* fix mamba
* fix copies
* update
* address comments and fix tests
* fix copies
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* update
* mamba fix
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-22 09:49:17 +01:00
Aymeric Roucher
44393df089
Tool calling: support more types ( #35776 )
...
* Tool calling: support NoneType for function return type
2025-01-20 19:15:34 +01:00
Ross Wightman
8c1b5d3782
🚨 🚨 🚨 An attempt to fix #29554 . Include 'LayerNorm.' in gamma/beta rename scope, optimize string search. ( #35615 )
...
* An attempt to fix #29554 . Include 'LayerNorm.' in gamma/beta rename scope, reduce number of characters searched on every load considerably.
* Fix fix on load issue
* Fix gamma/beta warning test
* A style complaint
* Improve efficiency of weight norm key rename. Add better comments about weight norm and layer norm renaming.
* Habitual elif redunant with the return
2025-01-16 17:25:44 -08:00
Joao Gante
aeeceb9916
[cache] add a test to confirm we can use cache at train time ( #35709 )
...
* add test
* augment test as suggested
* Update tests/utils/test_modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* rerun tests
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-16 17:02:34 +00:00
jiqing-feng
387663e571
Enable gptqmodel ( #35012 )
...
* gptqmodel
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* update readme
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* gptqmodel need use checkpoint_format (#1 )
* gptqmodel need use checkpoint_format
* fix quantize
* Update quantization_config.py
* Update quantization_config.py
* Update quantization_config.py
---------
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
* Revert quantizer_gptq.py (#2 )
* revert quantizer_gptq.py change
* pass **kwargs
* limit gptqmodel and optimum version
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix warning
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix version check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* revert unrelated changes
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* enable gptqmodel tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix requires gptq
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* Fix Transformer compat (#3 )
* revert quantizer_gptq.py change
* pass **kwargs
* add meta info
* cleanup
* cleanup
* Update quantization_config.py
* hf_select_quant_linear pass checkpoint_format and meta
* fix GPTQTestCUDA
* Update test_gptq.py
* gptqmodel.hf_select_quant_linear() now does not select ExllamaV2
* cleanup
* add backend
* cleanup
* cleanup
* no need check exllama version
* Update quantization_config.py
* lower checkpoint_format and backend
* check none
* cleanup
* Update quantization_config.py
* fix self.use_exllama == False
* spell
* fix unittest
* fix unittest
---------
Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix format again
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* update gptqmodel version (#6 )
* update gptqmodel version
* update gptqmodel version
* fix unit test (#5 )
* update gptqmodel version
* update gptqmodel version
* "not self.use_exllama" is not equivalent to "self.use_exllama==False"
* fix unittest
* update gptqmodel version
* backend is loading_attibutes (#7 )
* fix format and tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix memory check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix device mismatch
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix result check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* update tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* review: update docs (#10 )
* review: update docs (#12 )
* review: update docs
* fix typo
* update tests for gptqmodel
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* update document (#9 )
* update overview.md
* cleanup
* Update overview.md
* Update overview.md
* Update overview.md
* update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
---------
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
* typo
* doc note for asymmetric quant
* typo with apple silicon(e)
* typo for marlin
* column name revert: review
* doc rocm support
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/overview.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/overview.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com>
Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-15 14:22:49 +01:00
Raushan Turganbay
84a6789145
Enable different torch dtype in sub models ( #34873 )
...
* fix
* fix test
* add tests
* add more tests
* fix tests
* supposed to be a torch.dtype test
* handle BC and make fp32 default
2025-01-13 13:42:08 +01:00
Cyril Vallez
965a2fb320
More model refactoring! ( #35359 )
...
* cohere
* style
* phi3
* style
* small fix
* small fix
* phi3 longrope
* oups
* Update rope (only for phi3 still)
* Update test_modeling_rope_utils.py
* Update modeling_phi3.py
* fix
* fix copies
* style
* Fix copied from bad renaming
2025-01-09 11:09:09 +01:00
Arthur
2c47618c1a
🚨 All attention refactor 🚨 ( #35235 )
...
* refactor LlamaAttention
* minimal changes
* fix llama
* update
* modular gemmas
* modular nits
* modular updates
* nits
* simplify
* gpt2
* more modualr and fixes
* granite
* modular modular modular
* nits
* update
* qwen2 + starcoder2
* mostly gemma2
* Update image_processing_auto.py
* fix
* Update modular_starcoder2.py
* fix
* remove all copied from attentions
* remove gcv
* make fix-copies
* oups
* oups2.0
* fix some modulars + all copied from
* should be good now
* revert unwanted changes
* Update modeling_decision_transformer.py
* finish cleanup
* Update modeling_olmo.py
* consistency
* re-add gradient checkpointing attribute
* fix
* style
* make config necessary
* bis
* bis
* Update modeling_my_new_model2.py
* is_causal attr
* fix
* remove past kv return from decoder layer
* fix
* default rope config
* correctly fix rope config
* fix bias
* fix gpt2 attention output
* fix test
* fix inits
* fix default sdpa
* fix default sdpa implementation
* harmonize classes
* fix mistral
* fix sliding window models
* mixtral
* be more explicit
* style
* fix
* several fixes
* Update modeling_dbrx.py
* fix test
* olmo + phi
* rotary
* syle
* phi
* phi again
* again
* kwargs
* Update test_modeling_common.py
* skip fx tracing tests
* Update modeling_utils.py
* gemma 2
* again
* Update modeling_recurrent_gemma.py
* gemma2
* granite
* style
* starcoder
* Update sdpa_attention.py
* switch args
* Update modeling_mllama.py
* fix
* cache type tests
* gpt2
* Update test_modeling_common.py
* fix
* consistency
* fix shape with encoder
* should be the last one
* tests non model
* most comments
* small oupsi
* be more explicit in modulars
* more explicit modulars
* CIs! it works locally
* add kwargs to _flash_attention_forward
---------
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2024-12-18 16:53:39 +01:00
Marc Sun
1eee1cedfd
Fix loading with only state dict and low_cpu_mem_usage = True ( #35217 )
...
* fix loading with only state dict and config
* style
* add tests
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-18 09:54:32 +01:00
Yih-Dar
b0a51e5cff
Fix flaky Hub CI (test_trainer.py
) ( #35062 )
...
* fix
* Update src/transformers/testing_utils.py
Co-authored-by: Lucain <lucainp@gmail.com>
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* check
* check
* check
* check
* check
* check
* Update src/transformers/testing_utils.py
Co-authored-by: Lucain <lucainp@gmail.com>
* Update src/transformers/testing_utils.py
Co-authored-by: Lucain <lucainp@gmail.com>
* check
* check
* check
* Final space
* Final adjustment
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Lucain <lucainp@gmail.com>
2024-12-05 17:02:27 +01:00
Tibor Reiss
f297af55df
Fix: take into account meta device ( #34134 )
...
* Do not load for meta device
* Make some minor improvements
* Add test
* Update tests/utils/test_modeling_utils.py
Update test parameters
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Make the test simpler
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-11-20 11:32:07 +01:00