Ryan Mullins
487dab1b2b
Shieldgemma2 ( #36678 )
...
* single commit
* correct config
* fixup
* dummy pt
* Use ShieldGemma2Config in conversion script
* Update src/transformers/models/shieldgemma2/configuration_shieldgemma2.py
* Adding shieldgemma2 to models.__init__.py
* Adding ShieldGemma2 to main __init__.py
* Update shieldgemma2.md
* Update shieldgemma2.md
* Adding tests. Addressing review feedback.
* Minor docs update
* Fixing code quality feedback from CI
* Fixing empty messages bug reported by ghunkins
---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Ren Pang <ain-soph@live.com>
2025-03-20 15:14:38 +01:00
HuangBugWei
a63e92e2f0
Fix: remove the redundant snippet of _whole_word_mask ( #36759 )
...
remove the redundant snippet of _whole_word_mask
2025-03-20 14:10:43 +00:00
Ryan Mullins
8124a234ca
Gemma 3: Adding explicit GenerationConfig and refactoring conversion … ( #36833 )
...
Gemma 3: Adding explicit GenerationConfig and refactoring conversion script
2025-03-20 15:03:32 +01:00
Pavel Iakubovskii
cf8091c017
Fix import for torch 2.0, 2.1 - guard typehint for "device_mesh" ( #36768 )
...
* Fix device_mesh
* Remove rebase leftover
2025-03-20 11:55:47 +00:00
Marc Sun
388e6659bf
Update min safetensors bis ( #36823 )
...
* update setup.py
* style
2025-03-20 12:50:07 +01:00
Joao Gante
b47d9b2f8a
[generate] clarify docstrings: when to inherit GenerationMixin
( #36605 )
2025-03-20 10:58:54 +00:00
Joao Gante
8e97b44087
[modular] Sort modular skips ( #36304 )
2025-03-20 10:55:12 +00:00
Artem Kudisov
63380b77d4
Pass state dict ( #35234 )
...
* Pass state_dict argument to get_peft_model_state_dict
* Style fix
* Change arguments order
2025-03-20 11:54:59 +01:00
Joao Gante
957b05b413
[qwen2 audio] remove redundant code and update docs ( #36282 )
2025-03-20 10:54:51 +00:00
rasmi
f0d5b2ff04
Update deprecated Jax calls ( #35919 )
...
* Remove deprecated arguments for jax.numpy.clip.
* Remove deprecated arguments for jax.numpy.clip.
* Update jax version to 0.4.27 to 0.4.38.
* Avoid use of deprecated xla_bridge.get_backend().platform
Co-authored-by: Jake Vanderplas <jakevdp@google.com>
---------
Co-authored-by: Jake Vanderplas <jakevdp@google.com>
2025-03-20 11:51:51 +01:00
Pavel Iakubovskii
1ddb64937c
Fix fp16 ONNX export for RT-DETR and RT-DETRv2 ( #36460 )
...
* Fix FP16 ONNX export
* Fix typo
* Sync omdet-turbo
* Refactor encoder for better readability
* Fix _no_split_modules
* Fix int -> torch_int
* Fix rt_detr
* Apply to rt-detr-v2
* Fixup
* Fix copies
2025-03-20 10:43:51 +00:00
AbdelKarim ELJANDOUBI
e7337ee7be
Pass num_items_in_batch directly to loss computation ( #36753 )
...
* Pass num_items_in_batch directly to loss computation
* use self loss instead
* fix loss kwrgs
* fix vocab size
2025-03-20 10:35:35 +00:00
yutong_liu
8b479e39bb
Saving Trainer.collator.tokenizer
in when Trainer.processing_class
is None
( #36552 )
...
* feat: Saving tokenizer in collator when processing_class is None
* chore: Style issue
* chore: Typo
* dbg: Check why test failed
* dbg: Remove logics and another test failed which successed before, so should be the stablibility issue
* test: Init unit-test
* chore: Style
* chore: Add err log
* fix: Case
* Update tests/trainer/test_trainer.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* chore: Try to use get_regression_trainer
* fix: Impl and style
* fix: Style
* fix: Case
* fix: Import err
* fix: Missed import
* fix: Import block un-sorted problem
* fix: Try another tokenizer
* fix: Test logic
* chore: Light updates
* chore: Reformat
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-20 11:27:47 +01:00
Ita Zaporozhets
3f03c379d2
fix tiktoken convert to pass AddedToken to Tokenizer ( #36566 )
...
* pass AddedToken to Tokenizer
* ruff
* handle dict for special tokens
* option: test tokenizer from tiktoken same as fast
* ruff
* ruff
2025-03-20 11:26:49 +01:00
Stas Bekman
8f64b177f6
[ForCausalLMLoss] allow users to pass shifted labels ( #36607 )
...
* [ForCausalLMLoss] allow users to pass shifted labels
Signed-off-by: Stas Bekman <stas@stason.org>
* style
Signed-off-by: Stas Bekman <stas@stason.org>
---------
Signed-off-by: Stas Bekman <stas@stason.org>
2025-03-20 11:25:22 +01:00
HDCharles
94555437e2
Disable inductor config setter by default ( #36608 )
...
* Disable inductor config setter by default
This is hard to debug and should be off by default
* remove default settings in autoquant too
* Add info to torchao.md about recommended settings
* satisfying Ruff format
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-20 11:23:14 +01:00
Ze-Yi LIN
8733297b41
Fix swanlab global step ( #36728 )
...
* fix
* global step
2025-03-20 11:13:37 +01:00
Quentin Gallouédec
b815fae359
Move the warning to the documentation for DataCollatorWithFlattening ( #36707 )
...
Remove init warning
2025-03-20 11:09:57 +01:00
Matt
9be4728af8
Just import torch AdamW instead ( #36177 )
...
* Just import torch AdamW instead
* Update docs too
* Make AdamW undocumented
* make fixup
* Add a basic wrapper class
* Add it back to the docs
* Just remove AdamW entirely
* Remove some AdamW references
* Drop AdamW from the public init
* make fix-copies
* Cleanup some references
* make fixup
* Delete lots of transformers.AdamW references
* Remove extra references to adamw_hf
2025-03-19 18:29:40 +00:00
Michael Feil
51bd0ceb9e
Update configuration_qwen2.py ( #36735 )
...
* Update configuration_qwen2_moe.py
* Update modeling_qwen2_moe.py
* ruff fmt
* docstring add qkv_bias
2025-03-19 18:15:54 +00:00
JJJYmmm
107fedc1e2
quick fix fast_image_processor register error ( #36716 )
...
* fix fast_image_processor register error
* update error message
* remove redundant import
* fix format
2025-03-19 18:05:45 +00:00
Mohamed Mekkouri
258dd9cc69
Add Space to Bitsandbytes doc ( #36834 )
...
* add space
* address review
2025-03-19 18:56:07 +01:00
Tugsbayasgalan Manlaibaatar
f39f4960f3
Support tracable dynamicKVcache ( #36311 )
...
* Support tracable dynamicKVcache
* Fix lint
* More fine grained test
* Lint
* Update
* Update
* Fix up
* Apply suggestions from code review
* Update src/transformers/cache_utils.py
* Update tests/utils/test_cache_utils.py
* Apply suggestions from code review
* Update
* Change error message
* Rename
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
---------
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-19 16:52:30 +00:00
Matt
63c3116530
One more fix for reviewer assignment ( #36829 )
...
* one more fix
* one more fix
* Trigger tests
2025-03-19 16:25:24 +00:00
Joao Gante
7c233980f4
[gemma 3] multimodal checkpoints + AutoModelForCausalLM ( #36741 )
2025-03-19 15:04:19 +00:00
Yao Matrix
b11050d6a2
enable OffloadedCache on XPU from PyTorch 2.7 ( #36654 )
...
* fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model
* follow Marc's suggestion to use _tie_weights to fix
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
* enable OffloadedCache on XPU since PyTorch 2.7
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
* fix style
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
* don't change bart
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
* make code more concise per review comments
Signed-off-by: N <matrix.yao@intel.com>
* fix review comments
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
* Revert "fix review comments"
This reverts commit acf1484b86
.
* fix review comments
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
* fix style
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
---------
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
Signed-off-by: N <matrix.yao@intel.com>
Co-authored-by: root <root@a4bf01945cfe.jf.intel.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-19 15:15:52 +01:00
Driss Guessous
e8d960329e
Add option for ao base configs ( #36526 )
2025-03-19 14:59:47 +01:00
Arthur
fef8b7f8e9
Add attention visualization tool ( #36630 )
...
* add utils fiel
* style
* nits
* nits
* update
* updaets
* update
* fix init issues
* big updates
* nits
* nits?
* small updates
* nites
* there were still some models left
* style
* fixes
* updates
* nits _ fixes
* push changes
* update
* update
* update
* Apply suggestions from code review
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* style
* styling and return a string for testing
* small updates
* always biderectional for now
* update
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
2025-03-19 13:58:46 +01:00
Joao Gante
0fe0bae0a8
[Generation] remove leftover code from end-to-end compilation ( #36685 )
2025-03-19 11:28:33 +00:00
Mohamed Mekkouri
a861db01e5
Fix Device map for bitsandbytes tests ( #36800 )
...
fix
2025-03-19 11:57:13 +01:00
Yih-Dar
b9374a0763
Remove dist": "loadfile"
for pytest
in CircleCI jobs ( #36811 )
...
* fasterrrrr
* avoid crash in example jobs
* avoid crash in TF example jobs
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-19 11:15:09 +01:00
Yao Matrix
4fa91b1be5
fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model ( #36572 )
...
* fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model
* follow Marc's suggestion to use _tie_weights to fix
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
* fix review comments.
Signed-off-by: N <matrix.yao@intel.com>
* fix quality
Signed-off-by: N <matrix.yao@intel.com>
---------
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
Signed-off-by: N <matrix.yao@intel.com>
2025-03-19 10:48:47 +01:00
ivarflakstad
706703bba6
Expectations test utils ( #36569 )
...
* Add expectation classes + tests
* Use typing Union instead of |
* Use bits to track score in properties cmp method
* Add exceptions and tests + comments
* Remove compute cap minor as it is not needed currently
* Simplify. Remove Properties class
* Add example Exceptions usage
* Expectations as dict subclass
* Update example Exceptions usage
* Refactor. Improve type name. Document score fn.
* Rename to DeviceProperties.
2025-03-18 23:39:50 +01:00
Joao Gante
179d02ffb8
[generate] ✨ vectorized beam search ✨ ( #35802 )
2025-03-18 18:39:36 +00:00
Yoni Gozlan
12f2ebef63
Support custom dosctrings in modular ( #36726 )
...
* Override docstrings in modular if not none
* Update doc
2025-03-18 14:00:54 -04:00
Gar
00915d3041
Fix chameleon's TypeError because inputs_embeds may None ( #36673 )
...
* fix chameleon TypeError when inputs_embeds is None
* reformat
* hotfix
2025-03-18 18:59:30 +01:00
Marc Sun
14b597f518
Fix casting dtype for qunatization ( #36799 )
...
* fix
* remove print
2025-03-18 18:46:03 +01:00
Yoni Gozlan
30580f035b
Fix Mistral3 tests ( #36797 )
...
* fix processor tests
* fix modeling tests
* fix test processor chat template
* revert modeling test changes
2025-03-18 13:08:12 -04:00
Cyril Vallez
db1d4c5a0b
Loading optimizations ( #36742 )
...
* improvements
* Update modeling_utils.py
* add some doc about loading
* Update modeling_utils.py
2025-03-18 16:38:44 +01:00
Yih-Dar
7baf00089a
Update SHA for tj-actions/changed-files
( #36795 )
...
* trigger
* trigger
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-18 16:19:39 +01:00
Marc Sun
3017536ebf
fix hqq due to recent modeling changes ( #36771 )
...
* fix-hqq
* style
* test
2025-03-18 12:20:27 +01:00
Cyril Vallez
e959530b8f
Add Mistral3 ( #36790 )
...
* initial start
* style and dummies
* Create convert_mistral3_weights_to_hf.py
* update
* typo
* typo
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* up
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* update
* update
* Update image_processing_mistral3.py
* Update convert_mistral3_weights_to_hf.py
* fix patch merger
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* up
* update modular to fit
* style
* Update convert_mistral3_weights_to_hf.py
* typo
* Update modular_mistral3.py
* simplify a lot all shape shenanigans
* simplify
* add working test processor
* Add partially working common modeling tests
* All tests working and remove mistral3 image processors
* add docs and fixup
* fix inference with image size >1540
* 🚨 fix test image proc pixtral
* Remove vision_feature_select_strategy
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* clean
* fix test checkpoints
* Update test_modeling_mistral3.py
* Update test_modeling_mistral3.py
* style
* Use Pixtral processor
* up
* finish cleaning processor to use pixtral directly
* Update __init__.py
* Update processing_pixtral.py
* doc
* Update __init__.py
* Update mistral3.md
* Update _toctree.yml
---------
Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: yonigozlan <yoni.gozlan10@gmail.com>
2025-03-18 12:04:42 +01:00
Lysandre Debut
bd92073692
Fix gemma3_text tokenizer in mapping ( #36793 )
2025-03-18 11:50:22 +01:00
Zebin
7426d02ea8
Fixing typo in gemma3 image_processor_fast and adding a small test ( #36776 )
...
Co-authored-by: zebz13 <zeb@fedora>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-03-18 11:35:06 +01:00
Afanti
19b9d8ae13
chore: fix typos in tests directory ( #36785 )
...
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
2025-03-18 10:31:13 +01:00
Afanti
7f5077e536
fix typos in the tests directory ( #36717 )
2025-03-17 17:45:57 +00:00
Daniel Kleine
cbfb8d7b27
doc: Clarify is_decoder
usage in PretrainedConfig documentation ( #36724 )
...
* fix: clarify decoder usage in PretrainedConfig documentation
* Apply suggestions from code review
updated doc
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-03-17 09:40:25 -07:00
Steven Liu
ac1a1b66b9
[docs] Update README ( #36265 )
...
* update
* feedback
* feedback
* update versions
2025-03-17 09:37:19 -07:00
Joao Gante
cff4caa0c1
[CI] remove redundant checks in test_eager_matches_sdpa_inference
( #36740 )
2025-03-17 16:29:18 +00:00
Christopher Akiki
e3af4fec91
[MINOR:TYPO] Update hubert.md ( #36733 )
...
* [MINOR:TYPO] Update hubert.md
- typo fix (wave2vec instead of hubert)
- make code snippet copiable and runnable
* Run tests
2025-03-17 09:07:51 -07:00