Cyril Vallez
ad30598923
Update Mistral converter ( #35967 )
...
* Update convert_mistral_weights_to_hf.py
* Update convert_mistral_weights_to_hf.py
* update
* style
* move it to integrations
* style
* trigger CIs
* trigger CIs
2025-02-04 11:13:12 +01:00
Ryoo Kwangrok
b1954fd64a
layernorm_decay_fix ( #35927 )
...
* layernorm_decay_fix
* W293 fix
* ruff format fix
* black format
* ruff format
* erase last layer
* add test_get_parameter_names_rmsnorm
* rmsnorm fix
2025-02-04 11:01:49 +01:00
Dmitry Tarasov
2ba040a71f
apply_chat_template: consistent behaviour for return_assistant_tokens_mask=True return_tensors=True ( #35582 )
...
* apply_chat_template: consistent return_tensors behaviour with return_assistant_tokens_mask flag
* test_chat_template_return_assistant_tokens_mask: support tokenizers with no attention mask
* test_chat_template_return_assistant_tokens_mask: skip tokenizers with no padding token
* test_chat_template_return_assistant_tokens_mask: force tokenizer padding_side=right
---------
Co-authored-by: Eduard Allakhverdov <goncharova@airi.net>
Co-authored-by: d.tarasov <d.tarasov@airi.net>
2025-02-04 10:27:52 +01:00
Pavel Iakubovskii
9c02cb6233
Fix custom kernel for DeformableDetr, RT-Detr, GroindingDINO, OmDet-Turbo in Pytorch 2.6.0 ( #35979 )
...
Updates type().is_cuda() -> .is_cuda(); .data<> -> .data_ptr<>
2025-02-04 09:07:25 +00:00
Raushan Turganbay
5d75a25b03
Qwen2-VL: fix rope delta calculation ( #36013 )
...
* fix rope delats calculation
* add test
* style
2025-02-04 09:48:29 +01:00
Alex Brooks
e284c7e954
Update Granite Vision Model Path / Tests ( #35998 )
...
* Update granite vision model path
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
* Enable granite vision test
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
---------
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
2025-02-03 20:06:03 +01:00
Gar
9d2056f12b
Add mean_resizing for every VLMs' resizing_token_embeddings() ( #35717 )
...
* refine all resize_token_embedding()
* ruff format
* hotfix
2025-02-03 15:03:49 +01:00
Arthur
7eecdf2a86
Update-tp test ( #35844 )
...
* update test for now
* up
* cleanup
* update todo
2025-02-03 09:37:02 +01:00
Yih-Dar
62db3e6ed6
use torch 2.6 for daily CI ( #35985 )
...
use torch 2.6 for CI
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-31 18:58:23 +01:00
Yoni Gozlan
2b46943195
Add GOT-OCR 2.0 to Transformers ( #34721 )
...
* init modular got_ocr2
* Get correct got_ocr architecture
* add processing
* run modular with processing
* add working inference
* apply modular
* Refactor and fix style
* Refactor, cleanup, fix style
* fix init order
* Fix docs
* add base modeling tests
* fix style and consistency
* rename doc file
* fix repo consistency
* fix inference with box
* add image processing and support for crop_to_multi_page
* Fix batch inference
* add tests
* fixup
* fix slow test
* fix docstrings
* Add model doc
* update to new init
* fix input autocast pixel_values dtype
* update doc
* move doc to multimodal
* Reformat crop_image_to_patches and add docstrings
* Fix example in forward docstring
* Address Pablo review
* [run slow] got_ocr2
* remove defaults defined twice
* apply modular
* add torch_device to integration tests
* update modular
* follow-up Pavel review
* add device variable in doc
* fix doc multi-page
* Force eager attention for vision encoder to avoid attn implementation conflict
* revert qwen2vl doc changes
* use Qwen2ForCausalLM instead of Qwen2Model
* make fixup
* refactor gotocr2 to llava style
* uniformize function names and reduce checks
* final nits
* fix pixel_values dtype error
* change checkpoint names
* fix modular
2025-01-31 11:28:13 -05:00
Joao Gante
5bbee12ac9
[Moshi] disable automatic compilation if the model can't compile ( #35992 )
...
moshi cant compile
2025-01-31 15:53:06 +00:00
eustlb
e6f4a4ebbf
[Moonshine] compute head_dim_padding at init ( #35984 )
...
compute head_dim_padding at init
2025-01-31 14:26:52 +01:00
Yoni Gozlan
d7188ba600
Add support for nested images to LLava and VipLLava ( #35558 )
...
* move make_flat_list_of_images and make_batched_videos to image_utils
* remove unnecessary is_vision_available
* move make_nested_list_of_images to image_utils
* fix fast pixtral image processor
* fix import mllama
* fix make_nested_list_of_images
* add tests
* convert 4d arrays/tensors to list
* add test_make_batched_videos
* add support nested batch of videos
* fix image processing qwen2vl
2025-01-30 16:49:20 -05:00
Marcel
e4227eb4d4
Handle empty change indices in SAM's mask to rle conversion ( #35665 )
...
* Handle empty change indices in RLE conversion for masks
* [test] Add unit tests for RLE encoding of masks in SamProcessor
* [test] Update RLE conversion tests to use TensorFlow implementation
* [test] Fix formatting in SamProcessorTest according to check_code_quality action
* [test] Fix formatting in SamProcessorTest according to check_code_quality
* [test] Refactored rle test cases into one test and used tf tensors in tf test cases
* [test] Fix: removed self parameter from refactored methods
* [test] Removed nested methods in run-length encoding tests for PyTorch and TensorFlow
* [test] Added description to individual to run-length encoding tests for PyTorch and TensorFlow.
2025-01-30 19:08:38 +00:00
Yih-Dar
47bd4296d6
not to use A100 for benchmark.yml
( #35974 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-30 18:55:36 +01:00
Nat Jeffries
693328f2bc
Support batching for UsefulSensors Moonshine ( #35922 )
...
* Add support for attention masking in moonshine.
Tested against Open ASR Leaderboard with batch size 256.
* Update comments and ensure attention masks are passed everywhere.
Perform attention mask downsampling inside of moonshine forward call.
* Hide padding behind conditional. Fix encoder/decoder masking.
- Correctly pipe encoder attention mask into decoder
- Add correct scaling factor if one is not already provided.
- Fix formatting with ruff
* Add auto generated modeling_moonshine file.
* Update formatting in generated model file.
* Address review comments.
* Fix typo.
* Add `pad_head_dim_to_multiple_of` to moonshine config.
* Correct args order for MooonshineConfig.
* Update configuration moonshine too.
* Update src/transformers/models/moonshine/modular_moonshine.py
* Update src/transformers/models/moonshine/configuration_moonshine.py
---------
Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>
2025-01-30 17:08:07 +01:00
Yih-Dar
5757681837
Less flaky for TimmBackboneModelTest::test_batching_equivalence
( #35971 )
...
* fix
* remove is_flaky
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-30 16:56:26 +01:00
Matt
e320d5542e
Revert p_mask to a list in DQA pipeline ( #35964 )
...
* p_mask back to being a list
* Remove breakpoint
2025-01-30 15:37:59 +00:00
Raushan Turganbay
365fecb4d0
Whisper: fix static cache CI ( #35852 )
...
* fix
* remove overriden method
* small change
2025-01-30 12:43:00 +01:00
Raushan Turganbay
9725e5be2f
Pixtral: vectorize patch embeddings and enable tests ( #35122 )
...
* initial POC
* - batch mix feature
* fix tests
* fix tests
* make style
* do not skip and instead fix tests
* update
* return back the test
* correct text with the correct ckpt
2025-01-30 12:40:18 +01:00
Joao Gante
8bc4c89ee9
[bart] minor test fixes ( #35965 )
...
fix tests
2025-01-30 10:00:11 +00:00
Ilyas Moutawwakil
19f2ec80cf
Fix is_causal being a tensor ( #35791 )
...
* fix is_causal being a tensor
* convert in sdpa attention only when jit tracing
2025-01-30 09:22:33 +01:00
Wing Lian
7547f55e5d
fix iterator overflow when gradient accumulation is 1 ( #35960 )
2025-01-29 14:45:09 -05:00
Joao Gante
4d3b1076a1
[generate] move max time tests ( #35962 )
...
* move max time tests to their right place
* move test to the right place
2025-01-29 17:56:46 +00:00
Boris Malashenko
4d1d489617
Update README.md ( #35958 )
...
There should be a dot after pip install .
2025-01-29 15:46:26 +00:00
Fanli Lin
f0ae65c198
[tests] further fix Tester object has no attribute '_testMethodName'
( #35781 )
...
* bug fix
* update with more cases
* more entries
* Fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-29 16:05:33 +01:00
Yih-Dar
ec7790f0d3
update docker file transformers-pytorch-deepspeed-latest-gpu
( #35940 )
...
update docker file for deepspeed
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-29 16:01:27 +01:00
Zach Mueller
5d257111c1
Trainer Refactor: Part 1 ( #35567 )
...
* start
* So far: 30%
* Small fix
* Continuing update
* Continuing
* Forgot to check if not None
* Continuing refactor
* Fix if else
* Fix ref
* Should make tests pass
* Keep grad norm same
* Document
* Apply suggestions from code review
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Err instead of info for logging RNG state error
* Seperate out to func
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-01-29 09:50:54 -05:00
Jonas Rohw
23d782ead2
Output dicts support in text generation pipeline ( #35092 )
...
* Support for generate_argument: return_dict_in_generate=True, instead of returning a error
* fix: call test with return_dict_in_generate=True
* fix: Only import torch if it is present
* update: Encapsulate output_dict changes
* fix: added back original comments
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-01-29 14:44:46 +00:00
Yih-Dar
cf90404807
Fix flaky test_assisted_decoding_matches_greedy_search
( #35951 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-29 14:50:07 +01:00
Yih-Dar
692afa102d
Update squad_convert_example_to_features
to work with numpy v2 ( #35955 )
...
* Fix
* Fix
* Fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-29 14:33:06 +01:00
Yih-Dar
c600e89f5c
Update unwrap_and_save_reload_schedule
to use weights_only=False
( #35952 )
...
* fix
* Fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-29 14:30:57 +01:00
Nadav Timor
42c8ccfd4c
fix test_generated_length_assisted_generation
( #34935 )
...
fix test_generated_length_assisted_generation
2025-01-29 12:03:45 +00:00
Mohamed Abu El-Nasr
ec7afad609
use torch constraints to check if covariance is positive definite during mean resizing. ( #35693 )
...
* use torch constraints to check for psd
* small nit
* Small change
* Small change for the ci
* nit
2025-01-28 17:33:42 +01:00
Ella Charlaix
61cbb723fc
Remove INC notebook reference in documentation ( #35936 )
...
remove INC notebook in documentation
2025-01-28 17:10:02 +01:00
NanoCode012
478c4f2d0d
fix(FA): QKV not being casted to target_dtype for FA with dpo lora ( #35834 )
...
fix(FA): QKV not being casted to target_dtype due to dtype check
2025-01-28 17:06:56 +01:00
Joao Gante
ece8c42488
Test: generate with torch.compile(model.forward)
as a fast test ( #34544 )
2025-01-28 14:10:38 +00:00
Cyril Vallez
f48ecd7608
Fix TP initialization ( #35860 )
...
* fix tp
* Update modeling_utils.py
* style
* style
* Update test_tp.py
* Update test_tp.py
* style
* Update test_tp.py
* Update test_tp.py
* Update test_tp.py
* Update test_tp.py
2025-01-28 15:07:37 +01:00
Raushan Turganbay
f85ba20449
Qwen-2-5-VL: fix CI ( #35935 )
...
fix
2025-01-28 14:51:57 +01:00
Cyril Vallez
3f860dba55
Fix mask slicing for models with HybridCache ( #35681 )
...
* correctly slice
* check mask
* Update modular_gemma2.py
* fix
* add tests
* fix typo
* finally fix mask slicing
* Finally correctly slice in all cases!!
* add test for all attention functions
* small fix in tests
* trick around dynamo tracing issue
* last update
* more robust
* kwargs propagation
* make it explicit for checkpointing
* apply modular
2025-01-28 14:35:00 +01:00
Raushan Turganbay
b764c20b09
Fix: loading DBRX back from saved path ( #35728 )
...
* fix dtype as dict for some models + add test
* add comment in tests
2025-01-28 11:38:45 +01:00
Cyril Vallez
3613f568cd
Add default TP plan for all models with backend support ( #35870 )
...
* Add some tp plans!
* More tp plans!
* Add it in the comment
* style
* Update configuration_mixtral.py
* Update configuration_phi.py
* update the layout according to special archs
* fix mixtral
* style
* trigger CIs
* trigger CIs
* CIs
* olmo2
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-28 11:20:58 +01:00
ivarflakstad
96625d85fd
Use rocm6.2 for AMD images ( #35930 )
...
* Use rocm6.2 as rocm6.3 only has nightly pytorch wheels atm
* Use stable wheel index for torch libs
2025-01-28 11:10:28 +01:00
Yih-Dar
bf16a182ba
Remove _supports_static_cache = True
for some model classes ( #34975 )
...
* use mask_fill
* remove comment
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-28 10:42:10 +01:00
Steven Liu
86d7564611
[docs] Fix Zamba2 ( #35916 )
...
fix code block
2025-01-27 11:44:10 -08:00
Matt
414658f94f
Close Zamba2Config code block ( #35914 )
...
* close zamba2 code block
* Add Zamba2 to toctree
2025-01-27 19:09:42 +00:00
Matt
63e9c941eb
Fix the config class comparison for remote code models ( #35592 )
...
* Fix the config class comparison when repeatedly saving and loading remote code models
* once again you have committed your debug breakpoint
2025-01-27 18:37:30 +00:00
Steven Liu
c550a1c640
[docs] uv install ( #35821 )
...
uv install
2025-01-27 08:49:28 -08:00
CalOmnie
cd6591bfb2
Fix typing in audio_utils.chroma_filter_bank ( #35888 )
...
* Fix typing in audio_utils.chroma_filter_bank
* Apply make style
---------
Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>
2025-01-27 16:06:03 +00:00
Isotr0py
e57b459997
Split and clean up GGUF quantization tests ( #35502 )
...
* clean up ggml test
Signed-off-by: Isotr0py <2037008807@qq.com>
* port remaining tests
Signed-off-by: Isotr0py <2037008807@qq.com>
* further cleanup
Signed-off-by: Isotr0py <2037008807@qq.com>
* format
Signed-off-by: Isotr0py <2037008807@qq.com>
* fix broken tests
Signed-off-by: Isotr0py <2037008807@qq.com>
* update comment
Signed-off-by: Isotr0py <2037008807@qq.com>
* fix
Signed-off-by: Isotr0py <2037008807@qq.com>
* reorganize tests
Signed-off-by: Isotr0py <2037008807@qq.com>
* k-quants use qwen2.5-0.5B
Signed-off-by: Isotr0py <2037008807@qq.com>
* move ggml tokenization test
Signed-off-by: Isotr0py <2037008807@qq.com>
* remove dead code
Signed-off-by: Isotr0py <2037008807@qq.com>
* add assert for serilization test
Signed-off-by: Isotr0py <2037008807@qq.com>
* use str for parameterize
Signed-off-by: Isotr0py <2037008807@qq.com>
---------
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-01-27 15:46:57 +01:00