transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-18 03:58:25 +06:00

Author	SHA1	Message	Date
Cyril Vallez	ad30598923	Update Mistral converter (#35967 ) * Update convert_mistral_weights_to_hf.py * Update convert_mistral_weights_to_hf.py * update * style * move it to integrations * style * trigger CIs * trigger CIs	2025-02-04 11:13:12 +01:00
Ryoo Kwangrok	b1954fd64a	layernorm_decay_fix (#35927 ) * layernorm_decay_fix * W293 fix * ruff format fix * black format * ruff format * erase last layer * add test_get_parameter_names_rmsnorm * rmsnorm fix	2025-02-04 11:01:49 +01:00
Dmitry Tarasov	2ba040a71f	apply_chat_template: consistent behaviour for return_assistant_tokens_mask=True return_tensors=True (#35582 ) * apply_chat_template: consistent return_tensors behaviour with return_assistant_tokens_mask flag * test_chat_template_return_assistant_tokens_mask: support tokenizers with no attention mask * test_chat_template_return_assistant_tokens_mask: skip tokenizers with no padding token * test_chat_template_return_assistant_tokens_mask: force tokenizer padding_side=right --------- Co-authored-by: Eduard Allakhverdov <goncharova@airi.net> Co-authored-by: d.tarasov <d.tarasov@airi.net>	2025-02-04 10:27:52 +01:00
Pavel Iakubovskii	9c02cb6233	Fix custom kernel for DeformableDetr, RT-Detr, GroindingDINO, OmDet-Turbo in Pytorch 2.6.0 (#35979 ) Updates type().is_cuda() -> .is_cuda(); .data<> -> .data_ptr<>	2025-02-04 09:07:25 +00:00
Raushan Turganbay	5d75a25b03	Qwen2-VL: fix rope delta calculation (#36013 ) * fix rope delats calculation * add test * style	2025-02-04 09:48:29 +01:00
Alex Brooks	e284c7e954	Update Granite Vision Model Path / Tests (#35998 ) * Update granite vision model path Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Enable granite vision test Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> --------- Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>	2025-02-03 20:06:03 +01:00
Gar	9d2056f12b	Add mean_resizing for every VLMs' resizing_token_embeddings() (#35717 ) * refine all resize_token_embedding() * ruff format * hotfix	2025-02-03 15:03:49 +01:00
Arthur	7eecdf2a86	Update-tp test (#35844 ) * update test for now * up * cleanup * update todo	2025-02-03 09:37:02 +01:00
Yih-Dar	62db3e6ed6	use torch 2.6 for daily CI (#35985 ) use torch 2.6 for CI Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-31 18:58:23 +01:00
Yoni Gozlan	2b46943195	Add GOT-OCR 2.0 to Transformers (#34721 ) * init modular got_ocr2 * Get correct got_ocr architecture * add processing * run modular with processing * add working inference * apply modular * Refactor and fix style * Refactor, cleanup, fix style * fix init order * Fix docs * add base modeling tests * fix style and consistency * rename doc file * fix repo consistency * fix inference with box * add image processing and support for crop_to_multi_page * Fix batch inference * add tests * fixup * fix slow test * fix docstrings * Add model doc * update to new init * fix input autocast pixel_values dtype * update doc * move doc to multimodal * Reformat crop_image_to_patches and add docstrings * Fix example in forward docstring * Address Pablo review * [run slow] got_ocr2 * remove defaults defined twice * apply modular * add torch_device to integration tests * update modular * follow-up Pavel review * add device variable in doc * fix doc multi-page * Force eager attention for vision encoder to avoid attn implementation conflict * revert qwen2vl doc changes * use Qwen2ForCausalLM instead of Qwen2Model * make fixup * refactor gotocr2 to llava style * uniformize function names and reduce checks * final nits * fix pixel_values dtype error * change checkpoint names * fix modular	2025-01-31 11:28:13 -05:00
Joao Gante	5bbee12ac9	[Moshi] disable automatic compilation if the model can't compile (#35992 ) moshi cant compile	2025-01-31 15:53:06 +00:00
eustlb	e6f4a4ebbf	[Moonshine] compute head_dim_padding at init (#35984 ) compute head_dim_padding at init	2025-01-31 14:26:52 +01:00
Yoni Gozlan	d7188ba600	Add support for nested images to LLava and VipLLava (#35558 ) * move make_flat_list_of_images and make_batched_videos to image_utils * remove unnecessary is_vision_available * move make_nested_list_of_images to image_utils * fix fast pixtral image processor * fix import mllama * fix make_nested_list_of_images * add tests * convert 4d arrays/tensors to list * add test_make_batched_videos * add support nested batch of videos * fix image processing qwen2vl	2025-01-30 16:49:20 -05:00
Marcel	e4227eb4d4	Handle empty change indices in SAM's mask to rle conversion (#35665 ) * Handle empty change indices in RLE conversion for masks * [test] Add unit tests for RLE encoding of masks in SamProcessor * [test] Update RLE conversion tests to use TensorFlow implementation * [test] Fix formatting in SamProcessorTest according to check_code_quality action * [test] Fix formatting in SamProcessorTest according to check_code_quality * [test] Refactored rle test cases into one test and used tf tensors in tf test cases * [test] Fix: removed self parameter from refactored methods * [test] Removed nested methods in run-length encoding tests for PyTorch and TensorFlow * [test] Added description to individual to run-length encoding tests for PyTorch and TensorFlow.	2025-01-30 19:08:38 +00:00
Yih-Dar	47bd4296d6	not to use A100 for `benchmark.yml` (#35974 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-30 18:55:36 +01:00
Nat Jeffries	693328f2bc	Support batching for UsefulSensors Moonshine (#35922 ) * Add support for attention masking in moonshine. Tested against Open ASR Leaderboard with batch size 256. * Update comments and ensure attention masks are passed everywhere. Perform attention mask downsampling inside of moonshine forward call. * Hide padding behind conditional. Fix encoder/decoder masking. - Correctly pipe encoder attention mask into decoder - Add correct scaling factor if one is not already provided. - Fix formatting with ruff * Add auto generated modeling_moonshine file. * Update formatting in generated model file. * Address review comments. * Fix typo. * Add `pad_head_dim_to_multiple_of` to moonshine config. * Correct args order for MooonshineConfig. * Update configuration moonshine too. * Update src/transformers/models/moonshine/modular_moonshine.py * Update src/transformers/models/moonshine/configuration_moonshine.py --------- Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>	2025-01-30 17:08:07 +01:00
Yih-Dar	5757681837	Less flaky for `TimmBackboneModelTest::test_batching_equivalence` (#35971 ) * fix * remove is_flaky * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-30 16:56:26 +01:00
Matt	e320d5542e	Revert p_mask to a list in DQA pipeline (#35964 ) * p_mask back to being a list * Remove breakpoint	2025-01-30 15:37:59 +00:00
Raushan Turganbay	365fecb4d0	Whisper: fix static cache CI (#35852 ) * fix * remove overriden method * small change	2025-01-30 12:43:00 +01:00
Raushan Turganbay	9725e5be2f	Pixtral: vectorize patch embeddings and enable tests (#35122 ) * initial POC * - batch mix feature * fix tests * fix tests * make style * do not skip and instead fix tests * update * return back the test * correct text with the correct ckpt	2025-01-30 12:40:18 +01:00
Joao Gante	8bc4c89ee9	[bart] minor test fixes (#35965 ) fix tests	2025-01-30 10:00:11 +00:00
Ilyas Moutawwakil	19f2ec80cf	Fix is_causal being a tensor (#35791 ) * fix is_causal being a tensor * convert in sdpa attention only when jit tracing	2025-01-30 09:22:33 +01:00
Wing Lian	7547f55e5d	fix iterator overflow when gradient accumulation is 1 (#35960 )	2025-01-29 14:45:09 -05:00
Joao Gante	4d3b1076a1	[generate] move max time tests (#35962 ) * move max time tests to their right place * move test to the right place	2025-01-29 17:56:46 +00:00
Boris Malashenko	4d1d489617	Update README.md (#35958 ) There should be a dot after pip install .	2025-01-29 15:46:26 +00:00
Fanli Lin	f0ae65c198	[tests] further fix `Tester object has no attribute '_testMethodName'` (#35781 ) * bug fix * update with more cases * more entries * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 16:05:33 +01:00
Yih-Dar	ec7790f0d3	update docker file `transformers-pytorch-deepspeed-latest-gpu` (#35940 ) update docker file for deepspeed Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 16:01:27 +01:00
Zach Mueller	5d257111c1	Trainer Refactor: Part 1 (#35567 ) * start * So far: 30% * Small fix * Continuing update * Continuing * Forgot to check if not None * Continuing refactor * Fix if else * Fix ref * Should make tests pass * Keep grad norm same * Document * Apply suggestions from code review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Err instead of info for logging RNG state error * Seperate out to func --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-01-29 09:50:54 -05:00
Jonas Rohw	23d782ead2	Output dicts support in text generation pipeline (#35092 ) * Support for generate_argument: return_dict_in_generate=True, instead of returning a error * fix: call test with return_dict_in_generate=True * fix: Only import torch if it is present * update: Encapsulate output_dict changes * fix: added back original comments --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-01-29 14:44:46 +00:00
Yih-Dar	cf90404807	Fix flaky `test_assisted_decoding_matches_greedy_search` (#35951 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 14:50:07 +01:00
Yih-Dar	692afa102d	Update `squad_convert_example_to_features` to work with numpy v2 (#35955 ) * Fix * Fix * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 14:33:06 +01:00
Yih-Dar	c600e89f5c	Update `unwrap_and_save_reload_schedule` to use `weights_only=False` (#35952 ) * fix * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 14:30:57 +01:00
Nadav Timor	42c8ccfd4c	fix `test_generated_length_assisted_generation` (#34935 ) fix test_generated_length_assisted_generation	2025-01-29 12:03:45 +00:00
Mohamed Abu El-Nasr	ec7afad609	use torch constraints to check if covariance is positive definite during mean resizing. (#35693 ) * use torch constraints to check for psd * small nit * Small change * Small change for the ci * nit	2025-01-28 17:33:42 +01:00
Ella Charlaix	61cbb723fc	Remove INC notebook reference in documentation (#35936 ) remove INC notebook in documentation	2025-01-28 17:10:02 +01:00
NanoCode012	478c4f2d0d	fix(FA): QKV not being casted to target_dtype for FA with dpo lora (#35834 ) fix(FA): QKV not being casted to target_dtype due to dtype check	2025-01-28 17:06:56 +01:00
Joao Gante	ece8c42488	Test: generate with `torch.compile(model.forward)` as a fast test (#34544 )	2025-01-28 14:10:38 +00:00
Cyril Vallez	f48ecd7608	Fix TP initialization (#35860 ) * fix tp * Update modeling_utils.py * style * style * Update test_tp.py * Update test_tp.py * style * Update test_tp.py * Update test_tp.py * Update test_tp.py * Update test_tp.py	2025-01-28 15:07:37 +01:00
Raushan Turganbay	f85ba20449	Qwen-2-5-VL: fix CI (#35935 ) fix	2025-01-28 14:51:57 +01:00
Cyril Vallez	3f860dba55	Fix mask slicing for models with HybridCache (#35681 ) * correctly slice * check mask * Update modular_gemma2.py * fix * add tests * fix typo * finally fix mask slicing * Finally correctly slice in all cases!! * add test for all attention functions * small fix in tests * trick around dynamo tracing issue * last update * more robust * kwargs propagation * make it explicit for checkpointing * apply modular	2025-01-28 14:35:00 +01:00
Raushan Turganbay	b764c20b09	Fix: loading DBRX back from saved path (#35728 ) * fix dtype as dict for some models + add test * add comment in tests	2025-01-28 11:38:45 +01:00
Cyril Vallez	3613f568cd	Add default TP plan for all models with backend support (#35870 ) * Add some tp plans! * More tp plans! * Add it in the comment * style * Update configuration_mixtral.py * Update configuration_phi.py * update the layout according to special archs * fix mixtral * style * trigger CIs * trigger CIs * CIs * olmo2 --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-28 11:20:58 +01:00
ivarflakstad	96625d85fd	Use rocm6.2 for AMD images (#35930 ) * Use rocm6.2 as rocm6.3 only has nightly pytorch wheels atm * Use stable wheel index for torch libs	2025-01-28 11:10:28 +01:00
Yih-Dar	bf16a182ba	Remove `_supports_static_cache = True` for some model classes (#34975 ) * use mask_fill * remove comment --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-28 10:42:10 +01:00
Steven Liu	86d7564611	[docs] Fix Zamba2 (#35916 ) fix code block	2025-01-27 11:44:10 -08:00
Matt	414658f94f	Close Zamba2Config code block (#35914 ) * close zamba2 code block * Add Zamba2 to toctree	2025-01-27 19:09:42 +00:00
Matt	63e9c941eb	Fix the config class comparison for remote code models (#35592 ) * Fix the config class comparison when repeatedly saving and loading remote code models * once again you have committed your debug breakpoint	2025-01-27 18:37:30 +00:00
Steven Liu	c550a1c640	[docs] uv install (#35821 ) uv install	2025-01-27 08:49:28 -08:00
CalOmnie	cd6591bfb2	Fix typing in audio_utils.chroma_filter_bank (#35888 ) * Fix typing in audio_utils.chroma_filter_bank * Apply make style --------- Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>	2025-01-27 16:06:03 +00:00
Isotr0py	e57b459997	Split and clean up GGUF quantization tests (#35502 ) * clean up ggml test Signed-off-by: Isotr0py <2037008807@qq.com> * port remaining tests Signed-off-by: Isotr0py <2037008807@qq.com> * further cleanup Signed-off-by: Isotr0py <2037008807@qq.com> * format Signed-off-by: Isotr0py <2037008807@qq.com> * fix broken tests Signed-off-by: Isotr0py <2037008807@qq.com> * update comment Signed-off-by: Isotr0py <2037008807@qq.com> * fix Signed-off-by: Isotr0py <2037008807@qq.com> * reorganize tests Signed-off-by: Isotr0py <2037008807@qq.com> * k-quants use qwen2.5-0.5B Signed-off-by: Isotr0py <2037008807@qq.com> * move ggml tokenization test Signed-off-by: Isotr0py <2037008807@qq.com> * remove dead code Signed-off-by: Isotr0py <2037008807@qq.com> * add assert for serilization test Signed-off-by: Isotr0py <2037008807@qq.com> * use str for parameterize Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-01-27 15:46:57 +01:00

... 28 29 30 31 32 ...

19383 Commits