transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-30 17:52:35 +06:00

Author	SHA1	Message	Date
jiqing-feng	737cbd2109	Fix llava xpu tests. (#37130 ) * fix llava 4bit xpu test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix llava 4bit xpu test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-04-01 11:10:13 +02:00
cyyever	786d9c5ed9	Fix more inefficient PT operations (#37060 ) * Fix inefficient operations * Remove cpu() call * Reorder detach() * Reorder detach() * tolist without detach * item without detach * Update src/transformers/models/rag/modeling_rag.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/models/encodec/test_modeling_encodec.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Use detach().cpu().numpy * Revert some numpy operations * More fixes --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-31 16:31:24 +01:00
Cyril Vallez	f304318f5f	Remove low_cpu_mem_usage and _fast_init (#36963 ) * Remove low_cpu_mem_usage and _fast_init * Update deepspeed.py * Update modeling_utils.py * remove the first 2 tests everywhere * Update test_modeling_common.py * remove what was remaining about fast_init * fix logic and simplify * mismatched keys logic update * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * fix 2 models init_weights * extend to others * remove grad * Update modeling_fsmt.py * init weights in tests * style * Update test_modeling_fsmt.py * more old models * fix more init_weights * copies * fix * style * Update modeling_lxmert.py * fix inits * more and more * more * should finalize * style * Update modeling_dinov2_with_registers.py * fix * Update modeling_encoder_decoder.py * fix * style * Update modeling_lxmert.py * post rebase cleanup * Update modeling_informer.py * back to start for device * fix * add test to detect all failing cases correctly * Update test_modeling_common.py * fix * fix * sam * style * Update modeling_maskformer_swin.py * CIs * CIs * remove test - will add it on separate PR * fix * fix * Update modeling_sam.py * CIs * CIs * CIs * convnext * suggestions * CIs * fix copies after merge --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-31 17:18:43 +02:00
Raushan Turganbay	8805600406	[qwen3] fix generation tests (#37142 ) * do not skip tests * fix qwen3-moe as well * fixup * fixup	2025-03-31 16:33:41 +02:00
Yih-Dar	a03cee7a1d	skip (#37141 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-31 15:38:40 +02:00
Guang Yang	3b07ca78bb	Export T5 (encoder-decoder) to ExecuTorch (#36486 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2025-03-31 12:10:26 +02:00
Armaghan Shakir	0710e9b1e8	Create and Expose SamVisionModel as public for better accessibility (#36493 ) * move encoder below * auto modeling * write SamVisionTester * fix vision attention shape * fix SamVisionTest * minor changes to SamVisionTest * Revert "fix vision attention shape" This reverts commit `d2a4083ae5`. * fix attention output shape in new tests * remove encoder examples * run modular on got_ocr2 * code formatting * fix got_ocr2 * ruff fixes * code quality * add sam_vision in auto modeling and auto configuration * remove composite test * updated index.md * add TFSamVisionEncoder to __init__ * fix public TFSamVisionEncoder * remove outdated todo comment * set test_torch_exportable Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * rename: VisionEncoder -> VisionModel * bring back original SamVisionEncoder * rename back: VisionEncoderOutput -> VisionModelOutput * undo changes in SamModelTester * reuse SamVisionEncoder in SamVisionModel --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-03-31 11:45:07 +02:00
cyyever	f99c279d20	Remove deprecated code (#37059 ) * Remove deprecated code * fix get_loading_attributes * fix error * skip test --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-03-31 11:15:35 +02:00
Bo Zheng	6acd5aecb3	Adding Qwen3 and Qwen3MoE (#36878 ) * Initial commit for Qwen3 * fix and add tests for qwen3 & qwen3_moe * rename models for tests. * fix * fix * fix and add docs. * fix model name in docs. * simplify modular and fix configuration issues * Fix the red CI: ruff was updated * revert ruff, version was wrong * fix qwen3moe. * fix * make sure MOE can load * fix copies --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2025-03-31 09:50:49 +02:00
Joao Gante	9fd9476005	[generate] beam search -- fix output cropping (#37080 ) * handle jagged beams * better comment * bart -- beam search tests print special tokens * more bart test updates * more tests! * better comment	2025-03-28 18:57:51 +01:00
Minho Ryu	eca74d1367	[WIP] add deepseek-v3 (#35926 ) * init commit * style * take comments into account * add deepseekv3 modeling * remove redundant code * apply make style * apply fix-copies * make format * add init files * rename deepseekv3 into deepseek_v3 based on its model_type * rename deepseekv3 into deepseek_v3 based on its model_type * deepseek-v3 not deepseek_v3 * set model_type as deepseek_v3 * use default docs * apply make * fill type and docstring * add rope_config_validation * use custom DeepseekV3MLP * hold code only for checkpoints congifuration; remove redundant * revise rope yarn for DeepSeek variation * rename DeepSeek-V3 * some refactoring * revise load_hook to work properly; make moe func trainable; use llama instead of mixtral * fix attention forward * use -1 for not-changing dim when to use exapnd * refactor DeepseekV3TopkRouter * use reshape_for_rope instead of load_hook; revise attention forward for TP; rename q_head_dim with qk_head_dim * register pre_hook and hook both * make style * use n_shared_experts * Update src/transformers/models/deepseek_v3/configuration_deepseek_v3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add test file * update modeling_file according to modular file * make style * add mapping for DeepseekV3ForSequenceClassification * remove aux_loss_alpha * add deepseek_v3 for perf * add deepseek_v3 * rename test as deepseekv3 * use tiny-deepseek-v3 * remove DeepseekV3ForSequenceClassification * cache before padding * remote output_router_logits * Revert "remote output_router_logits" This reverts commit `f264f800d0`. * remove output_router_logits * make e_score_correction_bias as buffer * skip tests not compatible * make style * make e_score_correction_bias as buffer * use rope_interleave instead of load_hook * skip tests not compatible with MLA * add doc for rope_interleave * fix typo * remove torch.no_grad for selecting topk * fix post merge issue * mrege with main and simplify * nits * final * small fixes * fix * support TP better * stash * changes currently requires * remove synch * more fixes for TP * temp fix for TP : some attention layers's FP8 scales are too small + shared is local colwise and anything is local if FP8 because weights are used * updates to have generation work! * push most of the changes * reorder functions + call for contributions! * update readme * nits * update * ruff was updated on main * merge with main and fix copies * revert unrelated changes * route all tokens to all experts when testing to avoid no gradient iddues * finish fixing all tests * fixup * nit * clean config * last readme changes * nit * do cnit * typo * last nit * one more one more --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: arthur@huggingface.co <arthur@ip-26-0-165-131.ec2.internal>	2025-03-28 15:56:59 +01:00
Yih-Dar	1fcaad6df9	Use `lru_cache` for tokenization tests (#36818 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-28 15:09:35 +01:00
cyyever	6cc9c8d7d1	Remove deprecated batch_size parameter (#37007 )	2025-03-27 15:01:56 +00:00
cyyever	41a0e58e5b	Set weights_only in torch.load (#36991 )	2025-03-27 14:55:50 +00:00
jiqing-feng	0e56fb69a2	fix pegasus init weights and other copied models (#36844 ) * fix pegasus init weights Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix the rest of models Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix informer init Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * init weight before checking Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix roformer tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix roformer tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-03-27 14:14:30 +01:00
Yih-Dar	d13c390d01	Mark 2 tests as flaky for now (#37038 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-27 10:59:47 +01:00
Abu Bakr Soliman	49b5ab6a27	Support QuestionAnswering Module for ModernBert based models. (#35566 ) * push ModernBertForQuestionAnswering * update ModernBertForQuestionAnswering * update __init__ loading * set imports for ModernBertForQuestionAnswering * update ModernBertForQuestionAnswering * remove debugging logs * update init_weights method * remove custom initialization for ModernBertForQuestionAnswering * apply make fix-copies * apply make style * apply make fix-copies * append ModernBertForQuestionAnswering to the pipeline supported models * remove unused file * remove invalid autoload value * update en/model_doc/modernbert.md * apply make fixup command * make fixup * Update dummies * update usage tips for ModernBertForQuestionAnswering * update usage tips for ModernBertForQuestionAnswering * add init * add lint * add consistency * update init test * change text to trigger stuck text * use self.loss_function instead of custom loss By @Cyrilvallez Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update modeling_modernbert.py make comparable commit to even it out * Match whitespace * whitespace --------- Co-authored-by: Matt <rocketknight1@gmail.com> Co-authored-by: Orion Weller <wellerorion@gmail.com> Co-authored-by: Orion Weller <31665361+orionw@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-03-26 21:24:18 +01:00
cyyever	e7139d06f5	Fix tensor dtype mismatch (#36985 ) * Fix tensor dtype mismatch * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-26 10:37:46 +01:00
Yih-Dar	c6814b4ee8	Update ruff to `0.11.2` (#36962 ) * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-25 16:00:11 +01:00
Raushan Turganbay	0f733110a6	Support `return_tensors` in audio chat templates (#34601 ) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky	2025-03-25 11:08:47 +01:00
Afanti	19085c28da	fix typos in the tests directory (#36932 ) * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: format codes	2025-03-25 10:49:24 +01:00
Guang Yang	69bcb86c58	Export for Phi4-mini (#36780 ) * Export for Phi4-mini * Update tests/models/phi3/test_modeling_phi3.py --------- Co-authored-by: Guang Yang <guangyang@fb.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-25 10:46:38 +01:00
Cyril Vallez	4303d88c09	Add Phi4 multimodal (#36939 ) * raw start * update * update * add to imports * update * up * simplify configs * clean configs * style * typos * Update convert_phi4_multimodal_weights_to_hf.py * Update convert_phi4_multimodal_weights_to_hf.py * fix * up * up * up * Update convert_phi4_multimodal_weights_to_hf.py * Update convert_phi4_multimodal_weights_to_hf.py * up * up * up * Update feature_extraction_phi4_multimodal.py * up * up * up * up * up * simplify configs * typo * cut code * typo * typo * typo * re * typo * up * up * up * add tests * fix * fix * Update test_modeling_phi4_multimodal.py * up * Update test_modeling_phi4_multimodal.py * doc * fix * up * up * up * up * up * up * simplify * up * simplify * config docstrings * cleanup * clean * typo * typo * fix * Update phi4_multimodal.md * fix * fix * Update test_modeling_phi4_multimodal.py * update * simplify reshapes and permutes * up * simplify special tokens * simplify processor a lot * Update processing_phi4_multimodal.py * Update processing_phi4_multimodal.py * switch to fast processor * image processor * Update image_processing_phi4_multimodal_fast.py * add lora extraction to converter * Update convert_phi4_multimodal_weights_to_hf.py * Update __init__.py * add AudioInput type in audio_utils * rewrite feature_extraction: support torch batched FFT * input_audio_embeds -> audio_input_features, input_image_embeds -> image_pixel_values * test update * not mono channel warning update * remove auto maps from processor * kargs dispatch in processor * simplify kwargs dispatch * simplify merging * remove default sampling rate * style * Update test_modeling_phi4_multimodal.py * update doc * doc * torch only feature extractor * make fake tokens adjustable * Update feature_extraction_phi4_multimodal.py * fix * Update processing_phi4_multimodal.py * simplify mask * last touch * fix copies * style * Update audio_utils.py * style * Update feature_extraction_phi4_multimodal.py * Update __init__.py * docstrings * copies * fix all checks * back to fix-copies * trigger CIs * Update feature_extraction_phi4_multimodal.py * improve tests with multimodal inputs * trigger CIs --------- Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com>	2025-03-25 09:55:21 +01:00
Raushan Turganbay	47e5432805	Deprecate #36741 and map Causal to Conditional (#36917 ) * deprecate the prev fix * reword warning and update docs * reword warning * tests * dont bloat `get_text_config()`	2025-03-25 09:13:56 +01:00
Yoni Gozlan	91455c1825	Fix processor kwargs qwen2 vl (#36890 ) * Fix qwen2_vl and qwen2_5_vl processors cutom images kwargs * change version warning	2025-03-24 13:19:26 -04:00
omahs	cbf924b76c	Fix typos (#36910 ) * fix typos * fix typos * fix typos * fix typos	2025-03-24 14:08:29 +00:00
Yih-Dar	340500b1a9	Use another repo. for Mistral3 processor testing (#36925 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-24 14:36:05 +01:00
Raushan Turganbay	57f551c78d	[chameleon] fix num image token check (#36918 ) * [chameleon] fix num image token check * embed after merging image token * skip this also * mistral require_read_token	2025-03-24 12:36:08 +01:00
Yoni Gozlan	beb9b5b022	Fix Pan and Scan on batched images Gemma3 (#36864 ) * process flattened images in fast image proc * process flattened images in low proc and add tests * remove print * add unbalanced batch test pas image proc * fix integration tests	2025-03-21 13:56:00 -04:00
Cyril Vallez	dd3933dd65	Simplify keep_in_fp32_modules logic (#36722 ) * better regex everywhere * fix * Update test_modeling_instructblip.py * BC with explanations this time otherwise it makes no sense at all * Update test_modeling_instructblip.py * style * CIs * update _keep_in_fp32_modules in blip2 * Update modeling_utils.py * Update modeling_utils.py * style * CIs * add check * trigger CIs * Update modeling_utils.py * trigger CIs	2025-03-21 16:12:59 +01:00
Sukriti Sharma	90e2df5d55	fix: loss computation after embeddings resize - mllama (#36840 ) * move loss to generation class Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * code cleanup Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * test for resize and loss computation Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix tests Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix:test for resize and loss Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix resize embedding mllama test Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * review changes Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> --------- Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>	2025-03-21 14:47:59 +01:00
Pablo Montalvo	2638d54e78	Gemma 3 tests expect greedy decoding (#36882 ) tests expect greedy decoding	2025-03-21 12:36:39 +01:00
Joao Gante	94f487626a	[generate] model defaults being inherited only happens for newer models (#36881 )	2025-03-21 11:01:09 +00:00
Afanti	26c83490d2	chore: fix typos in the tests directory (#36813 ) * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * fix: format codes * chore: fix copy mismatch issue * fix: format codes * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: restore previous words * chore: revert unexpected changes	2025-03-21 10:20:05 +01:00
Raushan Turganbay	97d2f9d8ae	Mllama: raise better error (#35934 ) * fix mllama * update test * fix test	2025-03-21 09:35:37 +01:00
Raushan Turganbay	42c489f2ae	Gemma3: fix test (#36820 ) * fix test * require_read_token and public repo ids * flash-attn test uncomment * fix torchscript	2025-03-20 18:14:53 +01:00
Haotong LIN	6515c25953	Add Prompt Depth Anything Model (#35401 ) * add prompt depth anything model by modular transformer * add prompt depth anything docs and imports * update code style according transformers doc * update code style: import order issue is fixed by custom_init_isort * fix depth shape from B,1,H,W to B,H,W which is as the same as Depth Anything * move prompt depth anything to vision models in _toctree.yml * update backbone test; there is no need for resnet18 backbone test * update init file & pass RUN_SLOW tests * update len(prompt_depth) to prompt_depth.shape[0] Co-authored-by: Joshua Lochner <admin@xenova.com> * fix torch_int/model_doc * fix typo * update PromptDepthAnythingImageProcessor * fix typo * fix typo for prompt depth anything doc * update promptda overview image link of huggingface repo * fix some typos in promptda doc * Update image processing to include pad_image, prompt depth position, and related explanations for better clarity and functionality. * add copy disclaimer for prompt depth anything image processing * fix some format typos in image processing and conversion scripts * fix nn.ReLU(False) to nn.ReLU() * rename residual layer as it's a sequential layer * move size compute to a separate line/variable for easier debug in modular prompt depth anything * fix modular format for prompt depth anything * update modular prompt depth anything * fix scale to meter and some internal funcs warp * fix code style in image_processing_prompt_depth_anything.py * fix issues in image_processing_prompt_depth_anything.py * fix issues in image_processing_prompt_depth_anything.py * fix issues in prompt depth anything * update converting script similar to mllamma * update testing for modeling prompt depth anything * update testing for image_processing_prompt_depth_anything * fix assertion in image_processing_prompt_depth_anything * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update docs/source/en/model_doc/prompt_depth_anything.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update docs/source/en/model_doc/prompt_depth_anything.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * update some testing * fix testing * fix * add return doc for forward of prompt depth anything * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update tests/models/prompt_depth_anything/test_modeling_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix prompt depth order * fix format for testing prompt depth anything * fix minor issues in prompt depth anything doc * fix format for modular prompt depth anything * revert format for modular prompt depth anything * revert format for modular prompt depth anything * update format for modular prompt depth anything * fix parallel testing errors * fix doc for prompt depth anything * Add header * Fix imports * Licence header --------- Co-authored-by: Joshua Lochner <admin@xenova.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-03-20 16:12:44 +00:00
Pavel Iakubovskii	66291778dd	Refactor Attention implementation for ViT-based models (#36545 ) * Refactor vit attention * Refactor ViT-based models * 🚨🚨🚨 Fix prefix for DPT * Update params order * trigger tests * Fix Dinov2 attention * Fix DPT attention impl propagation for backbone config * Common test fix: config is modif. inplace - avoid it * view->reshape * Fixup * Fixup * Enable IJepa FA2 * Add FA2 in corresponding model docs	2025-03-20 15:15:01 +00:00
Ryan Mullins	487dab1b2b	Shieldgemma2 (#36678 ) * single commit * correct config * fixup * dummy pt * Use ShieldGemma2Config in conversion script * Update src/transformers/models/shieldgemma2/configuration_shieldgemma2.py * Adding shieldgemma2 to models.__init__.py * Adding ShieldGemma2 to main __init__.py * Update shieldgemma2.md * Update shieldgemma2.md * Adding tests. Addressing review feedback. * Minor docs update * Fixing code quality feedback from CI * Fixing empty messages bug reported by ghunkins --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Ren Pang <ain-soph@live.com>	2025-03-20 15:14:38 +01:00
Ita Zaporozhets	3f03c379d2	fix tiktoken convert to pass AddedToken to Tokenizer (#36566 ) * pass AddedToken to Tokenizer * ruff * handle dict for special tokens * option: test tokenizer from tiktoken same as fast * ruff * ruff	2025-03-20 11:26:49 +01:00
Michael Feil	51bd0ceb9e	Update configuration_qwen2.py (#36735 ) * Update configuration_qwen2_moe.py * Update modeling_qwen2_moe.py * ruff fmt * docstring add qkv_bias	2025-03-19 18:15:54 +00:00
Joao Gante	7c233980f4	[gemma 3] multimodal checkpoints + AutoModelForCausalLM (#36741 )	2025-03-19 15:04:19 +00:00
ivarflakstad	706703bba6	Expectations test utils (#36569 ) * Add expectation classes + tests * Use typing Union instead of \| * Use bits to track score in properties cmp method * Add exceptions and tests + comments * Remove compute cap minor as it is not needed currently * Simplify. Remove Properties class * Add example Exceptions usage * Expectations as dict subclass * Update example Exceptions usage * Refactor. Improve type name. Document score fn. * Rename to DeviceProperties.	2025-03-18 23:39:50 +01:00
Joao Gante	179d02ffb8	[generate] ✨ vectorized beam search ✨ (#35802 )	2025-03-18 18:39:36 +00:00
Yoni Gozlan	30580f035b	Fix Mistral3 tests (#36797 ) * fix processor tests * fix modeling tests * fix test processor chat template * revert modeling test changes	2025-03-18 13:08:12 -04:00
Cyril Vallez	e959530b8f	Add Mistral3 (#36790 ) * initial start * style and dummies * Create convert_mistral3_weights_to_hf.py * update * typo * typo * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * up * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * update * update * Update image_processing_mistral3.py * Update convert_mistral3_weights_to_hf.py * fix patch merger * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * up * update modular to fit * style * Update convert_mistral3_weights_to_hf.py * typo * Update modular_mistral3.py * simplify a lot all shape shenanigans * simplify * add working test processor * Add partially working common modeling tests * All tests working and remove mistral3 image processors * add docs and fixup * fix inference with image size >1540 * 🚨fix test image proc pixtral * Remove vision_feature_select_strategy * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * clean * fix test checkpoints * Update test_modeling_mistral3.py * Update test_modeling_mistral3.py * style * Use Pixtral processor * up * finish cleaning processor to use pixtral directly * Update __init__.py * Update processing_pixtral.py * doc * Update __init__.py * Update mistral3.md * Update _toctree.yml --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: yonigozlan <yoni.gozlan10@gmail.com>	2025-03-18 12:04:42 +01:00
Zebin	7426d02ea8	Fixing typo in gemma3 image_processor_fast and adding a small test (#36776 ) Co-authored-by: zebz13 <zeb@fedora> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-18 11:35:06 +01:00
Afanti	19b9d8ae13	chore: fix typos in tests directory (#36785 ) * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory	2025-03-18 10:31:13 +01:00
jiqing-feng	27361bd218	fix xpu tests (#36656 ) * fix awq xpu tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix llava next video bnb tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-17 15:57:49 +01:00
jiqing-feng	2256875a77	fix can_generate (#36570 ) * fix can_generate Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix can generate for speecht5 and blip Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix speecht5 tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>	2025-03-17 14:56:18 +01:00

1 2 3 4 5 ...

1731 Commits