transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-05 22:00:09 +06:00

Author	SHA1	Message	Date
Vinh H. Pham	49b9a69a36	Add Fast Image Processor for Flava (#37135 ) * support flava fast image processor * run style and quality * update test * update according to reviews * make style * update comment on BICUBIC * make style --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-14 15:05:31 +02:00
Vinh H. Pham	e7f5724efd	Add Fast Image Processor for Perceiver (#37176 ) * add test and fast image processor * make style * Update src/transformers/models/perceiver/image_processing_perceiver_fast.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * make style --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-14 13:49:13 +02:00
BakerBunker	4b8c6d4cf8	Add Qwen2.5-Omni (#36752 ) * Add qwen2.5-omni * Remove einops dependency * Add torchdiffeq dependency * Sort init * Add torchdiffeq to extras['diffeq'] * Fix repo consistency * use cached_file * del odeint * renew pytest * format * Remove torchdiffeq * format * fixed batch infer bug * Change positional_embedding to parameter * Change default speaker * Config revision * Use modular & code clean * code clean * decouple padding with model & code cleaning * sort init * fix * fix * Second code review * fix * fix * rename vars to full name + some comments * update pytest * Code clean & fix * fix * style * more clean up * fixup * smaller vision model in tests * fix processor test * deflake a bit the tests (still flaky though) * de-flake tests finally + add generation mixin * final nits i hope * make sure processor tests are complete * replace with Qwen2_5OmniForConditionalGeneration * fix tests after updating ckpt * fix typos when cleaning, also we can't change ckpt * fixup * images and videos kwargs for processor * thinker and talker loadable from hub ckpt * address comments and update tests after rebase * fixup * skip for now * fixup * fixup * remove torch dependency in processors --------- Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.con> Co-authored-by: feizi.wx <feizi.wx@alibaba-inc.com> Co-authored-by: raushan <raushan@huggingface.co>	2025-04-14 12:36:41 +02:00
Yih-Dar	ac1df5fccd	Fix tests failed with gated repos. (#37484 ) * fix * slow --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-14 12:08:13 +02:00
Yao Matrix	47b9f06aa2	make test_snowman_image_captioning pass on XPU, by sharing same atol w/ ROCM (#37480 ) Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-04-14 11:39:45 +02:00
Joao Gante	aaf129cdae	[agents] remove agents 🧹 (#37368 )	2025-04-11 18:42:37 +01:00
Alex Brooks	623d395aff	Add Granite Speech Support (#36801 ) * First pass at speech granite Add encoder / projector, rename things * Combine into one model file with causal lm outputs for forward * Add loss calc * Fix config loading Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Split new / old loading logic * Use transformers integration for loading peft adapters * Add generation wrapper for selective lora enablement * Add note for qformer encoder automodel * Guard torch/audio imports in feature extractor * Handle granite speech autoclasses * Handle optional deps in package structure for granite speech * Add granite pretrained model def for init * Add dummy objects for torch/torchaudio * Add tests for granite speech processor * Minor formatting fixes and refactoring * Add options for falling back to config in forward * Tentative model docstrings for granite speech * Fix config type * Remove legacy load * Allow non-lora variants for granite speech * Override weight tying for llm * Use text config instead of llm config * Add output embeddings getter to fix weight tying * Fix relative imports * computing the number of audio features, based on the raw audio sequence. * collating audio inputs, and keeping the original lengths. * asserted we have text. otherwise we can't specify the audio special token. * assering the number of audio-symbols/audios match correctly. running get validated_audios only when audio is present * indentation bugfix + supporting different feature lengths when expanding audio. * redundant, done in _get_validated_text * adapting the tests: - we must have text (not either audio or text) - _get_num_audio_features takes a list of raw lengths, provided it insetad. * Minor cleanup, remove unused import * Add more tests for batch feature processing * Allow setting offset in rel position embeddings * Add config option for warning if peft is not installed w/ lora * Port blip2 qformer code into granite speech * Add sad test for numpy arr processing * Allow numpy arrays / tuples in granite speech processor * Fix config type for projector * - pad instead of creating a zeros tensor, to keep the original dtype/device (support bfloat16) - cast input_features to the model dtype (support bfloat16) * merge Blip2QFormerConfig to GraniteSpeechProjectorConfig * prevent a crash when re-saving/loading the model (line 109) * consider additional edge cases during preprocessing. * consider additional edge cases during preprocessing. * add features mask for batched inference (bugfix) * Minor refactor, remove multiaudio processor tests * Add set input/output embeddings for granite speech * Fix feature dim check in processor test * Pop input features in embed test for granite speech * Small fixes for test edge cases Add granite speech to seq2seq causal lm mapping names * Add small tests for granite speech model * Fix data parallelism test * Standardize model class names * Fix check for copies * Fix misaligned init check * Skip granite speech in checkpoint check * Use default for tie_word_embeddings in granite speech * Fix non documentation granite speech repo issues * Fix comments and docstring checks * Add placeholder docs for granite speech * Fix test naming collision * Code formatting * Rerun torch dummy obj regen * Fix save pretrained for granite speech * Import sorting * Fix tests typo * Remove offset hack * Pass args through encoder config * Remove unused prune heads from blip2 * removing einsum. replaced with explicit multiplication (relative positional encodings) and sdpa attention. * remove Sequential from ConformerFeedForward and ConformerConvModule. + fix for sdpa attention * remove GraniteSpeechConformerScale * rename to hidden_states * rename conformer layers to self.layers, remove the first linear from the list to keep the list homogenous. * move pre-norm to the attention/feedforward blocks (avoid complex module wrapping) * adding pre_norm into forward * feature extractor refactoring to resemble how it's done in phi4multimodal. * rename feature_extractor to audio_processor * bugfix: input_feature_mask fix to get the exact number tokens. * Fix pytest decorator in processor test * Add (disabled) integration tests for granite speech * Fix handling of optional feature masking * Loosen validation in processing for vLLM compatability * Formatting fixes * Update init structure to mirror llama * Make granite speech projector generic * Update test config to reflect generic projector * Formatting fixes * Fix typos, add license * Fix undefined var in input processing * Cleanup and expose ctc encoder * Add missing config docstrings * Better var names, type hints, etc * Set attn context size in init * Add max pos emb to encoder config * Cleanup feature extractor * Add granite speech architecture details * Remove granite speech qformer ref * Add paper link, explicit calc for qkv * Calculate padding directly in depthwise conv1d init * Raise value error instead of asserting * Reorder class defs (classes used at top) * Precompute relpos distances * Run formatting * Pass attention distances through forward * Apply suggestions from code review Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * Add todo for using common batch feature extraction * Rename audios/features * Ensure chat template may be provided to processor * Move granite speech docs to audio models * Add todos for input proc refactoring * Fix import order * Guard torch import * Use relative imports * Require torch backend for processor in granite speech * Add backend guards in feature extractor --------- Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> Co-authored-by: Avihu Dekel <avihu.dekel@ibm.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>	2025-04-11 18:52:00 +02:00
Matt	bf46e44878	🚨 🚨 Allow saving and loading multiple "raw" chat template files (#36588 ) * Add saving in the new format (but no loading yet!) * Add saving in the new format (but no loading yet!) * A new approach to template files! * make fixup * make fixup, set correct dir * Some progress but need to rework for cached_file * Rework loading handling again * Small fixes * Looks like it's working now! * make fixup * Working! * make fixup * make fixup * Add TODO so I don't miss it * Cleaner control flow with one less indent * Copy the new logic to processing_utils as well * Proper support for dicts of templates * make fixup * define the file/dir names in a single place * Update the processor chat template reload test as well * Add processor loading of multiple templates * Flatten correctly to match tokenizers * Better support when files are empty sometimes * Stop creating those empty templates * Revert changes now we don't have empty templates * Revert changes now we don't have empty templates * Don't support separate template files on the legacy path * Rework/simplify loading code * Make sure it's always a chat_template key in chat_template.json * Update processor handling of multiple templates * Add a full save-loading test to the tokenizer tests as well * Correct un-flattening * New test was incorrect * Correct error/offline handling * Better exception handling * More error handling cleanup * Add skips for test failing on main * Reorder to fix errors * make fixup * clarify legacy processor file docs and location * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Rename to _jinja and _legacy * Stop saving multiple templates in the legacy format * Cleanup the processing code * Cleanup the processing code more * make fixup * make fixup * correct reformatting * Use correct dir name * Fix import location * Use save_jinja_files instead of save_raw_chat_template_files * Correct the test for saving multiple processor templates * Fix type hint * Update src/transformers/utils/hub.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Patch llava_onevision test * Update src/transformers/processing_utils.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Refactor chat template saving out into a separate function * Update tests for the new default * Don't do chat template saving logic when chat template isn't there * Ensure save_jinja_files is propagated to tokenizer correctly * Trigger tests * Update more tests to new default * Trigger tests --------- Co-authored-by: Lucain <lucainp@gmail.com> Co-authored-by: Julien Chaumond <julien@huggingface.co>	2025-04-11 16:37:23 +01:00
Bowen Bao	6cef03ba66	[Regression] Fix Quark quantized model loading after refactorization (#37407 )	2025-04-11 13:43:36 +02:00
Raushan Turganbay	a563999a02	[processor] clean up mulitmodal tests (#37362 ) * clkea up mulitmodal processor tests * fixup * fix tests * fix one last test * forgot	2025-04-11 13:32:19 +02:00
Lysandre Debut	54a123f068	Simplify soft dependencies and update the dummy-creation process (#36827 ) * Reverse dependency map shouldn't be created when test_all is set * [test_all] Remove dummies * Modular fixes * Update utils/check_repo.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * [test_all] Better docs * [test_all] Update src/transformers/commands/chat.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * [test_all] Remove deprecated AdaptiveEmbeddings from the tests * [test_all] Doc builder * [test_all] is_dummy * [test_all] Import utils * [test_all] Doc building should not require all deps --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-04-11 11:08:36 +02:00
Yao Matrix	c7064cdba1	enhance require_deterministic_for_xpu (#37437 ) * enhance require_deterministic_for_xpu Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-04-11 08:06:08 +02:00
cyyever	371c44d0ef	Remove old code for PyTorch, Accelerator and tokenizers (#37234 ) * Remove unneeded library version checks Signed-off-by: cyy <cyyever@outlook.com> * Remove PyTorch condition Signed-off-by: cyy <cyyever@outlook.com> * Remove PyTorch condition Signed-off-by: cyy <cyyever@outlook.com> * Fix ROCm get_device_capability Signed-off-by: cyy <cyyever@outlook.com> * Revert "Fix ROCm get_device_capability" This reverts commit `0e756434bd`. * Remove unnecessary check Signed-off-by: cyy <cyyever@outlook.com> * Revert changes Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-04-10 20:54:21 +02:00
Isotr0py	6daec12d0b	Add GGUF support to Gemma3 Text backbone (#37424 ) * add gemma3 gguf support Signed-off-by: Isotr0py <2037008807@qq.com> * fix typo and add gguf limit Signed-off-by: Isotr0py <2037008807@qq.com> * fix a typo Signed-off-by: Isotr0py <2037008807@qq.com> * add vision conversion test Signed-off-by: Isotr0py <2037008807@qq.com> * fix typos Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-10 17:15:43 +02:00
Mohamed Mekkouri	9c0c323e12	Fix require_read_token (#37422 ) * nit * fix * fix	2025-04-10 17:01:40 +02:00
Mario Michael Krell	bde41d69b4	Correctly drop tokens in SwitchTransformer (#37123 ) Previously, the identity function was used for dropped tokens with a weight from the expert that was not applied to the hidden states. This was misleading, because dropping means, the expert weight is zero. Instead of trying to fix the weight, we take an easier approach by initializing with zeros. Fixes issue https://github.com/huggingface/transformers/issues/37017	2025-04-10 16:58:57 +02:00
AbdelKarim ELJANDOUBI	7ecc5b88c0	Add image classifier donut & update loss calculation for all swins (#37224 ) * add classifier head to donut * add to transformers __init__ * add to auto model * fix typo * add loss for image classification * add checkpoint * remove no needed import * reoder import * format * consistency * add test of classifier * add doc * try ignore * update loss for all swin models	2025-04-10 15:00:42 +02:00
Mohamed Mekkouri	5ae9b2cac0	Quark Quantization gated repo (#37412 ) * fix * empty commit * empty * nit * fix maybe ?	2025-04-10 14:57:15 +02:00
Raushan Turganbay	1ae8d54b04	[chat-template] Unify tests and clean up 🧼 (#37275 ) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now	2025-04-10 14:42:32 +02:00
ivarflakstad	aa478567f8	Allow rocm systems to run these tests (#37278 ) * Allow rocm systems to run these tests * Fix skipTest logic * Use get_device_properties to check system capabilities	2025-04-10 13:33:01 +02:00
Sangyun_LEE (이상윤)	ad340908e4	Fix warning message for PEFT models in text-generation pipeline #36783 (#36887 ) * add peft model in constant * add test * fix formating * make fixup execute * change code * check by self.task * add test * fixup test code * fix minor typo * fix pipeline test * apply maintainers reqests	2025-04-09 15:36:52 +01:00
Arthur	e3eda6d188	Add glm4 (#37388 ) * add changed * Revert "add changed" This reverts commit `0a0166a1fe`. * update with NEW MODEL class called GLM4 * update * Update glm4.md * Name * style * fix copies * fixup test --------- Co-authored-by: Yuxuan Zhang <2448370773@qq.com>	2025-04-09 14:02:04 +02:00
Raushan Turganbay	6f4058aee3	Update composition flag usage (#36263 ) * update composition flag usage * remove print * fix tests * actually fix * oh c'mon * now should be fixed right? * fix copies	2025-04-09 11:48:49 +02:00
Matt	4d0de5f73a	🚨 🚨 Setup -> setupclass conversion (#37282 ) * More limited setup -> setupclass conversion * make fixup * Trigger tests * Fixup UDOP * Missed a spot * tearDown -> tearDownClass where appropriate * Couple more class fixes * Fixups for UDOP and VisionTextDualEncoder * Ignore errors when removing the tmpdir, in case it already got cleaned up somewhere * CLIP fixes * More correct classmethods * Wav2Vec2Bert fixes * More methods become static * More class methods * More class methods * Revert changes for integration tests / modeling files * Use a different tempdir for tests that actually write to it * Remove addClassCleanup and just use teardownclass * Remove changes in modeling files * Cleanup get_processor_dict() for got_ocr2 * Fix regression on Wav2Vec2BERT test that was masked by this before * Rework tests that modify the tmpdir * make fix-copies * revert clvp modeling test changes * Fix CLIP processor test * make fix-copies	2025-04-08 17:15:37 +01:00
Jonathan Mamou	121f91d36c	prune LM Head for USD (#36695 ) * initial commit * fix * fix style * set default to prune * add tests * comment * remove prune flag from generate * address Joao's comments * deprecate_kwarg * add doc * fix target_vocab_size * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * fix deprecated argument assistant_model_device --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-04-08 16:44:10 +01:00
Joao Gante	4321b0648c	[core] remove `GenerationMixin` inheritance by default in `PreTrainedModel` (#37173 )	2025-04-08 16:42:05 +01:00
cyyever	1e6b546ea6	Use Python 3.9 syntax in tests (#37343 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-04-08 14:12:08 +02:00
Matt	f789f960c8	Avoid build crashes when torch.version.xpu doesn't exist and fix Llama4 processor tests (#37346 ) * Avoid build crashes when torch.version.xpu doesn't exist * Trigger tests * Fix image token and skip inappropriate test * Remove ignore_errors=True * Add another skip	2025-04-07 17:05:54 +01:00
Yao Matrix	12bf24d6ae	enable 2 llama UT cases on xpu (#37126 ) * enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu Signed-off-by: YAO Matrix <matrix.yao@intel.com> * switch to use Expectations Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * extract gen bits from architecture and use it Signed-off-by: YAO Matrix <matrix.yao@intel.com> * add cross refererence Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-07 16:02:14 +02:00
Yih-Dar	e7ad077012	byebye torch 2.0 (#37277 ) * bump Torch 2.1 with broken compatibility `torch.compile` * dep table * remove usage of is_torch_greater_or_equal_than_2_1 * remove usage of is_torch_greater_or_equal_than_2_1 * remove if is_torch_greater_or_equal("2.1.0") * remove torch >= "2.1.0" * deal with 2.0.0 * PyTorch 2.0+ --> PyTorch 2.1+ * ruff 1 * difficult ruff * address comment * address comment --------- Co-authored-by: Jirka B <j.borovec+github@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-07 15:19:47 +02:00
jiqing-feng	99f9f1042f	Fix torchao usage (#37034 ) * fix load path Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix path Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Fix torchao usage Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert useless change Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert fp8 test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix fp8 test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix fp8 test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix torch dtype Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-07 14:50:48 +02:00
Arthur	25b7f27234	Add llama4 (#37307 ) * remove one of the last deps * update fast image processor after refactor * styling * more quality of life improvements * nit * update * cleanups * some cleanups * vllm updates * update fake image token * [convert] Fix typo * [convert] Strip extraneous bytes from shards * [convert] Minor fixes * [convert] Use num_experts * multi-image fixes in modeling + processor * fixup size * 128 experts * Use default rope * Unfuse mlp * simplify a lot inputs embeds merging * remove .item() 👀 * fix from review * Address feedback * Use None "default" for rope_scaling. Add eot. * set seed * return aspect ratios and bug fixes * Moe 128 rebased (#8) * 128 experts * Use default rope * Unfuse mlp * Address feedback * Use None "default" for rope_scaling. Add eot. * Meta/llama quant compat (#7) * add quant compatible model & conversion code for llama4 * fix a few issues * fix a few issues * minor type mapping fix --------- Co-authored-by: Lu Fang <fanglu@fb.com> * use a new config parameter to determine which model definition to use for MoE --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Lu Fang <fanglu@fb.com> * un-comment write_tokenizer from converting script * remove un-used imports * [llama4] Pop aspect_ratios from image processor output in Llama4Processor Signed-off-by: Jon Swenson <jmswen@gmail.com> * Fix parameter_count name * Update src/transformers/models/llama4/configuration_llama4.py * nit * Add changes for no_rope, moe_layers, chunked attention. Just need to test all * Update src/transformers/models/llama4/image_processing_llama4_fast.py * nit * fix post merge with main * support flex attention * fixes * fix * add layer * small updates * rebase and delete llm_compressor * nit * [llama4/mm] Add back <\|image\|> token that delimits global tile * [llama4/mm] Fix Llama 4 image processing unit tests * add explicit dtype Signed-off-by: Jon Swenson <jmswen@gmail.com> * sdpa works * comment todo small * fix model loading Signed-off-by: Zijing Liu <liuzijing2014@gmail.com> * revert * nits * small fix for TP on 1 node * Read new params from config * Add <\|eom\|> * lol don't know how this got here * adding fp8 * Save processor, fix chat template * style * Add boi/eoi tokens We don't use them. * fixes for now flex seems to work :) * updates * nits * updates * missking keys * add context parallel * update * update * fix * nits * add worldsize and make eager attn work for vision * Ignore new key present in base models * add tp_plan * fix nope Signed-off-by: Zijing Liu <liuzijing2014@gmail.com> * minor fix Signed-off-by: Zijing Liu <liuzijing2014@gmail.com> * Clean up Llama4 vision model * current updates * add support for `attn_temperature_tuning` * add floor scale * add missing attn scales * push what works, dirty trick for the device synch * oups * Fix pad_token_id See https://huggingface.co/ll-re/Llama-4-Scout-17B-16E/discussions/2/files Confirmed in the original codebase. * fix causallml loading * rm * fix tied-weights * fix sdpa * push current version * should work with both short and long * add compressed_tensos & fix fbgemm tp * Fix flex impl * style * chunking * try to revert the potentially breaking change * fix auto factory * fix shapes in general * rm processing * commit cache utils cleanup * Fix context length * fix * allocate * update tp_plan * fix SDPA! * Add support for sparse `Llama4TextMoe` layer from the kernel hub * cleanup * better merge * update * still broken fixing now * nits * revert print * Write max_position_embeddings and max_model_length * Update modeling_llama4.py * Save attention_chunk_size * Sync eos terminators * Read initializer_range * style * remove `dict` * fix * eager should use `chunked_attention_mask` * revert * fixup * fix config * Revert "Merge pull request #36 from huggingface/sparse-llama4-moe" This reverts commit `ccda19f050`, reversing changes made to `a515579aed`. * Fix typo and remove warning with compiled flex and chunked prefill * Fix MoE vs FF (#41) * fix * Use correct no_rope_layers if provided one is empty list * update tests * fix * skipping some tests * fix fp8 loading Signed-off-by: Zijing Liu <liuzijing2014@gmail.com> * fix text geneartion pipeline Signed-off-by: Zijing Liu <liuzijing2014@gmail.com> * eager needs 4D mask * fix * Some cleanup * fix * update * fix * replace correctly module * patch * modulelist * update * update * clean up * Don't move to `cuda:0` in distributed mode * restrict to compressed tensors for now * rm print * Docs! * Fixes * Update docs/source/en/model_doc/llama4.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Fixes * cuda graph fix * revert some stuff * fixup * styling * Update src/transformers/models/llama4/modeling_llama4.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup * commit licence, cleanup here and there and style * more styling changes * fix dummies * fix and clean docstrings * remove comment * remove warning * Only fast image processor is supported * nit * trigger CI * fix issue with flex encoder * fix dynamic cache * Code quality * Code quality * fix more tests for now * Code quality * Code quality * Nuke bunch of failing stuff * Code quality * Code quality * cleanup removal of slow image processor * ruff fix fast image processor * fix * fix styling * Docs * Repo consistency * Repo consistency * fix sliding window issue * separate llama cache * styling * Repo consistency * Repo consistency * push waht works * L4 Repo consistency * Docs * fix last last alst alst alst alstsaltlsltlaslt --------- Signed-off-by: Jon Swenson <jmswen@gmail.com> Signed-off-by: Zijing Liu <liuzijing2014@gmail.com> Co-authored-by: yonigozlan <yoni.gozlan10@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Pablo Montalvo <pablo.montalvo.leroux@gmail.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> Co-authored-by: Keyun Tong <tongkeyun@gmail.com> Co-authored-by: Zijing Liu <liuzijing2014@users.noreply.github.com> Co-authored-by: Lu Fang <fanglu@fb.com> Co-authored-by: Zijing Liu <liuzijing2014@gmail.com> Co-authored-by: Jon Swenson <jmswen@gmail.com> Co-authored-by: jmswen <jmswen@users.noreply.github.com> Co-authored-by: MekkCyber <mekk.cyber@gmail.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Mohit Sharma <mohit21sharma.ms@gmail.com> Co-authored-by: Yong Hoon Shin <yhshin@meta.com> Co-authored-by: Marc Sun <marc@huggingface.co> Co-authored-by: drisspg <drisspguessous@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Daniël de Kok <me@danieldk.eu> Co-authored-by: Lysandre <hi@lysand.re> Co-authored-by: Ye (Charlotte) Qi <ye.charlotte.qi@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-05 22:02:22 +02:00
Rahul Tuli	ebe47ce3e9	Fix: Unexpected Keys, Improve `run_compressed`, Rename Test Folder (#37077 )	2025-04-04 21:30:11 +02:00
byi8220	a4e55fcff8	Disable delay_optimizer_creation in `Trainer` to support fsdp2 (#37147 ) * github why you do this * fix * make fixup * disable cpu offload test * fixup * tmp reworks * git branch movement * make fixup * add require_fsdp_v2_version * dep issues * update ruff and fixup	2025-04-04 20:11:37 +02:00
Matt	8ebc435267	Fix llava_onevision tests (#37280 ) * Fix llava_onevision tests * Trigger tests	2025-04-04 15:03:38 +01:00
Joao Gante	acbcb5d07d	[Tests] flaky `test_constrained_beam_search_generate_dict_output` (#37276 )	2025-04-04 13:38:42 +01:00
cyyever	edd345b52e	Fix deprecated PT functions (#37237 ) * Fix deprecated PT functions Signed-off-by: cyy <cyyever@outlook.com> * Revert some changes Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-04-04 12:31:11 +01:00
Nikos Antoniou	f74d7da836	Introduce modular files for speech models (#35902 ) * WAV_2_VEC_2 to WAV2VEC2 * added modular files for hubert, wavlm, wav2vec2_bert, data2vec_audio * remove unnessary definitions in modulars * added modular files for UniSpeech, UniSpeechSat, Wav2Vec2Conformer * docstring fix for UniSpeechForCTC * removed unneccessary re-definition of modular classes * reverted lazy imports change on modular_model_converter, type-alias for Wav2Vec2BaseModelOutput * top-level import of deepspeed in seamless_m4t, speecht5 * avoid tracking imports inside classes, relocate lazy deepspeed, peft imports in their original locations * convert modular * tiny modular typing fixes * some more modular fixes * make style --------- Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com>	2025-04-04 11:46:27 +02:00
Raushan Turganbay	41b9b92b52	[qwen-vl] fix image processor (#37258 ) * fix * add test	2025-04-03 19:48:56 +02:00
Matt	2d46a08b63	Purge unused ModelTester code (#37085 ) * Purge correctly this time * Remove more methods from recent PRs * make fixup	2025-04-03 17:48:35 +01:00
Joao Gante	9a1c1fe7ed	[CI] green llama tests (#37244 ) * green llama tests * use cleanup instead * better test comment; cleanup upgrade * better test comment; cleanup upgrade	2025-04-03 14:15:53 +01:00
Matt	782d7d945d	Allow flexible generation params arg when checking pipeline specs (#37211 ) * Allow flexible generation params arg * Trigger tests * Add docstring and rename js_generate to hub_generate	2025-04-03 13:29:36 +01:00
Yao Matrix	f697b3f824	enable 2 types of case on XPU (#37198 ) enable 2 types of case on XPU 1. test_resize_tokens_embeddings_with_deepspeed_multi_gpu 2. test_resize_embeddings_untied_with_deepspeed_multi_gpu Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-04-03 11:37:55 +02:00
Joao Gante	2099287a59	[CI] lazy loading external datasets (#37218 )	2025-04-03 09:57:45 +01:00
Fanli Lin	a0803a9555	[tests] fix mamba integration simple inference precision issue (#37193 ) * fix precision issue * use float32	2025-04-03 10:38:03 +02:00
Cyril Vallez	6ce238fe7a	Fix test (#37213 ) * Update test_modeling_common.py * style	2025-04-03 10:24:34 +02:00
Matt	3d133cc557	Stop DOSing the Hub in the CI (#37209 ) * As the title suggests, stop hammering the same files * make fixup * Use shutil instead of pathlib	2025-04-02 17:19:33 +01:00
Joao Gante	e90d55ebcc	[Tests] add `min_new_tokens` to prevent flaky length checks (#37175 )	2025-04-02 15:24:00 +01:00
Matt	cbfa14823b	No more dtype_byte_size() (#37144 ) * No more dtype_byte_size() * Remove function once again * Fix rebase cruft * Trigger tests	2025-04-02 14:58:38 +01:00
Yih-Dar	adfc91cd46	Try to avoid/reduce some remaining CI job failures (#37202 ) * try * try * Update tests/pipelines/test_pipelines_video_classification.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-04-02 14:39:57 +02:00
Xavier Dupré	6f5dc9c82e	Fixes DynamicCache export issues due to control flow and inplace modifications (#36652 ) * Remove unnecessary masked_fill in deberta models * Enable some code when exporting but not compiling * add missing import * style * replace if by torch.cond * style * use numel * style * add unit tests * style * change empty value for dynamic cache * replace != [] by numel() * fix import issue * style	2025-04-02 12:04:40 +01:00
Jerry Zhang	a165458901	Add device workaround for int4 weight only quantization after API update (#36980 ) * merge * fix import * format * reformat * reformat --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-04-02 12:42:22 +02:00
Raushan Turganbay	211e4dc9a4	[chat-template] fix video loading (#37146 ) * fix * add video * trigger * push new iamges * fix tests * revert --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-02 11:27:50 +02:00
Yih-Dar	35253076f4	Avoid pipeline test failing related to Hub call (#37170 ) * cls * cls * cls --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-01 18:22:45 +02:00
Pavel Iakubovskii	3249c5dc15	Refactor attention for SigLIP based models (#36981 ) * Update Siglip attention implementation * Update tests for Siglip * Remove one level of indentation * Update test to be more specific * Fixup * Idefics2 * Idefics3 * Emu3 * SmolVLM * Phi4 (just init small update) * Idefics2 (test fix) * Update siglip2 tests * Update eager * trigger * Clean up * Transfer inputs to device in test * Fixing test * Fixing test * Revert contiguous * Remove unused is_flash_attn_2_available * Move flaky to specific models	2025-04-01 15:37:25 +02:00
Yao Matrix	24e311f42b	fix XPU UT error case brough by RNG difference btw XPU and CUDA (#37121 ) * fix XPU UT error case brough by RNG difference btw XPU and CUDA Signed-off-by: YAO Matrix <matrix.yao@intel.com> * enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu Signed-off-by: YAO Matrix <matrix.yao@intel.com> * Revert "enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu" This reverts commit `3ef83a4f02`. --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-04-01 13:52:55 +01:00
Tom Aarsen	897ff9af0e	[`ModernBERT`] Never save 'reference_compile' config; should be set based on end user (#36305 ) * Never save 'reference_compile' config; should be set based on end user * Reformat (I ran 'make style' from the wrong env) * Use pop instead of del Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Use pop instead of del Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-04-01 14:14:39 +02:00
Qizhi Chen	fac70ff3c0	Convert `_VALID_DICT_FIELDS` to class attribute for shared dict parsing in subclasses (#36736 ) * make _VALID_DICT_FIELDS as a class attribute * fix test case about TrainingArguments	2025-04-01 12:29:12 +02:00
Yao Matrix	8f6b27eb5c	enable `test_assisted_decoding_in_different_gpu` test on XPU (#37120 ) Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-04-01 11:22:59 +02:00
jiqing-feng	737cbd2109	Fix llava xpu tests. (#37130 ) * fix llava 4bit xpu test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix llava 4bit xpu test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-04-01 11:10:13 +02:00
jiqing-feng	3a6ab46a0b	add gpt2 test on XPU (#37028 ) * add gpt2 test on XPU Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * auto dtype has been fixed Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * convert model to train mode Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-04-01 11:09:29 +02:00
cyyever	786d9c5ed9	Fix more inefficient PT operations (#37060 ) * Fix inefficient operations * Remove cpu() call * Reorder detach() * Reorder detach() * tolist without detach * item without detach * Update src/transformers/models/rag/modeling_rag.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/models/encodec/test_modeling_encodec.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Use detach().cpu().numpy * Revert some numpy operations * More fixes --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-31 16:31:24 +01:00
Pavel Iakubovskii	a1e389e637	Refactor `return_dict` logic to remove complicated if/else paths (#36794 ) * SAM * CLIP * SigLIP * GOT-OCR2 (depends on SAM) * SigLIP2 (depends on SigLIP) * trigger tests * Fix SAM * Fix missed indexing, use named attributes * Llama * Aria * Bamba * Update llama: missed outputs return type * (fixup) Aria * DiffLlama * Emu3 * Gemma * Gemma2 * Paligemma * Fix paligemma * Gemma3 * GLM * Helium * JetMoe * Jamba * Mistral * Mistral * Mixtral * Nemotron * Olmo * Olmo2 * Persimmon * Phi * Phi3 * PhiMoe * Qwen2 * Qwen2_moe * StableLM * Starcoder2 * Add return_dict decorator * SAM * Update decorator: compile, export, trace - friendly * Llama (decorator) * SAM (decorator) * Add decorator `can_return_tuple` * Llama * Update to decorator * Update CLIP * Update decorator to store `_is_top_level_module` in self * Update decorator to correctly handle compile/export * Remove is_torchdynamo_compiling constraint, all work fine with self attribute assignment * Typing * GPT NeoX * Fixup * Fix attribute Granite * Fix return type mixtral * Update Gemma3 * Fix Cohere amd Cohere2 * Fixup * Fix corner case for Phi4, when activation is shared * (fix-copies) deepseekv3, phi4 * Fixup * Apply to qwen3/qwen3_moe * Fix	2025-03-31 16:23:37 +01:00
Cyril Vallez	f304318f5f	Remove low_cpu_mem_usage and _fast_init (#36963 ) * Remove low_cpu_mem_usage and _fast_init * Update deepspeed.py * Update modeling_utils.py * remove the first 2 tests everywhere * Update test_modeling_common.py * remove what was remaining about fast_init * fix logic and simplify * mismatched keys logic update * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * fix 2 models init_weights * extend to others * remove grad * Update modeling_fsmt.py * init weights in tests * style * Update test_modeling_fsmt.py * more old models * fix more init_weights * copies * fix * style * Update modeling_lxmert.py * fix inits * more and more * more * should finalize * style * Update modeling_dinov2_with_registers.py * fix * Update modeling_encoder_decoder.py * fix * style * Update modeling_lxmert.py * post rebase cleanup * Update modeling_informer.py * back to start for device * fix * add test to detect all failing cases correctly * Update test_modeling_common.py * fix * fix * sam * style * Update modeling_maskformer_swin.py * CIs * CIs * remove test - will add it on separate PR * fix * fix * Update modeling_sam.py * CIs * CIs * CIs * convnext * suggestions * CIs * fix copies after merge --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-31 17:18:43 +02:00
Raushan Turganbay	8805600406	[qwen3] fix generation tests (#37142 ) * do not skip tests * fix qwen3-moe as well * fixup * fixup	2025-03-31 16:33:41 +02:00
Zhen	e686fed635	[Feature] Support using FlashAttention2 on Ascend NPU (#36696 ) * [Feature] Support using flash-attention on Ascend NPU * Fix qwen3 and qwen3_moe moduler conversion mismatch	2025-03-31 16:12:58 +02:00
Yih-Dar	a03cee7a1d	skip (#37141 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-31 15:38:40 +02:00
Guang Yang	3b07ca78bb	Export T5 (encoder-decoder) to ExecuTorch (#36486 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2025-03-31 12:10:26 +02:00
Fanli Lin	475664e2c6	[tests] remove cuda-only test marker in `AwqConfigTest` (#37032 ) * enable on xpu * add xpu support --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-31 11:53:02 +02:00
Armaghan Shakir	0710e9b1e8	Create and Expose SamVisionModel as public for better accessibility (#36493 ) * move encoder below * auto modeling * write SamVisionTester * fix vision attention shape * fix SamVisionTest * minor changes to SamVisionTest * Revert "fix vision attention shape" This reverts commit `d2a4083ae5`. * fix attention output shape in new tests * remove encoder examples * run modular on got_ocr2 * code formatting * fix got_ocr2 * ruff fixes * code quality * add sam_vision in auto modeling and auto configuration * remove composite test * updated index.md * add TFSamVisionEncoder to __init__ * fix public TFSamVisionEncoder * remove outdated todo comment * set test_torch_exportable Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * rename: VisionEncoder -> VisionModel * bring back original SamVisionEncoder * rename back: VisionEncoderOutput -> VisionModelOutput * undo changes in SamModelTester * reuse SamVisionEncoder in SamVisionModel --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-03-31 11:45:07 +02:00
cyyever	f99c279d20	Remove deprecated code (#37059 ) * Remove deprecated code * fix get_loading_attributes * fix error * skip test --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-03-31 11:15:35 +02:00
huismiling	d0b65bb479	[MLU] Fix FA2 check error, remove deepspeed-mlu deps. (#36159 ) * add Cambricon MLUs support * fix mlu device rng state * up for quality check * up mlu to support fp16 * fix mlu device dependency error * fix mlu device dependency error * enable mlu device for bf16 * fix mlu device memory tracker * Cambricon support SDPA and flash_attn * MLU devices : Checks if `mlu` is available via an `cndev-based` check which won't trigger the drivers and leave mlu * Fix mlu FA2 check. Remove deepspeed-mlu check. add mlu tests support. * fix testing errors. * Merge branch 'hf/main' into main * fix get_device_count error. * fix mlu testing utils. * fix code quality and style. * switch to @require_torch_multi_accelerator	2025-03-31 11:02:49 +02:00
jiqing-feng	286393fbb1	enable tp on CPU (#36299 ) * enable tp on CPU Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * get rank from cpu Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * enable TP tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix comment Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * em print Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix model id Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix conflict Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix index and add doc Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-03-31 10:55:47 +02:00
Bo Zheng	6acd5aecb3	Adding Qwen3 and Qwen3MoE (#36878 ) * Initial commit for Qwen3 * fix and add tests for qwen3 & qwen3_moe * rename models for tests. * fix * fix * fix and add docs. * fix model name in docs. * simplify modular and fix configuration issues * Fix the red CI: ruff was updated * revert ruff, version was wrong * fix qwen3moe. * fix * make sure MOE can load * fix copies --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2025-03-31 09:50:49 +02:00
Joao Gante	9fd9476005	[generate] beam search -- fix output cropping (#37080 ) * handle jagged beams * better comment * bart -- beam search tests print special tokens * more bart test updates * more tests! * better comment	2025-03-28 18:57:51 +01:00
Minho Ryu	eca74d1367	[WIP] add deepseek-v3 (#35926 ) * init commit * style * take comments into account * add deepseekv3 modeling * remove redundant code * apply make style * apply fix-copies * make format * add init files * rename deepseekv3 into deepseek_v3 based on its model_type * rename deepseekv3 into deepseek_v3 based on its model_type * deepseek-v3 not deepseek_v3 * set model_type as deepseek_v3 * use default docs * apply make * fill type and docstring * add rope_config_validation * use custom DeepseekV3MLP * hold code only for checkpoints congifuration; remove redundant * revise rope yarn for DeepSeek variation * rename DeepSeek-V3 * some refactoring * revise load_hook to work properly; make moe func trainable; use llama instead of mixtral * fix attention forward * use -1 for not-changing dim when to use exapnd * refactor DeepseekV3TopkRouter * use reshape_for_rope instead of load_hook; revise attention forward for TP; rename q_head_dim with qk_head_dim * register pre_hook and hook both * make style * use n_shared_experts * Update src/transformers/models/deepseek_v3/configuration_deepseek_v3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add test file * update modeling_file according to modular file * make style * add mapping for DeepseekV3ForSequenceClassification * remove aux_loss_alpha * add deepseek_v3 for perf * add deepseek_v3 * rename test as deepseekv3 * use tiny-deepseek-v3 * remove DeepseekV3ForSequenceClassification * cache before padding * remote output_router_logits * Revert "remote output_router_logits" This reverts commit `f264f800d0`. * remove output_router_logits * make e_score_correction_bias as buffer * skip tests not compatible * make style * make e_score_correction_bias as buffer * use rope_interleave instead of load_hook * skip tests not compatible with MLA * add doc for rope_interleave * fix typo * remove torch.no_grad for selecting topk * fix post merge issue * mrege with main and simplify * nits * final * small fixes * fix * support TP better * stash * changes currently requires * remove synch * more fixes for TP * temp fix for TP : some attention layers's FP8 scales are too small + shared is local colwise and anything is local if FP8 because weights are used * updates to have generation work! * push most of the changes * reorder functions + call for contributions! * update readme * nits * update * ruff was updated on main * merge with main and fix copies * revert unrelated changes * route all tokens to all experts when testing to avoid no gradient iddues * finish fixing all tests * fixup * nit * clean config * last readme changes * nit * do cnit * typo * last nit * one more one more --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: arthur@huggingface.co <arthur@ip-26-0-165-131.ec2.internal>	2025-03-28 15:56:59 +01:00
Yih-Dar	1fcaad6df9	Use `lru_cache` for tokenization tests (#36818 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-28 15:09:35 +01:00
cyyever	6cc9c8d7d1	Remove deprecated batch_size parameter (#37007 )	2025-03-27 15:01:56 +00:00
cyyever	41a0e58e5b	Set weights_only in torch.load (#36991 )	2025-03-27 14:55:50 +00:00
Joao Gante	29f322d04d	[generate, cache] handle more complex device maps (#37014 )	2025-03-27 14:33:20 +00:00
eustlb	fb8e6c50e4	[audio utils] fix fft_bin_width computation (#36603 ) * fix fft_bin_width computation * update docstring + enforce correct params * update test with correct value * udpate test * update feature extractors for concerned models * update * make * udpate docstring * udpate docstring	2025-03-27 15:20:02 +01:00
Raushan Turganbay	e97c760006	[chat templates} support loading audio from video (#36955 ) * add audio from video * typos * delete print * comments	2025-03-27 14:46:11 +01:00
Sungyoon Jeong	d1eafe8d4e	Optimize `to_py_obj` for python-native numeric lists and scalars (#36885 ) * Optimize to_py_obj for python-native numeric lists and scalars * Fix bug that tuple is not converted to list * Try np.array for more robust type checking * Apply review and add tests for to_py_obj	2025-03-27 14:16:46 +01:00
jiqing-feng	0e56fb69a2	fix pegasus init weights and other copied models (#36844 ) * fix pegasus init weights Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix the rest of models Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix informer init Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * init weight before checking Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix roformer tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix roformer tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-03-27 14:14:30 +01:00
Mohamed Mekkouri	92429057d9	Skip FP8 linear tests For device capability < 9.0(#37008 ) * skip fp8 linear * add capability check * format	2025-03-27 12:38:37 +01:00
Yih-Dar	d13c390d01	Mark 2 tests as flaky for now (#37038 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-27 10:59:47 +01:00
Abu Bakr Soliman	49b5ab6a27	Support QuestionAnswering Module for ModernBert based models. (#35566 ) * push ModernBertForQuestionAnswering * update ModernBertForQuestionAnswering * update __init__ loading * set imports for ModernBertForQuestionAnswering * update ModernBertForQuestionAnswering * remove debugging logs * update init_weights method * remove custom initialization for ModernBertForQuestionAnswering * apply make fix-copies * apply make style * apply make fix-copies * append ModernBertForQuestionAnswering to the pipeline supported models * remove unused file * remove invalid autoload value * update en/model_doc/modernbert.md * apply make fixup command * make fixup * Update dummies * update usage tips for ModernBertForQuestionAnswering * update usage tips for ModernBertForQuestionAnswering * add init * add lint * add consistency * update init test * change text to trigger stuck text * use self.loss_function instead of custom loss By @Cyrilvallez Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update modeling_modernbert.py make comparable commit to even it out * Match whitespace * whitespace --------- Co-authored-by: Matt <rocketknight1@gmail.com> Co-authored-by: Orion Weller <wellerorion@gmail.com> Co-authored-by: Orion Weller <31665361+orionw@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-03-26 21:24:18 +01:00
cyyever	2b550c47b2	Remove deprecated training arguments (#36946 ) * Remove deprecated training arguments * More fixes * More fixes * More fixes	2025-03-26 16:44:48 +00:00
cyyever	e7139d06f5	Fix tensor dtype mismatch (#36985 ) * Fix tensor dtype mismatch * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-26 10:37:46 +01:00
湛露先生	ebd2029483	Change GPUS to GPUs (#36945 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-25 17:25:39 +01:00
Yih-Dar	c6814b4ee8	Update ruff to `0.11.2` (#36962 ) * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-25 16:00:11 +01:00
Joao Gante	bc1c90a755	[Utils] torch version checks optionally accept dev versions (#36847 )	2025-03-25 10:58:58 +00:00
Raushan Turganbay	0f733110a6	Support `return_tensors` in audio chat templates (#34601 ) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky	2025-03-25 11:08:47 +01:00
Afanti	19085c28da	fix typos in the tests directory (#36932 ) * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: format codes	2025-03-25 10:49:24 +01:00
Guang Yang	69bcb86c58	Export for Phi4-mini (#36780 ) * Export for Phi4-mini * Update tests/models/phi3/test_modeling_phi3.py --------- Co-authored-by: Guang Yang <guangyang@fb.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-25 10:46:38 +01:00
Cyril Vallez	4303d88c09	Add Phi4 multimodal (#36939 ) * raw start * update * update * add to imports * update * up * simplify configs * clean configs * style * typos * Update convert_phi4_multimodal_weights_to_hf.py * Update convert_phi4_multimodal_weights_to_hf.py * fix * up * up * up * Update convert_phi4_multimodal_weights_to_hf.py * Update convert_phi4_multimodal_weights_to_hf.py * up * up * up * Update feature_extraction_phi4_multimodal.py * up * up * up * up * up * simplify configs * typo * cut code * typo * typo * typo * re * typo * up * up * up * add tests * fix * fix * Update test_modeling_phi4_multimodal.py * up * Update test_modeling_phi4_multimodal.py * doc * fix * up * up * up * up * up * up * simplify * up * simplify * config docstrings * cleanup * clean * typo * typo * fix * Update phi4_multimodal.md * fix * fix * Update test_modeling_phi4_multimodal.py * update * simplify reshapes and permutes * up * simplify special tokens * simplify processor a lot * Update processing_phi4_multimodal.py * Update processing_phi4_multimodal.py * switch to fast processor * image processor * Update image_processing_phi4_multimodal_fast.py * add lora extraction to converter * Update convert_phi4_multimodal_weights_to_hf.py * Update __init__.py * add AudioInput type in audio_utils * rewrite feature_extraction: support torch batched FFT * input_audio_embeds -> audio_input_features, input_image_embeds -> image_pixel_values * test update * not mono channel warning update * remove auto maps from processor * kargs dispatch in processor * simplify kwargs dispatch * simplify merging * remove default sampling rate * style * Update test_modeling_phi4_multimodal.py * update doc * doc * torch only feature extractor * make fake tokens adjustable * Update feature_extraction_phi4_multimodal.py * fix * Update processing_phi4_multimodal.py * simplify mask * last touch * fix copies * style * Update audio_utils.py * style * Update feature_extraction_phi4_multimodal.py * Update __init__.py * docstrings * copies * fix all checks * back to fix-copies * trigger CIs * Update feature_extraction_phi4_multimodal.py * improve tests with multimodal inputs * trigger CIs --------- Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com>	2025-03-25 09:55:21 +01:00
Raushan Turganbay	47e5432805	Deprecate #36741 and map Causal to Conditional (#36917 ) * deprecate the prev fix * reword warning and update docs * reword warning * tests * dont bloat `get_text_config()`	2025-03-25 09:13:56 +01:00
Yoni Gozlan	91455c1825	Fix processor kwargs qwen2 vl (#36890 ) * Fix qwen2_vl and qwen2_5_vl processors cutom images kwargs * change version warning	2025-03-24 13:19:26 -04:00
gautham	48385aa4f4	Added support for seed in `DataCollatorForWholeWordMask` (#36903 ) * Added support for seed in `DataCollatorForWholeWordMask`, and also wrote tests. Also fixed bugs where the code hardcoded values for mask replacement probability and random replacement probability, instead of using the values passed by the user. * formatting issues * Used better way to generate seed in TF. Made tests more consistent.	2025-03-24 16:57:17 +00:00
omahs	cbf924b76c	Fix typos (#36910 ) * fix typos * fix typos * fix typos * fix typos	2025-03-24 14:08:29 +00:00
Yih-Dar	340500b1a9	Use another repo. for Mistral3 processor testing (#36925 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-24 14:36:05 +01:00
Raushan Turganbay	57f551c78d	[chameleon] fix num image token check (#36918 ) * [chameleon] fix num image token check * embed after merging image token * skip this also * mistral require_read_token	2025-03-24 12:36:08 +01:00
Yoni Gozlan	beb9b5b022	Fix Pan and Scan on batched images Gemma3 (#36864 ) * process flattened images in fast image proc * process flattened images in low proc and add tests * remove print * add unbalanced batch test pas image proc * fix integration tests	2025-03-21 13:56:00 -04:00
Cyril Vallez	dd3933dd65	Simplify keep_in_fp32_modules logic (#36722 ) * better regex everywhere * fix * Update test_modeling_instructblip.py * BC with explanations this time otherwise it makes no sense at all * Update test_modeling_instructblip.py * style * CIs * update _keep_in_fp32_modules in blip2 * Update modeling_utils.py * Update modeling_utils.py * style * CIs * add check * trigger CIs * Update modeling_utils.py * trigger CIs	2025-03-21 16:12:59 +01:00
Sukriti Sharma	90e2df5d55	fix: loss computation after embeddings resize - mllama (#36840 ) * move loss to generation class Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * code cleanup Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * test for resize and loss computation Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix tests Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix:test for resize and loss Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix resize embedding mllama test Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * review changes Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> --------- Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>	2025-03-21 14:47:59 +01:00
Raushan Turganbay	523f6e743c	Fix: dtype cannot be str (#36262 ) * fix * this wan't supposed to be here, revert * refine tests a bit more	2025-03-21 13:27:47 +01:00
Pablo Montalvo	2638d54e78	Gemma 3 tests expect greedy decoding (#36882 ) tests expect greedy decoding	2025-03-21 12:36:39 +01:00
Joao Gante	94f487626a	[generate] model defaults being inherited only happens for newer models (#36881 )	2025-03-21 11:01:09 +00:00
Afanti	26c83490d2	chore: fix typos in the tests directory (#36813 ) * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * fix: format codes * chore: fix copy mismatch issue * fix: format codes * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: restore previous words * chore: revert unexpected changes	2025-03-21 10:20:05 +01:00
Raushan Turganbay	97d2f9d8ae	Mllama: raise better error (#35934 ) * fix mllama * update test * fix test	2025-03-21 09:35:37 +01:00
gautham	9e771bf402	Add support for seed in `DataCollatorForLanguageModeling` (#36497 ) Add support for `seed` in `DataCollatorForLanguageModeling`. Also wrote tests for verifying behaviour.	2025-03-20 18:27:43 +00:00
Raushan Turganbay	42c489f2ae	Gemma3: fix test (#36820 ) * fix test * require_read_token and public repo ids * flash-attn test uncomment * fix torchscript	2025-03-20 18:14:53 +01:00
Haotong LIN	6515c25953	Add Prompt Depth Anything Model (#35401 ) * add prompt depth anything model by modular transformer * add prompt depth anything docs and imports * update code style according transformers doc * update code style: import order issue is fixed by custom_init_isort * fix depth shape from B,1,H,W to B,H,W which is as the same as Depth Anything * move prompt depth anything to vision models in _toctree.yml * update backbone test; there is no need for resnet18 backbone test * update init file & pass RUN_SLOW tests * update len(prompt_depth) to prompt_depth.shape[0] Co-authored-by: Joshua Lochner <admin@xenova.com> * fix torch_int/model_doc * fix typo * update PromptDepthAnythingImageProcessor * fix typo * fix typo for prompt depth anything doc * update promptda overview image link of huggingface repo * fix some typos in promptda doc * Update image processing to include pad_image, prompt depth position, and related explanations for better clarity and functionality. * add copy disclaimer for prompt depth anything image processing * fix some format typos in image processing and conversion scripts * fix nn.ReLU(False) to nn.ReLU() * rename residual layer as it's a sequential layer * move size compute to a separate line/variable for easier debug in modular prompt depth anything * fix modular format for prompt depth anything * update modular prompt depth anything * fix scale to meter and some internal funcs warp * fix code style in image_processing_prompt_depth_anything.py * fix issues in image_processing_prompt_depth_anything.py * fix issues in image_processing_prompt_depth_anything.py * fix issues in prompt depth anything * update converting script similar to mllamma * update testing for modeling prompt depth anything * update testing for image_processing_prompt_depth_anything * fix assertion in image_processing_prompt_depth_anything * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update docs/source/en/model_doc/prompt_depth_anything.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update docs/source/en/model_doc/prompt_depth_anything.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * update some testing * fix testing * fix * add return doc for forward of prompt depth anything * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update tests/models/prompt_depth_anything/test_modeling_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix prompt depth order * fix format for testing prompt depth anything * fix minor issues in prompt depth anything doc * fix format for modular prompt depth anything * revert format for modular prompt depth anything * revert format for modular prompt depth anything * update format for modular prompt depth anything * fix parallel testing errors * fix doc for prompt depth anything * Add header * Fix imports * Licence header --------- Co-authored-by: Joshua Lochner <admin@xenova.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-03-20 16:12:44 +00:00
Pavel Iakubovskii	66291778dd	Refactor Attention implementation for ViT-based models (#36545 ) * Refactor vit attention * Refactor ViT-based models * 🚨🚨🚨 Fix prefix for DPT * Update params order * trigger tests * Fix Dinov2 attention * Fix DPT attention impl propagation for backbone config * Common test fix: config is modif. inplace - avoid it * view->reshape * Fixup * Fixup * Enable IJepa FA2 * Add FA2 in corresponding model docs	2025-03-20 15:15:01 +00:00
fxmarty-amd	1a374799ce	Support loading Quark quantized models in Transformers (#36372 ) * add quark quantizer * add quark doc * clean up doc * fix tests * make style * more style fixes * cleanup imports * cleaning * precise install * Update docs/source/en/quantization/quark.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/quark_integration/test_quark.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/utils/quantization_config.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * remove import guard as suggested * update copyright headers * add quark to transformers-quantization-latest-gpu Dockerfile * make tests pass on transformers main + quark==0.7 * add missing F8_E4M3 and F8_E5M2 keys from str_to_torch_dtype --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Bowen Bao <bowenbao@amd.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-03-20 15:40:51 +01:00
mobicham	3e8f0fbf44	Fix hqq skipped modules and dynamic quant (#36821 ) * Fix hqq skip_modules and dynamic_quant * fix skipped modules loading * add dynamic/skip HqqConfig test	2025-03-20 15:31:49 +01:00
Ryan Mullins	487dab1b2b	Shieldgemma2 (#36678 ) * single commit * correct config * fixup * dummy pt * Use ShieldGemma2Config in conversion script * Update src/transformers/models/shieldgemma2/configuration_shieldgemma2.py * Adding shieldgemma2 to models.__init__.py * Adding ShieldGemma2 to main __init__.py * Update shieldgemma2.md * Update shieldgemma2.md * Adding tests. Addressing review feedback. * Minor docs update * Fixing code quality feedback from CI * Fixing empty messages bug reported by ghunkins --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Ren Pang <ain-soph@live.com>	2025-03-20 15:14:38 +01:00
yutong_liu	8b479e39bb	Saving `Trainer.collator.tokenizer` in when `Trainer.processing_class` is `None` (#36552 ) * feat: Saving tokenizer in collator when processing_class is None * chore: Style issue * chore: Typo * dbg: Check why test failed * dbg: Remove logics and another test failed which successed before, so should be the stablibility issue * test: Init unit-test * chore: Style * chore: Add err log * fix: Case * Update tests/trainer/test_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * chore: Try to use get_regression_trainer * fix: Impl and style * fix: Style * fix: Case * fix: Import err * fix: Missed import * fix: Import block un-sorted problem * fix: Try another tokenizer * fix: Test logic * chore: Light updates * chore: Reformat --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-20 11:27:47 +01:00
Ita Zaporozhets	3f03c379d2	fix tiktoken convert to pass AddedToken to Tokenizer (#36566 ) * pass AddedToken to Tokenizer * ruff * handle dict for special tokens * option: test tokenizer from tiktoken same as fast * ruff * ruff	2025-03-20 11:26:49 +01:00
Matt	9be4728af8	Just import torch AdamW instead (#36177 ) * Just import torch AdamW instead * Update docs too * Make AdamW undocumented * make fixup * Add a basic wrapper class * Add it back to the docs * Just remove AdamW entirely * Remove some AdamW references * Drop AdamW from the public init * make fix-copies * Cleanup some references * make fixup * Delete lots of transformers.AdamW references * Remove extra references to adamw_hf	2025-03-19 18:29:40 +00:00
Michael Feil	51bd0ceb9e	Update configuration_qwen2.py (#36735 ) * Update configuration_qwen2_moe.py * Update modeling_qwen2_moe.py * ruff fmt * docstring add qkv_bias	2025-03-19 18:15:54 +00:00
Tugsbayasgalan Manlaibaatar	f39f4960f3	Support tracable dynamicKVcache (#36311 ) * Support tracable dynamicKVcache * Fix lint * More fine grained test * Lint * Update * Update * Fix up * Apply suggestions from code review * Update src/transformers/cache_utils.py * Update tests/utils/test_cache_utils.py * Apply suggestions from code review * Update * Change error message * Rename * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review --------- Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-19 16:52:30 +00:00
Joao Gante	7c233980f4	[gemma 3] multimodal checkpoints + AutoModelForCausalLM (#36741 )	2025-03-19 15:04:19 +00:00
Yao Matrix	b11050d6a2	enable OffloadedCache on XPU from PyTorch 2.7 (#36654 ) * fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model * follow Marc's suggestion to use _tie_weights to fix Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * enable OffloadedCache on XPU since PyTorch 2.7 Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * don't change bart Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> * make code more concise per review comments Signed-off-by: N <matrix.yao@intel.com> * fix review comments Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> * Revert "fix review comments" This reverts commit `acf1484b86`. * fix review comments Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> * fix style Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> Signed-off-by: N <matrix.yao@intel.com> Co-authored-by: root <root@a4bf01945cfe.jf.intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-19 15:15:52 +01:00
Driss Guessous	e8d960329e	Add option for ao base configs (#36526 )	2025-03-19 14:59:47 +01:00
Mohamed Mekkouri	a861db01e5	Fix Device map for bitsandbytes tests (#36800 ) fix	2025-03-19 11:57:13 +01:00
ivarflakstad	706703bba6	Expectations test utils (#36569 ) * Add expectation classes + tests * Use typing Union instead of \| * Use bits to track score in properties cmp method * Add exceptions and tests + comments * Remove compute cap minor as it is not needed currently * Simplify. Remove Properties class * Add example Exceptions usage * Expectations as dict subclass * Update example Exceptions usage * Refactor. Improve type name. Document score fn. * Rename to DeviceProperties.	2025-03-18 23:39:50 +01:00
Joao Gante	179d02ffb8	[generate] ✨ vectorized beam search ✨ (#35802 )	2025-03-18 18:39:36 +00:00
Yoni Gozlan	30580f035b	Fix Mistral3 tests (#36797 ) * fix processor tests * fix modeling tests * fix test processor chat template * revert modeling test changes	2025-03-18 13:08:12 -04:00
Marc Sun	3017536ebf	fix hqq due to recent modeling changes (#36771 ) * fix-hqq * style * test	2025-03-18 12:20:27 +01:00
Cyril Vallez	e959530b8f	Add Mistral3 (#36790 ) * initial start * style and dummies * Create convert_mistral3_weights_to_hf.py * update * typo * typo * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * up * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * update * update * Update image_processing_mistral3.py * Update convert_mistral3_weights_to_hf.py * fix patch merger * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * up * update modular to fit * style * Update convert_mistral3_weights_to_hf.py * typo * Update modular_mistral3.py * simplify a lot all shape shenanigans * simplify * add working test processor * Add partially working common modeling tests * All tests working and remove mistral3 image processors * add docs and fixup * fix inference with image size >1540 * 🚨fix test image proc pixtral * Remove vision_feature_select_strategy * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * clean * fix test checkpoints * Update test_modeling_mistral3.py * Update test_modeling_mistral3.py * style * Use Pixtral processor * up * finish cleaning processor to use pixtral directly * Update __init__.py * Update processing_pixtral.py * doc * Update __init__.py * Update mistral3.md * Update _toctree.yml --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: yonigozlan <yoni.gozlan10@gmail.com>	2025-03-18 12:04:42 +01:00
Zebin	7426d02ea8	Fixing typo in gemma3 image_processor_fast and adding a small test (#36776 ) Co-authored-by: zebz13 <zeb@fedora> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-18 11:35:06 +01:00
Afanti	19b9d8ae13	chore: fix typos in tests directory (#36785 ) * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory	2025-03-18 10:31:13 +01:00
Afanti	7f5077e536	fix typos in the tests directory (#36717 )	2025-03-17 17:45:57 +00:00
Joao Gante	cff4caa0c1	[CI] remove redundant checks in `test_eager_matches_sdpa_inference` (#36740 )	2025-03-17 16:29:18 +00:00
Petr Kuderov	c8a2b25f91	Fix `TrainingArguments.torch_empty_cache_steps` post_init check (#36734 ) Mistaken use of De Morgan's law. Fixed "not (X or Y)" to correct "not (X and Y)" check to raise a ValueError. Added corresponding test to check "positive int or None" condition. Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-17 16:09:46 +01:00
Sambhav Dixit	8e67230860	Fix test isolation for clear_import_cache utility (#36345 ) * test fixup * test fixup * fixing tests for unused imports * style fixes * fix * style fixes * styke fix * remove isolated module cache * rm custom subprocess defination * run using exsiting fn * style fixup * make fixup * remove redundant comments * rm redundat skipif + style changes	2025-03-17 16:09:09 +01:00
jiqing-feng	27361bd218	fix xpu tests (#36656 ) * fix awq xpu tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix llava next video bnb tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-17 15:57:49 +01:00
jiqing-feng	2256875a77	fix can_generate (#36570 ) * fix can_generate Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix can generate for speecht5 and blip Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix speecht5 tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>	2025-03-17 14:56:18 +01:00
Marc Sun	9e94801146	enable/disable compile for quants methods (#36519 ) * disable compile for most quants methods * fix * Update src/transformers/generation/configuration_utils.py Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Update tests/quantization/bnb/test_mixed_int8.py Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Update src/transformers/generation/configuration_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * changes from joao suggestions --------- Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-17 11:38:21 +01:00
Armaghan Shakir	c53d53da89	🚨🚨🚨 Fix sdpa in SAM and refactor relative position embeddings (#36422 ) * fall back to eager if output_attentions * improve relative position embeddings * run modular on got_ocr2 * run-slow: sam * fix run-length encoding * fix tf processor errors * update tf_sam * fix compile error * re-run tests	2025-03-17 09:39:52 +00:00
Joao Gante	fc8764c9a6	[Generation, Gemma 3] When passing a custom `generation_config`, overwrite default values with the model's base `generation_config` (#36684 )	2025-03-15 12:40:09 +00:00
Ilyas Moutawwakil	6f3e0b68e0	Fix grad accum arbitrary value (#36691 )	2025-03-14 22:03:01 +01:00
Joao Gante	42ebb6c23e	[tests] Parameterized `test_eager_matches_sdpa_inference` (#36650 )	2025-03-14 14:41:27 +00:00
Matt	9215cc62d4	Try working around the processor registration bugs (#36184 ) * Try working around the processor registration bugs * oops * Update error message * Clarify error * Docstring docstring docstring * The extra content is indexed by config class, so let's grab some values out of there * Commit my confusion as a TODO * Resolve my confusion * Cleanup and mostly revert to the original * Better autoclass fallback * Don't nest f-strings you lunatic * Clearer error message * Less getattr() * Revert a lot of changes to try a different approach! * Try the global registry * Check the dynamic list as well as the transformers root * Move the dynamic list somewhere safer * Move the dynamic list somewhere even safer * More import cleanup * Simplify all the register_for_auto_class methods * Set _auto_class in the register() methods * Stop setting the cls attribute in register() * Restore specifying the model class for Model derivatives only * Fix accidentally taking the .__class__ of a class * Revert register_for_auto_class changes * Fix get_possibly_dynamic_module * No more ALL_CUSTOM_CLASSES * Fix up get_possibly_dynamic_module as well * Revert unnecessary formatting changes * Trigger tests	2025-03-14 13:56:21 +00:00
Sean (Seok-Won) Yi	691d1b52c3	Fix/best model checkpoint fix (#35885 ) * Set best_model_checkpoint only when ckpt exists. Rather than set it explicitly without checking if the checkpoint directory even exists as before, now we moved the setting logic inside of _save_checkpoint and are only setting it if it exists. * Added best_global_step to TrainerState. * Added tests for best_model_checkpoint. * Fixed hard-coded values in test to prevent fail. * Added helper func and removed hard-coded best_step. * Added side effect patch generator for _eval. * Added evaluate side effect func. * Removed erroneous patching. * Fixed minor bug. * Applied Ruff. * Fixed Ruff problem in make style. * Used Trainer.set_initial_training_values.	2025-03-14 14:24:53 +01:00
Matt	72861e11eb	Make the flaky list a little more general (#36704 ) * Make the flaky list a little more general * Trigger tests * Make the flaky list a little more general	2025-03-14 12:15:32 +00:00
Kingsley	53742b11f5	Gemma3 processor typo (#36710 ) * fix typo when is on * tiny * add test and remove 'text_crops' * lint	2025-03-14 13:07:55 +01:00
Matt	48ef468c74	Final CI cleanup (#36703 ) * make fixup * make fixup * Correct skip decorator * Add TODOs * add is_flaky() parentheses	2025-03-13 17:26:09 +00:00
Cyril Vallez	2a004f9ff1	Add loading speed test (#36671 ) * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * trigger CIs * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * better error messages * Update test_modeling_utils.py * Update test_modeling_utils.py	2025-03-13 17:07:30 +01:00
bd793fcb	87b30c3589	fix wandb hp search unable to resume from sweep_id (#35883 ) * fix wandb hp search unable to resume from sweep_id * format styles --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-13 12:32:26 +01:00
Mohamed Mekkouri	47cc4da351	Changing the test model in Quanto kv cache (#36670 ) changing model	2025-03-13 12:23:34 +01:00
Marc Sun	fbb18ce68b	Update config.torch_dtype correctly (#36679 ) * fix * style * new test	2025-03-13 12:08:02 +01:00
Joao Gante	c4161238bd	[Cache] Don't initialize the cache on `meta` device (#36543 )	2025-03-13 10:13:29 +00:00
Yoni Gozlan	ea219ed164	Remove differences between init and preprocess kwargs for fast image processors (#36186 ) * Remove differences between init and preprocess kwargs in fast image processors * make modifs got_ocr2 * update gemma3	2025-03-12 19:44:05 -04:00
Yoni Gozlan	bc3253f076	Remove hardcoded slow image processor class in processors supporting fast ones (#36266 ) * Add fast image processor class to processors supporting them * fix test kosmos2	2025-03-12 18:39:25 -04:00
Mohamed Mekkouri	0013ba61e5	Fix Failing GPTQ tests (#36666 ) fix tests	2025-03-12 20:03:02 +01:00
Matt	c7eb95581a	Don't accidentally mutate the base_model_tp_plan (#36677 ) * Don't accidentally mutate the base_model_tp_plan * Co-authored by: Joao Gante <joaofranciscocardosogante@gmail.com> * Trigger tests * Marking grad accum test as slow * Add a flaky decorator * Add a flaky decorator * Use cyril's codeblock * Don't copy() when it's None * Use cyril's new codeblock * make fixup	2025-03-12 18:59:13 +00:00
Cyril Vallez	071a161d3e	[core] Large/full refactor of `from_pretrained` (#36033 ) * squash everything together start to simplify inner logic Update modeling_utils.py Update modeling_utils.py Update modeling_utils.py Update modeling_utils.py continue refactor fix small fixes add type hints/docstring Update modeling_utils.py remove _fast_init keep improving Update modeling_utils.py Update modeling_utils.py new first tp loading version style fix weird in-place op trigger CIs Update modeling_utils.py much clearer renaming of keys fix update Update test_modeling_common.py trigger CIs update update style Update modeling_utils.py Update modeling_utils.py Update modeling_utils.py fix fast download first prototype remove old function remove old functions Remove unused function and move back _get_tp_registry fix tp plan registry simplify CIs Update hub.py Update modeling_utils.py simplify simplify renaming logic remove unused check add sanity check back (a test depends on it) Update modeling_utils.py finalize sound renaming logic style add forgotten check Update modeling_utils.py add key_mapping keyword style Update modeling_utils.py add comment minor updates minor change for clarity fix small prefix issue and simplify style trigger CIs typo fix Post rebase fix post rebase cleanup simplify tp typo oupsi typo correctly escape improvements based on Marc's review finalize Marc's review comments squash everything * improve * Update modeling_utils.py * Update modeling_utils.py * fix * Update modeling_utils.py * Update modeling_utils.py * style * Update modeling_utils.py * simplify * style * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * fix dtype issue * Update modeling_utils.py * style * remove test that does not make sense * style * small fixes * style * fix * cleanup after rebase * style * typo * escape * tp for task specific top modules * Update modeling_utils.py * Update modeling_utils.py * fix allocation * CIs * CIs * CIs * improve docstring * CIs * Update modeling_utils.py * fix	2025-03-12 13:39:25 +01:00
Ilyas Moutawwakil	89f6956015	HPU support (#36424 ) * test * fix * fix * skip some and run some first * test fsdp * fix * patches for generate * test distributed * copy * don't test distributed loss for hpu * require fp16 and run first * changes from marc's PR fixing zero3 * better alternative * return True when fp16 support on gaudi without creating bridge * fix * fix tested dtype in deepspeed inference test * test * fix * test * fix * skip * require fp16 * run first fsdp * Apply suggestions from code review * address comments * address comments and refactor test * reduce precison * avoid doing gaudi1 specific stuff in the genreation loop * document test_gradient_accumulation_loss_alignment_with_model_loss test a bit more	2025-03-12 09:08:12 +01:00
Ryan Mullins	50d3530aa0	Gemma3 (#36658 ) * Fix converter * [Broken] Adds Gemma 3 to Hugging Face Transformers * Consolidating Config and Processor params across impls * Sorting out configuration parameters. Adds qk_norm before RoPE. Still not sure if RoPE is right. * Additional plumbing for CausalLM and ConditionalGeneration variants * incomplete draft of Orbax conversion script * More complete checkpoint conversion * Supporting Gemma 3 1B checkpoints * Updating RoPE for multiple frequencies * Adjustments to rotary embedder * Proof of life for text-only operation * Updating the conversion script to handle multimodal projection weights * Fixing tet-only conversions * Cleaner conversion script with multimodal support and a simpler processor * Additional refatcors to the Gemma3Processor * Simplified Processor to work over text representations * Updated conversion script to join text and vision embeddings at converion time * Logging for debugging * Update src/transformers/models/gemma2/modeling_gemma2.py Co-authored-by: Joshua Lochner <admin@xenova.com> * Removed extraneous Config params * Switching to fast tokenizer for checkpoint conversions * isolating siglip for performance tetsing * Minor changes for debugging tests against baselines * Adding average pooling for soft tokens * Updating processor code to enable simpler embedding interleaving for arbitrary number of images in prompts * Updating conversion script for ShieldGemma 2 conversion compatibility * Allow disable_compile to be provided as a kwarg * Refresh from modular * Updated conversion script and corrected sliding window * Fix type mismatch in cache_position (#4) * Fix dtype (#5) * Fix type mismatch in cache_position * Actually fix in the modular file Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com> --------- Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com> * fixes for embedding table overflow and missing image_soft_token_mask from Gemma3Processor * Adding 2D pooling for image embeddings * Revert "Adding 2D pooling for image embeddings" This reverts commit `65350cf531`. * Gemma3 average pooling changed from 1D to 2D * Major refactor to Gemma3MultimodalInputProjection * Updating Gemm 3 Auto* registrations * Add option to save Gemma 3 chat template with tokenizer during weights conversion * Removing unused imports * Moving out-of-vocab handling from Gemma3Processor to Gemma3ForConditionalGeneration * Removing duplicate config property * Removing final logit softcapping and 1-indexing of position ids * Fixing image processor config and none --> None typo * Fixing sliding window size for 1B * Updating image_mean and image_std in Image Processor * Attention masking changed to lower triangular * Moving image special tokens to conversion script * Mirror image processor defaults from conversion script into Gemma3ProcessorKwargs * Remove special token variables from symbol space * Moving image soft token mask computation from Gemma3Processor to Gemma3ForConditionalGeneration * tie lm_head and embedding weights Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Correct tied weights in Gemma3CausalLM * iterative bidirectional attention * resolving merge conflicts * Reverting to Gemma 2 HybridCache with sldiing window support and a sliding_window_pattern of 6 * Correcting RoPE scaling * clean up first pass, dummy model geenration works * final clean up before fixing tests * causal lm test works, so fine * Fix conversion * Update src/transformers/models/gemma3/processing_gemma3.py * model tests are happy * processor tests are happy * image processing tests added * fixup * Fix pre-processing in conversion * Inputs merging * Do not normalize vision embeddings * Apply Ryan's (and team) changes to attention * token type ids + mask * template * move embed scale, add rope scale, fix tests * Add chat template to tokenizer * Use prefix for causal model loading * use existing code for sliding mask from gemma2 * self.embed_tokens already normalizes * Correcting Gemma3TextConfig parameters in conversion script * typo, modular overwrites my fixes * enable device map for text model * Conversion updates * ultra nit: no einsums * update image token * copy deepcopy config + some docs * add some test, still WIP * Refactoring --include_chat_tempalte logic in converter * Update src/transformers/models/gemma3/modular_gemma3.py Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> * Add eos tokens for instruct models * dump so i can work on dgx * Removing add_bos by default * dump * add fast im proc * docs for PaS + fixup * another fixup * one more fixup * fix tests * Inverting prior BOS change * ultra nit * Reverting to Tokenizer saved with add_bos_token=True and chat template starting with BOS * resize embeds, remove sqrt, add slow test outputs * FA2 but quality is meh * nit * skip FA2, no idea what happened * last bit for green CI * please, green CI for docs * T_T * Fix for Gemma3 logits * Support both options for system prompt * Update src/transformers/models/gemma3/image_processing_gemma3_fast.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/gemma3.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/gemma3.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/gemma3.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/gemma3.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/gemma3.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Docs updates now that assets are live * Style fixes --------- Co-authored-by: Joshua Lochner <admin@xenova.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com> Co-authored-by: Mayank Chaturvedi <imayank@google.com> Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> Co-authored-by: raushan <raushan@huggingface.co> Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz> Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> Co-authored-by: Lysandre <hi@lysand.re>	2025-03-12 09:06:17 +01:00
Travis Johnson	d8663cb8c5	Fix bugs in mllama image processing (#36156 ) * fix: handle input_channel_dim == channels_last Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> * fix: default PIL images to channels_last Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> * Apply suggestions from code review Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fixup from review batch Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> * test: add 1x1 PIL image to ambiguous channel test Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> * fix(mllama): avoid 0 dimension for image with impractical aspect ratio Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> --------- Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-03-11 10:22:48 +01:00
Arthur	1c4b62b219	Refactor some core stuff (#36539 ) * some config changes * update * current state * update * update * updates and cleanup * something that works * fixup * fixes * nits * nit * nits and fix * Update src/transformers/integrations/tensor_parallel.py Co-authored-by: Lysandre Debut <hi@lysand.re> * Update src/transformers/integrations/tensor_parallel.py Co-authored-by: Lysandre Debut <hi@lysand.re> * cleanup * style * safe import * fix * updates * rename stuff an clean * style * small updates * ups * oups * nit * protect imports * update tp * rodfl * arf * turbo nit on init * fix import error * frumble gumbgle * try to fix the import error * should fix the non model test * update keep in float32 * update * fix * nits * fix subvconfigs * test was weird * nit * fix failing test * fix instruct blip * fixes * style * x.com * fix overwrite * ok last bit of failing test --------- Co-authored-by: Lysandre Debut <hi@lysand.re>	2025-03-11 09:26:28 +01:00
Joao Gante	858545047c	[`HybridCache`] disable automatic compilation (#36620 )	2025-03-10 09:24:26 +00:00
gautham	a1cf9f3390	Fixed datatype related issues in `DataCollatorForLanguageModeling` (#36457 ) Fixed 2 issues regarding `tests/trainer/test_data_collator.py::TFDataCollatorIntegrationTest::test_all_mask_replacement`: 1. I got the error `RuntimeError: "bernoulli_tensor_cpu_p_" not implemented for 'Long'`. This is because the `mask_replacement_prob=1` and `torch.bernoulli` doesn't accept this type (which would be a `torch.long` dtype instead. I fixed this by manually casting the probability arguments in the `__post_init__` function of `DataCollatorForLanguageModeling`. 2. I also got the error `tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute Equal as input #1(zero-based) was expected to be a int64 tensor but is a int32 tensor [Op:Equal]` due to the line `tf.reduce_all((batch["input_ids"] == inputs) \| (batch["input_ids"] == tokenizer.mask_token_id))` in `test_data_collator.py`. This occurs because the type of the `inputs` variable is `tf.int32`. Solved this by manually casting it to `tf.int64` in the test, as the expected return type of `batch["input_ids"]` is `tf.int64`.	2025-03-07 14:09:27 +00:00
Joao Gante	5275ef6f3d	[XGLM] tag tests as slow (#36592 ) these tests should be slow	2025-03-06 17:54:41 +00:00
co63oc	996f512d52	Fix typos in tests (#36547 ) Signed-off-by: co63oc <co63oc@users.noreply.github.com>	2025-03-05 15:04:06 -08:00
ivarflakstad	c0c5acff07	Fix bamba tests amd (#36535 )	2025-03-04 15:24:27 +01:00
Arthur	84f0186e89	Add aya (#36521 ) * initial commit * small fix * move stuff to image processing file * remove stuff in validate turn and fix return tensor * remove liquid stuff * in the process of addressing comments * changes to get the right tokenization * new __init__ works * fixing defulat std and mean * works * small testing scipt -- to be deleted before merge * remove redundant code * addressing comments * fix inits, add docs templates * refactor processor, switch to gotocr image processor * remove image proc from init * refactor to working llava-style architecture * Change AyaVisionModel to AyaVisionForConditionalGeneration * add tests * fixups * update doc * Adding logits_to_keep explicitly in ayavision forward to enable compatibility with cohere model * better variable names + remove code paths * Updates to aya_vision.md * address comments * adding copied from * make style and remove unused projector_hidden_act from config * sort init * include usage of fast image proc and proc on cuda in doc * update checkpoint iin test processor * update checkpoint in test processor 2 * remove test_model and update docstring * skip failing tests --------- Co-authored-by: Saurabh Dash <saurabh@cohere.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-03-04 12:24:33 +01:00
Steven Liu	c0f8d055ce	[docs] Redesign (#31757 ) * toctree * not-doctested.txt * collapse sections * feedback * update * rewrite get started sections * fixes * fix * loading models * fix * customize models * share * fix link * contribute part 1 * contribute pt 2 * fix toctree * tokenization pt 1 * Add new model (#32615) * v1 - working version * fix * fix * fix * fix * rename to correct name * fix title * fixup * rename files * fix * add copied from on tests * rename to `FalconMamba` everywhere and fix bugs * fix quantization + accelerate * fix copies * add `torch.compile` support * fix tests * fix tests and add slow tests * copies on config * merge the latest changes * fix tests * add few lines about instruct * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * fix tests --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * "to be not" -> "not to be" (#32636) * "to be not" -> "not to be" * Update sam.md * Update trainer.py * Update modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * fix hfoption tag * tokenization pt. 2 * image processor * fix toctree * backbones * feature extractor * fix file name * processor * update not-doctested * update * make style * fix toctree * revision * make fixup * fix toctree * fix * make style * fix hfoption tag * pipeline * pipeline gradio * pipeline web server * add pipeline * fix toctree * not-doctested * prompting * llm optims * fix toctree * fixes * cache * text generation * fix * chat pipeline * chat stuff * xla * torch.compile * cpu inference * toctree * gpu inference * agents and tools * gguf/tiktoken * finetune * toctree * trainer * trainer pt 2 * optims * optimizers * accelerate * parallelism * fsdp * update * distributed cpu * hardware training * gpu training * gpu training 2 * peft * distrib debug * deepspeed 1 * deepspeed 2 * chat toctree * quant pt 1 * quant pt 2 * fix toctree * fix * fix * quant pt 3 * quant pt 4 * serialization * torchscript * scripts * tpu * review * model addition timeline * modular * more reviews * reviews * fix toctree * reviews reviews * continue reviews * more reviews * modular transformers * more review * zamba2 * fix * all frameworks * pytorch * supported model frameworks * flashattention * rm check_table * not-doctested.txt * rm check_support_list.py * feedback * updates/feedback * review * feedback * fix * update * feedback * updates * update --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2025-03-03 10:33:46 -08:00
Matt	1975be4d97	Fix edge case for continue_final_message (#36404 ) * Fix edge case for continue_final_message * lstrip() correctly * Add regression test * Add a clearer error message when the final message is not present * Add a clearer error message when the final message is not present * Fix massive bug!	2025-03-03 18:03:03 +00:00
Matt	2aff938992	Fix pipeline+peft interaction (#36480 ) * Fix pipeline-peft interaction * once again you have committed a debug breakpoint * Remove extra testing line * Add a test to check adapter loading * Correct adapter path * make fixup * Remove unnecessary check * Make check a little more stringent	2025-03-03 18:01:43 +00:00
Zach Mueller	4d8259d245	Fix loading zero3 weights (#36455 ) * Check if fixes * Fix zero3 loading * Quality * Fix marc nit * Add fast tests * Migrate to integrations.deepspeed rather than modeling_utils * Style	2025-03-03 15:05:58 +01:00
Yoni Gozlan	2c5d038f92	Add Got-OCR 2 Fast image processor and refactor slow one (#36185 ) * refactor image processor slow got ocr * add working image processor fast * fix fast image processor, update doc * use one big loop for processing patches	2025-03-01 00:56:00 -05:00
Eduardo Pacheco	222505c7e4	[GroundingDino] Fix grounding dino loss 🚨 (#31828 ) * Starting to fix GroundingDinoLoss and GroundingDinoHungarianMatcher * More updates * More updates * fixed: GroundingDinoLoss * fixed: failing tests * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/grounding_dino/test_modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Addressed comments * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> * add: cardinality loss and make box loss as copy from * change: default for reduction loss is sum * fix: vectorized generate fake box * fix copies * Addressed comments * addressed comments * addressed one-hot * Update tests/models/grounding_dino/test_modeling_grounding_dino.py Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> * Addressed comments * fixed test * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py * Update tests/models/grounding_dino/test_modeling_grounding_dino.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Starting to fix GroundingDinoLoss and GroundingDinoHungarianMatcher * More updates * More updates * fixed: GroundingDinoLoss * add: cardinality loss and make box loss as copy from * fix copies * Revert "Update tests/models/grounding_dino/test_modeling_grounding_dino.py" This reverts commit aa74c4c57c430e54cc74c414d6269edb65c73e83. * [run-slow] groundigdino * remove nestedtensor * [run-slow] groundig_dino * [run-slow] grounding_dino * [run-slow] grounding_dino * [run-slow] grounding_dino * check * check * add: enconder intermediate outputs to ImageLoss forward * add: GroundingDinoForObjectDetectionLoss in the loss directory * make style * fix the loss function * remove class_reduction since it sum is default * remove class_reduction * Update src/transformers/loss/loss_grounding_dino.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * simple fix * Update src/transformers/loss/loss_grounding_dino.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * minor fix * Update src/transformers/loss/loss_for_object_detection.py --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: sangbumchoi <danielsejong55@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-27 19:15:58 +00:00
Yih-Dar	482d17be60	Fix `hub_retry` (#36449 ) * cry * trigger --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-27 14:38:25 +01:00
Joao Gante	8aed019764	[generate] `torch.distributed`-compatible `DynamicCache` (#36373 ) * test * docstring * prepare distributed cache data * fix cat dim * test mvp * add test checks * like this? * working test and solution * nit * nit * add shape info	2025-02-27 11:48:57 +00:00
Mohamed Mekkouri	a7fbab33ae	Fix Expected output for compressed-tensors tests (#36425 ) fix	2025-02-26 21:17:24 +01:00
Arthur	1603018e7a	Update form pretrained to make TP a first class citizen (#36335 ) * clean code * oups * fix merge * yups * fix if * now you can play * fix shape issue * try non blocking * fix * updates * up * updates * fix most of thetests * update * update * small updates * up * fix the remaining bug? * update * rename when you read from the file * buffer issues * current status * cleanup * properly allocate dumb memory * update a small bug * fix colwise rep issue * fix keep in float 32 that was keeping everything in float 32 * typo * more fixes with keep_in_fp32_modules as we use to serach on it * fix ROPE dtype for TP * remove what's breaking the tests * updates * update and fixes * small cleanup after merging * allocate 2x to be safe * style, auto * update * yup nit * fix * remove slow as fuck torch api :( * work * fixup * update * brting the fix back * fix and update * fixes Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * updates because some suggestions were wrong 👀 * update? * fuck this bloated function * typo * fix the dumb prefix thing once and forall * fixes here and there * updates * remove prints * fix strict cases * styel * properly fix keys on load! * update * fix base model prefix issue * style * update * fix all? * remoce 1 print * fix the final etsts * fixup * last nits * fix the detach issue which cause a 2x slowdown * fixup * small fixes * ultra nit * fix * fix --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-26 20:12:38 +01:00
Nadav Timor	d18d9c3205	Universal Speculative Decoding `CandidateGenerator` (#35029 ) * move `TestAssistedCandidateGeneratorDifferentTokenizers` into a new testing file * refactor * NOTHING. add space to rerun github actions tests * remove it... * `UniversalSpeculativeDecodingGenerator` * Use `UniversalSpeculativeDecodingGenerator` when `generation_config.do_sample=True` * assistant tokenizes only the target's new suffix * formatting * fix code * fix code * formatting * add `TestGenerateWithDifferentModels` * `TestGenerateWithDifferentModels` parameterize on `do_sample` * `AssistantVocabMapping` & `AssistantVocabMappingCache` * formatting * `AssistantToTargetTranslator`: `get_target_input_ids` & `get_target_logits` * improve `_get_assistant_to_target_input_ids` & formatting * renaming * WIP: debugging `min_new_tokens` * fix get_target_ids * `UniversalSpeculativeDecodingGenerator` * assistant tokenizes only the target's new suffix * formatting * fix code * fix code * formatting * `TestGenerateWithDifferentModels` parameterize on `do_sample` * `AssistantVocabMapping` & `AssistantVocabMappingCache` * formatting * `AssistantToTargetTranslator`: `get_target_input_ids` & `get_target_logits` * improve `_get_assistant_to_target_input_ids` & formatting * renaming * WIP: debugging `min_new_tokens` * fix get_target_ids * fix device issue * fix get_assistant_input_ids * add `TestAssistedCandidateGeneratorDifferentTokenizers` * formatting * `AssistantVocabTranslatorCache` refactor & tests * revert changes in `src/transformers/generation/logits_process.py` * refactor `AssistedCandidateGenerator` * refactor `AssistedCandidateGeneratorDifferentTokenizers` * formatting * refactor `UniversalSpeculativeDecodingGenerator` * fix negative value for max_new_tokens * fix generation length target + attention_mask vs. assistant + attent * fix device * fix negative max_new_tokens bug * fix UAG * minor * formatting * `AssistedCandidateGeneratorDifferentTokenizers` `lookbehind`s init * resolve conflict & formatting * rerun CI tests * remove space... * remove old code * fix candidate_input_ids device * minor * formatting * Fix prepare + apply (#7) * fix prepare + apply * move to cpu * simplity suppress_tokens * fix bugs and refacatoring * device move * handle self.config.vocab_size > len(target_tokenizer.get_vocab()) * no need to normalize in candidate_generator * address Nadav's comments + minor * optimize device move + SuppressTokensLogitsProcessor * AssistantToTargetTranslator, SuppressTokensLogitsProcessor and tokenizers mapping improvements * padding size * padding improvement * fix and simplify get_target_logits * renaming in get_target_logits * minor * add filter_value and suppress_tokens_id * style + rename * remove TODO * restore original SelectTokensLogitsProcessor with modification * fix style * fix _update_past_and_masks and optimize code * remove assistant_vocab_size arg * fix attention_mask * call _prepare_attention_mask also if not has_past_key_values * handling attention mask for first generation * comment * restore test * remove SelectTokensLogitsProcessor * _update_past_and_masks implementation for USD * Add unittests for Universal Assisted generation * fix style * update tests * Remove unused import and fix `test_speculation_depth` test * exclude special and reserved tokens from tokenizer for UAG * mv `test_universal_assisted_generation.py` to `generation/test_candidate_generator.py` * Remove unused imports and fix style using `make style` (#9) * formatting * Swap gated `meta-llama/llama-3.2` with `allenai/llama` (#10) * Fix space sign disagreement (#12) * default values for AssistantToTargetTranslator fileds * fix space sign * minor * fix test + style * Default values for some fields of assistant to target translator (#11) * default values for AssistantToTargetTranslator fileds * fix * add support to empty logit_processors * Update candidate_generator.py (#15) fix typo * BUG fix in _prepare_assistant_input_ids (#14) * fix _prepare_assistant_input_ids * target_to_assistant_input_ids * Update src/transformers/generation/candidate_generator.py Co-authored-by: Nadav Timor <nadav.timor@weizmann.ac.il> --------- Co-authored-by: Nadav Timor <nadav.timor@weizmann.ac.il> * typo (`target_to_assistant_input_ids`) * formatting * merge upstream/main * Fix minor review comments (#16) * Fix: `token_ids.to(torch.int64)` (#18) * tok ids to `torch.int64` (reference: https://huggingface.co/docs/transformers.js/en/api/tokenizers) * `LongTensor` * fix dtype * `assistant_input_ids.to(dtype=torch.long)` * Remove unused import from test_candidate_generator.py * Remove unused import from test_candidate_generator.py * Remove `numpy` import * resolve pr comments (#19) * `AssistantToTargetTranslator` docstring * (per gante's comment) `filter_value` and `suppress_tokens_id` to class constants * update `AssistantToTargetTranslator` docstring * (gante's comment) replace `match-case` * formatting * Fix Joao's comments (#21) * remove threading * fix logits_processor * fix test device * fix style (#23) * Move atm (#24) * move AssistantToTargetTranslator * fixup * fix logit_processor * add atm_translator test * refactor test * remove threading from test * add require_torch in tests * move AssistantVocabTranslatorCache + add tests * ruff fix --------- Co-authored-by: jmamou <jonathan.mamou@intel.com> Co-authored-by: Gaurav <gauravj@d-matrix.ai> Co-authored-by: Gaurav Jain <gaurjain14@gmail.com> Co-authored-by: gauravjain14 <41287729+gauravjain14@users.noreply.github.com>	2025-02-26 16:14:02 +00:00
Manny Cortes	082834dd79	fix: prevent model access error during Optuna hyperparameter tuning (#36395 ) * fix: prevent model access error during Optuna hyperparameter tuning The `transformers.integrations.integration_utils.run_hp_search_optuna` function releases model memory and sets trainer.model to None after each trial. This causes an AttributeError when subsequent Trainer.train calls attempt to access the model before reinitialization. This is only an issue when `fp16_full_eval` or `bf16_full_eval` flags are enabled. * Update src/transformers/trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-26 17:06:48 +01:00
Zach Mueller	41925e4213	Add retry hf hub decorator (#35213 ) * Add retry torch decorator * New approach * Empty commit * Empty commit * Style * Use logger.error * Add a test * Update src/transformers/testing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Fix err * Update tests/utils/test_modeling_utils.py --------- Co-authored-by: Lucain <lucainp@gmail.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-02-25 20:53:11 +01:00
Chulhwa (Evan) Han	9ebfda3263	Fixed VitDet for non-squre Images (#35969 ) * size tuple * delete original input_size * use zip * process the other case * Update src/transformers/models/vitdet/modeling_vitdet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * [VITDET] Test non-square image * [Fix] Make Quality * make fix style * Update src/transformers/models/vitdet/modeling_vitdet.py --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-02-25 19:31:24 +00:00
Pavel Iakubovskii	fb83befb14	Fix pytorch integration tests for SAM (#36397 ) Fix device in tests	2025-02-25 14:53:34 +00:00
jiqing-feng	7c8916ddb5	fix audio classification pipeline fp16 test on cuda (#36359 ) * fix audio classification pipeline fp16 test on cuda Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add comments Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Update tests/pipelines/test_pipelines_audio_classification.py --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-02-25 15:01:25 +01:00
Fanli Lin	c3700b0eee	[tests] enable autoawq tests on XPU (#36327 ) add autoawq Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-02-25 13:38:09 +01:00
Dmitry Rogozhkin	b4b9da6d9b	tests: revert change of torch_require_multi_gpu to be device agnostic (#35721 ) * tests: revert change of torch_require_multi_gpu to be device agnostic The `11c27dd33` modified `torch_require_multi_gpu()` to be device agnostic instead of being CUDA specific. This broke some tests which are rightfully CUDA specific, such as: * `tests/trainer/test_trainer_distributed.py::TestTrainerDistributed` In the current Transformers tests architecture `require_torch_multi_accelerator()` should be used to mark multi-GPU tests agnostic to device. This change addresses the issue introduced by `11c27dd33` and reverts modification of `torch_require_multi_gpu()`. Fixes: `11c27dd33` ("Enable BNB multi-backend support (#31098)") Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * fix bug: modification of frozen set --------- Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-02-25 13:36:10 +01:00
MAHIR DAIYAN	d80d52b007	addressing the issue #34611 to make FlaxDinov2 compatible with any batch size (#35138 ) fixed the batch_size error, all tests are passing Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-02-25 10:44:44 +00:00
jiqing-feng	9d6abf9778	enable torchao quantization on CPU (#36146 ) * enable torchao quantization on CPU Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix int4 Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * enable CPU torchao tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix cuda tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix cpu tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix style Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix cuda tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix torchao available Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix torchao available Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix torchao config cannot convert to json * fix docs Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * rm to_dict to rebase Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * limited torchao version for CPU Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix skip Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Update src/transformers/testing_utils.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix cpu test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-25 11:06:52 +01:00
Jerry Zhang	2af272c101	Add autoquant support for torchao quantizer (#35503 ) * Add autoquant support for torchao quantizer Summary: att, also verified that autoquantized model can be saved and loaded: save: https://gist.github.com/jerryzh168/01d367aaf44dbbbfd4068a4a10a00061 load: https://gist.github.com/jerryzh168/d5c6c401b2abdf18e0b6771341f1525c Test Plan: tested locally with above script model uploaded to https://huggingface.co/jerryzh168/llama3-8b-autoquant Reviewers: Subscribers: Tasks: Tags: * add test * ruff fix * ruff reformat * add docs and min_sqnr support * format * format * fix test * update doc * format * remove disable_compile * format	2025-02-24 15:54:16 +01:00
Rahul Tuli	884a8ea1f0	Improve model loading for compressed tensor models (#36152 ) * Disable warnings for stacked compressors * Introduce two new hooks in HfQuantizer lifecycle to allow updates to missing and unexpected keys * Update missing and unexpected keys for stacked compressors * Add tests * Fix: run_compressed cases * Fix: uncompressed cases * Rename compressed_tensor folder to compressed_tensors Move RunCompressedTest to the same file Update tests to unittest	2025-02-24 13:47:21 +01:00
Fanli Lin	4dbf17c17f	[tests] enable bnb tests on xpu (#36233 ) * fix failed test * fix device * fix more device cases * add more cases * fix empty cache * Update test_4bit.py --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-02-24 11:30:15 +01:00
CalOmnie	547911e727	Uses Collection in transformers.image_transforms.normalize (#36301 ) * Uses Collection instead of Sequence in transformers.image_transforms.normalize * Uses collections.abc.Collection in lieu of deprecated typing one	2025-02-21 18:38:41 +01:00
Fanli Lin	7c5bd24ffa	[tests] make quanto tests device-agnostic (#36328 ) * make device-agnostic * name change	2025-02-21 14:20:40 +01:00
Joao Gante	678885bbbd	[CI] Check test if the `GenerationTesterMixin` inheritance is correct 🐛 🔫 (#36180 )	2025-02-21 10:18:20 +00:00
Pavel Iakubovskii	a957b7911a	Add SigLIP 2 (#36323 ) * Docs * Inits * Auto classes * Add siglip base * Add base tests * Fix Siglip V1 for fix res version * Add image processor * Update conversion * Experimenting with vectorized embeddings * Fixup * Add modular Siglip2Processor * Add modular configuration * Rename num patches * Correct image and text features merging * Working conversion script * Refactoring conversion script * Remove unused code in conversion script * Shorten dict a bit * Refactoring conversion * Done conversion refactoring * Fixup * Modular siglip2 * Make model exportable and compilable without graph breaks * Remove position_ids from image_processor * REmove position ids from modeling file * Update modular * Type hint * Fixup * Set defaults to processor * Add integration test * Revert spatial shapes back to tensor * Change order * Fix most of the tests * Fix docstring * Remove interpolate_pos_encoding arg (not needed) * Update docs * Standardize processing * Fix attention_mask in vision head * Siglip v1: remove double transpose in FA2 * Update modular file * Update FA2 test * Update expected logits * Fix interpolation for siglip2 image processor * Skip init test * Skip dispatch on flash test * Fix modeling tests * Fixup * Add dummy objects * Fix some docstrings * Add siglip2 in index.md * Fix consistency * Add docs * Remove size and data format * Add image processor tests * Fix * Add fast image processor * Fix style * Fix * Docs * Set lowercase for tokenizer * Adjust head size for Siglip v1 * Update siglip2 for consistency with siglip1 * Update siglip2 conversion * Update pipeline * Update checkpoints in tests * Update checkpoint name * Fix pooling for image classification model * Fix FA2 test * Update processor * Fix check repo * Update docs * Fix typos * Fix docstring for fast image processor * Add siglip2 to FA2 docs * Fix fast ip tests * Fix constitency * Fix tokenizer class for siglip v1 * Fix missing header * Refactor scaling for clip, siglip, siglip2 * Remove unused imports * Make fast IP default for siglip2 * Update docs * Update checkpoints * Update modular * Update paper link * Fixup * Fix name in toctree * Fix test	2025-02-21 09:04:19 +00:00
Nosimus	effaef334b	fix: prevent second save in the end of training if last step was saved already (#36219 ) * fix: prevent second save in the end of training * fix: prevent second save in the end of training * test: added test for no duplicate save on epoch save strategy * fix: removed TrainerControl * chore: style formatting --------- Co-authored-by: JaktensTid <jaktenstid1@gmail.com>	2025-02-20 17:38:52 +01:00
Orr Zohar	4397dfcb71	SmolVLM2 (#36126 ) * smolvlm init * updates * fixing bugs * minimal run, no checks * minimal run, no checks * passing first check + adding url support * updating video dataloading logic * fixing image logic * trying modular, but fails * modular is working, changing processor to match PR comments and general transformers logic * fixing kwargs * offloading video loading logic to image_util * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * update * add idefics3-based tests * add keyword to all * add PreTrainedModel * updateing video loading logic * working inference * updates for PR comments * updates for PR comments * moving SmolVLMPretrainedModel higher to fix import error * CI test pass * CI test pass * removing lambda * CI test pass * CI test pass * CI test pass * CI test pass * CI test pass * CI test pass * processor tests * add example in docs * typo * fix copies * skip compile tests - sdpa for VisionTransformer * fix init * raise import error for num2words * update doc for FA2 * more doc fix * CI * updates for PR comments * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Joshua Lochner <admin@xenova.com> * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * fixing processor -- tokenizer not defined properly, (gpt2 tokenizer), and does not have the attributes of fake image token, etc * adding smolvlm to VQA models * removing vqa auto class * Update src/transformers/models/smolvlm/processing_smolvlm.py Co-authored-by: Joshua Lochner <admin@xenova.com> * removing smolvlmvisiontransformer from index.md * my bad, video processing had typos * fixing docs * renaming params in SmolVLMModel.inputs_merger * removing un-needed dtype/device in model forward * ruff for CI * update docs * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * return cache position * return cache position * return cache also in modular * needed to run modular again * fix training tests * push vectorized inputs merger * format * format * reduce number of mappings * addressing PR comments * happy CI, happy me :) * skip non-nested images * adjust integration test for smaller GPUs * format * fix kwargs in chat template apply * skip this for now --------- Co-authored-by: raushan <raushan@huggingface.co> Co-authored-by: Pablo <pablo.montalvo.leroux@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Joshua Lochner <admin@xenova.com>	2025-02-20 15:00:26 +01:00
Yih-Dar	e8531a0e33	Fix broken CI on release branch due to missing conversion files (#36275 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-20 13:22:10 +01:00
Ilyas Moutawwakil	5e2183f344	Make cache traceable (#35873 ) simply make cache traceable	2025-02-20 09:59:25 +01:00
Joao Gante	e3d99ec2f5	[tests] make `test_from_pretrained_low_cpu_mem_usage_equal` less flaky (#36255 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-19 15:14:02 +00:00
Joao Gante	99adc74462	[tests] remove flax-pt equivalence and cross tests (#36283 )	2025-02-19 15:13:27 +00:00
Joao Gante	fa8cdccd91	[tests] deflake dither test (#36284 )	2025-02-19 15:13:10 +00:00
Cyril Vallez	60226c6ff3	TP initialization module-by-module (#35996 ) * module-by-module loading! * Update modeling_utils.py * dtyle and comments * Update modeling_utils.py * Update modeling_utils.py * Update test * Update modeling_utils.py * Update modeling_utils.py * Update test_tp.py * Update test_tp.py * Update modeling_utils.py * re-trigger CIs * re-trigger CIs	2025-02-19 14:04:57 +01:00
Joao Gante	0863eef248	[tests] remove `pt_tf` equivalence tests (#36253 )	2025-02-19 11:55:11 +00:00
Karel Vesely	1a81d774b1	Add dithering to the `Speech2TextFeatureExtractor` API. (#34638 ) * Add dithering to the `Speech2TextFeatureExtractor` API. - in kaldi : `4a8b7f6732/src/feat/feature-window.cc (L145)` - with dithering without a seed, the features become non-deterministic due to small Gaussian noise added to the audio (i.e. 2 runs lead to little different outputs) * update the PR - add dithering also for WhisperFeatureExtractor - not adding to Wav2Vec2FeatureExtractor (no FBANK computation) * add unit-tests for dithering, fix docstrings * ruff * utils/check_copies.py --fix_and_overwrite * update code, add seed to unit-test * adding explanation of dithering	2025-02-19 11:50:02 +01:00
Yoni Gozlan	9f51dc2535	Add support for post-processing kwargs in image-text-to-text pipeline (#35374 ) * fix error and improve pipeline * add processing_kwargs to apply_chat_template * change default post_process kwarg to args * Fix slow tests * fix copies	2025-02-18 17:43:36 -05:00
Yoni Gozlan	9b479a245b	Uniformize LlavaNextVideoProcessor kwargs (#35613 ) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs	2025-02-18 14:13:51 -05:00
ivarflakstad	07182b2e10	GitModelIntegrationTest - flatten the expected slice tensor (#36260 ) Flatten the expected slice tensor	2025-02-18 16:04:19 +01:00
Damiano Amatruda	4d2de5f63c	Fix XGLM loss computation (PyTorch and TensorFlow) (#35878 ) * Fix XGLM loss computation (PyTorch and TensorFlow) * Update expected output string in XGLM sample test This updates the expected output string of test_xglm_sample for torch 2.0 to the correct one and removes the one for torch 1.13.1 + cu116 (transformers moved to torch 2.0 with PR #35358). * Update expected output IDs in XGLM generation test	2025-02-18 15:37:48 +01:00
Raushan Turganbay	e6cc410d5b	Remove flakiness in VLMs (#36242 ) * fix * nit * no logits processor needed * two more tests on assisted decoding	2025-02-18 11:41:07 +01:00
andrewor14	fdcfdbfd22	Fix TorchAoConfig not JSON serializable (#36206 ) Summary: TorchAoConfig optionally contains a `torchao.dtypes.Layout` object which is a dataclass and not JSON serializable, and so the following fails: ``` import json from torchao.dtypes import TensorCoreTiledLayout from transformers import TorchAoConfig config = TorchAoConfig("int4_weight_only", layout=TensorCoreTiledLayout()) config.to_json_string() json.dumps(config.to_dict()) ``` This also causes `quantized_model.save_pretrained(...)` to fail because the first step of this call is to JSON serialize the config. Fixes https://github.com/pytorch/ao/issues/1704. Test Plan: python tests/quantization/torchao_integration/test_torchao.py -k test_json_serializable Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-18 11:05:42 +01:00
Yih-Dar	626666c444	Au revoir flaky `test_fast_is_faster_than_slow` (#36240 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-17 18:30:07 +01:00
Joao Gante	429f1a682d	[tests] remove `test_export_to_onnx` (#36241 )	2025-02-17 16:52:44 +00:00
Joao Gante	55493f1390	[tests] remove tf/flax tests in `/generation` (#36235 )	2025-02-17 14:59:22 +00:00
ivarflakstad	7ec35bc3bd	Add missing atol to torch.testing.assert_close where rtol is specified (#36234 )	2025-02-17 14:57:50 +01:00
Joao Gante	dad513e0c2	[generate] remove cache v4.47 deprecations (#36212 )	2025-02-17 13:55:03 +00:00
Yih-Dar	23d6095e8f	Fix `LlavaForConditionalGenerationModelTest::test_config` after #36077 (#36230 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-17 11:49:07 +01:00
Fanli Lin	fae0f3dde8	[tests] fix `EsmModelIntegrationTest::test_inference_bitsandbytes` (#36225 ) fix failed test	2025-02-17 11:10:33 +01:00
Yih-Dar	dd16acb8a3	set `test_torchscript = False` for Blip2 testing (#35972 ) * just skip * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-14 17:43:32 +01:00
Mayank Mishra	a570e2ba87	add shared experts for upcoming Granite 4.0 language models (#35894 ) * Modular GraniteMoE with shared Experts. Signed-off-by: Shawn Tan <shawntan@ibm.com> * Modified * Import order. * Modified for style * Fix space. * Test * Remove extra granitemoe file. * New converted file and tests * Modified __init__ files. * Formatting. * Dummy PT objects * register granitemoe shared model Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix linting of a file Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix import in modeling file Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * update generated modeling file Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * add documentation Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * update docstrings Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * update generated modeling file Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix docstrings in config class Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * merge main Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> --------- Signed-off-by: Shawn Tan <shawntan@ibm.com> Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> Co-authored-by: Shawn Tan <shawntan@ibm.com> Co-authored-by: Shawn Tan <shawn@wtf.sg> Co-authored-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> Co-authored-by: Sukriti Sharma <Ssukriti@users.noreply.github.com>	2025-02-14 16:55:28 +01:00
ivarflakstad	7ae7e87a09	Add @require_bitsandbytes to Aria test_batched_generation (#36192 )	2025-02-14 15:48:47 +01:00
Kyle Sayers	bcfc9d795e	[Bugfix] Fix reloading of pixtral/llava configs (#36077 ) * add is_composition flag to LlavaConfig Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * WIP: pixtral text config Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * fix style Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * add test Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * use is_composition for pixtral Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * Revert "use is_composition for pixtral" This reverts commit `a53d5f9fc5`. * Revert "Revert "use is_composition for pixtral"" This reverts commit `3ab1c99404`. --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>	2025-02-14 15:27:05 +01:00
Raushan Turganbay	0c78ef6cd3	🔴 VLM: compile compatibility (#35724 ) * llavas * add mroe models * fix `compile_forward` test for all models * fix copies * make style * also doesn't support cache class * fix some tests * not copied from * ci green? * fix tests * fix copies * fix tests * check with `numel` and remove `item` * fix copies * fix copies * Update src/transformers/models/cohere2/modeling_cohere2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * opt remove cross attn * gemma2 * fixup * fixup * fix newly added test * maybe fixed? * green please? --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-02-14 15:23:49 +01:00
David LaPalomento	b45cf0e90a	Guard against unset resolved_archive_file (#35628 ) * archive_file may not be specified When loading a pre-trained model from a gguf file, resolved_archive_file may not be set. Guard against that case in the safetensors availability check. * Remap partial disk offload to cpu for GGUF files GGUF files don't support disk offload so attempt to remap them to the CPU when device_map is auto. If device_map is anything else but None, raise a NotImplementedError. * Don't remap auto device_map and raise RuntimeError If device_map=auto and modules are selected for disk offload, don't attempt to map them to any other device. Raise a runtime error when a GGUF model is configured to map any modules to disk. --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-14 14:44:31 +01:00
Arthur	96f01a36ac	Revert qwen2 breaking changes related to attention refactor (#36162 ) * dito * add a test * upsate * test needs fa2 * update test and configuration * test requires fa2 * style	2025-02-14 13:44:14 +01:00
Mohamed Mekkouri	cb586a3999	Add require_read_token to fp8 tests (#36189 ) fix	2025-02-14 12:27:35 +01:00
Andrei Panferov	5f726f8b8e	New HIGGS quantization interfaces, JIT kernel compilation support. (#36148 ) * new flute * new higgs working * small adjustments * progress and quallity * small updates * style --------- Co-authored-by: Andrey Panferov <panferov.andrey3@wb.ru> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-02-14 12:26:45 +01:00
Raushan Turganbay	15ec971b8e	Prepare processors for VideoLLMs (#36149 ) * allow processor to preprocess conversation + video metadata * allow callable * add test * fix test * nit: fix * add metadata frames_indices * Update src/transformers/processing_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * Update src/transformers/processing_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * port updates from Orr and add one more test * Update src/transformers/processing_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * typo * as dataclass * style * docstring + maek sure tests green --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2025-02-14 11:34:08 +01:00
Isotr0py	33d1d715b0	Add ImageProcessorFast to Qwen2.5-VL processor (#36164 ) * add qwen2 fast image processor to modular file Signed-off-by: isotr0py <2037008807@qq.com> * fix modular Signed-off-by: isotr0py <2037008807@qq.com> * fix circle import Signed-off-by: isotr0py <2037008807@qq.com> * add docs Signed-off-by: isotr0py <2037008807@qq.com> * fix typo Signed-off-by: isotr0py <2037008807@qq.com> * add modular generated files Signed-off-by: isotr0py <2037008807@qq.com> * revert qwen2vl fast image processor Signed-off-by: isotr0py <2037008807@qq.com> * remove qwen2.5-vl image processor from modular Signed-off-by: isotr0py <2037008807@qq.com> * re-generate qwen2.5-vl files Signed-off-by: isotr0py <2037008807@qq.com> * remove unnecessary test Signed-off-by: isotr0py <2037008807@qq.com> * fix auto map Signed-off-by: isotr0py <2037008807@qq.com> * cleanup Signed-off-by: isotr0py <2037008807@qq.com> * fix model_input_names Signed-off-by: isotr0py <2037008807@qq.com> * remove import Signed-off-by: isotr0py <2037008807@qq.com> * make fix-copies Signed-off-by: isotr0py <2037008807@qq.com> --------- Signed-off-by: isotr0py <2037008807@qq.com>	2025-02-14 17:34:55 +08:00
Raushan Turganbay	3bf02cf440	CI: fix `test-save-trainer` (#36191 ) * fix * also the docstring	2025-02-14 10:20:56 +01:00
Yoni Gozlan	336dc69d63	Uniformize OwlViT and Owlv2 processors (#35700 ) * uniformize owlvit processor * uniformize owlv2 * nit * add positional arg test owlvit * run-slow: owlvit, owlv2 * run-slow: owlvit, owlv2 * remove one letter variable	2025-02-13 17:30:26 -05:00
Yoni Gozlan	e6a7981711	Fix make_batched_videos and add tests (#36143 ) * add support for initial shift in video processing and other fixes * revert modifications video loading functions	2025-02-13 17:14:30 -05:00
Joao Gante	62c7ea0201	CI: avoid human error, automatically infer generative models (#33212 ) * tmp commit * move tests to the right class * remove ALL all_generative_model_classes = ... * skip tf roberta * skip InstructBlipForConditionalGenerationDecoderOnlyTest * videollava * reduce diff * reduce diff * remove on vlms * fix a few more * manual rebase bits * more manual rebase * remove all manual generative model class test entries * fix up to ernie * a few more removals * handle remaining cases * recurrent gemma * it's better here * make fixup * tf idefics is broken * tf bert + generate is broken * don't touch tf :() * don't touch tf :( * make fixup * better comments for test skips * revert tf changes * remove empty line removal * one more * missing one	2025-02-13 16:27:11 +01:00
Elvir Crnčević	845b0a2616	Efficient Inference Kernel for SpQR (#34976 ) * Resolve vptq conflict * Rename spqr package to spqr_quant * Get rid of aqlm mention * Start working on tests * Resolve ruff code checks * Ruff format * Isort * Test updates * Add gpu tag * Rename to modules_to_not_convert * Config update * Docs and config update * Docs and config update * Update to update_torch_dtype * spqr config parameter validation * Ruff update * Apply ruff fixes * Test fixes * Ruff update * Mark tests as @slow again; Ruff; Docstring update * Ruff * Remove absolute path * Resolve typo * Remove redundandt log * Check accelerate/spqr availability * Ruff fix * Check if the config contains proper shapes * Ruff test * Documentation update * overview update * Ruff checks * Ruff code quality * Make style * Update docs/source/en/quantization/spqr.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update spqr.md * Enable gptqmodel (#35012) * gptqmodel Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update readme Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * gptqmodel need use checkpoint_format (#1) * gptqmodel need use checkpoint_format * fix quantize * Update quantization_config.py * Update quantization_config.py * Update quantization_config.py --------- Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * Revert quantizer_gptq.py (#2) * revert quantizer_gptq.py change * pass *kwargs limit gptqmodel and optimum version Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix warning Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix version check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert unrelated changes Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * enable gptqmodel tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix requires gptq Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Fix Transformer compat (#3) * revert quantizer_gptq.py change * pass *kwargs add meta info * cleanup * cleanup * Update quantization_config.py * hf_select_quant_linear pass checkpoint_format and meta * fix GPTQTestCUDA * Update test_gptq.py * gptqmodel.hf_select_quant_linear() now does not select ExllamaV2 * cleanup * add backend * cleanup * cleanup * no need check exllama version * Update quantization_config.py * lower checkpoint_format and backend * check none * cleanup * Update quantization_config.py * fix self.use_exllama == False * spell * fix unittest * fix unittest --------- Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format again Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update gptqmodel version (#6) * update gptqmodel version * update gptqmodel version * fix unit test (#5) * update gptqmodel version * update gptqmodel version * "not self.use_exllama" is not equivalent to "self.use_exllama==False" * fix unittest * update gptqmodel version * backend is loading_attibutes (#7) * fix format and tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix memory check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix device mismatch Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix result check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * update tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * review: update docs (#10) * review: update docs (#12) * review: update docs * fix typo * update tests for gptqmodel Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update document (#9) * update overview.md * cleanup * Update overview.md * Update overview.md * Update overview.md * update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md --------- Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * typo * doc note for asymmetric quant * typo with apple silicon(e) * typo for marlin * column name revert: review * doc rocm support * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com> Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com> Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fix : Nemotron Processor in GGUF conversion (#35708) * fixing nemotron processor * make style * Update docs/source/en/quantization/spqr.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add missing TOC to doc --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com> Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com> Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-02-13 16:22:58 +01:00
Joao Gante	636ee57489	[generate] revert change in Aria: the maximum cache length must match `max_length` (#36120 ) * revert inputs_embeds len * Update test_utils.py * make fixup	2025-02-13 14:36:33 +00:00
Arthur	b079dd1fa2	Fix red CI (#36174 ) test was weird	2025-02-13 14:27:55 +01:00
Joao Gante	d114a6f78e	[Modular] skip modular checks based on diff (#36130 ) skip modular checks based on diff	2025-02-13 12:53:21 +00:00
Mohamed Mekkouri	efe72fe21f	Adding FP8 Quantization to transformers (#36026 ) * first commit * adding kernels * fix create_quantized_param * fix quantization logic * end2end * fix style * fix imports * fix consistency * update * fix style * update * udpate after review * make style * update * update * fix * update * fix docstring * update * update after review * update * fix scheme * update * update * fix * update * fix docstring * add source * fix test --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-13 13:01:19 +01:00
Thomas Bauwens	8f137b2427	Move `DataCollatorForMultipleChoice` from the docs to the package (#34763 ) * Add implementation for DataCollatorForMultipleChoice based on docs. * Add DataCollatorForMultipleChoice to import structure. * Remove custom DataCollatorForMultipleChoice implementations from example scripts. * Remove custom implementations of DataCollatorForMultipleChoice from docs in English, Spanish, Japanese and Korean. * Refactor torch version of DataCollatorForMultipleChoice to be more easily understandable. * Apply suggested changes and run make fixup. * fix copies, style and fixup * add missing documentation * nits * fix docstring * style * nits * isort --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2025-02-13 12:01:28 +01:00
CL-ModelCloud	35c155052d	Fix PretrainedTokenizerFast check => Fix PretrainedTokenizerFast Save (#35835 ) * Fix the bug in tokenizer.save_pretrained when saving tokenizer_class to tokenizer_config.json * Update tokenization_utils_base.py * Update tokenization_utils_base.py * Update tokenization_utils_base.py * add tokenizer class type test * code review * code opt * fix bug * Update test_tokenization_fast.py * ruff check * make style * code opt * Update test_tokenization_fast.py --------- Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>	2025-02-13 12:00:33 +01:00
Marco Edward Gorelli	3c912c9089	docs: fix return type annotation of `get_default_model_revision` (#35982 )	2025-02-13 11:59:15 +01:00
gewenbin0992	6a1ab634b6	qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1 (#36083 ) * qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1 * fix * fix * fix * fix * add tests * fix test bugs * fix * fix failed tests * fix	2025-02-13 11:35:28 +01:00
Pavel Iakubovskii	d419862889	Fix tests for vision models (#35654 ) * Trigger tests * [run-slow] beit, detr, dinov2, vit, textnet * Fix BEiT interpolate_pos_encoding * Fix DETR test * Update DINOv2 test * Fix textnet * Fix vit * Fix DPT * fix data2vec test * Fix textnet test * Update interpolation check * Fix ZoeDepth tests * Update interpolate embeddings for BEiT * Apply suggestions from code review	2025-02-13 10:28:37 +00:00
Lucain	e60ae0d078	Replace deprecated update_repo_visibility (#35970 )	2025-02-13 11:27:55 +01:00
Sambhav Dixit	950cfb0b4f	Fix PaliGemma Pad Token Masking During Training #35855 (#35859 ) * change order of unmasking of tokens * library import * class setup * test function * refactor * add commit message * test modified * explict initiliasation of weights + made model smaller * removed sepete testing file * fixup * fixup core * test attention mask with token types * tests fixup * removed PaliGemmaAttentionMaskTest class --------- Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>	2025-02-13 10:11:44 +01:00
Yih-Dar	9985d06add	skip `test_initialization` for `VitPoseBackboneModelTest` for now (#36154 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-12 18:24:24 +01:00
Zach Mueller	1fae54c721	Add more rigerous non-slow grad accum tests (#35668 ) * Add more rigerous non-slow grad accum tests * Further nits * Re-add space * Readbility * Use tinystories instead * Revert transformer diff * tweak threshs	2025-02-12 10:26:21 -05:00
hsilva664	281c0c8b5b	adding option to save/reload scaler (#34932 ) * Adding option to save/reload scaler * Removing duplicate variable * Adding save/reload test * Small fixes on deterministic algorithm call * Moving LLM test to another file to isolate its environment * Moving back to old file and using subprocess to run test isolated * Reverting back accidental change * Reverting back accidental change	2025-02-12 15:48:16 +01:00
kang sheng	a33ac830af	Fix multi gpu loss sync condition, add doc and test (#35743 ) * Fix multi gpu loss sync condition, add doc and test * rename function and class * loss should not scale during inference * fix typo	2025-02-12 15:41:31 +01:00
zhuHQ	08c4959a23	Optim: APOLLO optimizer integration (#36062 ) * Added APOLLO optimizer integration * fix comment * Remove redundancy: Modularize low-rank optimizer construction * Remove redundancy: Remove useless comment * Fix comment: Add typing * Fix comment: Rewrite apollo desc	2025-02-12 15:33:43 +01:00
Raushan Turganbay	8fc6ecba4f	VLM: enable skipped tests (#35746 ) * fix cached tests * fix some tests * fix pix2struct * fix	2025-02-12 12:55:46 +01:00
Sambhav Dixit	d6897b46bd	Add utility for Reload Transformers imports cache for development workflow #35508 (#35858 ) * Reload transformers fix form cache * add imports * add test fn for clearing import cache * ruff fix to core import logic * ruff fix to test file * fixup for imports * fixup for test * lru restore * test check * fix style changes * added documentation for usecase * fixing --------- Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>	2025-02-12 12:45:11 +01:00
Joao Gante	1cc7ca3295	Whisper: remove redundant assisted generation tests (#34814 ) * remove redundant test * delete another test * revert default max_length * (wrong place, moving)	2025-02-12 11:37:19 +00:00
nhamanasu	377d8e2b9c	add RAdamScheduleFree optimizer (#35313 ) * add RAdamScheduleFree optimizer * revert schedulefree version to the minimum requirement * refine is_schedulefree_available so that it can take min_version * refine documents --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-02-12 11:31:51 +01:00
Sambhav Dixit	1b57de8dcf	Make `output_dir` Optional in `TrainingArguments` #27866 (#35735 ) * make output_dir optional * inintaied a basic testing module to validate and verify the changes * Test output_dir default to 'tmp_trainer' when unspecified. * test existing functionality of output_dir. * test that output dir only created when needed * final check * added doc string and changed the tmp_trainer to trainer_output * amke style fixes to test file. * another round of fixup --------- Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>	2025-02-11 18:54:36 +01:00
Pablo Montalvo	3a5c328fd8	Fix CI issues (#35662 ) * make explicit gpu dep * [run-slow] bamba	2025-02-11 18:17:01 +01:00
Pavel Iakubovskii	f42d46ccb4	Add common test for `torch.export` and fix some vision models (#35124 ) * Add is_torch_greater_or_equal test decorator * Add common test for torch.export * Fix bit * Fix focalnet * Fix imagegpt * Fix seggpt * Fix swin2sr * Enable torch.export test for vision models * Enable test for video models * Remove json * Enable for hiera * Enable for ijepa * Fix detr * Fic conditional_detr * Fix maskformer * Enable test maskformer * Fix test for deformable detr * Fix custom kernels for export in rt-detr and deformable-detr * Enable test for all DPT * Remove custom test for deformable detr * Simplify test to use only kwargs for export * Add comment * Move compile_compatible_method_lru_cache to utils * Fix beit export * Fix deformable detr * Fix copies data2vec<->beit * Fix typos, update test to work with dict * Add seed to the test * Enable test for vit_mae * Fix beit tests * [run-slow] beit, bit, conditional_detr, data2vec, deformable_detr, detr, focalnet, imagegpt, maskformer, rt_detr, seggpt, swin2sr * Add vitpose test * Add textnet test * Add dinov2 with registers * Update tests/test_modeling_common.py * Switch to torch.testing.assert_close * Fix masformer * Remove save-load from test * Add dab_detr * Add depth_pro * Fix and test RT-DETRv2 * Fix dab_detr	2025-02-11 11:37:31 +00:00
Arthur	1779f5180e	Fix nighlty CIs: missing atols (#35903 ) fix osme missing atols	2025-02-11 10:49:21 +01:00
ivarflakstad	1feebb5b41	AutoformerForPrediction test add atol (#36017 )	2025-02-10 19:22:24 +01:00
Joao Gante	be2ac0916a	[generate] shape checks in tests compatible with fixed-length caches (+ some minor fixes) (#35993 ) * shape checks compatible with static cache * add test * tmp * manually turn on eager attn when we want to output attn * typo * generalize to encoder-decoder models * force compilation on cpu * tmp commit * fix static cache shape checks * models with odd caches * fix copies * shorter cache search loop * use decoder_past_key_values everywhere * better test variable names and comments * signature * rename _check_outputs into _check_generate_outputs * add comments * HybridCache future test note	2025-02-10 17:50:54 +00:00
Yih-Dar	3897f2caf8	Enable pytest live log and show warning logs on GitHub Actions CI runs (#35912 ) * fix * remove * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-10 13:36:20 +01:00
Jingze Shi	48a309d0d2	Support constant lr with cooldown (#35453 ) * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add more warmup and cooldown methods to 'get_wsc_schedule' * Add more warmup and cooldown methods to 'get_wsc_schedule' * Add more warmup and cooldown methods to 'get_wsc_schedule' * Add more warmup and cooldown methods to 'get_wsc_schedule' * Add more warmup and decay methods to 'get_wsd_schedule' * support num_training_steps and num_stable_steps for get_wsd_schedule * support num_training_steps and num_stable_steps for get_wsd_schedule * get wsd scheduler before the `num_training_steps` decision * fix code_quality * Update stable branch logic * fix code_quality * Move stable stage decide to `get_wsd_schedule` * Update docstring of `get_wsd_schedule` * Update `num_train_steps` to optional * Update `num_train_steps` to optional * Update docstring of `get_wsd_schedule` * Update src/transformers/optimization.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-10 13:21:55 +01:00
Armaghan Shakir	9a6be63fdb	Add Apple's Depth-Pro for depth estimation (#34583 ) * implement config and model building blocks * refactor model architechture * update model outputs * update init param to include use_fov_model * update param name in config * fix hidden_states and attentions outputs for fov * sort config * complete minor todos * update patching * update config for encoder * fix config * use correct defaults in config * update merge for compatibility with different image size * restructure encoder for custom configuration * make fov model compatible with custom config * replace word "decoder" with "fusion" * weight conversion script * fix fov squeeze * update conversion script (without test) * upload ruff image processing * create fast image processing * use torch interpolation for image processing * complete post_process_depth_estimation * config: fix imports and sort args * apply inference in weight conversion * use mllama script instead for weight conversion * clean weight conversion script * add depth-pro status in other files * fill docstring in config * formatting * more formatting * formatting with ruff * formatting with style * fix copied classes * add examples; update weight convert script * fix using check_table.py and isort * fix config docstring * add depth pro to sdpa docs * undo unintentional changes in configuration_gemma.py * minor fixes * test image processing * fixes and tests * more fixes * use output states from image_encoder instead * Revert "use output states from image_encoder instead" This reverts commit `2408ec54e4`. * make embeddings dynamic * reshape output hidden states and attentions as part of computation graph * fix ruff formating * fix docstring failure * use num_fov_head_layers in tests * update doc * check consistency with config * ruff formatting * update test case * fix ruff formatting * add tests for fov * use interpolation in postprocess * run and fix slow tests locally * use scaled_images_features for image and fov encoder * return fused_hidden_states in fusion stage * fix example * fix ruff * fix copyright license for all files * add __all__ for each file * minor fixes - fix download spell - add push_to_hub option - fix Optional type hinting - apply single loop for DepthProImageProcessor.preprocess * return list in post_process_depth_estimation * minor fixes - capitalize start of docstring - use ignore copy - fix examples - move docstring templates and custom output classes to top - remove "-> None" typehinting from __init__ - type hinting for forward passes - fix docstrings for custom output classes * fix "ruff check" * update upsample and projection * major changes: (image size and merge optimization) - add support for images of any size - optimize merge operation - remove image_size from config - use full names instead of B, C, H, W - remove interpolation from fusion stage - add interpolation after merge - move validations to config - update integration test - add type hints for functions * fix push_to_hub option in weights conversion * remove image_size in weights conversion * major changes in the architecture - remove all DepthProViT modules and support different backbones using the AutoModel API - set default use_fov_model to False - validate parameters in configuration - update interpolate function: use "nearest" for faster computation - update reshape_feature function: remove all special tokens, possible from different backbones - update merge function: use padding from config instead of merge_out_size - remove patch_to_batch and batch_to_patch conversions for now - calculate out_size dynamically in the encoder - leave head_mask calculation to the backbone - fix bugs with merge - add more comments - update tests * placeholder for unused config attributes * improve docs amid review * minor change in docs * further optimize merge * fix formatting * remove unused patch/batch convertion functions * use original F.interpolate * improve function naming * minor chages - use torch_int instead of int - use proper for newly initialized tensors - use user provided return_dict for patch_encoder - use if-else block instead in self.use_fov_model * rearchitect upsample block for improved modularity * update upsample keys in weight conversion * improve padding in merge_patches * use double-loop for merge * update comments * create feature_extractor, reduce some forward code * introduce config.use_mask_token in dinov2 * minor fixes * minor fixes for onnx * update __init__ to latest format * remove DepthProConfig.to_dict() * major changes in backbone * update config in weight conversion * formatting * converted model is fp32 * improve naming and docs for feature_extractor->reconstruct_feature_maps * minor fixes; amid review * create intermediate vars in func call * use torch.testing.assert_close * use ModuleList instead of Sequential and ModuleDict * update docs * include fov in integraiton tests * update docs * improve initialization of convolution layers * fix unused fov keys * update tests * ruff format * fix test, amid kaimming initialization * add depthpro to toctree * add residual layer to _no_split_modules * architecture rework * Update src/transformers/models/depth_pro/image_processing_depth_pro.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/depth_pro/image_processing_depth_pro_fast.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * update docs * improve merge_patches * use flatten with fov_output * ruff formatting * update resources section in docs Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix typo "final_kernal_size" Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix output typehint for DepthProDepthEstimator Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * residual operation in 2 steps Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * use image_size instead of global patch_size in interpolation * replace all Sequential with ModuleList * update fov * update heads * fix and update conversion script for heads * ruff formatting * remove float32 conversion * use "Fov" instead of "FOV" in class names * use "Fov" instead of "FOV" in config docs * remove prune_heads * update fusion stage * use device in examples * update processor * ruff fixes * add do_rescale in image_processor_dict * skip test: test_fast_is_faster_than_slow * ruff formatting * DepthProImageProcessorFast in other files * revert antialias removal * add antialias in BaseImageProcessorFast * Revert "revert antialias removal" This reverts commit `5caa0bd8f9`. * Revert "add antialias in BaseImageProcessorFast" This reverts commit `3ae1134780`. * update processor for grouping and antialias * try test_fast_is_faster_than_slow without "skip" or "flanky" * update checkpoint * update checkpoint * use @is_flanky for processor test * update checkpoint to "apple/DepthPro-hf" --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-02-10 11:32:45 +00:00
Raushan Turganbay	eebd2c972c	Chat template: update for processor (#35953 ) * update * we need batched nested input to always process correctly * update a bit * fix copies	2025-02-10 09:52:19 +01:00
Matt	a18b7fdd9e	Move audio top_k tests to the right file and add slow decorator (#36072 ) * Move audio top_k tests to the right file and add slow decorator because we load a real model * empty commit to trigger tests	2025-02-07 14:32:30 +00:00
Jade Choghari	006d9249ec	Adding RT-DETRv2 for object detection (#34773 ) * cookiecutter add rtdetrv2 * make modular working * working modelgit add . * working modelgit add . * finalize moduar inheritence * finalize moduar inheritence * Update src/transformers/models/rtdetrv2/modular_rtdetrv2.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * update modular and add rename * remove output ckpt * define loss_kwargs * fix CamelCase naming * fix naming + files * fix modular and convert file * additional changes * fix modular * fix import error (switch to lazy) * fix autobackbone * make style * add * update testing * fix loss * remove old folder * fix testing for v2 * update docstring * fix docstring * add resnetv2 (with modular bug to fix) * remove resnetv2 backbone * fix changes * small fixes * remove rtdetrv2resnetconfig * add rtdetrv2 name to convert * make style * Update docs/source/en/model_doc/rt_detr_v2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/rt_detr_v2/modular_rt_detr_v2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/rt_detr_v2/modular_rt_detr_v2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix modular typo after review * add reviewed changes * add final review changes * Update docs/source/en/model_doc/rt_detr_v2.md Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update src/transformers/models/rt_detr_v2/__init__.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update src/transformers/models/rt_detr_v2/convert_rt_detr_v2_weights_to_hf.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * add review changes * remove rtdetrv2 resnet * removing this weird project change * change ckpt name from jadechoghari to author * implement review and update testing * update naming and remove wrong ckpt * name * make fix-copies * Fix RT-DETR loss * Add resources, fix name * Fix repo in docs * Fix table name --------- Co-authored-by: jadechoghari <jadechoghari@users.noreply.huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: qubvel <qubvel@gmail.com>	2025-02-06 19:28:45 +00:00
Matt	4563ba2c6f	Fix StopStringCriteria to handle tokens above len(tokenizer) (#35797 ) * Fix StopStringCriteria to handle tokens above len(tokenizer) This fixes #35244 by clipping token IDs to be within the tokenizer's vocabulary size before performing the embedding lookup. This prevents index errors when model.config.vocab_size > len(tokenizer). The fix: 1. Adds a clamp operation to ensure token IDs are within bounds 2. Adds a test case to verify the behavior * Use self.stop_strings instead of stop_strings * Handle clipping correctly * make fixup * Update test to the new embedding vecs * Use much bigger values in the mismatch test * Typo fix * Slight simplification --------- Co-authored-by: openhands <openhands@all-hands.dev>	2025-02-06 16:53:28 +00:00
Zach Mueller	28f73bc307	Fix model kwargs (#35875 ) * Save state * Make a failing test * Better test * mpt -> done, many more to go * Rm extranious * Bamba * Bert * big_bird * biogpt * bloom * codegen * ctrl * data2vec * dbrx * Through up to Dbrx * electra * ernie * falcon * Fuyu/persimmon * Include noop kwargs to base models * Rebase * Skip musigen * Refactor/skip mllama * Revert makefile * Rm file * Fix PT failing, need to modify rest of loss funcs to not resize * Propagate some * Continue * More * More options * Mostly fixed * Proved that it's the same * Bloom is good * Make ability to override loss func possible * Fixup * Clean * Fix xglm * Quality tests * Skip OCR2 * Make specific loss for xglm * Make order the same/line up 1:1 * xglm * Skip fx output loss bloom model * Didn't pass in pad_token_id * Fix quality	2025-02-06 11:35:25 -05:00
湛露先生	1590c66430	Fix words typos in ggml test. (#36060 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-02-06 15:32:40 +00:00
Zach Mueller	1ce0e2992e	Nail in edge case of torch dtype being overriden permantly in the case of an error (#35845 ) * Nail in edge case of torch dtype * Rm unused func * Apply suggestions from code review Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> * Refactor tests to only mock what we need, don't introduce injection functions * SetUp/TearDown * Do super --------- Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>	2025-02-06 09:05:23 -05:00
Raushan Turganbay	3dd1de39bb	Paligemma: fix generation with Gemma2 (#36044 ) * fix paligemma * nit * use `kwargs` in models that can load any LM	2025-02-06 14:31:32 +01:00
Yih-Dar	dce9970884	Update `test_flash_attn_2_can_dispatch_composite_models` (#36050 ) * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-06 12:09:49 +01:00
Yaswanth Gali	7aee036e54	Iterative generation using Input embeds and `past_key_values` (#35890 ) * Iterative generation using input embeds * ruff fix * Added Testcase * Updated comment * ♻️ Refactored testcase * Skip test for these models * Continue generation using input embeds and cache * Skip generate_continue_from_embeds test * Refactor `prepare_input_for_generation` func * Continue generation using input embeds and cache * Modular changes fix * Overwrite 'prepare_inputs_for_generation' function	2025-02-06 11:06:05 +01:00
Sambhav Dixit	0de15c988b	Fix Audio Classification Pipeline top_k Documentation Mismatch and Bug #35736 (#35771 ) * added condition for top_k Doc mismatch fix * initilation of test file for top_k changes * added test for returning all labels * added test for few labels * tests/test_audio_classification_top_k.py * final fix * ruff fix --------- Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>	2025-02-05 16:25:08 +00:00
Stas Bekman	9dc1efa5d4	DeepSpeed github repo move sync (#36021 ) deepspeed github repo move	2025-02-05 08:19:31 -08:00
Yoni Gozlan	fa56dcc2ab	Refactoring of ImageProcessorFast (#35069 ) * add init and base image processing functions * add add_fast_image_processor to transformers-cli * add working fast image processor clip * add fast image processor to doc, working tests * remove "to be implemented" SigLip * fix unprotected import * fix unprotected vision import * update ViTImageProcessorFast * increase threshold slow fast ewuivalence * add fast img blip * add fast class in tests with cli * improve cli * add fast image processor convnext * add LlavaPatchingMixin and fast image processor for llava_next and llava_onevision * add device kwarg to ImagesKwargs for fast processing on cuda * cleanup * fix unprotected import * group images by sizes and add batch processing * Add batch equivalence tests, skip when center_crop is used * cleanup * update init and cli * fix-copies * refactor convnext, cleanup base * fix * remove patching mixins, add piped torchvision transforms for ViT * fix unbatched processing * fix f strings * protect imports * change llava onevision to class transforms (test) * fix convnext * improve formatting (following Pavel review) * fix handling device arg * improve cli * fix * fix inits * Add distinction between preprocess and _preprocess, and support for arbitrary kwargs through valid_extra_kwargs * uniformize qwen2_vl fast * fix docstrings * add add fast image processor llava * remove min_pixels max_pixels from accepted size * nit * nit * refactor fast image processors docstrings * cleanup and remove fast class transforms * update add fast image processor transformers cli * cleanup docstring * uniformize pixtral fast and make _process_image explicit * fix prepare image structure llava next/onevision * Use typed kwargs instead of explicit args * nit fix import Unpack * clearly separate pops and gets in base preprocess. Use explicit typed kwargs * make qwen2_vl preprocess arguments hashable	2025-02-04 17:52:31 -05:00
David	8d73a38606	Add DAB-DETR for object detection (#30803 ) * initial commit * encoder+decoder layer changes WIP * architecture checks * working version of detection + segmentation * fix modeling outputs * fix return dict + output att/hs * found the position embedding masking bug * pre-training version * added iamge processors * typo in init.py * iterupdate set to false * fixed num_labels in class_output linear layer bias init * multihead attention shape fixes * test improvements * test update * dab-detr model_doc update * dab-detr model_doc update2 * test fix:test_retain_grad_hidden_states_attentions * config file clean and renaming variables * config file clean and renaming variables fix * updated convert_to_hf file * small fixes * style and qulity checks * return_dict fix * Merge branch main into add_dab_detr * small comment fix * skip test_inputs_embeds test * image processor updates + image processor test updates * check copies test fix update * updates for check_copies.py test * updates for check_copies.py test2 * tied weights fix * fixed image processing tests and fixed shared weights issues * added numpy nd array option to get_Expected_values method in test_image_processing_dab_detr.py * delete prints from test file * SafeTensor modification to solve HF Trainer issue * removing the safetensor modifications * make fix copies and hf uplaod has been added. * fixed index.md * fixed repo consistency * styel fix and dabdetrimageprocessor docstring update * requested modifications after the first review * Update src/transformers/models/dab_detr/image_processing_dab_detr.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * repo consistency has been fixed * update copied NestedTensor function after main merge * Update src/transformers/models/dab_detr/modeling_dab_detr.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * temp commit * temp commit2 * temp commit 3 * unit tests are fixed * fixed repo consistency * updated expected_boxes varible values based on related notebook results in DABDETRIntegrationTests file. * temporarialy config modifications and repo consistency fixes * Put dilation parameter back to config * pattern embeddings have been added to the rename_keys method * add dilation comment to config + add as an exception in check_config_attributes SPECIAL CASES * delete FeatureExtractor part from docs.md * requested modifications in modeling_dab_detr.py * [run_slow] dab_detr * deleted last segmentation code part, updated conversion script and changed the hf path in test files * temp commit of requested modifications * temp commit of requested modifications 2 * updated config file, resolved codepaths and refactored conversion script * updated decodelayer block types and refactored conversion script * style and quality update * small modifications based on the request * attentions are refactored * removed loss functions from modeling file, added loss function to lossutils, tried to move the MLP layer generation to config but it failed * deleted imageprocessor * fixed conversion script + quality and style * fixed config_att * [run_slow] dab_detr * changing model path in conversion file and in test file * fix Decoder variable naming * testing the old loss function * switched back to the new loss function and testing with the odl attention functions * switched back to the new last good result modeling file * moved back to the version when I asked the review * missing new line at the end of the file * old version test * turn back to newest mdoel versino but change image processor * style fix * style fix after merge main * [run_slow] dab_detr * [run_slow] dab_detr * added device and type for head bias data part * [run_slow] dab_detr * fixed model head bias data fill * changed test_inference_object_detection_head assertTrues to torch test assert_close * fixes part 1 * quality update * self.bbox_embed in decoder has been restored * changed Assert true torch closeall methods to torch testing assertclose * modelcard markdown file has been updated * deleted intemediate list from decoder module --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-02-04 17:28:27 +00:00
Yih-Dar	fe52679e74	Update tests regarding attention types after #35235 (#36024 ) * update * update * update * dev-ci * more changes * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-04 18:04:47 +01:00
Marc Sun	9f486badd5	Display warning for unknown quants config instead of an error (#35963 ) * add supports_quant_method check * fix * add test and fix suggestions * change logic slightly --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-02-04 15:17:01 +01:00
Sumit Vij	bc9a6d8302	Fix device mismatch error in Whisper model during feature extraction (#35866 ) * Fix device mismatch error in whisper feature extraction * Set default device * Address code review feedback --------- Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>	2025-02-04 12:23:08 +01:00
Cyril Vallez	9afb904b15	Refactor (and fix) gpt_neox (#35610 ) * start a nice modular * Update modular_gpt_neox.py * Update modular_gpt_neox.py * Update modular_gpt_neox.py * Update modular_gpt_neox.py * update * Update modular_gpt_neox.py * convert * fix attribute * fix attrs * oups * fix * fix * fix * fix * fix * fix order to pass test (see with accelerate team) * trigger CIs * modular * update * up * Update test_modeling_gpt_neox.py * Update test_modeling_gpt_neox.py * trigger CIs * correctly pass arg * simplify * remove key warning * update tp -> it's compatible since the view is before * trigger CIs	2025-02-04 11:18:43 +01:00
Ryoo Kwangrok	b1954fd64a	layernorm_decay_fix (#35927 ) * layernorm_decay_fix * W293 fix * ruff format fix * black format * ruff format * erase last layer * add test_get_parameter_names_rmsnorm * rmsnorm fix	2025-02-04 11:01:49 +01:00
Dmitry Tarasov	2ba040a71f	apply_chat_template: consistent behaviour for return_assistant_tokens_mask=True return_tensors=True (#35582 ) * apply_chat_template: consistent return_tensors behaviour with return_assistant_tokens_mask flag * test_chat_template_return_assistant_tokens_mask: support tokenizers with no attention mask * test_chat_template_return_assistant_tokens_mask: skip tokenizers with no padding token * test_chat_template_return_assistant_tokens_mask: force tokenizer padding_side=right --------- Co-authored-by: Eduard Allakhverdov <goncharova@airi.net> Co-authored-by: d.tarasov <d.tarasov@airi.net>	2025-02-04 10:27:52 +01:00
Raushan Turganbay	5d75a25b03	Qwen2-VL: fix rope delta calculation (#36013 ) * fix rope delats calculation * add test * style	2025-02-04 09:48:29 +01:00
Alex Brooks	e284c7e954	Update Granite Vision Model Path / Tests (#35998 ) * Update granite vision model path Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Enable granite vision test Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> --------- Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>	2025-02-03 20:06:03 +01:00
Arthur	7eecdf2a86	Update-tp test (#35844 ) * update test for now * up * cleanup * update todo	2025-02-03 09:37:02 +01:00
Yoni Gozlan	2b46943195	Add GOT-OCR 2.0 to Transformers (#34721 ) * init modular got_ocr2 * Get correct got_ocr architecture * add processing * run modular with processing * add working inference * apply modular * Refactor and fix style * Refactor, cleanup, fix style * fix init order * Fix docs * add base modeling tests * fix style and consistency * rename doc file * fix repo consistency * fix inference with box * add image processing and support for crop_to_multi_page * Fix batch inference * add tests * fixup * fix slow test * fix docstrings * Add model doc * update to new init * fix input autocast pixel_values dtype * update doc * move doc to multimodal * Reformat crop_image_to_patches and add docstrings * Fix example in forward docstring * Address Pablo review * [run slow] got_ocr2 * remove defaults defined twice * apply modular * add torch_device to integration tests * update modular * follow-up Pavel review * add device variable in doc * fix doc multi-page * Force eager attention for vision encoder to avoid attn implementation conflict * revert qwen2vl doc changes * use Qwen2ForCausalLM instead of Qwen2Model * make fixup * refactor gotocr2 to llava style * uniformize function names and reduce checks * final nits * fix pixel_values dtype error * change checkpoint names * fix modular	2025-01-31 11:28:13 -05:00
Yoni Gozlan	d7188ba600	Add support for nested images to LLava and VipLLava (#35558 ) * move make_flat_list_of_images and make_batched_videos to image_utils * remove unnecessary is_vision_available * move make_nested_list_of_images to image_utils * fix fast pixtral image processor * fix import mllama * fix make_nested_list_of_images * add tests * convert 4d arrays/tensors to list * add test_make_batched_videos * add support nested batch of videos * fix image processing qwen2vl	2025-01-30 16:49:20 -05:00
Marcel	e4227eb4d4	Handle empty change indices in SAM's mask to rle conversion (#35665 ) * Handle empty change indices in RLE conversion for masks * [test] Add unit tests for RLE encoding of masks in SamProcessor * [test] Update RLE conversion tests to use TensorFlow implementation * [test] Fix formatting in SamProcessorTest according to check_code_quality action * [test] Fix formatting in SamProcessorTest according to check_code_quality * [test] Refactored rle test cases into one test and used tf tensors in tf test cases * [test] Fix: removed self parameter from refactored methods * [test] Removed nested methods in run-length encoding tests for PyTorch and TensorFlow * [test] Added description to individual to run-length encoding tests for PyTorch and TensorFlow.	2025-01-30 19:08:38 +00:00
Yih-Dar	5757681837	Less flaky for `TimmBackboneModelTest::test_batching_equivalence` (#35971 ) * fix * remove is_flaky * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-30 16:56:26 +01:00
Raushan Turganbay	365fecb4d0	Whisper: fix static cache CI (#35852 ) * fix * remove overriden method * small change	2025-01-30 12:43:00 +01:00
Raushan Turganbay	9725e5be2f	Pixtral: vectorize patch embeddings and enable tests (#35122 ) * initial POC * - batch mix feature * fix tests * fix tests * make style * do not skip and instead fix tests * update * return back the test * correct text with the correct ckpt	2025-01-30 12:40:18 +01:00
Joao Gante	8bc4c89ee9	[bart] minor test fixes (#35965 ) fix tests	2025-01-30 10:00:11 +00:00
Joao Gante	4d3b1076a1	[generate] move max time tests (#35962 ) * move max time tests to their right place * move test to the right place	2025-01-29 17:56:46 +00:00
Fanli Lin	f0ae65c198	[tests] further fix `Tester object has no attribute '_testMethodName'` (#35781 ) * bug fix * update with more cases * more entries * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 16:05:33 +01:00
Jonas Rohw	23d782ead2	Output dicts support in text generation pipeline (#35092 ) * Support for generate_argument: return_dict_in_generate=True, instead of returning a error * fix: call test with return_dict_in_generate=True * fix: Only import torch if it is present * update: Encapsulate output_dict changes * fix: added back original comments --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-01-29 14:44:46 +00:00
Yih-Dar	cf90404807	Fix flaky `test_assisted_decoding_matches_greedy_search` (#35951 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 14:50:07 +01:00
Yih-Dar	c600e89f5c	Update `unwrap_and_save_reload_schedule` to use `weights_only=False` (#35952 ) * fix * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 14:30:57 +01:00
Nadav Timor	42c8ccfd4c	fix `test_generated_length_assisted_generation` (#34935 ) fix test_generated_length_assisted_generation	2025-01-29 12:03:45 +00:00
Joao Gante	ece8c42488	Test: generate with `torch.compile(model.forward)` as a fast test (#34544 )	2025-01-28 14:10:38 +00:00
Cyril Vallez	f48ecd7608	Fix TP initialization (#35860 ) * fix tp * Update modeling_utils.py * style * style * Update test_tp.py * Update test_tp.py * style * Update test_tp.py * Update test_tp.py * Update test_tp.py * Update test_tp.py	2025-01-28 15:07:37 +01:00
Raushan Turganbay	f85ba20449	Qwen-2-5-VL: fix CI (#35935 ) fix	2025-01-28 14:51:57 +01:00
Cyril Vallez	3f860dba55	Fix mask slicing for models with HybridCache (#35681 ) * correctly slice * check mask * Update modular_gemma2.py * fix * add tests * fix typo * finally fix mask slicing * Finally correctly slice in all cases!! * add test for all attention functions * small fix in tests * trick around dynamo tracing issue * last update * more robust * kwargs propagation * make it explicit for checkpointing * apply modular	2025-01-28 14:35:00 +01:00
Raushan Turganbay	b764c20b09	Fix: loading DBRX back from saved path (#35728 ) * fix dtype as dict for some models + add test * add comment in tests	2025-01-28 11:38:45 +01:00
Isotr0py	e57b459997	Split and clean up GGUF quantization tests (#35502 ) * clean up ggml test Signed-off-by: Isotr0py <2037008807@qq.com> * port remaining tests Signed-off-by: Isotr0py <2037008807@qq.com> * further cleanup Signed-off-by: Isotr0py <2037008807@qq.com> * format Signed-off-by: Isotr0py <2037008807@qq.com> * fix broken tests Signed-off-by: Isotr0py <2037008807@qq.com> * update comment Signed-off-by: Isotr0py <2037008807@qq.com> * fix Signed-off-by: Isotr0py <2037008807@qq.com> * reorganize tests Signed-off-by: Isotr0py <2037008807@qq.com> * k-quants use qwen2.5-0.5B Signed-off-by: Isotr0py <2037008807@qq.com> * move ggml tokenization test Signed-off-by: Isotr0py <2037008807@qq.com> * remove dead code Signed-off-by: Isotr0py <2037008807@qq.com> * add assert for serilization test Signed-off-by: Isotr0py <2037008807@qq.com> * use str for parameterize Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-01-27 15:46:57 +01:00
Mikhail Moskovchenko	5450e7c84a	🔴 🔴 🔴 Added `segmentation maps` support for DPT image processor (#34345 ) * Added `segmentation_maps` support for DPT image processor * Added tests for dpt image processor * Moved preprocessing into separate functions * Added # Copied from statements * Fixed # Copied from statements * Added `segmentation_maps` support for DPT image processor * Added tests for dpt image processor * Moved preprocessing into separate functions * Added # Copied from statements * Fixed # Copied from statements	2025-01-27 15:14:00 +01:00
pglorio	33cb1f7b61	Add Zamba2 (#34517 ) * First commit * Finish model implementation * First commit * Finish model implementation * Register zamba2 * generated modeling and configuration * generated modeling and configuration * added hybrid cache * fix attention_mask in mamba * dropped unused loras * fix flash2 * config docstrings * fix config and fwd pass * make fixup fixes * text_modeling_zamba2 * small fixes * make fixup fixes * Fix modular model converter * added inheritances in modular, renamed zamba cache * modular rebase * new modular conversion * fix generated modeling file * fixed import for Zamba2RMSNormGated * modular file cleanup * make fixup and model tests * dropped inheritance for Zamba2PreTrainedModel * make fixup and unit tests * Add inheritance of rope from GemmaRotaryEmbedding * moved rope to model init * drop del self.self_attn and del self.feed_forward * fix tests * renamed lora -> adapter * rewrote adapter implementation * fixed tests * Fix torch_forward in mamba2 layer * Fix torch_forward in mamba2 layer * Fix torch_forward in mamba2 layer * Dropped adapter in-place sum * removed rope from attention init * updated rope * created get_layers method * make fixup fix * make fixup fixes * make fixup fixes * update to new attention standard * update to new attention standard * make fixup fixes * minor fixes * cache_position * removed cache_position postion_ids use_cache * remove config from modular * removed config from modular (2) * import apply_rotary_pos_emb from llama * fixed rope_kwargs * Instantiate cache in Zamba2Model * fix cache * fix @slow decorator * small fix in modular file * Update docs/source/en/model_doc/zamba2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * several minor fixes * inherit mamba2decoder fwd and drop position_ids in mamba * removed docstrings from modular * reinstate zamba2 attention decoder fwd * use regex for tied keys * Revert "use regex for tied keys" This reverts commit `9007a522b1`. * use regex for tied keys * add cpu to slow forward tests * dropped config.use_shared_mlp_adapter * Update docs/source/en/model_doc/zamba2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * re-convert from modular --------- Co-authored-by: root <root@node-2.us-southcentral1-a.compute.internal> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-27 10:51:23 +01:00
Arthur	b912f5ee43	use torch.testing.assertclose instead to get more details about error in cis (#35659 ) * use torch.testing.assertclose instead to get more details about error in cis * fix * style * test_all * revert for I bert * fixes and updates * more image processing fixes * more image processors * fix mamba and co * style * less strick * ok I won't be strict * skip and be done * up	2025-01-24 16:55:28 +01:00
CalOmnie	b5aaf87509	Fix `test_pipelines_video_classification` that was always failing (#35842 ) * Fix test_pipelines_video_classification that was always failing * Update video pipeline docstring to reflect actual return type --------- Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>	2025-01-23 19:22:32 +01:00
Alex Brooks	71cc8161b2	Granite Vision Support (#35579 ) * Add multimodal granite support Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Support multiple image feature layres Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Remove failing validation for visual encoders with no cls Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Update llava based models / configs to support list of feature layers Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Add tests for multiple feature layers Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Use conditional instead of except for misaligned feature shapes Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * crop cls from each hidden state Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Fix formatting Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Support single vision feature int in vipllava Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Fix typo in vision feature selection strategy validation Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Add tentative integration test for granite vision models Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Add granite vision docs Replace multimodal granite refs with granite vision Add granite vision / llava next alias Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Use image url in granitevision example Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> --------- Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>	2025-01-23 17:15:52 +01:00
Arthur	8f1509a96c	Fix more CI tests (#35661 ) add tooslow for the fat ones	2025-01-23 14:45:42 +01:00
ShuaiBai623	f3f6c86582	add qwen2.5vl (#35569 ) * add qwen2.5vl * fix * pass check table * add modular file * fix style * Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com> * Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com> * Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com> * padd copy check * use modular * fix * fix * fix * update flashatt2&sdpa support_list * Update docs/source/en/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/qwen2_5_vl/modular_qwen2_5_vl.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update config * update * fix hf path * rename Qwen2_5_VLVideosKwargs * fix * fix * update * excuted modular * rollback init * fix * formated * simpler init * fix * fix * fix * fix * fix * update docs * fix * fix * update Qwen2VLRotaryEmbedding for yarn * fix --------- Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: gewenbin0992 <gewenbin292@163.com> Co-authored-by: gewenbin0992 <67409248+gewenbin0992@users.noreply.github.com>	2025-01-23 11:23:00 +01:00
Cyril Vallez	d3af76df58	[Backend support] Allow `num_logits_to_keep` as Tensor + add flag (#35757 ) * support * Update modeling_utils.py * style * most models * Other models * fix-copies * tests + generation utils	2025-01-23 09:47:54 +01:00
Arthur	8736e91ad6	[ `tests`] remove some flash attention class tests (#35817 ) remove class from tests	2025-01-23 09:44:21 +01:00
Mohamed Mekkouri	a7738f5a89	Fix : Nemotron tokenizer for GGUF format (#35836 ) fix nemotron gguf	2025-01-22 12:28:40 +01:00
Joao Gante	36c9181f5c	[gpt2] fix generation tests (#35822 ) fix gpt2 generation tests	2025-01-22 09:41:04 +00:00
Raushan Turganbay	373e50e970	Init cache on meta device (#35164 ) * init cache on meta device * offloaded static + enable tests * tests weren't running before :( * update * fix mamba * fix copies * update * address comments and fix tests * fix copies * Update src/transformers/cache_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update * mamba fix --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-22 09:49:17 +01:00
Joao Gante	90b46e983f	Remove old `benchmark` code (#35730 ) * remove traces of the old deprecated benchmarks * also remove old tf benchmark example, which uses deleted code * run doc builder	2025-01-21 17:56:43 +00:00
eustlb	870eb7b41b	[Mimi] update test expected values for t4 runners (#35696 ) update values for t4	2025-01-21 18:23:36 +01:00
Yoni Gozlan	107f9f5127	add Qwen2-VL image processor fast (#35733 ) * add qwen2_vl image processor fast * add device to ImagesKwargs * remove automatic fix copies * fix fast_is_faster_than_slow * remove unnecessary import	2025-01-21 11:49:05 -05:00
Mohamed Mekkouri	dbd8474125	Fix : BLOOM tie_word_embeddings in GGUF (#35812 ) * fix bloom ggml * fix falcon output * make style	2025-01-21 15:35:54 +01:00
fzyzcjy	dc10f7906a	Support adamw_torch_8bit (#34993 ) * var * more * test	2025-01-21 14:17:49 +01:00
Yih-Dar	fd8d61fdb2	Byebye `test_batching_equivalence`'s flakiness (#35729 ) * fix * fix * skip * better error message --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-21 13:11:33 +01:00
NielsRogge	78f5ee0217	Add LlavaImageProcessor (#33191 ) * First draft * Add equivalence test * Update docstrings * Add tests * Use numpy * Fix tests * Improve variable names * Improve docstring * Add link * Remove script * Add copied from * Address comment * Add note in docs * Add docstring, data format * Improve test * Add test * update * Update src/transformers/models/llava/image_processing_llava.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/llava/image_processing_llava.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * loop once only --------- Co-authored-by: raushan <raushan@huggingface.co> Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-01-21 12:47:04 +01:00
Aymeric Roucher	44393df089	Tool calling: support more types (#35776 ) * Tool calling: support NoneType for function return type	2025-01-20 19:15:34 +01:00
jiqing-feng	f19135afc7	fix low-precision audio classification pipeline (#35435 ) * fix low-precision audio classification pipeline Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix torch import Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix torch import Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-01-20 16:20:51 +00:00
jiqing-feng	641238eb76	Fix vits low-precision dtype (#35418 ) * fix vits dtype Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * use weight dtype Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-01-20 16:19:31 +00:00
jiqing-feng	729b569531	fix document qa bf16 pipeline (#35456 ) * fix document qa bf16 pipeline Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-01-20 16:18:07 +00:00
eustlb	5f0f4b1b93	Patch moonshine (#35731 ) * udpate expected logits for T4 runners * update doc * correct order of the args for better readability * remove generate wrap * convert modular	2025-01-20 16:19:29 +01:00
Mohamed Mekkouri	b80e334e71	Skip Falcon 7B GGML Test (#35783 ) skip test	2025-01-20 15:00:34 +01:00
StevenBucaille	abe57b6f17	Add SuperGlue model (#29886 ) * Initial commit with template code generated by transformers-cli * Multiple additions to SuperGlue implementation : - Added the SuperGlueConfig - Added the SuperGlueModel and its implementation - Added basic weight conversion script - Added new ImageMatchingOutput dataclass * Few changes for SuperGlue * Multiple changes : - Added keypoint detection config to SuperGlueConfig - Completed convert_superglue_to_pytorch and succesfully run inference * Reverted unintentional change * Multiple changes : - Added SuperGlue to a bunch of places - Divided SuperGlue into SuperGlueForImageMatching and SuperGlueModel - Added testing images * Moved things in init files * Added docs (to be finished depending on the final implementation) * Added necessary imports and some doc * Removed unnecessary import * Fixed make fix-copies bug and ran it * Deleted SuperGlueModel Fixed convert script * Added SuperGlueImageProcessor * Changed SuperGlue to support batching pairs of images and modified ImageMatchingOutput in consequences * Changed convert_superglue_to_hf.py script to experiment different ways of reading an image and seeing its impact on performances * Added initial tests for SuperGlueImageProcessor * Added AutoModelForImageMatching in missing places and tests * Fixed keypoint_detector_output instructions * Fix style * Adapted to latest main changes * Added integration test * Fixed bugs to pass tests * Added keypoints returned by keypoint detector in the output of SuperGlue * Added doc to SuperGlue * SuperGlue returning all attention and hidden states for a fixed number of keypoints * Make style * Changed SuperGlueImageProcessor tests * Revert "SuperGlue returning all attention and hidden states for a fixed number of keypoints" Changed tests accordingly This reverts commit 5b3b669c * Added back hidden_states and attentions masked outputs with tests * Renamed ImageMatching occurences into KeypointMatching * Changed SuperGlueImageProcessor to raise error when batch_size is not even * Added docs and clarity to hidden state and attention grouping function * Fixed some code and done refactoring * Fixed typo in SuperPoint output doc * Fixed some of the formatting and variable naming problems * Removed useless function call * Removed AutoModelForKeypointMatching * Fixed SuperGlueImageProcessor to only accept paris of images * Added more fixes to SuperGlueImageProcessor * Simplified the batching of attention and hidden states * Simplified stack functions * Moved attention instructions into class * Removed unused do_batch_norm argument * Moved weight initialization to the proper place * Replaced deepcopy for instantiation * Fixed small bug * Changed from stevenbucaille to magic-leap repo * Renamed London Bridge images to Tower Bridge * Fixed formatting * Renamed remaining "london" to "tower" * Apply suggestions from code review Small changes in the docs Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Added AutoModelForKeypointMatching * Changed images used in example * Several changes to image_processing_superglue and style * Fixed resample type hint * Changed SuperGlueImageProcessor and added test case for list of 2 images * Changed list_of_tuples implementation * Fix in dummy objects * Added normalize_keypoint, log_sinkhorn_iterations and log_optimal_transport docstring * Added missing docstring * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Moved forward block at bottom * Added docstring to forward method * Added docstring to match_image_pair method * Changed test_model_common_attributes to test_model_get_set_embeddings test method signature * Removed AutoModelForKeypointMatching * Removed image fixtures and added load_dataset * Added padding of images in SuperGlueImageProcessor * Cleaned up convert_superglue_to_hf script * Added missing docs and fixed unused argument * Fixed SuperGlueImageProcessor tests * Transposed all hidden states from SuperGlue to reflect the standard (..., seq_len, feature_dim) shape * Added SuperGlueForKeypointMatching back to modeling_auto * Fixed image processor padding test * Changed SuperGlue docs * changes: - Abstraction to batch, concat and stack of inconsistent tensors - Changed conv1d's to linears to match standard attention implementations - Renamed all tensors to be tensor0 and not tensor_0 and be consistent - Changed match image pair to run keypoint detection on all image first, create batching tensors and then filling these tensors matches after matches - Various changes in docs, etc * Changes to SuperGlueImageProcessor: - Reworked the input image pairs checking function and added tests accordingly - Added Copied from statements - Added do_grayscale tag (also for SuperPointImageProcessor) - Misc changes for better code * Formatting changes * Reverted conv1d to linear conversion because of numerical differences * fix: changed some code to be more straightforward (e.g. filtering keypoints) and converted plot from opencv to matplotlib * fix: removed unnecessary test * chore: removed commented code and added back hidden states transpositions * chore: changed from "inconsistent" to "ragged" function names as suggested Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * docs: applied suggestions Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * docs: updated to display matched output * chore: applied suggestion for check_image_pairs_input function Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * chore: changed check_image_pairs_input function name to validate_and_format_image_pairs and used validate_preprocess_arguments function * tests: simplified tests for image input format and shapes * feat: converted SuperGlue's use of Conv1d with kernel_size of 1 with Linear layers. Changed tests and conversion script accordingly * feat: several changes to address comments Conversion script: - Reverted fuse batchnorm to linear conversion - Changed all 'nn.Module' to respective SuperGlue models - Changed conversion script to use regex mapping and match other recent scripts Modeling SuperGlue: - Added batching with mask and padding to attention - Removed unnecessary concat, stack and batch ragged pairs functions - Reverted batchnorm layer - Renamed query, key, value and merge layers into q, k, v, out proj - Removed Union of different Module into nn.Module in _init_weights method typehint - Changed several method's signature to combine image0 and image1 inputs with appropriate doc changes - Updated SuperGlue's doc with torch.no_grad() Updated test to reflect changes in SuperGlue model * refactor: changed validate_and_format_image_pairs function with clarity * refactor: changed from one SuperGlueMLP class to a list of SuperGlueMLP class * fix: fixed forgotten init weight change from last commit * fix: fixed rebase mistake * fix: removed leftover commented code * fix: added typehint and changed some of arguments default values * fix: fixed attribute default values for SuperGlueConfig * feat: added SuperGlueImageProcessor post process keypoint matching method with tests * fix: fixed SuperGlue attention and hidden state tuples aggregation * chore: fixed mask optionality and reordered tensor reshapes to be cleaner * chore: fixed docs and error message returned in validate_and_format_image_pairs function * fix: fixed returned keypoints to be the ones that SuperPoint returns * fix: fixed check on number of image sizes for post process compared to the pairs in outputs of SuperGlue * fix: fixed check on number of image sizes for post process compared to the pairs in outputs of SuperGlue (bis) * fix: Changed SuperGlueMultiLayerPerceptron instantiation to avoid if statement * fix: Changed convert_superglue_to_hf script to reflect latest SuperGlue changes and got rid of nn.Modules * WIP: implement Attention from an existing class (like BERT) * docs: Changed docs to include more appealing matching plot * WIP: Implement Attention * chore: minor typehint change * chore: changed convert superglue script by removing all classes and apply conv to linear conversion in state dict + rearrange keys to comply with changes in model's layers organisation * Revert "Fixed typo in SuperPoint output doc" This reverts commit `2120390e82`. * chore: added comments in SuperGlueImageProcessor * chore: changed SuperGlue organization HF repo to magic-leap-community * [run-slow] refactor: small change in layer instantiation * [run-slow] chore: replaced remaining stevenbucaille org to magic-leap-community * [run-slow] chore: make style * chore: update image matching fixture dataset HF repository * [run-slow] superglue * tests: overwriting test_batching_equivalence * [run-slow] superglue * tests: changed test to cope with value changing depending on cuda version * [run-slow] superglue * tests: changed matching_threshold value * [run-slow] superglue * [run-slow] superglue * tests: changed tests for integration * [run-slow] superglue * fix: Changed tensor view and permutations to match original implementation results * fix: updated convert script and integration test to include last change in model * fix: increase tolerance for CUDA variances * Apply suggestions from code review Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * [run-slow] superglue * chore: removed blank whitespaces * [run-slow] superglue * Revert SuperPoint image processor accident changes * [run-slow] superglue * refactor: reverted copy from BERT class * tests: lower the tolerance in integration tests for SuperGlue * [run-slow] superglue * chore: set do_grayscale to False in SuperPoint and SuperGlue image processors * [run-slow] superglue * fix: fixed imports in SuperGlue files * chore: changed do_grayscale SuperGlueImageProcessing default value to True * docs: added typehint to post_process_keypoint_matching method in SuperGlueImageProcessor * fix: set matching_threshold default value to 0.0 instead of 0.2 * feat: added matching_threshold to post_process_keypoint_matching method * docs: update superglue.md to include matching_threshold parameter * docs: updated SuperGlueConfig docstring for matching_threshold default value * refactor: removed unnecessary parameters in SuperGlueConfig * fix: changed from matching_threshold to threshold * fix: re-revert changes to make SuperGlue attention classes copies of BERT * [run-slow] superglue * fix: added missing device argument in post_processing method * [run-slow] superglue * fix: add matches different from -1 to compute valid matches in post_process_keypoint_matching (and docstring) * fix: add device to image_sizes tensor instantiation * tests: added checks on do_grayscale test * chore: reordered and added Optional typehint to KeypointMatchingOutput * LightGluePR suggestions: - use `post_process_keypoint_matching` as default docs example - add `post_process_keypoint_matching` in autodoc - add `SuperPointConfig` import under TYPE_CHECKING condition - format SuperGlueConfig docstring - add device in convert_superglue_to_hf - Fix typo - Fix KeypointMatchingOutput docstring - Removed unnecessary line - Added missing SuperGlueConfig in __init__ methods * LightGluePR suggestions: - use batching to get keypoint detection * refactor: processing images done in 1 for loop instead of 4 * fix: use @ instead of torch.einsum for scores computation * style: added #fmt skip to long tensor values * refactor: rollbacked validate_and_format_image_pairs valid and invalid case to more simple ones * refactor: prepare_imgs * refactor: simplified `validate_and_format_image_pairs` * docs: fixed doc --------- Co-authored-by: steven <steven.bucaillle@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Steven Bucaille <steven.bucaille@buawei.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-01-20 10:32:39 +00:00
NielsRogge	872dfbdd46	[ViTPose] Convert more checkpoints (#35638 ) * Convert more checkpoints * Update docs, convert huge variant * Update model name * Update src/transformers/models/vitpose/modeling_vitpose.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Remove print statements * Update docs/source/en/model_doc/vitpose.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Link to collection --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-01-20 11:29:47 +01:00
Raushan Turganbay	8571bb145a	Fix CI for VLMs (#35690 ) * fix some easy test * more tests * remove logit check here also * add require_torch_large_gpu in Emu3	2025-01-20 11:15:39 +01:00
Dmitry Rogozhkin	7d4b3ddde4	ci: fix xpu skip condition for test_model_parallel_beam_search (#35742 ) `return unittest.skip()` used in the `test_model_parallel_beam_search` in skip condition for xpu did not actually mark test to be skipped running under pytest: * 148 passed, 1 skipped Other tests use `self.skipTest()`. Reusing this approach and moving the condition outside the loop (since it does not depend on it) allows to skip for xpu correctly: * 148 skipped Secondly, `device_map="auto"` is now implemented for XPU for IPEX>=2.5 and torch>=2.6, so we can now enable these tests for XPU for new IPEX/torch versions. Fixes: `1ea3ad1ae` ("[tests] use `torch_device` instead of `auto` for model testing (#29531)") Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>	2025-01-17 16:47:27 +01:00
Pavel Iakubovskii	099d93d2e9	Grounding DINO Processor standardization (#34853 ) * Add input ids to model output * Add text preprocessing for processor * Fix snippet * Add test for equivalence * Add type checking guard * Fixing typehint * Fix test for added `input_ids` in output * Add deprecations and "text_labels" to output * Adjust tests * Fix test * Update code examples * Minor docs and code improvement * Remove one-liner functions and rename class to CamelCase * Update docstring * Fixup	2025-01-17 14:18:16 +00:00
Pavel Iakubovskii	42b2857b01	OmDet Turbo processor standardization (#34937 ) * Fix docstring * Fix docstring * Add `classes_structure` to model output * Update omdet postprocessing * Adjust tests * Update code example in docs * Add deprecation to "classes" key in output * Types, docs * Fixing test * Fix missed clip_boxes * [run-slow] omdet_turbo * Apply suggestions from code review Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Make CamelCase class --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-01-17 14:10:19 +00:00
Pavel Iakubovskii	94ae9a8da1	OwlViT/Owlv2 post processing standardization (#34929 ) * Refactor owlvit post_process_object_detection + add text_labels * Fix copies in grounding dino * Sync with Owlv2 postprocessing * Add post_process_grounded_object_detection method to processor, deprecate post_process_object_detection * Add test cases * Move text_labels to processors only * [run-slow] owlvit owlv2 * [run-slow] owlvit, owlv2 * Update snippets * Update docs structure * Update deprecated objects for check_repo * Update docstring for post processing of image guided object detection	2025-01-17 13:58:28 +00:00
Cyril Vallez	ab1afd56f5	Fix some tests (#35682 ) * cohere tests * glm tests * cohere2 model name * create decorator * update * fix cohere2 completions * style * style * style * add cuda in comments	2025-01-17 12:10:43 +00:00
Ross Wightman	8c1b5d3782	🚨🚨🚨 An attempt to fix #29554 . Include 'LayerNorm.' in gamma/beta rename scope, optimize string search. (#35615 ) * An attempt to fix #29554. Include 'LayerNorm.' in gamma/beta rename scope, reduce number of characters searched on every load considerably. * Fix fix on load issue * Fix gamma/beta warning test * A style complaint * Improve efficiency of weight norm key rename. Add better comments about weight norm and layer norm renaming. * Habitual elif redunant with the return	2025-01-16 17:25:44 -08:00
Joao Gante	94af1c0aa2	[generate] return Cache object even if passed in a legacy format (#35673 ) * generate returns a Cache object by default * fix tests * fix test for encoder-decoder models	2025-01-16 17:06:24 +00:00
Joao Gante	2818307e93	[generate] can instantiate `GenerationConfig(cache_implementation="static")` (#35679 ) fix failing instantiation	2025-01-16 17:04:54 +00:00
Joao Gante	aeeceb9916	[cache] add a test to confirm we can use cache at train time (#35709 ) * add test * augment test as suggested * Update tests/utils/test_modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * rerun tests --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-16 17:02:34 +00:00
kang sheng	2cbcc5877d	Fix condition when GA loss bug fix is not performed (#35651 ) * fix condition when GA loss bug fix is not performed * max loss diff is 2.29 * fix typo * add an extra validation that loss should not vary too much	2025-01-16 13:59:53 +01:00
jiqing-feng	387663e571	Enable gptqmodel (#35012 ) * gptqmodel Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update readme Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * gptqmodel need use checkpoint_format (#1) * gptqmodel need use checkpoint_format * fix quantize * Update quantization_config.py * Update quantization_config.py * Update quantization_config.py --------- Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * Revert quantizer_gptq.py (#2) * revert quantizer_gptq.py change * pass *kwargs limit gptqmodel and optimum version Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix warning Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix version check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert unrelated changes Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * enable gptqmodel tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix requires gptq Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Fix Transformer compat (#3) * revert quantizer_gptq.py change * pass *kwargs add meta info * cleanup * cleanup * Update quantization_config.py * hf_select_quant_linear pass checkpoint_format and meta * fix GPTQTestCUDA * Update test_gptq.py * gptqmodel.hf_select_quant_linear() now does not select ExllamaV2 * cleanup * add backend * cleanup * cleanup * no need check exllama version * Update quantization_config.py * lower checkpoint_format and backend * check none * cleanup * Update quantization_config.py * fix self.use_exllama == False * spell * fix unittest * fix unittest --------- Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format again Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update gptqmodel version (#6) * update gptqmodel version * update gptqmodel version * fix unit test (#5) * update gptqmodel version * update gptqmodel version * "not self.use_exllama" is not equivalent to "self.use_exllama==False" * fix unittest * update gptqmodel version * backend is loading_attibutes (#7) * fix format and tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix memory check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix device mismatch Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix result check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * update tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * review: update docs (#10) * review: update docs (#12) * review: update docs * fix typo * update tests for gptqmodel Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update document (#9) * update overview.md * cleanup * Update overview.md * Update overview.md * Update overview.md * update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md --------- Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * typo * doc note for asymmetric quant * typo with apple silicon(e) * typo for marlin * column name revert: review * doc rocm support * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com> Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com> Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-01-15 14:22:49 +01:00
Raushan Turganbay	09d5f76274	Clean-up composite configs (#34603 ) * remove manual assignment tie-word-embeddings * remove another unused attribute * fix tests * fix tests * remove unnecessary overwrites * fix * decoder=True * clean pix2struct * run-all * forgot `_tied_weights_keys` when adding Emu3 * also Aria + fix-copies * and clean aria	2025-01-15 10:04:07 +01:00
Mahdi Baghbanzadeh	c61fcde910	Enhance DataCollatorForLanguageModeling with Configurable Token Replacement Probabilities (#35251 ) * DataCollatorForLanguageModeling class was updated with new parameters that provides more control over the token masking and relacing * DataCollatorForLanguageModeling class was updated with new parameters that provides more control over the token masking and relacing * Addressed review comments, modified the docstring and made a test for the DataCollatorForLanguageModeling	2025-01-14 17:01:10 +00:00
Mohamed Mekkouri	a11041ffad	Fix : add require_read_token for gemma2 gated model (#35687 ) fix gemma2 gated model test	2025-01-14 11:47:05 +01:00
Mohamed Mekkouri	df2a812e95	Fix expected output for ggml test (#35686 ) fix expected output	2025-01-14 11:46:55 +01:00
Mohamed Mekkouri	050636518a	Fix : HQQ config when hqq not available (#35655 ) * fix * make style * adding require_hqq * make style	2025-01-14 11:37:37 +01:00

... 5 6 7 8 9 ...

5042 Commits