transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 10:12:23 +06:00

Author	SHA1	Message	Date
Yih-Dar	8fd4bc7d1d	Fix a mistake in #36175 (#36179 ) fix my bad Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-13 18:33:02 +01:00
Mohamed Mekkouri	b1a2de075d	Follow up to SpQR integration (#36176 ) fix	2025-02-13 17:40:59 +01:00
Wizyoung	12962fe84b	Fix the key name for _load_rng_state under torch.cuda (#36138 ) fix load key name for _load_rng_state under torch.cuda Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-13 11:35:08 -05:00
Yih-Dar	bfe46c98b5	Make `check_repository_consistency` run faster by MP (#36175 ) * speeddddd * speeddddd * speeddddd * speeddddd --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-13 17:25:17 +01:00
Jiahao Li	5f0fd1185b	Optimize Qwen2VL vision model by precomputing cos/sin embeds before ViT blocks (#35837 ) * Optimize Qwen2VL vision model by precomputing cos/sin embeds before ViT blocks * Make rotary_pos_emb optional & fix type * Adapt pre-computed cos/sin to Qwen2.5VL * More concise	2025-02-13 17:10:58 +01:00
மனோஜ்குமார் பழனிச்சாமி	d72642bccc	Use tqdm auto (#35726 ) * Remove traces of the progressbar * Use tqdm auto	2025-02-13 15:41:30 +00:00
Joao Gante	62c7ea0201	CI: avoid human error, automatically infer generative models (#33212 ) * tmp commit * move tests to the right class * remove ALL all_generative_model_classes = ... * skip tf roberta * skip InstructBlipForConditionalGenerationDecoderOnlyTest * videollava * reduce diff * reduce diff * remove on vlms * fix a few more * manual rebase bits * more manual rebase * remove all manual generative model class test entries * fix up to ernie * a few more removals * handle remaining cases * recurrent gemma * it's better here * make fixup * tf idefics is broken * tf bert + generate is broken * don't touch tf :() * don't touch tf :( * make fixup * better comments for test skips * revert tf changes * remove empty line removal * one more * missing one	2025-02-13 16:27:11 +01:00
Arthur	06231fdfc7	add disable compile option (#36161 ) * add disable compile code * fix	2025-02-13 16:24:46 +01:00
Arthur	0ca7259217	fix training issues (#36158 ) * fix training issues * Update Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-13 16:24:28 +01:00
Elvir Crnčević	845b0a2616	Efficient Inference Kernel for SpQR (#34976 ) * Resolve vptq conflict * Rename spqr package to spqr_quant * Get rid of aqlm mention * Start working on tests * Resolve ruff code checks * Ruff format * Isort * Test updates * Add gpu tag * Rename to modules_to_not_convert * Config update * Docs and config update * Docs and config update * Update to update_torch_dtype * spqr config parameter validation * Ruff update * Apply ruff fixes * Test fixes * Ruff update * Mark tests as @slow again; Ruff; Docstring update * Ruff * Remove absolute path * Resolve typo * Remove redundandt log * Check accelerate/spqr availability * Ruff fix * Check if the config contains proper shapes * Ruff test * Documentation update * overview update * Ruff checks * Ruff code quality * Make style * Update docs/source/en/quantization/spqr.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update spqr.md * Enable gptqmodel (#35012) * gptqmodel Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update readme Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * gptqmodel need use checkpoint_format (#1) * gptqmodel need use checkpoint_format * fix quantize * Update quantization_config.py * Update quantization_config.py * Update quantization_config.py --------- Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * Revert quantizer_gptq.py (#2) * revert quantizer_gptq.py change * pass *kwargs limit gptqmodel and optimum version Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix warning Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix version check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert unrelated changes Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * enable gptqmodel tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix requires gptq Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Fix Transformer compat (#3) * revert quantizer_gptq.py change * pass *kwargs add meta info * cleanup * cleanup * Update quantization_config.py * hf_select_quant_linear pass checkpoint_format and meta * fix GPTQTestCUDA * Update test_gptq.py * gptqmodel.hf_select_quant_linear() now does not select ExllamaV2 * cleanup * add backend * cleanup * cleanup * no need check exllama version * Update quantization_config.py * lower checkpoint_format and backend * check none * cleanup * Update quantization_config.py * fix self.use_exllama == False * spell * fix unittest * fix unittest --------- Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format again Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update gptqmodel version (#6) * update gptqmodel version * update gptqmodel version * fix unit test (#5) * update gptqmodel version * update gptqmodel version * "not self.use_exllama" is not equivalent to "self.use_exllama==False" * fix unittest * update gptqmodel version * backend is loading_attibutes (#7) * fix format and tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix memory check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix device mismatch Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix result check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * update tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * review: update docs (#10) * review: update docs (#12) * review: update docs * fix typo * update tests for gptqmodel Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update document (#9) * update overview.md * cleanup * Update overview.md * Update overview.md * Update overview.md * update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md --------- Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * typo * doc note for asymmetric quant * typo with apple silicon(e) * typo for marlin * column name revert: review * doc rocm support * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com> Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com> Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fix : Nemotron Processor in GGUF conversion (#35708) * fixing nemotron processor * make style * Update docs/source/en/quantization/spqr.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add missing TOC to doc --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com> Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com> Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-02-13 16:22:58 +01:00
dependabot[bot]	c5506f4f00	Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/adversarial (#36168 ) Bump transformers in /examples/research_projects/adversarial Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-13 15:06:16 +00:00
dependabot[bot]	d7c5d1b539	Bump transformers from 4.38.0 to 4.48.0 in /examples/tensorflow/language-modeling-tpu (#36167 ) Bump transformers in /examples/tensorflow/language-modeling-tpu Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-13 14:46:38 +00:00
Joao Gante	636ee57489	[generate] revert change in Aria: the maximum cache length must match `max_length` (#36120 ) * revert inputs_embeds len * Update test_utils.py * make fixup	2025-02-13 14:36:33 +00:00
Mohamed Mekkouri	b41591d847	Fix : fix doc fp8 (#36173 ) * fix * fix	2025-02-13 15:29:59 +01:00
Arthur	b079dd1fa2	Fix red CI (#36174 ) test was weird	2025-02-13 14:27:55 +01:00
Joao Gante	d114a6f78e	[Modular] skip modular checks based on diff (#36130 ) skip modular checks based on diff	2025-02-13 12:53:21 +00:00
Pavel Iakubovskii	6397916dd2	Remove loading custom kernel for RT-DETRv2 (#36098 ) * Remove loading custom kernels * Remove config param * Fixup	2025-02-13 12:01:53 +00:00
Mohamed Mekkouri	efe72fe21f	Adding FP8 Quantization to transformers (#36026 ) * first commit * adding kernels * fix create_quantized_param * fix quantization logic * end2end * fix style * fix imports * fix consistency * update * fix style * update * udpate after review * make style * update * update * fix * update * fix docstring * update * update after review * update * fix scheme * update * update * fix * update * fix docstring * add source * fix test --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-13 13:01:19 +01:00
Lysandre Debut	c82319b493	Helium documentation fixes (#36170 ) * Helium documentation fixes * Update helium.md * Update helium.md * Update helium.md	2025-02-13 12:20:53 +01:00
Thomas Bauwens	8f137b2427	Move `DataCollatorForMultipleChoice` from the docs to the package (#34763 ) * Add implementation for DataCollatorForMultipleChoice based on docs. * Add DataCollatorForMultipleChoice to import structure. * Remove custom DataCollatorForMultipleChoice implementations from example scripts. * Remove custom implementations of DataCollatorForMultipleChoice from docs in English, Spanish, Japanese and Korean. * Refactor torch version of DataCollatorForMultipleChoice to be more easily understandable. * Apply suggested changes and run make fixup. * fix copies, style and fixup * add missing documentation * nits * fix docstring * style * nits * isort --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2025-02-13 12:01:28 +01:00
CL-ModelCloud	35c155052d	Fix PretrainedTokenizerFast check => Fix PretrainedTokenizerFast Save (#35835 ) * Fix the bug in tokenizer.save_pretrained when saving tokenizer_class to tokenizer_config.json * Update tokenization_utils_base.py * Update tokenization_utils_base.py * Update tokenization_utils_base.py * add tokenizer class type test * code review * code opt * fix bug * Update test_tokenization_fast.py * ruff check * make style * code opt * Update test_tokenization_fast.py --------- Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>	2025-02-13 12:00:33 +01:00
Marco Edward Gorelli	3c912c9089	docs: fix return type annotation of `get_default_model_revision` (#35982 )	2025-02-13 11:59:15 +01:00
gewenbin0992	6a1ab634b6	qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1 (#36083 ) * qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1 * fix * fix * fix * fix * add tests * fix test bugs * fix * fix failed tests * fix	2025-02-13 11:35:28 +01:00
Pavel Iakubovskii	d419862889	Fix tests for vision models (#35654 ) * Trigger tests * [run-slow] beit, detr, dinov2, vit, textnet * Fix BEiT interpolate_pos_encoding * Fix DETR test * Update DINOv2 test * Fix textnet * Fix vit * Fix DPT * fix data2vec test * Fix textnet test * Update interpolation check * Fix ZoeDepth tests * Update interpolate embeddings for BEiT * Apply suggestions from code review	2025-02-13 10:28:37 +00:00
Lucain	e60ae0d078	Replace deprecated update_repo_visibility (#35970 )	2025-02-13 11:27:55 +01:00
Nerogar	9065cf0d92	Fix Gemma2 dtype issue when storing weights in float16 precision (#35398 ) fix gemma2 dtype issue when storing weights in float16 precision	2025-02-13 11:17:37 +01:00
Ben Schneider	08ab1abff4	Add reminder config to issue template and print DS version in env (#35156 ) * update env command to log deepspeed version * suppress deepspeed import logging * Add reminder to include configs to repro description in bug report. * make fixup * [WIP] update import utils for deepspeed * Change to using is_deepspeed_available() from integrations. * make fixup	2025-02-13 10:55:49 +01:00
Sambhav Dixit	950cfb0b4f	Fix PaliGemma Pad Token Masking During Training #35855 (#35859 ) * change order of unmasking of tokens * library import * class setup * test function * refactor * add commit message * test modified * explict initiliasation of weights + made model smaller * removed sepete testing file * fixup * fixup core * test attention mask with token types * tests fixup * removed PaliGemmaAttentionMaskTest class --------- Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>	2025-02-13 10:11:44 +01:00
Benjamin Badger	1614d196e8	Mllama fsdp (#36000 ) * pixel input assignment revoked * double send * Update src/transformers/models/mllama/modeling_mllama.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-02-13 09:49:39 +01:00
ivarflakstad	847854b023	Add git LFS to AMD docker image (#36016 ) Add git lfs to AMD docker image	2025-02-12 22:27:21 +01:00
Yih-Dar	9985d06add	skip `test_initialization` for `VitPoseBackboneModelTest` for now (#36154 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-12 18:24:24 +01:00
Yih-Dar	4a5a7b991a	Fix test fetcher (#36129 ) * fix * fix * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-12 17:35:41 +01:00
Zach Mueller	1fae54c721	Add more rigerous non-slow grad accum tests (#35668 ) * Add more rigerous non-slow grad accum tests * Further nits * Re-add space * Readbility * Use tinystories instead * Revert transformer diff * tweak threshs	2025-02-12 10:26:21 -05:00
Ke Wen	f869d486d3	Update doc re list of models supporting TP (#35864 ) Update doc about models' TP support	2025-02-12 15:53:27 +01:00
hsilva664	281c0c8b5b	adding option to save/reload scaler (#34932 ) * Adding option to save/reload scaler * Removing duplicate variable * Adding save/reload test * Small fixes on deterministic algorithm call * Moving LLM test to another file to isolate its environment * Moving back to old file and using subprocess to run test isolated * Reverting back accidental change * Reverting back accidental change	2025-02-12 15:48:16 +01:00
kang sheng	a33ac830af	Fix multi gpu loss sync condition, add doc and test (#35743 ) * Fix multi gpu loss sync condition, add doc and test * rename function and class * loss should not scale during inference * fix typo	2025-02-12 15:41:31 +01:00
zhuHQ	08c4959a23	Optim: APOLLO optimizer integration (#36062 ) * Added APOLLO optimizer integration * fix comment * Remove redundancy: Modularize low-rank optimizer construction * Remove redundancy: Remove useless comment * Fix comment: Add typing * Fix comment: Rewrite apollo desc	2025-02-12 15:33:43 +01:00
Dmitry Rogozhkin	2440512723	multi-gpu: fix tensor device placements for various models (#35763 ) * milti-gpu: fix inputs_embeds + position_embeds Fixing the following errors in few models: ``` > hidden_states = inputs_embeds + pos_embeds E RuntimeError: Expected all tensors to be on the same device, but found at least two devices, xpu:2 and xpu:3! ``` Fixes: #35762 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * multi-gpu: fix tensor device placements for various models Fixes: #35762 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * Apply make fix-copies Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> --------- Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>	2025-02-12 15:28:18 +01:00
Lucain	befea8c4f0	🚨 Remove cache migration script (#35810 ) * Remove cache migration script * remove dummy move_cache	2025-02-12 15:12:38 +01:00
dependabot[bot]	d52a9d08ce	Bump cryptography from 43.0.1 to 44.0.1 in /examples/research_projects/decision_transformer (#36142 ) Bump cryptography in /examples/research_projects/decision_transformer Bumps [cryptography](https://github.com/pyca/cryptography) from 43.0.1 to 44.0.1. - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst) - [Commits](https://github.com/pyca/cryptography/compare/43.0.1...44.0.1) --- updated-dependencies: - dependency-name: cryptography dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-12 13:34:52 +00:00
dependabot[bot]	31e4831b98	Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/vqgan-clip (#36136 ) Bump transformers in /examples/research_projects/vqgan-clip Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-12 13:21:09 +00:00
Leon Engländer	243aeb7c4a	Fix Gradient Checkpointing for Deberta & Deberta-V2 using PEFT / Adapters (#35898 ) Replace In-Place Operations for Deberta and Deberta-V2	2025-02-12 14:21:01 +01:00
Joao Gante	8a2f062eac	[commands] remove deprecated/inoperational commands (#35718 ) rm deprecated/inoperational commands	2025-02-12 12:23:58 +00:00
Raushan Turganbay	8fc6ecba4f	VLM: enable skipped tests (#35746 ) * fix cached tests * fix some tests * fix pix2struct * fix	2025-02-12 12:55:46 +01:00
Sambhav Dixit	d6897b46bd	Add utility for Reload Transformers imports cache for development workflow #35508 (#35858 ) * Reload transformers fix form cache * add imports * add test fn for clearing import cache * ruff fix to core import logic * ruff fix to test file * fixup for imports * fixup for test * lru restore * test check * fix style changes * added documentation for usecase * fixing --------- Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>	2025-02-12 12:45:11 +01:00
Joao Gante	1cc7ca3295	Whisper: remove redundant assisted generation tests (#34814 ) * remove redundant test * delete another test * revert default max_length * (wrong place, moving)	2025-02-12 11:37:19 +00:00
MilkClouds	0cd5e2dfd0	added warning to Trainer when label_names is not specified for PeftModel (#32085 ) * feat: added warning to Trainer when label_names is not specified for PeftModel * Update trainer.py * feat: peft detectw ith `_is_peft_model` * Update src/transformers/trainer.py Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> * Applied formatting in trainer.py --------- Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>	2025-02-12 12:34:47 +01:00
nhamanasu	377d8e2b9c	add RAdamScheduleFree optimizer (#35313 ) * add RAdamScheduleFree optimizer * revert schedulefree version to the minimum requirement * refine is_schedulefree_available so that it can take min_version * refine documents --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-02-12 11:31:51 +01:00
Harry Mellor	f5fff672db	Add pipeline parallel plan to `PretrainedConfig` and `PreTrainedModel` (#36091 ) * Add `base_model_pp_plan` to `PretrainedConfig` Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add `_pp_plan` to `PreTrainedModel` Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add both to Llama for testing Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Fix type error Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Update to suggested schema Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * `_pp_plan` keys are not patterns Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Simplify schema Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Fix typing error Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Update input name for Llama Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Aria Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Bamba Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Cohere 1 & 2 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to diffllama and emu3 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Gemma 1 & 2 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to GLM and GPT NeoX Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Granite and Helium Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Mistral and Mixtral Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to OLMo 1 & 2 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Phi and Phi 3 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan for Qwen 2, 2 MoE, 2 VL and 2.5 VL Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan for Starcoder 2 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add enum for accessing inputs and outputs Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Update type hints to use tuples Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Change outer list to tuple Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> --------- Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-02-12 10:51:48 +01:00
Fanli Lin	11afab19c0	[docs] update awq doc (#36079 ) * update awq doc * Update docs/source/en/quantization/awq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/awq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/awq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/awq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * add note for inference --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-02-11 10:35:28 -08:00

1 2 3 4 5 ...

18046 Commits