transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-05 22:00:09 +06:00

Author	SHA1	Message	Date
Aritra Roy Gosthipaty	c9d1e5238a	Update installation.md (#36826 ) * Update installation.md * Update README.md	2025-03-21 16:32:02 -07:00
Steven Liu	d253de6d58	[docs] Model docs (#36469 ) * initial * fix * fix * update * fix * fixes * quantization * attention mask visualizer * multimodal * small changes * fix code samples	2025-03-21 15:35:22 -07:00
Joao Gante	949cca4061	[CI] doc builder without custom image (#36862 ) * no image * test * revert jax version updates * make fixup * update autodoc path for model_addition_debugger * shieldgemma2 * add missing pages to toctree	2025-03-21 09:10:27 +00:00
Pablo Montalvo	1d3f35f30a	Add model visual debugger (#36798 ) * draft of model tracer visualiser * add context manager in addition to decorator * add debug utils to init * move model debugging utils to dedicated file * add documentation * protect some imports * format * move and protect imports * format * doc: improve errors in case of broken dummy imports. * format * use automatic torch backend * update doc * fix backend * (TEMP) move to dummies while backend wait * update documentation * doc	2025-03-20 17:37:29 +01:00
Haotong LIN	6515c25953	Add Prompt Depth Anything Model (#35401 ) * add prompt depth anything model by modular transformer * add prompt depth anything docs and imports * update code style according transformers doc * update code style: import order issue is fixed by custom_init_isort * fix depth shape from B,1,H,W to B,H,W which is as the same as Depth Anything * move prompt depth anything to vision models in _toctree.yml * update backbone test; there is no need for resnet18 backbone test * update init file & pass RUN_SLOW tests * update len(prompt_depth) to prompt_depth.shape[0] Co-authored-by: Joshua Lochner <admin@xenova.com> * fix torch_int/model_doc * fix typo * update PromptDepthAnythingImageProcessor * fix typo * fix typo for prompt depth anything doc * update promptda overview image link of huggingface repo * fix some typos in promptda doc * Update image processing to include pad_image, prompt depth position, and related explanations for better clarity and functionality. * add copy disclaimer for prompt depth anything image processing * fix some format typos in image processing and conversion scripts * fix nn.ReLU(False) to nn.ReLU() * rename residual layer as it's a sequential layer * move size compute to a separate line/variable for easier debug in modular prompt depth anything * fix modular format for prompt depth anything * update modular prompt depth anything * fix scale to meter and some internal funcs warp * fix code style in image_processing_prompt_depth_anything.py * fix issues in image_processing_prompt_depth_anything.py * fix issues in image_processing_prompt_depth_anything.py * fix issues in prompt depth anything * update converting script similar to mllamma * update testing for modeling prompt depth anything * update testing for image_processing_prompt_depth_anything * fix assertion in image_processing_prompt_depth_anything * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update docs/source/en/model_doc/prompt_depth_anything.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update docs/source/en/model_doc/prompt_depth_anything.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * update some testing * fix testing * fix * add return doc for forward of prompt depth anything * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update tests/models/prompt_depth_anything/test_modeling_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix prompt depth order * fix format for testing prompt depth anything * fix minor issues in prompt depth anything doc * fix format for modular prompt depth anything * revert format for modular prompt depth anything * revert format for modular prompt depth anything * update format for modular prompt depth anything * fix parallel testing errors * fix doc for prompt depth anything * Add header * Fix imports * Licence header --------- Co-authored-by: Joshua Lochner <admin@xenova.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-03-20 16:12:44 +00:00
Pavel Iakubovskii	66291778dd	Refactor Attention implementation for ViT-based models (#36545 ) * Refactor vit attention * Refactor ViT-based models * 🚨🚨🚨 Fix prefix for DPT * Update params order * trigger tests * Fix Dinov2 attention * Fix DPT attention impl propagation for backbone config * Common test fix: config is modif. inplace - avoid it * view->reshape * Fixup * Fixup * Enable IJepa FA2 * Add FA2 in corresponding model docs	2025-03-20 15:15:01 +00:00
fxmarty-amd	1a374799ce	Support loading Quark quantized models in Transformers (#36372 ) * add quark quantizer * add quark doc * clean up doc * fix tests * make style * more style fixes * cleanup imports * cleaning * precise install * Update docs/source/en/quantization/quark.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/quark_integration/test_quark.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/utils/quantization_config.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * remove import guard as suggested * update copyright headers * add quark to transformers-quantization-latest-gpu Dockerfile * make tests pass on transformers main + quark==0.7 * add missing F8_E4M3 and F8_E5M2 keys from str_to_torch_dtype --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Bowen Bao <bowenbao@amd.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-03-20 15:40:51 +01:00
Ryan Mullins	487dab1b2b	Shieldgemma2 (#36678 ) * single commit * correct config * fixup * dummy pt * Use ShieldGemma2Config in conversion script * Update src/transformers/models/shieldgemma2/configuration_shieldgemma2.py * Adding shieldgemma2 to models.__init__.py * Adding ShieldGemma2 to main __init__.py * Update shieldgemma2.md * Update shieldgemma2.md * Adding tests. Addressing review feedback. * Minor docs update * Fixing code quality feedback from CI * Fixing empty messages bug reported by ghunkins --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Ren Pang <ain-soph@live.com>	2025-03-20 15:14:38 +01:00
Joao Gante	957b05b413	[qwen2 audio] remove redundant code and update docs (#36282 )	2025-03-20 10:54:51 +00:00
HDCharles	94555437e2	Disable inductor config setter by default (#36608 ) * Disable inductor config setter by default This is hard to debug and should be off by default * remove default settings in autoquant too * Add info to torchao.md about recommended settings * satisfying Ruff format Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-20 11:23:14 +01:00
Matt	9be4728af8	Just import torch AdamW instead (#36177 ) * Just import torch AdamW instead * Update docs too * Make AdamW undocumented * make fixup * Add a basic wrapper class * Add it back to the docs * Just remove AdamW entirely * Remove some AdamW references * Drop AdamW from the public init * make fix-copies * Cleanup some references * make fixup * Delete lots of transformers.AdamW references * Remove extra references to adamw_hf	2025-03-19 18:29:40 +00:00
Mohamed Mekkouri	258dd9cc69	Add Space to Bitsandbytes doc (#36834 ) * add space * address review	2025-03-19 18:56:07 +01:00
Driss Guessous	e8d960329e	Add option for ao base configs (#36526 )	2025-03-19 14:59:47 +01:00
Yoni Gozlan	12f2ebef63	Support custom dosctrings in modular (#36726 ) * Override docstrings in modular if not none * Update doc	2025-03-18 14:00:54 -04:00
Yoni Gozlan	30580f035b	Fix Mistral3 tests (#36797 ) * fix processor tests * fix modeling tests * fix test processor chat template * revert modeling test changes	2025-03-18 13:08:12 -04:00
Cyril Vallez	e959530b8f	Add Mistral3 (#36790 ) * initial start * style and dummies * Create convert_mistral3_weights_to_hf.py * update * typo * typo * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * up * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * update * update * Update image_processing_mistral3.py * Update convert_mistral3_weights_to_hf.py * fix patch merger * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * up * update modular to fit * style * Update convert_mistral3_weights_to_hf.py * typo * Update modular_mistral3.py * simplify a lot all shape shenanigans * simplify * add working test processor * Add partially working common modeling tests * All tests working and remove mistral3 image processors * add docs and fixup * fix inference with image size >1540 * 🚨fix test image proc pixtral * Remove vision_feature_select_strategy * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * clean * fix test checkpoints * Update test_modeling_mistral3.py * Update test_modeling_mistral3.py * style * Use Pixtral processor * up * finish cleaning processor to use pixtral directly * Update __init__.py * Update processing_pixtral.py * doc * Update __init__.py * Update mistral3.md * Update _toctree.yml --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: yonigozlan <yoni.gozlan10@gmail.com>	2025-03-18 12:04:42 +01:00
Steven Liu	ac1a1b66b9	[docs] Update README (#36265 ) * update * feedback * feedback * update versions	2025-03-17 09:37:19 -07:00
Christopher Akiki	e3af4fec91	[MINOR:TYPO] Update hubert.md (#36733 ) * [MINOR:TYPO] Update hubert.md - typo fix (wave2vec instead of hubert) - make code snippet copiable and runnable * Run tests	2025-03-17 09:07:51 -07:00
MaCAT	25992b493c	🌐 [i18n-KO] Translated codegen.md to Korean (#36698 ) * Initial translation * Add _toctree.yml	2025-03-14 09:31:18 -07:00
Yoni Gozlan	69bc848480	Add support for fast image processors in add-new-model-like CLI (#36313 ) * add support for fast image processors in add-new-model-like * fix header not found add-fast-image-processor-cli * Encourage adding fast image processor * nit * start improve doc * update docs * make requested modifs	2025-03-13 14:16:37 -04:00
Arthur	2829013d2d	fix block mask typing (#36661 ) * fix block mask typing * updated Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * gemma * fix --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-03-12 11:29:11 +01:00
Ryan Mullins	50d3530aa0	Gemma3 (#36658 ) * Fix converter * [Broken] Adds Gemma 3 to Hugging Face Transformers * Consolidating Config and Processor params across impls * Sorting out configuration parameters. Adds qk_norm before RoPE. Still not sure if RoPE is right. * Additional plumbing for CausalLM and ConditionalGeneration variants * incomplete draft of Orbax conversion script * More complete checkpoint conversion * Supporting Gemma 3 1B checkpoints * Updating RoPE for multiple frequencies * Adjustments to rotary embedder * Proof of life for text-only operation * Updating the conversion script to handle multimodal projection weights * Fixing tet-only conversions * Cleaner conversion script with multimodal support and a simpler processor * Additional refatcors to the Gemma3Processor * Simplified Processor to work over text representations * Updated conversion script to join text and vision embeddings at converion time * Logging for debugging * Update src/transformers/models/gemma2/modeling_gemma2.py Co-authored-by: Joshua Lochner <admin@xenova.com> * Removed extraneous Config params * Switching to fast tokenizer for checkpoint conversions * isolating siglip for performance tetsing * Minor changes for debugging tests against baselines * Adding average pooling for soft tokens * Updating processor code to enable simpler embedding interleaving for arbitrary number of images in prompts * Updating conversion script for ShieldGemma 2 conversion compatibility * Allow disable_compile to be provided as a kwarg * Refresh from modular * Updated conversion script and corrected sliding window * Fix type mismatch in cache_position (#4) * Fix dtype (#5) * Fix type mismatch in cache_position * Actually fix in the modular file Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com> --------- Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com> * fixes for embedding table overflow and missing image_soft_token_mask from Gemma3Processor * Adding 2D pooling for image embeddings * Revert "Adding 2D pooling for image embeddings" This reverts commit `65350cf531`. * Gemma3 average pooling changed from 1D to 2D * Major refactor to Gemma3MultimodalInputProjection * Updating Gemm 3 Auto* registrations * Add option to save Gemma 3 chat template with tokenizer during weights conversion * Removing unused imports * Moving out-of-vocab handling from Gemma3Processor to Gemma3ForConditionalGeneration * Removing duplicate config property * Removing final logit softcapping and 1-indexing of position ids * Fixing image processor config and none --> None typo * Fixing sliding window size for 1B * Updating image_mean and image_std in Image Processor * Attention masking changed to lower triangular * Moving image special tokens to conversion script * Mirror image processor defaults from conversion script into Gemma3ProcessorKwargs * Remove special token variables from symbol space * Moving image soft token mask computation from Gemma3Processor to Gemma3ForConditionalGeneration * tie lm_head and embedding weights Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Correct tied weights in Gemma3CausalLM * iterative bidirectional attention * resolving merge conflicts * Reverting to Gemma 2 HybridCache with sldiing window support and a sliding_window_pattern of 6 * Correcting RoPE scaling * clean up first pass, dummy model geenration works * final clean up before fixing tests * causal lm test works, so fine * Fix conversion * Update src/transformers/models/gemma3/processing_gemma3.py * model tests are happy * processor tests are happy * image processing tests added * fixup * Fix pre-processing in conversion * Inputs merging * Do not normalize vision embeddings * Apply Ryan's (and team) changes to attention * token type ids + mask * template * move embed scale, add rope scale, fix tests * Add chat template to tokenizer * Use prefix for causal model loading * use existing code for sliding mask from gemma2 * self.embed_tokens already normalizes * Correcting Gemma3TextConfig parameters in conversion script * typo, modular overwrites my fixes * enable device map for text model * Conversion updates * ultra nit: no einsums * update image token * copy deepcopy config + some docs * add some test, still WIP * Refactoring --include_chat_tempalte logic in converter * Update src/transformers/models/gemma3/modular_gemma3.py Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> * Add eos tokens for instruct models * dump so i can work on dgx * Removing add_bos by default * dump * add fast im proc * docs for PaS + fixup * another fixup * one more fixup * fix tests * Inverting prior BOS change * ultra nit * Reverting to Tokenizer saved with add_bos_token=True and chat template starting with BOS * resize embeds, remove sqrt, add slow test outputs * FA2 but quality is meh * nit * skip FA2, no idea what happened * last bit for green CI * please, green CI for docs * T_T * Fix for Gemma3 logits * Support both options for system prompt * Update src/transformers/models/gemma3/image_processing_gemma3_fast.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/gemma3.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/gemma3.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/gemma3.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/gemma3.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/gemma3.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Docs updates now that assets are live * Style fixes --------- Co-authored-by: Joshua Lochner <admin@xenova.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com> Co-authored-by: Mayank Chaturvedi <imayank@google.com> Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> Co-authored-by: raushan <raushan@huggingface.co> Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz> Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> Co-authored-by: Lysandre <hi@lysand.re>	2025-03-12 09:06:17 +01:00
Afanti	81aa9b2e07	fix typos in the docs directory (#36639 ) * chore: fix typos in the docs directory * chore: fix typos in the docs directory * chore: fix typos in the docs directory	2025-03-11 09:41:41 -07:00
Marc Sun	cb384dcd7a	Fix gguf docs (#36601 ) * update * doc * update * Update docs/source/en/gguf.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-03-11 15:29:14 +01:00
Matt	1e4286fd59	Remove research projects (#36645 ) * Remove research projects * Add new README to explain where the projects went * Trigger tests * Cleanup all references to research_projects	2025-03-11 13:47:38 +00:00
Steven Liu	e9756cdbc7	[docs] Serving LLMs (#36522 ) * initial * fix * model-impl	2025-03-10 13:14:19 -07:00
Krishnakumar Kannan	1b9978c360	Update chat_extras.md with content correction (#36599 ) Update chat_extras.md - content Fixed a typo in the content, that may confuse the readers.	2025-03-07 13:09:02 +00:00
Nouamane Tazi	51ed61e2f0	Mention UltraScale Playbook 🌌 in docs (#36589 )	2025-03-06 14:48:11 -08:00
Aritra Roy Gosthipaty	159445d044	fix: argument (#36558 ) `752ef3fd4e/utils/modular_model_converter.py (L1729)`	2025-03-06 13:11:19 -08:00
Shaohon Chen	0440dbc0e1	Integrate SwanLab for offline/online experiment tracking and local visualization (#36433 ) * add swanlab integration * feat(integrate): add SwanLab as an optional experiment tracking tool in transformers - Integrated SwanLab into the transformers library as an alternative for experiment tracking. - Users can now log training metrics, hyperparameters, and other experiment details to SwanLab by setting `report_to="swanlab"` in the `TrainingArguments`. - Added necessary dependencies and documentation for SwanLab integration. * Fix the spelling error of SwanLabCallback in callback.md * Apply suggestions from code review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Fix typo in comment * Fix typo in comment * Fix typos and update comments * fix annotation * chore: opt some comments --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: AAssets <20010618@qq.com> Co-authored-by: ZeYi Lin <944270057@qq.com> Co-authored-by: KAAANG <79990647+SAKURA-CAT@users.noreply.github.com>	2025-03-06 17:35:30 +01:00
Mohamed Mekkouri	89d27fa6ff	Fix links in quantization doc (#36528 ) fix quantization doc	2025-03-04 16:43:03 +01:00
co63oc	37508816d6	chore: Fix typos in docs and examples (#36524 ) Fix typos in docs and examples Signed-off-by: co63oc <co63oc@users.noreply.github.com>	2025-03-04 13:47:41 +00:00
Arthur	84f0186e89	Add aya (#36521 ) * initial commit * small fix * move stuff to image processing file * remove stuff in validate turn and fix return tensor * remove liquid stuff * in the process of addressing comments * changes to get the right tokenization * new __init__ works * fixing defulat std and mean * works * small testing scipt -- to be deleted before merge * remove redundant code * addressing comments * fix inits, add docs templates * refactor processor, switch to gotocr image processor * remove image proc from init * refactor to working llava-style architecture * Change AyaVisionModel to AyaVisionForConditionalGeneration * add tests * fixups * update doc * Adding logits_to_keep explicitly in ayavision forward to enable compatibility with cohere model * better variable names + remove code paths * Updates to aya_vision.md * address comments * adding copied from * make style and remove unused projector_hidden_act from config * sort init * include usage of fast image proc and proc on cuda in doc * update checkpoint iin test processor * update checkpoint in test processor 2 * remove test_model and update docstring * skip failing tests --------- Co-authored-by: Saurabh Dash <saurabh@cohere.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-03-04 12:24:33 +01:00
Steven Liu	c0f8d055ce	[docs] Redesign (#31757 ) * toctree * not-doctested.txt * collapse sections * feedback * update * rewrite get started sections * fixes * fix * loading models * fix * customize models * share * fix link * contribute part 1 * contribute pt 2 * fix toctree * tokenization pt 1 * Add new model (#32615) * v1 - working version * fix * fix * fix * fix * rename to correct name * fix title * fixup * rename files * fix * add copied from on tests * rename to `FalconMamba` everywhere and fix bugs * fix quantization + accelerate * fix copies * add `torch.compile` support * fix tests * fix tests and add slow tests * copies on config * merge the latest changes * fix tests * add few lines about instruct * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * fix tests --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * "to be not" -> "not to be" (#32636) * "to be not" -> "not to be" * Update sam.md * Update trainer.py * Update modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * fix hfoption tag * tokenization pt. 2 * image processor * fix toctree * backbones * feature extractor * fix file name * processor * update not-doctested * update * make style * fix toctree * revision * make fixup * fix toctree * fix * make style * fix hfoption tag * pipeline * pipeline gradio * pipeline web server * add pipeline * fix toctree * not-doctested * prompting * llm optims * fix toctree * fixes * cache * text generation * fix * chat pipeline * chat stuff * xla * torch.compile * cpu inference * toctree * gpu inference * agents and tools * gguf/tiktoken * finetune * toctree * trainer * trainer pt 2 * optims * optimizers * accelerate * parallelism * fsdp * update * distributed cpu * hardware training * gpu training * gpu training 2 * peft * distrib debug * deepspeed 1 * deepspeed 2 * chat toctree * quant pt 1 * quant pt 2 * fix toctree * fix * fix * quant pt 3 * quant pt 4 * serialization * torchscript * scripts * tpu * review * model addition timeline * modular * more reviews * reviews * fix toctree * reviews reviews * continue reviews * more reviews * modular transformers * more review * zamba2 * fix * all frameworks * pytorch * supported model frameworks * flashattention * rm check_table * not-doctested.txt * rm check_support_list.py * feedback * updates/feedback * review * feedback * fix * update * feedback * updates * update --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2025-03-03 10:33:46 -08:00
co63oc	acb8586dd9	Fix some typos in docs (#36502 ) Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-03-03 17:53:53 +00:00
Yoni Gozlan	2c5d038f92	Add Got-OCR 2 Fast image processor and refactor slow one (#36185 ) * refactor image processor slow got ocr * add working image processor fast * fix fast image processor, update doc * use one big loop for processing patches	2025-03-01 00:56:00 -05:00
Fanli Lin	51083d1bac	[docs] fix bug in deepspeed config (#36081 ) bug fix	2025-02-28 07:09:54 -08:00
Nicolas Patry	b4965cecc5	Fixing the docs corresponding to the breaking change in torch 2.6. (#36420 )	2025-02-26 14:11:52 +01:00
Aymeric Roucher	9a217fc327	Deprecate transformers.agents (#36415 )	2025-02-26 11:38:47 +01:00
jiqing-feng	9d6abf9778	enable torchao quantization on CPU (#36146 ) * enable torchao quantization on CPU Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix int4 Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * enable CPU torchao tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix cuda tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix cpu tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix style Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix cuda tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix torchao available Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix torchao available Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix torchao config cannot convert to json * fix docs Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * rm to_dict to rebase Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * limited torchao version for CPU Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix skip Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Update src/transformers/testing_utils.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix cpu test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-25 11:06:52 +01:00
Jerry Zhang	2af272c101	Add autoquant support for torchao quantizer (#35503 ) * Add autoquant support for torchao quantizer Summary: att, also verified that autoquantized model can be saved and loaded: save: https://gist.github.com/jerryzh168/01d367aaf44dbbbfd4068a4a10a00061 load: https://gist.github.com/jerryzh168/d5c6c401b2abdf18e0b6771341f1525c Test Plan: tested locally with above script model uploaded to https://huggingface.co/jerryzh168/llama3-8b-autoquant Reviewers: Subscribers: Tasks: Tags: * add test * ruff fix * ruff reformat * add docs and min_sqnr support * format * format * fix test * update doc * format * remove disable_compile * format	2025-02-24 15:54:16 +01:00
Pavel Iakubovskii	a957b7911a	Add SigLIP 2 (#36323 ) * Docs * Inits * Auto classes * Add siglip base * Add base tests * Fix Siglip V1 for fix res version * Add image processor * Update conversion * Experimenting with vectorized embeddings * Fixup * Add modular Siglip2Processor * Add modular configuration * Rename num patches * Correct image and text features merging * Working conversion script * Refactoring conversion script * Remove unused code in conversion script * Shorten dict a bit * Refactoring conversion * Done conversion refactoring * Fixup * Modular siglip2 * Make model exportable and compilable without graph breaks * Remove position_ids from image_processor * REmove position ids from modeling file * Update modular * Type hint * Fixup * Set defaults to processor * Add integration test * Revert spatial shapes back to tensor * Change order * Fix most of the tests * Fix docstring * Remove interpolate_pos_encoding arg (not needed) * Update docs * Standardize processing * Fix attention_mask in vision head * Siglip v1: remove double transpose in FA2 * Update modular file * Update FA2 test * Update expected logits * Fix interpolation for siglip2 image processor * Skip init test * Skip dispatch on flash test * Fix modeling tests * Fixup * Add dummy objects * Fix some docstrings * Add siglip2 in index.md * Fix consistency * Add docs * Remove size and data format * Add image processor tests * Fix * Add fast image processor * Fix style * Fix * Docs * Set lowercase for tokenizer * Adjust head size for Siglip v1 * Update siglip2 for consistency with siglip1 * Update siglip2 conversion * Update pipeline * Update checkpoints in tests * Update checkpoint name * Fix pooling for image classification model * Fix FA2 test * Update processor * Fix check repo * Update docs * Fix typos * Fix docstring for fast image processor * Add siglip2 to FA2 docs * Fix fast ip tests * Fix constitency * Fix tokenizer class for siglip v1 * Fix missing header * Refactor scaling for clip, siglip, siglip2 * Remove unused imports * Make fast IP default for siglip2 * Update docs * Update checkpoints * Update modular * Update paper link * Fixup * Fix name in toctree * Fix test	2025-02-21 09:04:19 +00:00
Joao Gante	27d1707586	[smolvlm] make CI green (#36306 ) * add smolvlm to toctree * add requirements * dev-ci * no docker changes * dev-ci * update torch-light.dockerfile * derp * dev-ci	2025-02-20 18:56:11 +01:00
12v	5412ff1a13	Fix typo in Pixtral example (#36302 ) Fix typo	2025-02-20 14:13:48 +00:00
Orr Zohar	4397dfcb71	SmolVLM2 (#36126 ) * smolvlm init * updates * fixing bugs * minimal run, no checks * minimal run, no checks * passing first check + adding url support * updating video dataloading logic * fixing image logic * trying modular, but fails * modular is working, changing processor to match PR comments and general transformers logic * fixing kwargs * offloading video loading logic to image_util * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * update * add idefics3-based tests * add keyword to all * add PreTrainedModel * updateing video loading logic * working inference * updates for PR comments * updates for PR comments * moving SmolVLMPretrainedModel higher to fix import error * CI test pass * CI test pass * removing lambda * CI test pass * CI test pass * CI test pass * CI test pass * CI test pass * CI test pass * processor tests * add example in docs * typo * fix copies * skip compile tests - sdpa for VisionTransformer * fix init * raise import error for num2words * update doc for FA2 * more doc fix * CI * updates for PR comments * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Joshua Lochner <admin@xenova.com> * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * fixing processor -- tokenizer not defined properly, (gpt2 tokenizer), and does not have the attributes of fake image token, etc * adding smolvlm to VQA models * removing vqa auto class * Update src/transformers/models/smolvlm/processing_smolvlm.py Co-authored-by: Joshua Lochner <admin@xenova.com> * removing smolvlmvisiontransformer from index.md * my bad, video processing had typos * fixing docs * renaming params in SmolVLMModel.inputs_merger * removing un-needed dtype/device in model forward * ruff for CI * update docs * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * return cache position * return cache position * return cache also in modular * needed to run modular again * fix training tests * push vectorized inputs merger * format * format * reduce number of mappings * addressing PR comments * happy CI, happy me :) * skip non-nested images * adjust integration test for smaller GPUs * format * fix kwargs in chat template apply * skip this for now --------- Co-authored-by: raushan <raushan@huggingface.co> Co-authored-by: Pablo <pablo.montalvo.leroux@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Joshua Lochner <admin@xenova.com>	2025-02-20 15:00:26 +01:00
Joao Gante	99adc74462	[tests] remove flax-pt equivalence and cross tests (#36283 )	2025-02-19 15:13:27 +00:00
Joao Gante	0863eef248	[tests] remove `pt_tf` equivalence tests (#36253 )	2025-02-19 11:55:11 +00:00
Mehant Kammakomati	c3ba53303b	feat: add support for tensor parallel training workflow with accelerate (#34194 ) * feat: add support for tensor parallel flow using accelerate Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * fix: add tp degree to env variable Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * fix: add version check for accelerate to allow TP Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * docs: tensor parallelism Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * nit: rename plugin name Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * fix: guard accelerate version before allow tp Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * docs: add more docs and updates related to TP Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> --------- Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-18 14:05:46 +01:00
Mayank Mishra	a570e2ba87	add shared experts for upcoming Granite 4.0 language models (#35894 ) * Modular GraniteMoE with shared Experts. Signed-off-by: Shawn Tan <shawntan@ibm.com> * Modified * Import order. * Modified for style * Fix space. * Test * Remove extra granitemoe file. * New converted file and tests * Modified __init__ files. * Formatting. * Dummy PT objects * register granitemoe shared model Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix linting of a file Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix import in modeling file Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * update generated modeling file Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * add documentation Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * update docstrings Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * update generated modeling file Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix docstrings in config class Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * merge main Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> --------- Signed-off-by: Shawn Tan <shawntan@ibm.com> Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> Co-authored-by: Shawn Tan <shawntan@ibm.com> Co-authored-by: Shawn Tan <shawn@wtf.sg> Co-authored-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> Co-authored-by: Sukriti Sharma <Ssukriti@users.noreply.github.com>	2025-02-14 16:55:28 +01:00
Isotr0py	33d1d715b0	Add ImageProcessorFast to Qwen2.5-VL processor (#36164 ) * add qwen2 fast image processor to modular file Signed-off-by: isotr0py <2037008807@qq.com> * fix modular Signed-off-by: isotr0py <2037008807@qq.com> * fix circle import Signed-off-by: isotr0py <2037008807@qq.com> * add docs Signed-off-by: isotr0py <2037008807@qq.com> * fix typo Signed-off-by: isotr0py <2037008807@qq.com> * add modular generated files Signed-off-by: isotr0py <2037008807@qq.com> * revert qwen2vl fast image processor Signed-off-by: isotr0py <2037008807@qq.com> * remove qwen2.5-vl image processor from modular Signed-off-by: isotr0py <2037008807@qq.com> * re-generate qwen2.5-vl files Signed-off-by: isotr0py <2037008807@qq.com> * remove unnecessary test Signed-off-by: isotr0py <2037008807@qq.com> * fix auto map Signed-off-by: isotr0py <2037008807@qq.com> * cleanup Signed-off-by: isotr0py <2037008807@qq.com> * fix model_input_names Signed-off-by: isotr0py <2037008807@qq.com> * remove import Signed-off-by: isotr0py <2037008807@qq.com> * make fix-copies Signed-off-by: isotr0py <2037008807@qq.com> --------- Signed-off-by: isotr0py <2037008807@qq.com>	2025-02-14 17:34:55 +08:00
Raushan Turganbay	1931a35140	Chat template docs (#36163 ) * decompose chat template docs * add docs * update model docs * qwen2-5 * pixtral * remove old chat template * also video as list frames supported * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * remove audio for now --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-02-14 10:32:14 +01:00
Elvir Crnčević	845b0a2616	Efficient Inference Kernel for SpQR (#34976 ) * Resolve vptq conflict * Rename spqr package to spqr_quant * Get rid of aqlm mention * Start working on tests * Resolve ruff code checks * Ruff format * Isort * Test updates * Add gpu tag * Rename to modules_to_not_convert * Config update * Docs and config update * Docs and config update * Update to update_torch_dtype * spqr config parameter validation * Ruff update * Apply ruff fixes * Test fixes * Ruff update * Mark tests as @slow again; Ruff; Docstring update * Ruff * Remove absolute path * Resolve typo * Remove redundandt log * Check accelerate/spqr availability * Ruff fix * Check if the config contains proper shapes * Ruff test * Documentation update * overview update * Ruff checks * Ruff code quality * Make style * Update docs/source/en/quantization/spqr.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update spqr.md * Enable gptqmodel (#35012) * gptqmodel Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update readme Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * gptqmodel need use checkpoint_format (#1) * gptqmodel need use checkpoint_format * fix quantize * Update quantization_config.py * Update quantization_config.py * Update quantization_config.py --------- Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * Revert quantizer_gptq.py (#2) * revert quantizer_gptq.py change * pass *kwargs limit gptqmodel and optimum version Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix warning Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix version check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert unrelated changes Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * enable gptqmodel tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix requires gptq Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Fix Transformer compat (#3) * revert quantizer_gptq.py change * pass *kwargs add meta info * cleanup * cleanup * Update quantization_config.py * hf_select_quant_linear pass checkpoint_format and meta * fix GPTQTestCUDA * Update test_gptq.py * gptqmodel.hf_select_quant_linear() now does not select ExllamaV2 * cleanup * add backend * cleanup * cleanup * no need check exllama version * Update quantization_config.py * lower checkpoint_format and backend * check none * cleanup * Update quantization_config.py * fix self.use_exllama == False * spell * fix unittest * fix unittest --------- Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format again Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update gptqmodel version (#6) * update gptqmodel version * update gptqmodel version * fix unit test (#5) * update gptqmodel version * update gptqmodel version * "not self.use_exllama" is not equivalent to "self.use_exllama==False" * fix unittest * update gptqmodel version * backend is loading_attibutes (#7) * fix format and tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix memory check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix device mismatch Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix result check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * update tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * review: update docs (#10) * review: update docs (#12) * review: update docs * fix typo * update tests for gptqmodel Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update document (#9) * update overview.md * cleanup * Update overview.md * Update overview.md * Update overview.md * update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md --------- Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * typo * doc note for asymmetric quant * typo with apple silicon(e) * typo for marlin * column name revert: review * doc rocm support * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com> Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com> Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fix : Nemotron Processor in GGUF conversion (#35708) * fixing nemotron processor * make style * Update docs/source/en/quantization/spqr.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add missing TOC to doc --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com> Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com> Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-02-13 16:22:58 +01:00
Mohamed Mekkouri	b41591d847	Fix : fix doc fp8 (#36173 ) * fix * fix	2025-02-13 15:29:59 +01:00
Mohamed Mekkouri	efe72fe21f	Adding FP8 Quantization to transformers (#36026 ) * first commit * adding kernels * fix create_quantized_param * fix quantization logic * end2end * fix style * fix imports * fix consistency * update * fix style * update * udpate after review * make style * update * update * fix * update * fix docstring * update * update after review * update * fix scheme * update * update * fix * update * fix docstring * add source * fix test --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-13 13:01:19 +01:00
Lysandre Debut	c82319b493	Helium documentation fixes (#36170 ) * Helium documentation fixes * Update helium.md * Update helium.md * Update helium.md	2025-02-13 12:20:53 +01:00
Thomas Bauwens	8f137b2427	Move `DataCollatorForMultipleChoice` from the docs to the package (#34763 ) * Add implementation for DataCollatorForMultipleChoice based on docs. * Add DataCollatorForMultipleChoice to import structure. * Remove custom DataCollatorForMultipleChoice implementations from example scripts. * Remove custom implementations of DataCollatorForMultipleChoice from docs in English, Spanish, Japanese and Korean. * Refactor torch version of DataCollatorForMultipleChoice to be more easily understandable. * Apply suggested changes and run make fixup. * fix copies, style and fixup * add missing documentation * nits * fix docstring * style * nits * isort --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2025-02-13 12:01:28 +01:00
Ke Wen	f869d486d3	Update doc re list of models supporting TP (#35864 ) Update doc about models' TP support	2025-02-12 15:53:27 +01:00
zhuHQ	08c4959a23	Optim: APOLLO optimizer integration (#36062 ) * Added APOLLO optimizer integration * fix comment * Remove redundancy: Modularize low-rank optimizer construction * Remove redundancy: Remove useless comment * Fix comment: Add typing * Fix comment: Rewrite apollo desc	2025-02-12 15:33:43 +01:00
Sambhav Dixit	d6897b46bd	Add utility for Reload Transformers imports cache for development workflow #35508 (#35858 ) * Reload transformers fix form cache * add imports * add test fn for clearing import cache * ruff fix to core import logic * ruff fix to test file * fixup for imports * fixup for test * lru restore * test check * fix style changes * added documentation for usecase * fixing --------- Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>	2025-02-12 12:45:11 +01:00
nhamanasu	377d8e2b9c	add RAdamScheduleFree optimizer (#35313 ) * add RAdamScheduleFree optimizer * revert schedulefree version to the minimum requirement * refine is_schedulefree_available so that it can take min_version * refine documents --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-02-12 11:31:51 +01:00
Fanli Lin	11afab19c0	[docs] update awq doc (#36079 ) * update awq doc * Update docs/source/en/quantization/awq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/awq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/awq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/awq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * add note for inference --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-02-11 10:35:28 -08:00
Armaghan Shakir	9a6be63fdb	Add Apple's Depth-Pro for depth estimation (#34583 ) * implement config and model building blocks * refactor model architechture * update model outputs * update init param to include use_fov_model * update param name in config * fix hidden_states and attentions outputs for fov * sort config * complete minor todos * update patching * update config for encoder * fix config * use correct defaults in config * update merge for compatibility with different image size * restructure encoder for custom configuration * make fov model compatible with custom config * replace word "decoder" with "fusion" * weight conversion script * fix fov squeeze * update conversion script (without test) * upload ruff image processing * create fast image processing * use torch interpolation for image processing * complete post_process_depth_estimation * config: fix imports and sort args * apply inference in weight conversion * use mllama script instead for weight conversion * clean weight conversion script * add depth-pro status in other files * fill docstring in config * formatting * more formatting * formatting with ruff * formatting with style * fix copied classes * add examples; update weight convert script * fix using check_table.py and isort * fix config docstring * add depth pro to sdpa docs * undo unintentional changes in configuration_gemma.py * minor fixes * test image processing * fixes and tests * more fixes * use output states from image_encoder instead * Revert "use output states from image_encoder instead" This reverts commit `2408ec54e4`. * make embeddings dynamic * reshape output hidden states and attentions as part of computation graph * fix ruff formating * fix docstring failure * use num_fov_head_layers in tests * update doc * check consistency with config * ruff formatting * update test case * fix ruff formatting * add tests for fov * use interpolation in postprocess * run and fix slow tests locally * use scaled_images_features for image and fov encoder * return fused_hidden_states in fusion stage * fix example * fix ruff * fix copyright license for all files * add __all__ for each file * minor fixes - fix download spell - add push_to_hub option - fix Optional type hinting - apply single loop for DepthProImageProcessor.preprocess * return list in post_process_depth_estimation * minor fixes - capitalize start of docstring - use ignore copy - fix examples - move docstring templates and custom output classes to top - remove "-> None" typehinting from __init__ - type hinting for forward passes - fix docstrings for custom output classes * fix "ruff check" * update upsample and projection * major changes: (image size and merge optimization) - add support for images of any size - optimize merge operation - remove image_size from config - use full names instead of B, C, H, W - remove interpolation from fusion stage - add interpolation after merge - move validations to config - update integration test - add type hints for functions * fix push_to_hub option in weights conversion * remove image_size in weights conversion * major changes in the architecture - remove all DepthProViT modules and support different backbones using the AutoModel API - set default use_fov_model to False - validate parameters in configuration - update interpolate function: use "nearest" for faster computation - update reshape_feature function: remove all special tokens, possible from different backbones - update merge function: use padding from config instead of merge_out_size - remove patch_to_batch and batch_to_patch conversions for now - calculate out_size dynamically in the encoder - leave head_mask calculation to the backbone - fix bugs with merge - add more comments - update tests * placeholder for unused config attributes * improve docs amid review * minor change in docs * further optimize merge * fix formatting * remove unused patch/batch convertion functions * use original F.interpolate * improve function naming * minor chages - use torch_int instead of int - use proper for newly initialized tensors - use user provided return_dict for patch_encoder - use if-else block instead in self.use_fov_model * rearchitect upsample block for improved modularity * update upsample keys in weight conversion * improve padding in merge_patches * use double-loop for merge * update comments * create feature_extractor, reduce some forward code * introduce config.use_mask_token in dinov2 * minor fixes * minor fixes for onnx * update __init__ to latest format * remove DepthProConfig.to_dict() * major changes in backbone * update config in weight conversion * formatting * converted model is fp32 * improve naming and docs for feature_extractor->reconstruct_feature_maps * minor fixes; amid review * create intermediate vars in func call * use torch.testing.assert_close * use ModuleList instead of Sequential and ModuleDict * update docs * include fov in integraiton tests * update docs * improve initialization of convolution layers * fix unused fov keys * update tests * ruff format * fix test, amid kaimming initialization * add depthpro to toctree * add residual layer to _no_split_modules * architecture rework * Update src/transformers/models/depth_pro/image_processing_depth_pro.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/depth_pro/image_processing_depth_pro_fast.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * update docs * improve merge_patches * use flatten with fov_output * ruff formatting * update resources section in docs Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix typo "final_kernal_size" Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix output typehint for DepthProDepthEstimator Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * residual operation in 2 steps Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * use image_size instead of global patch_size in interpolation * replace all Sequential with ModuleList * update fov * update heads * fix and update conversion script for heads * ruff formatting * remove float32 conversion * use "Fov" instead of "FOV" in class names * use "Fov" instead of "FOV" in config docs * remove prune_heads * update fusion stage * use device in examples * update processor * ruff fixes * add do_rescale in image_processor_dict * skip test: test_fast_is_faster_than_slow * ruff formatting * DepthProImageProcessorFast in other files * revert antialias removal * add antialias in BaseImageProcessorFast * Revert "revert antialias removal" This reverts commit `5caa0bd8f9`. * Revert "add antialias in BaseImageProcessorFast" This reverts commit `3ae1134780`. * update processor for grouping and antialias * try test_fast_is_faster_than_slow without "skip" or "flanky" * update checkpoint * update checkpoint * use @is_flanky for processor test * update checkpoint to "apple/DepthPro-hf" --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-02-10 11:32:45 +00:00
Fanli Lin	6b55046213	[docs] fix not-working example code in `perf_infer_gpu_one.md` (#36087 ) * bug fix * update memory limit	2025-02-07 12:42:22 -08:00
Fanli Lin	14ca7f1452	[docs] fix typo (#36080 ) typo fix	2025-02-07 12:42:09 -08:00
Fanli Lin	c361b1e3d9	[docs] fix model checkpoint name (#36075 ) update model name	2025-02-07 12:41:52 -08:00
Jade Choghari	006d9249ec	Adding RT-DETRv2 for object detection (#34773 ) * cookiecutter add rtdetrv2 * make modular working * working modelgit add . * working modelgit add . * finalize moduar inheritence * finalize moduar inheritence * Update src/transformers/models/rtdetrv2/modular_rtdetrv2.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * update modular and add rename * remove output ckpt * define loss_kwargs * fix CamelCase naming * fix naming + files * fix modular and convert file * additional changes * fix modular * fix import error (switch to lazy) * fix autobackbone * make style * add * update testing * fix loss * remove old folder * fix testing for v2 * update docstring * fix docstring * add resnetv2 (with modular bug to fix) * remove resnetv2 backbone * fix changes * small fixes * remove rtdetrv2resnetconfig * add rtdetrv2 name to convert * make style * Update docs/source/en/model_doc/rt_detr_v2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/rt_detr_v2/modular_rt_detr_v2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/rt_detr_v2/modular_rt_detr_v2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix modular typo after review * add reviewed changes * add final review changes * Update docs/source/en/model_doc/rt_detr_v2.md Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update src/transformers/models/rt_detr_v2/__init__.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update src/transformers/models/rt_detr_v2/convert_rt_detr_v2_weights_to_hf.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * add review changes * remove rtdetrv2 resnet * removing this weird project change * change ckpt name from jadechoghari to author * implement review and update testing * update naming and remove wrong ckpt * name * make fix-copies * Fix RT-DETR loss * Add resources, fix name * Fix repo in docs * Fix table name --------- Co-authored-by: jadechoghari <jadechoghari@users.noreply.huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: qubvel <qubvel@gmail.com>	2025-02-06 19:28:45 +00:00
Fanli Lin	6246c03260	[docs] fix outdated example code in `trainer.md` (#36066 ) fix bugs	2025-02-06 10:54:22 -08:00
Fanli Lin	531d1511f5	[docs] no hard-coding cuda (#36043 ) make device-agnostic	2025-02-05 08:22:33 -08:00
Fanli Lin	7399f8021e	[docs] fix bugs in the bitsandbytes documentation (#35868 ) * fix doc * update model	2025-02-05 08:21:20 -08:00
Fanli Lin	0a1a8e3c7e	[docs] no hard coding cuda as bnb has multi-backend support (#35867 ) * change cuda to DEVICE * Update docs/source/en/llm_tutorial.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-02-05 08:20:02 -08:00
Stas Bekman	9dc1efa5d4	DeepSpeed github repo move sync (#36021 ) deepspeed github repo move	2025-02-05 08:19:31 -08:00
Yoni Gozlan	fa56dcc2ab	Refactoring of ImageProcessorFast (#35069 ) * add init and base image processing functions * add add_fast_image_processor to transformers-cli * add working fast image processor clip * add fast image processor to doc, working tests * remove "to be implemented" SigLip * fix unprotected import * fix unprotected vision import * update ViTImageProcessorFast * increase threshold slow fast ewuivalence * add fast img blip * add fast class in tests with cli * improve cli * add fast image processor convnext * add LlavaPatchingMixin and fast image processor for llava_next and llava_onevision * add device kwarg to ImagesKwargs for fast processing on cuda * cleanup * fix unprotected import * group images by sizes and add batch processing * Add batch equivalence tests, skip when center_crop is used * cleanup * update init and cli * fix-copies * refactor convnext, cleanup base * fix * remove patching mixins, add piped torchvision transforms for ViT * fix unbatched processing * fix f strings * protect imports * change llava onevision to class transforms (test) * fix convnext * improve formatting (following Pavel review) * fix handling device arg * improve cli * fix * fix inits * Add distinction between preprocess and _preprocess, and support for arbitrary kwargs through valid_extra_kwargs * uniformize qwen2_vl fast * fix docstrings * add add fast image processor llava * remove min_pixels max_pixels from accepted size * nit * nit * refactor fast image processors docstrings * cleanup and remove fast class transforms * update add fast image processor transformers cli * cleanup docstring * uniformize pixtral fast and make _process_image explicit * fix prepare image structure llava next/onevision * Use typed kwargs instead of explicit args * nit fix import Unpack * clearly separate pops and gets in base preprocess. Use explicit typed kwargs * make qwen2_vl preprocess arguments hashable	2025-02-04 17:52:31 -05:00
David	8d73a38606	Add DAB-DETR for object detection (#30803 ) * initial commit * encoder+decoder layer changes WIP * architecture checks * working version of detection + segmentation * fix modeling outputs * fix return dict + output att/hs * found the position embedding masking bug * pre-training version * added iamge processors * typo in init.py * iterupdate set to false * fixed num_labels in class_output linear layer bias init * multihead attention shape fixes * test improvements * test update * dab-detr model_doc update * dab-detr model_doc update2 * test fix:test_retain_grad_hidden_states_attentions * config file clean and renaming variables * config file clean and renaming variables fix * updated convert_to_hf file * small fixes * style and qulity checks * return_dict fix * Merge branch main into add_dab_detr * small comment fix * skip test_inputs_embeds test * image processor updates + image processor test updates * check copies test fix update * updates for check_copies.py test * updates for check_copies.py test2 * tied weights fix * fixed image processing tests and fixed shared weights issues * added numpy nd array option to get_Expected_values method in test_image_processing_dab_detr.py * delete prints from test file * SafeTensor modification to solve HF Trainer issue * removing the safetensor modifications * make fix copies and hf uplaod has been added. * fixed index.md * fixed repo consistency * styel fix and dabdetrimageprocessor docstring update * requested modifications after the first review * Update src/transformers/models/dab_detr/image_processing_dab_detr.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * repo consistency has been fixed * update copied NestedTensor function after main merge * Update src/transformers/models/dab_detr/modeling_dab_detr.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * temp commit * temp commit2 * temp commit 3 * unit tests are fixed * fixed repo consistency * updated expected_boxes varible values based on related notebook results in DABDETRIntegrationTests file. * temporarialy config modifications and repo consistency fixes * Put dilation parameter back to config * pattern embeddings have been added to the rename_keys method * add dilation comment to config + add as an exception in check_config_attributes SPECIAL CASES * delete FeatureExtractor part from docs.md * requested modifications in modeling_dab_detr.py * [run_slow] dab_detr * deleted last segmentation code part, updated conversion script and changed the hf path in test files * temp commit of requested modifications * temp commit of requested modifications 2 * updated config file, resolved codepaths and refactored conversion script * updated decodelayer block types and refactored conversion script * style and quality update * small modifications based on the request * attentions are refactored * removed loss functions from modeling file, added loss function to lossutils, tried to move the MLP layer generation to config but it failed * deleted imageprocessor * fixed conversion script + quality and style * fixed config_att * [run_slow] dab_detr * changing model path in conversion file and in test file * fix Decoder variable naming * testing the old loss function * switched back to the new loss function and testing with the odl attention functions * switched back to the new last good result modeling file * moved back to the version when I asked the review * missing new line at the end of the file * old version test * turn back to newest mdoel versino but change image processor * style fix * style fix after merge main * [run_slow] dab_detr * [run_slow] dab_detr * added device and type for head bias data part * [run_slow] dab_detr * fixed model head bias data fill * changed test_inference_object_detection_head assertTrues to torch test assert_close * fixes part 1 * quality update * self.bbox_embed in decoder has been restored * changed Assert true torch closeall methods to torch testing assertclose * modelcard markdown file has been updated * deleted intemediate list from decoder module --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-02-04 17:28:27 +00:00
Ryoo Kwangrok	b1954fd64a	layernorm_decay_fix (#35927 ) * layernorm_decay_fix * W293 fix * ruff format fix * black format * ruff format * erase last layer * add test_get_parameter_names_rmsnorm * rmsnorm fix	2025-02-04 11:01:49 +01:00
Alex Brooks	e284c7e954	Update Granite Vision Model Path / Tests (#35998 ) * Update granite vision model path Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Enable granite vision test Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> --------- Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>	2025-02-03 20:06:03 +01:00
Yoni Gozlan	2b46943195	Add GOT-OCR 2.0 to Transformers (#34721 ) * init modular got_ocr2 * Get correct got_ocr architecture * add processing * run modular with processing * add working inference * apply modular * Refactor and fix style * Refactor, cleanup, fix style * fix init order * Fix docs * add base modeling tests * fix style and consistency * rename doc file * fix repo consistency * fix inference with box * add image processing and support for crop_to_multi_page * Fix batch inference * add tests * fixup * fix slow test * fix docstrings * Add model doc * update to new init * fix input autocast pixel_values dtype * update doc * move doc to multimodal * Reformat crop_image_to_patches and add docstrings * Fix example in forward docstring * Address Pablo review * [run slow] got_ocr2 * remove defaults defined twice * apply modular * add torch_device to integration tests * update modular * follow-up Pavel review * add device variable in doc * fix doc multi-page * Force eager attention for vision encoder to avoid attn implementation conflict * revert qwen2vl doc changes * use Qwen2ForCausalLM instead of Qwen2Model * make fixup * refactor gotocr2 to llava style * uniformize function names and reduce checks * final nits * fix pixel_values dtype error * change checkpoint names * fix modular	2025-01-31 11:28:13 -05:00
Ella Charlaix	61cbb723fc	Remove INC notebook reference in documentation (#35936 ) remove INC notebook in documentation	2025-01-28 17:10:02 +01:00
Joao Gante	ece8c42488	Test: generate with `torch.compile(model.forward)` as a fast test (#34544 )	2025-01-28 14:10:38 +00:00
Steven Liu	86d7564611	[docs] Fix Zamba2 (#35916 ) fix code block	2025-01-27 11:44:10 -08:00
Matt	414658f94f	Close Zamba2Config code block (#35914 ) * close zamba2 code block * Add Zamba2 to toctree	2025-01-27 19:09:42 +00:00
Steven Liu	c550a1c640	[docs] uv install (#35821 ) uv install	2025-01-27 08:49:28 -08:00
pglorio	33cb1f7b61	Add Zamba2 (#34517 ) * First commit * Finish model implementation * First commit * Finish model implementation * Register zamba2 * generated modeling and configuration * generated modeling and configuration * added hybrid cache * fix attention_mask in mamba * dropped unused loras * fix flash2 * config docstrings * fix config and fwd pass * make fixup fixes * text_modeling_zamba2 * small fixes * make fixup fixes * Fix modular model converter * added inheritances in modular, renamed zamba cache * modular rebase * new modular conversion * fix generated modeling file * fixed import for Zamba2RMSNormGated * modular file cleanup * make fixup and model tests * dropped inheritance for Zamba2PreTrainedModel * make fixup and unit tests * Add inheritance of rope from GemmaRotaryEmbedding * moved rope to model init * drop del self.self_attn and del self.feed_forward * fix tests * renamed lora -> adapter * rewrote adapter implementation * fixed tests * Fix torch_forward in mamba2 layer * Fix torch_forward in mamba2 layer * Fix torch_forward in mamba2 layer * Dropped adapter in-place sum * removed rope from attention init * updated rope * created get_layers method * make fixup fix * make fixup fixes * make fixup fixes * update to new attention standard * update to new attention standard * make fixup fixes * minor fixes * cache_position * removed cache_position postion_ids use_cache * remove config from modular * removed config from modular (2) * import apply_rotary_pos_emb from llama * fixed rope_kwargs * Instantiate cache in Zamba2Model * fix cache * fix @slow decorator * small fix in modular file * Update docs/source/en/model_doc/zamba2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * several minor fixes * inherit mamba2decoder fwd and drop position_ids in mamba * removed docstrings from modular * reinstate zamba2 attention decoder fwd * use regex for tied keys * Revert "use regex for tied keys" This reverts commit `9007a522b1`. * use regex for tied keys * add cpu to slow forward tests * dropped config.use_shared_mlp_adapter * Update docs/source/en/model_doc/zamba2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * re-convert from modular --------- Co-authored-by: root <root@node-2.us-southcentral1-a.compute.internal> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-27 10:51:23 +01:00
Steven Liu	f11f57c925	[doctest] Fixes (#35863 ) doctest fixes	2025-01-26 15:26:38 -08:00
Yosshi999	045c02f209	[DOC] Fix contamination and missing paragraph in translation (#35851 ) Fix contamination and missing paragraph in translation	2025-01-23 08:33:44 -08:00
Alex Brooks	71cc8161b2	Granite Vision Support (#35579 ) * Add multimodal granite support Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Support multiple image feature layres Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Remove failing validation for visual encoders with no cls Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Update llava based models / configs to support list of feature layers Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Add tests for multiple feature layers Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Use conditional instead of except for misaligned feature shapes Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * crop cls from each hidden state Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Fix formatting Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Support single vision feature int in vipllava Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Fix typo in vision feature selection strategy validation Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Add tentative integration test for granite vision models Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Add granite vision docs Replace multimodal granite refs with granite vision Add granite vision / llava next alias Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Use image url in granitevision example Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> --------- Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>	2025-01-23 17:15:52 +01:00
ShuaiBai623	f3f6c86582	add qwen2.5vl (#35569 ) * add qwen2.5vl * fix * pass check table * add modular file * fix style * Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com> * Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com> * Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com> * padd copy check * use modular * fix * fix * fix * update flashatt2&sdpa support_list * Update docs/source/en/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/qwen2_5_vl/modular_qwen2_5_vl.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update config * update * fix hf path * rename Qwen2_5_VLVideosKwargs * fix * fix * update * excuted modular * rollback init * fix * formated * simpler init * fix * fix * fix * fix * fix * update docs * fix * fix * update Qwen2VLRotaryEmbedding for yarn * fix --------- Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: gewenbin0992 <gewenbin292@163.com> Co-authored-by: gewenbin0992 <67409248+gewenbin0992@users.noreply.github.com>	2025-01-23 11:23:00 +01:00
Joao Gante	62bd83947a	[chat] docs fix (#35840 ) docs fix	2025-01-22 14:32:27 +00:00
Joao Gante	b3d6722469	[Chat] Add Chat from TRL 🐈 (#35714 ) * tmp commit * add working chat * add docts * docs 2 * use auto dtype by default	2025-01-22 13:30:12 +00:00
Joao Gante	90b46e983f	Remove old `benchmark` code (#35730 ) * remove traces of the old deprecated benchmarks * also remove old tf benchmark example, which uses deleted code * run doc builder	2025-01-21 17:56:43 +00:00
Cyril Vallez	8ac851b0b3	Improve modular documentation (#35737 ) * start a nice doc * keep improving the doc * Finalize doc * Update modular_transformers.md * apply suggestion	2025-01-21 17:53:30 +01:00
Yoni Gozlan	107f9f5127	add Qwen2-VL image processor fast (#35733 ) * add qwen2_vl image processor fast * add device to ImagesKwargs * remove automatic fix copies * fix fast_is_faster_than_slow * remove unnecessary import	2025-01-21 11:49:05 -05:00
eustlb	3df90103b8	move fastspeech to audio models (#35788 )	2025-01-21 08:32:09 -08:00
Ahmed Almaghz	741d55237a	[i18n-ar] Translated file: `docs/source/ar/tasks/masked_language_modeling.md` into Arabic (#35198 ) * إضافة الترجمة العربية: masked_language_modeling.md * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml * Update _toctree.yml * Add language_modeling.md * Add Sequence_classifiation.md * Update _toctree.yml --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2025-01-21 08:29:58 -08:00
Aritra Roy Gosthipaty	edbabf6b82	[Doc] Adding blog post to model doc for `TimmWrapper` (#35744 ) * adding blog post to model doc * Update docs/source/en/model_doc/timm_wrapper.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * review suggestions * review suggestions --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-01-21 12:32:39 +00:00
NielsRogge	78f5ee0217	Add LlavaImageProcessor (#33191 ) * First draft * Add equivalence test * Update docstrings * Add tests * Use numpy * Fix tests * Improve variable names * Improve docstring * Add link * Remove script * Add copied from * Address comment * Add note in docs * Add docstring, data format * Improve test * Add test * update * Update src/transformers/models/llava/image_processing_llava.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/llava/image_processing_llava.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * loop once only --------- Co-authored-by: raushan <raushan@huggingface.co> Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-01-21 12:47:04 +01:00
eustlb	5f0f4b1b93	Patch moonshine (#35731 ) * udpate expected logits for T4 runners * update doc * correct order of the args for better readability * remove generate wrap * convert modular	2025-01-20 16:19:29 +01:00
StevenBucaille	abe57b6f17	Add SuperGlue model (#29886 ) * Initial commit with template code generated by transformers-cli * Multiple additions to SuperGlue implementation : - Added the SuperGlueConfig - Added the SuperGlueModel and its implementation - Added basic weight conversion script - Added new ImageMatchingOutput dataclass * Few changes for SuperGlue * Multiple changes : - Added keypoint detection config to SuperGlueConfig - Completed convert_superglue_to_pytorch and succesfully run inference * Reverted unintentional change * Multiple changes : - Added SuperGlue to a bunch of places - Divided SuperGlue into SuperGlueForImageMatching and SuperGlueModel - Added testing images * Moved things in init files * Added docs (to be finished depending on the final implementation) * Added necessary imports and some doc * Removed unnecessary import * Fixed make fix-copies bug and ran it * Deleted SuperGlueModel Fixed convert script * Added SuperGlueImageProcessor * Changed SuperGlue to support batching pairs of images and modified ImageMatchingOutput in consequences * Changed convert_superglue_to_hf.py script to experiment different ways of reading an image and seeing its impact on performances * Added initial tests for SuperGlueImageProcessor * Added AutoModelForImageMatching in missing places and tests * Fixed keypoint_detector_output instructions * Fix style * Adapted to latest main changes * Added integration test * Fixed bugs to pass tests * Added keypoints returned by keypoint detector in the output of SuperGlue * Added doc to SuperGlue * SuperGlue returning all attention and hidden states for a fixed number of keypoints * Make style * Changed SuperGlueImageProcessor tests * Revert "SuperGlue returning all attention and hidden states for a fixed number of keypoints" Changed tests accordingly This reverts commit 5b3b669c * Added back hidden_states and attentions masked outputs with tests * Renamed ImageMatching occurences into KeypointMatching * Changed SuperGlueImageProcessor to raise error when batch_size is not even * Added docs and clarity to hidden state and attention grouping function * Fixed some code and done refactoring * Fixed typo in SuperPoint output doc * Fixed some of the formatting and variable naming problems * Removed useless function call * Removed AutoModelForKeypointMatching * Fixed SuperGlueImageProcessor to only accept paris of images * Added more fixes to SuperGlueImageProcessor * Simplified the batching of attention and hidden states * Simplified stack functions * Moved attention instructions into class * Removed unused do_batch_norm argument * Moved weight initialization to the proper place * Replaced deepcopy for instantiation * Fixed small bug * Changed from stevenbucaille to magic-leap repo * Renamed London Bridge images to Tower Bridge * Fixed formatting * Renamed remaining "london" to "tower" * Apply suggestions from code review Small changes in the docs Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Added AutoModelForKeypointMatching * Changed images used in example * Several changes to image_processing_superglue and style * Fixed resample type hint * Changed SuperGlueImageProcessor and added test case for list of 2 images * Changed list_of_tuples implementation * Fix in dummy objects * Added normalize_keypoint, log_sinkhorn_iterations and log_optimal_transport docstring * Added missing docstring * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Moved forward block at bottom * Added docstring to forward method * Added docstring to match_image_pair method * Changed test_model_common_attributes to test_model_get_set_embeddings test method signature * Removed AutoModelForKeypointMatching * Removed image fixtures and added load_dataset * Added padding of images in SuperGlueImageProcessor * Cleaned up convert_superglue_to_hf script * Added missing docs and fixed unused argument * Fixed SuperGlueImageProcessor tests * Transposed all hidden states from SuperGlue to reflect the standard (..., seq_len, feature_dim) shape * Added SuperGlueForKeypointMatching back to modeling_auto * Fixed image processor padding test * Changed SuperGlue docs * changes: - Abstraction to batch, concat and stack of inconsistent tensors - Changed conv1d's to linears to match standard attention implementations - Renamed all tensors to be tensor0 and not tensor_0 and be consistent - Changed match image pair to run keypoint detection on all image first, create batching tensors and then filling these tensors matches after matches - Various changes in docs, etc * Changes to SuperGlueImageProcessor: - Reworked the input image pairs checking function and added tests accordingly - Added Copied from statements - Added do_grayscale tag (also for SuperPointImageProcessor) - Misc changes for better code * Formatting changes * Reverted conv1d to linear conversion because of numerical differences * fix: changed some code to be more straightforward (e.g. filtering keypoints) and converted plot from opencv to matplotlib * fix: removed unnecessary test * chore: removed commented code and added back hidden states transpositions * chore: changed from "inconsistent" to "ragged" function names as suggested Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * docs: applied suggestions Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * docs: updated to display matched output * chore: applied suggestion for check_image_pairs_input function Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * chore: changed check_image_pairs_input function name to validate_and_format_image_pairs and used validate_preprocess_arguments function * tests: simplified tests for image input format and shapes * feat: converted SuperGlue's use of Conv1d with kernel_size of 1 with Linear layers. Changed tests and conversion script accordingly * feat: several changes to address comments Conversion script: - Reverted fuse batchnorm to linear conversion - Changed all 'nn.Module' to respective SuperGlue models - Changed conversion script to use regex mapping and match other recent scripts Modeling SuperGlue: - Added batching with mask and padding to attention - Removed unnecessary concat, stack and batch ragged pairs functions - Reverted batchnorm layer - Renamed query, key, value and merge layers into q, k, v, out proj - Removed Union of different Module into nn.Module in _init_weights method typehint - Changed several method's signature to combine image0 and image1 inputs with appropriate doc changes - Updated SuperGlue's doc with torch.no_grad() Updated test to reflect changes in SuperGlue model * refactor: changed validate_and_format_image_pairs function with clarity * refactor: changed from one SuperGlueMLP class to a list of SuperGlueMLP class * fix: fixed forgotten init weight change from last commit * fix: fixed rebase mistake * fix: removed leftover commented code * fix: added typehint and changed some of arguments default values * fix: fixed attribute default values for SuperGlueConfig * feat: added SuperGlueImageProcessor post process keypoint matching method with tests * fix: fixed SuperGlue attention and hidden state tuples aggregation * chore: fixed mask optionality and reordered tensor reshapes to be cleaner * chore: fixed docs and error message returned in validate_and_format_image_pairs function * fix: fixed returned keypoints to be the ones that SuperPoint returns * fix: fixed check on number of image sizes for post process compared to the pairs in outputs of SuperGlue * fix: fixed check on number of image sizes for post process compared to the pairs in outputs of SuperGlue (bis) * fix: Changed SuperGlueMultiLayerPerceptron instantiation to avoid if statement * fix: Changed convert_superglue_to_hf script to reflect latest SuperGlue changes and got rid of nn.Modules * WIP: implement Attention from an existing class (like BERT) * docs: Changed docs to include more appealing matching plot * WIP: Implement Attention * chore: minor typehint change * chore: changed convert superglue script by removing all classes and apply conv to linear conversion in state dict + rearrange keys to comply with changes in model's layers organisation * Revert "Fixed typo in SuperPoint output doc" This reverts commit `2120390e82`. * chore: added comments in SuperGlueImageProcessor * chore: changed SuperGlue organization HF repo to magic-leap-community * [run-slow] refactor: small change in layer instantiation * [run-slow] chore: replaced remaining stevenbucaille org to magic-leap-community * [run-slow] chore: make style * chore: update image matching fixture dataset HF repository * [run-slow] superglue * tests: overwriting test_batching_equivalence * [run-slow] superglue * tests: changed test to cope with value changing depending on cuda version * [run-slow] superglue * tests: changed matching_threshold value * [run-slow] superglue * [run-slow] superglue * tests: changed tests for integration * [run-slow] superglue * fix: Changed tensor view and permutations to match original implementation results * fix: updated convert script and integration test to include last change in model * fix: increase tolerance for CUDA variances * Apply suggestions from code review Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * [run-slow] superglue * chore: removed blank whitespaces * [run-slow] superglue * Revert SuperPoint image processor accident changes * [run-slow] superglue * refactor: reverted copy from BERT class * tests: lower the tolerance in integration tests for SuperGlue * [run-slow] superglue * chore: set do_grayscale to False in SuperPoint and SuperGlue image processors * [run-slow] superglue * fix: fixed imports in SuperGlue files * chore: changed do_grayscale SuperGlueImageProcessing default value to True * docs: added typehint to post_process_keypoint_matching method in SuperGlueImageProcessor * fix: set matching_threshold default value to 0.0 instead of 0.2 * feat: added matching_threshold to post_process_keypoint_matching method * docs: update superglue.md to include matching_threshold parameter * docs: updated SuperGlueConfig docstring for matching_threshold default value * refactor: removed unnecessary parameters in SuperGlueConfig * fix: changed from matching_threshold to threshold * fix: re-revert changes to make SuperGlue attention classes copies of BERT * [run-slow] superglue * fix: added missing device argument in post_processing method * [run-slow] superglue * fix: add matches different from -1 to compute valid matches in post_process_keypoint_matching (and docstring) * fix: add device to image_sizes tensor instantiation * tests: added checks on do_grayscale test * chore: reordered and added Optional typehint to KeypointMatchingOutput * LightGluePR suggestions: - use `post_process_keypoint_matching` as default docs example - add `post_process_keypoint_matching` in autodoc - add `SuperPointConfig` import under TYPE_CHECKING condition - format SuperGlueConfig docstring - add device in convert_superglue_to_hf - Fix typo - Fix KeypointMatchingOutput docstring - Removed unnecessary line - Added missing SuperGlueConfig in __init__ methods * LightGluePR suggestions: - use batching to get keypoint detection * refactor: processing images done in 1 for loop instead of 4 * fix: use @ instead of torch.einsum for scores computation * style: added #fmt skip to long tensor values * refactor: rollbacked validate_and_format_image_pairs valid and invalid case to more simple ones * refactor: prepare_imgs * refactor: simplified `validate_and_format_image_pairs` * docs: fixed doc --------- Co-authored-by: steven <steven.bucaillle@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Steven Bucaille <steven.bucaille@buawei.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-01-20 10:32:39 +00:00
NielsRogge	872dfbdd46	[ViTPose] Convert more checkpoints (#35638 ) * Convert more checkpoints * Update docs, convert huge variant * Update model name * Update src/transformers/models/vitpose/modeling_vitpose.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Remove print statements * Update docs/source/en/model_doc/vitpose.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Link to collection --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-01-20 11:29:47 +01:00
Raushan Turganbay	8571bb145a	Fix CI for VLMs (#35690 ) * fix some easy test * more tests * remove logit check here also * add require_torch_large_gpu in Emu3	2025-01-20 11:15:39 +01:00
Pavel Iakubovskii	099d93d2e9	Grounding DINO Processor standardization (#34853 ) * Add input ids to model output * Add text preprocessing for processor * Fix snippet * Add test for equivalence * Add type checking guard * Fixing typehint * Fix test for added `input_ids` in output * Add deprecations and "text_labels" to output * Adjust tests * Fix test * Update code examples * Minor docs and code improvement * Remove one-liner functions and rename class to CamelCase * Update docstring * Fixup	2025-01-17 14:18:16 +00:00
Pavel Iakubovskii	42b2857b01	OmDet Turbo processor standardization (#34937 ) * Fix docstring * Fix docstring * Add `classes_structure` to model output * Update omdet postprocessing * Adjust tests * Update code example in docs * Add deprecation to "classes" key in output * Types, docs * Fixing test * Fix missed clip_boxes * [run-slow] omdet_turbo * Apply suggestions from code review Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Make CamelCase class --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-01-17 14:10:19 +00:00
Pavel Iakubovskii	94ae9a8da1	OwlViT/Owlv2 post processing standardization (#34929 ) * Refactor owlvit post_process_object_detection + add text_labels * Fix copies in grounding dino * Sync with Owlv2 postprocessing * Add post_process_grounded_object_detection method to processor, deprecate post_process_object_detection * Add test cases * Move text_labels to processors only * [run-slow] owlvit owlv2 * [run-slow] owlvit, owlv2 * Update snippets * Update docs structure * Update deprecated objects for check_repo * Update docstring for post processing of image guided object detection	2025-01-17 13:58:28 +00:00
hiroaki222	99e0ab6ed8	Fix typo in /docs/source/ja/model_doc/decision_transformer.md URL (#35705 ) doc: Update original code repository URL	2025-01-15 07:36:50 -08:00
jiqing-feng	387663e571	Enable gptqmodel (#35012 ) * gptqmodel Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update readme Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * gptqmodel need use checkpoint_format (#1) * gptqmodel need use checkpoint_format * fix quantize * Update quantization_config.py * Update quantization_config.py * Update quantization_config.py --------- Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * Revert quantizer_gptq.py (#2) * revert quantizer_gptq.py change * pass *kwargs limit gptqmodel and optimum version Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix warning Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix version check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert unrelated changes Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * enable gptqmodel tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix requires gptq Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Fix Transformer compat (#3) * revert quantizer_gptq.py change * pass *kwargs add meta info * cleanup * cleanup * Update quantization_config.py * hf_select_quant_linear pass checkpoint_format and meta * fix GPTQTestCUDA * Update test_gptq.py * gptqmodel.hf_select_quant_linear() now does not select ExllamaV2 * cleanup * add backend * cleanup * cleanup * no need check exllama version * Update quantization_config.py * lower checkpoint_format and backend * check none * cleanup * Update quantization_config.py * fix self.use_exllama == False * spell * fix unittest * fix unittest --------- Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format again Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update gptqmodel version (#6) * update gptqmodel version * update gptqmodel version * fix unit test (#5) * update gptqmodel version * update gptqmodel version * "not self.use_exllama" is not equivalent to "self.use_exllama==False" * fix unittest * update gptqmodel version * backend is loading_attibutes (#7) * fix format and tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix memory check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix device mismatch Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix result check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * update tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * review: update docs (#10) * review: update docs (#12) * review: update docs * fix typo * update tests for gptqmodel Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update document (#9) * update overview.md * cleanup * Update overview.md * Update overview.md * Update overview.md * update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md --------- Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * typo * doc note for asymmetric quant * typo with apple silicon(e) * typo for marlin * column name revert: review * doc rocm support * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com> Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com> Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-01-15 14:22:49 +01:00
Ego Joseph Oborakpororo	b0cdbd9119	Enhanced Installation Section in README.md (#35094 ) * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md Enhanced installation section with troubleshooting, GPU setup, and OS-specific details. * Update README.md Enhanced installation section with troubleshooting, GPU setup, and OS-specific details. * Update installation.md Updated installation.md to include virtual environment and GPU setup instructions. * Update installation.md Updated installation.md to include virtual environment and GPU setup instructions. * Update installation.md Updated installation.md to include virtual environment, troubleshooting and GPU setup instructions. * Update installation.md * Update installation.md * Update installation.md * Update installation.md Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions. * Update installation.md Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions. * Update installation.md Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions. * Update README.md Removed numbering from README.md. * Update README.md Removed unnecessary "a)" formatting as per maintainer feedback. * Update README.md Added blank lines around code snippets for better readability. * Update README.md Removed the line "b) Install a backend framework:" from README.md as per feedback. * Update README.md Simplified "For Windows:" to "Windows" in README.md as per feedback as well as "For macOS/Linux:" to "macOS/Linux" * Update README.md Removed unnecessary heading and retained valid code snippet. * Update README.md Removed unnecessary heading "d) Optional: Install from source for the latest updates" as per feedback. * Update README.md Removed "GPU Setup (Optional)" section to align with minimal design feedback. * Update installation.md Removed "Create and Activate a Virtual Environment" section from installation.md as per feedback. * Update installation.md Adjusted "Troubleshooting" to a second-level heading and added an introductory line as per feedback. * Update installation.md Updated troubleshooting section with simplified headings and formatted code blocks as per feedback. * Update installation.md Integrated GPU setup instructions into the "Install with pip" section for better content flow. * Update README.md Removed Troubleshooting section from README.md for minimalism as per maintainer feedback.	2025-01-14 08:05:08 -08:00
Martin	715fdd6459	Update torchao.md: use auto-compilation (#35490 ) * Update torchao.md: use auto-compilation * Update torchao.md: indicate updating transformers to the latest --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-01-14 11:33:48 +01:00
RTrace	34f76bb62b	Fix `zero_shot_image_classification` documentation guide link in SigLIP (#35671 )	2025-01-13 11:08:17 -08:00
Arthur	c23a1c1932	Add-helium (#35669 ) * Add the helium model. * Add a missing helium. * And add another missing helium. * Use float for the rmsnorm mul. * Add the Helium tokenizer converter. * Add the pad token as suggested by Arthur. * Update the RMSNorm + some other tweaks. * Fix more rebase issues. * fix copies and style * fixes and add helium.md * add missing tests * udpate the backlink * oups * style * update init, and expected results * small fixes * match test outputs * style fixup, fix doc builder * add dummies and we should be good to go!z * update sdpa and fa2 documentation --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2025-01-13 18:41:15 +01:00
Ahmed Almaghz	a3f82328ed	[i18n-ar] Translated file : docs/source/ar/tasks/token_classification.md into Arabic (#35193 ) * Create token_classification.md * Update token_classification.md * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2025-01-13 09:32:15 -08:00
Raushan Turganbay	52e1f87c7d	[WIP] Emu3: add model (#33770 ) * model can convert to HF and be loaded back * nit * works in single batch generation but hallucinates * use the image tokens * add image generation * now it works * add tests * update * add modulare but it doesn't work for porting docstring :( * skip some tests * add slow tests * modular removed the import? * guess this works * update * update * fix copies * fix test * fix copies * update * docs * fix tests * last fix tests? * pls * repo consistency * more style * style * remove file * address comments * tiny bits * update after the new modular * fix tests * add one more cond in check attributes * decompose down/up/mid blocks * allow static cache generation in VLMs * nit * fix copies * Update docs/source/en/model_doc/emu3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/emu3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/emu3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/emu3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/emu3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/emu3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/emu3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/emu3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix VAE upsampling * Update src/transformers/models/emu3/modular_emu3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * address comments * state overwritten stuff explicitly * fix copies * add the flag for flex attn --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-10 12:23:00 +01:00
Raushan Turganbay	e0646f3dce	Chat template: return vectorized output in processors (#34275 ) * update chat template * style * fix tests * Update src/transformers/image_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * typehints + docs * fix tests * remove unnecessary warnings * forgot code style :( * allow users to pass backend and num frames * Update docs/source/en/chat_templating.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/image_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/image_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/image_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/image_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/image_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/image_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * typo fix * style * address comments * align with "pipeline" template * update docs * update docs * unpack for all kwargs? * wrong conflict resolution while rebasing * tmp * update docs * Update docs/source/en/chat_templating.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_templating.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_templating.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_templating.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-01-10 11:05:29 +01:00
eustlb	5f087d1335	Add Moonshine (#34784 ) * config draft * full encoder forward * full decoder forward * fix sdpa and FA2 * fix sdpa and FA2 * moonshine model * moonshine model forward * fix attention with past_key_values * add MoonshineForConditionalGeneration * fix cache handling and causality for cross attention * no causal attention mask for the encoder * model addition (imports etc) * small nit * nits * Update src/transformers/models/moonshine/convert_usefulsensors_to_hf.py Co-authored-by: Joshua Lochner <admin@xenova.com> * add rope_theta * nits * model doc * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Joshua Lochner <admin@xenova.com> * imports * add MODEL_FOR_SPEECH_SEQ_2_SEQ_MAPPING_NAMES * updates modular * make * make fix-copies * ruff check examples fix * fix check_modular_conversion * nit * nits * nits * copied from -> imports * imports fix * integrate attention refacto * modular edge case * remove encoder * convolutions params in config * run modular_model_converter * make * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Joshua Lochner <admin@xenova.com> * MoonshineModelTest * correct typo * make style * integration tests * make * modular convert * name conversion update (up_proj -> fc1 etc) * update config * update MLP * update attention * update encoder layer * update decoder layer * update convolutions parameters * update encoder * remove INPUTS_DOCSTRING * update decoder * update conditional generation * update pretrained model * imports * modular converted * update doc * fix * typo * update doc * update license * update init * split config in file * two classes for MLP * attention from GLM * from GlmRotaryEmbedding * split MLP * apply arthur's review suggestions * apply arthur's review suggestions * apply arthur's review suggestions * auto feature extractor * convert modular * fix + make * convert modular * make * unsplit config * use correct checkpoint * wrap generate * update tests * typos * make * typo * update doc --------- Co-authored-by: Joshua Lochner <admin@xenova.com>	2025-01-10 11:00:54 +01:00
Benjamin Warner	1e3ddcb2d0	ModernBERT bug fixes (#35404 ) * bug fixes * organize imports * wrap cpu warning in reference_compile * Avoid needing repad_logits_with_grad, always repad with grads when training I'm not 100% that the conditional with "or labels is None" makes sense though - not sure what the intention is there. Perhaps we can remove that? * Revert "Avoid needing repad_logits_with_grad, always repad with grads when training" This reverts commit `cedcb4e89b`. * Fix grammar: keep -> keeps * Propagate grammar fix with modular_model_converter --------- Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com> Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com>	2025-01-09 20:15:38 +01:00
胡译文	c9c682d19c	[doc] deepspeed universal checkpoint (#35015 ) * universal checkpoint * Update docs/source/en/deepspeed.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/deepspeed.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/deepspeed.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-01-09 09:50:51 -08:00
Pablo Montalvo	395b114bd1	Small fix rope kwargs (#35589 ) * don't know why this keeps popping up? * remove unused rope_kwargs	2025-01-09 15:40:36 +01:00
Merve Noyan	487c31a21f	Minor fix in video text 2 text docs (#35546 ) minor fix in docs	2025-01-09 11:20:36 +01:00
Ahmed Almaghz	a6256ec098	[i18n-ar] Translated file: `docs/source/ar/tasks/multiple_choice.md` into Arabic (#35199 ) * إضافة الترجمة العربية: multiple_choice.md * Update multiple_choice.md * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml * Add files via upload * Update _toctree.yml --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2025-01-08 14:17:58 -08:00
Joao Gante	76da6ca034	Pipeline: simple API for assisted generation (#34504 ) Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-01-08 17:08:02 +00:00
DaNing An	4c2c12b3de	[docs] Remove Hiera from AUDIO MODELS in docs (#35544 ) Remove Hiera from AUDIO MODELS Hiera is a visual model and should not appear in audio model...	2025-01-08 16:33:21 +00:00
NielsRogge	8490d3159c	Add ViTPose (#30530 ) * First draft * Make fixup * Make forward pass worké * Improve code * More improvements * More improvements * Make predictions match * More improvements * Improve image processor * Fix model tests * Add classic decoder * Convert classic decoder * Verify image processor * Fix classic decoder logits * Clean up * Add post_process_pose_estimation * Improve post_process_pose_estimation * Use AutoBackbone * Add support for MoE models * Fix tests, improve num_experts% * Improve variable names * Make fixup * More improvements * Improve post_process_pose_estimation * Compute centers and scales * Improve postprocessing * More improvements * Fix ViTPoseBackbone tests * Add docstrings, fix image processor tests * Update index * Use is_cv2_available * Add model to toctree * Add cv2 to doc tests * Remove script * Improve conversion script * Add coco_to_pascal_voc * Add box_to_center_and_scale to image_transforms * Update tests * Add integration test * Fix merge * Address comments * Replace numpy by pytorch, improve docstrings * Remove get_input_embeddings * Address comments * Move coco_to_pascal_voc * Address comment * Fix style * Address comments * Fix test * Address comment * Remove udp * Remove comment * [WIP] need to check if the numpy function is same as cv * add scipy affine_transform * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * refactor convert * add output_shape * add atol 5e-2 * Use hf_hub_download in conversion script * make box_to_center more applicable * skipt test_get_set_embedding * fix to accept array and fix CI * add co-contributor * make it to tensor type output * add torch * change to torch tensor * add more test * minor change * CI test change * import torch should be above ImageProcessor * make style * try not use torch in def * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/vitpose_backbone/configuration_vitpose_backbone.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/vitpose/modeling_vitpose.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix * fix * add caution * make more detail about dataset_index * Update src/transformers/models/vitpose/modeling_vitpose.py Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> * add docs * Update docs/source/en/model_doc/vitpose.md * Update src/transformers/models/vitpose/configuration_vitpose.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Revert "Update src/transformers/__init__.py" This reverts commit `7ffa504450`. * change name * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/vitpose/test_modeling_vitpose.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/vitpose.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vitpose/modeling_vitpose.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * move vitpose only function to image_processor * raise valueerror when using timm backbone * use out_indices * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove camel-case of def flip_back * rename vitposeEstimatorOutput * Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix confused camelcase of MLP * remove in-place logic * clear scale description * make consistent batch format * docs update * formatting docstring * add batch tests * test docs change * Update src/transformers/models/vitpose/image_processing_vitpose.py * Update src/transformers/models/vitpose/configuration_vitpose.py * chagne ViT to Vit * change to enable MoE * make fix-copies * Update docs/source/en/model_doc/vitpose.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * extract udp * add more described docs * simple fix * change to accept target_size * make style * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vitpose/configuration_vitpose.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * change to `verify_backbone_config_arguments` * Update docs/source/en/model_doc/vitpose.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove unnecessary copy * make config immutable * enable gradient checkpointing * update inappropriate docstring * linting docs * split function for visibility * make style * check isinstances * change to acceptable use_pretrained_backbone * make style * remove copy in docs * Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update docs/source/en/model_doc/vitpose.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/vitpose/modeling_vitpose.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * simple fix + make style * change input config of activation function to string * Update docs/source/en/model_doc/vitpose.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * tmp docs * delete index.md * make fix-copies * simple fix * change conversion to sam2/mllama style * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * refactor convert * add supervision * Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * remove reduntant def * seperate code block for visualization * add validation for num_moe * final commit * add labels * [run-slow] vitpose, vitpose_backbone * Update src/transformers/models/vitpose/convert_vitpose_to_hf.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * enable all conversion * final commit * [run-slow] vitpose, vitpose_backbone * ruff check --fix * [run-slow] vitpose, vitpose_backbone * rename split module * [run-slow] vitpose, vitpose_backbone * fix pos_embed * Simplify init * Revert "fix pos_embed" This reverts commit `2c56a4806e`. * refactor single loop * allow flag to enable custom model * efficiency of MoE to not use unused experts * make style * Fix range -> arange to avoid warning * Revert MOE router, a new one does not work * Fix postprocessing a bit (labels) * Fix type hint * Fix docs snippets * Fix links to checkpoints * Fix checkpoints in tests * Fix test * Add image to docs --------- Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: sangbumchoi <danielsejong55@gmail.com> Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-01-08 16:02:14 +00:00
Joao Gante	430d3d43a5	[Docs] links to `logits-processor-zoo` (#35552 ) links to logits-processor-zoo	2025-01-08 13:36:30 +00:00
Jade Choghari	7176e06b52	Add TextNet (#34979 ) * WIP * Add config and modeling for Fast model * Refactor modeling and add tests * More changes * WIP * Add tests * Add conversion script * Add conversion scripts, integration tests, image processor * Fix style and copies * Add fast model to init * Add fast model in docs and other places * Fix import of cv2 * Rename image processing method * Fix build * Fix Build * fix style and fix copies * Fix build * Fix build * Fix Build * Clean up docstrings * Fix Build * Fix Build * Fix Build * Fix build * Add test for image_processing_fast and add documentation tests * some refactorings * Fix failing tests * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Introduce TextNet * Fix failures * Refactor textnet model * Fix failures * Add cv2 to setup * Fix failures * Fix failures * Add CV2 dependency * Fix bugs * Fix build issue * Fix failures * Remove textnet from modeling fast * Fix build and other things * Fix build * some cleanups * some cleanups * Some more cleanups * Fix build * Incorporate PR feedbacks * More cleanup * More cleanup * More cleanup * Fix build * Remove all the references of fast model * More cleanup * Fix build * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Fix Build * Fix build * Fix build * Fix build * Fix build * Fix build * Incorporate PR feedbacks * Fix style * Fix build * Incorporate PR feedbacks * Fix image processing mean and std * Incorporate PR feedbacks * fix build failure * Add assertion to image processor * Incorporate PR feedbacks * Incorporate PR feedbacks * fix style failures * fix build * Fix Imageclassification's linear layer, also introduce TextNetImageProcessor * Fix build * Fix build * Fix build * Fix build * Incorporate PR feedbacks * Incorporate PR feedbacks * Fix build * Incorporate PR feedbacks * Remove some script * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Fix image processing in textnet * Incorporate PR Feedbacks * Fix CI failures * Fix failing test * Fix failing test * Fix failing test * Fix failing test * Fix failing test * Fix failing test * Add textnet to readme * Improve readability * Incorporate PR feedbacks * fix code style * fix key error and convert working * tvlt shouldn't be here * fix test modeling test * Fix tests, make fixup * Make fixup * Make fixup * Remove TEXTNET_PRETRAINED_MODEL_ARCHIVE_LIST * improve type annotation Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update tests/models/textnet/test_image_processing_textnet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * improve type annotation Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * space typo Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * improve type annotation Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/textnet/configuration_textnet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * make conv layer kernel sizes and strides default to None * Update src/transformers/models/textnet/modeling_textnet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/textnet/modeling_textnet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix keyword bug * add batch init and make fixup * Make fixup * Update integration test * Add figure * Update textnet.md * add testing and fix errors (classification, imgprocess) * fix error check * make fixup * make fixup * revert to original docstring * add make style * remove conflict for now * Update modeling_auto.py got a confusion in `timm_wrapper` - was giving some conflicts * Update tests/models/textnet/test_modeling_textnet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/textnet/modeling_textnet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update tests/models/textnet/test_modeling_textnet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/textnet/modeling_textnet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * add changes * Update textnet.md * add doc * add authors hf ckpt + rename * add feedback: classifier/docs --------- Co-authored-by: raghavanone <opensourcemaniacfreak@gmail.com> Co-authored-by: jadechoghari <jadechoghari@users.noreply.huggingface.co> Co-authored-by: Niels <niels.rogge1@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-01-08 09:52:51 +01:00
eustlb	7f7677307c	[Qwen2Audio] handle input ids expansion during processing (#35534 ) * add audio_token attribute to proc * expand input_ids * and legacy and expanded input_ids * test update * split lines * add possibility not to provide eos and bos audio tokens * raise errors * test incorrect number of audio tokens * add example * fmt * typo	2025-01-07 16:47:27 +01:00
松本和真	96bf3d6cc5	Add diffllama (#34083 ) * first adding diffllama * add Diff Attention and other but still with errors * complate make attention Diff-Attention * fix some bugs which may be caused by transformer-cli while adding model * fix a bug caused by forgetting KV cache... * Update src/transformers/models/diffllama/modeling_diffllama.py You don't need to divide by 2 if we use same number of attention heads as llama. instead you can just split in forward. Co-authored-by: Minho Ryu <ryumin93@gmail.com> * Update src/transformers/models/diffllama/modeling_diffllama.py fit to changeing "num_heads // 2" place Co-authored-by: Minho Ryu <ryumin93@gmail.com> * Update src/transformers/models/diffllama/modeling_diffllama.py new codes are more meaningful than before Co-authored-by: Minho Ryu <ryumin93@gmail.com> * Update src/transformers/models/diffllama/modeling_diffllama.py new codes are more meaningful than before Co-authored-by: Minho Ryu <ryumin93@gmail.com> * Update src/transformers/models/diffllama/modeling_diffllama.py fit to changeing "num_heads // 2" place Co-authored-by: Minho Ryu <ryumin93@gmail.com> * Update src/transformers/models/diffllama/modeling_diffllama.py fix 2times divide by sqrt(self.head_dim) Co-authored-by: Minho Ryu <ryumin93@gmail.com> * Update src/transformers/models/diffllama/modeling_diffllama.py fix 2times divide by sqrt(self.head_dim) Co-authored-by: Minho Ryu <ryumin93@gmail.com> * Update src/transformers/models/diffllama/modeling_diffllama.py fit to changeing "num_heads // 2" place. and more visible Co-authored-by: Minho Ryu <ryumin93@gmail.com> * I found Attention missed implemented from paper still on `e072544a3b`. * re-implemented * adding groupnorm Co-authored-by: Minho Ryu <ryumin93@gmail.com> * align with transformers code style Co-authored-by: Minho Ryu <ryumin93@gmail.com> * fix typo Co-authored-by: Minho Ryu <ryumin93@gmail.com> * adding groupnorm Co-authored-by: Minho Ryu <ryumin93@gmail.com> * change SdpaAttention to DiffSdpaAttention Co-authored-by: Minho Ryu <ryumin93@gmail.com> * fix bug * Update src/transformers/models/diffllama/modeling_diffllama.py resolve "not same outputs" problem Co-authored-by: Minho Ryu <ryumin93@gmail.com> * fix bugs of places of "GroupNorm with scale" and etc * Revert "fix bugs of places of "GroupNorm with scale" and etc" This reverts commit `26307d92f6`. * simplify multiple of attention (matmul) operations into one by repeating value_states Co-authored-by: Minho Ryu <ryumin93@gmail.com> * simplify multiple of attention (matmul) operations into one by repeating value_states Co-authored-by: Minho Ryu <ryumin93@gmail.com> * simplify multiple of attention (matmul) operations into one by repeating value_states Co-authored-by: Minho Ryu <ryumin93@gmail.com> * remove missed type * add diffllama model_doc * apply make style/quality * apply review comment about model * apply review comment about test * place diffllama alphabetically on the src/transformers/__init__.py * fix forgot code * Supports parameters that are not initialized with standard deviation 0 in the conventional method * add DiffLlamaConfig to CONFIG_CLASSES_TO_IGNORE_FOR_DOCSTRING_CHECKPOINT_CHECK on utils/check_config_docstrings.py * remove unused property of config * add to supported model list * add to spda supported model list * fix copyright, remove pretraining_tensor_parallel, and modify for initialization test * remove unused import and etc. * empty commit * empty commit * empty commit * apply modular transformers but with bugs * revert prev commit * create src/transformers/model/diffllama/modular_diffllama.py * run utils/modular_model_converter.py * empty commit * leaner modular diffllama * remove more and more in modular_diffllama.pt * remove more and more in modular_diffllama.pt * resolve missing docstring entries * force reset * convert modular --------- Co-authored-by: Minho Ryu <ryumin93@gmail.com>	2025-01-07 11:34:56 +01:00
Woojun Jung	3b1be043cd	🌐 [i18n-KO] Remove duplicates in toctree (#35496 ) fix(docs): remove duplicates in toctree	2025-01-06 09:14:22 -08:00
Jacky Lee	44a26c871c	Update llm_optims docs for `sdpa_kernel` (#35481 ) update: use sdpa_kernel	2025-01-06 08:54:31 -08:00
Chulhwa (Evan) Han	18e896bd8f	🌐 [i18n-KO] Translated `altclip.md` to Korean (#34594 ) * docs: ko: model_doc/timesformer.md * feat: nmt draft * Apply suggestions from code review Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com> * Update docs/source/ko/model_doc/altclip.md * add snippet --------- Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com>	2025-01-06 08:45:26 -08:00
Antoine Dussolle	c451a72cd7	Add French translation of task_summary and tasks_explained (#33407 ) * Add French translation of task_summary and tasks_explained --------- Co-authored-by: Aymeric Roucher <69208727+aymeric-roucher@users.noreply.github.com>	2025-01-06 14:23:52 +01:00
Yijun Lee	e5fd865eba	Add Gemma2 GGUF support (#34002 ) * initial setup for ggml.py * initial setup of GGUFGemma2Converter class * Add gemma2 model to gguf.md doc * Partial work on GGUF_TENSOR_MAPPING * initial setup of GGUF_TENSOR_MAPPING for Gemma2 * refactor: rename GemmaConvert class to GemmaConverter for naming consistency * feat: complete gemma2 tensor mapping implementation * feat: add initial implementation of GGUFGemmaConverter * feat: complete GGUFGemmaConverter implementation * feat: add test code for gemma2 * refactor: minor code cleanup * refactor: minor code cleanup * fix: resolve suggestions * Update tests/quantization/ggml/test_ggml.py Co-authored-by: Isotr0py <2037008807@qq.com> --------- Co-authored-by: Isotr0py <2037008807@qq.com>	2025-01-03 14:50:07 +01:00
湛露先生	b2b04e86e7	Fix docs typos. (#35465 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-01-02 11:29:46 +01:00
Jacky Lee	919220dab1	Update translated docs for `sdpa_kernel` (#35461 ) * docs: update sdpa_kernel for translation * fix: nn.attention * update: infer many	2024-12-31 08:37:58 -08:00
Ahmed Almaghz	eb2b452432	[i18n-ar] Translated file: `docs/source/ar/tasks/summarization.md` into Arabic (#35195 ) * إضافة الترجمة العربية: summarization.md * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-12-31 08:35:54 -08:00
Ahmed Almaghz	d5aebc6465	[i18n-ar] Translated file: `docs/source/ar/tasks/question_answering.md` into Arabic (#35196 ) * إضافة الترجمة العربية: question_answering.md * Update question_answering.md * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-12-30 11:56:05 -08:00
Jacky Lee	b5f97977ed	Update docs for `sdpa_kernel` (#35410 ) update: sdp_kernel -> sdpa_kernel	2024-12-30 09:50:34 -08:00
Martin	90f256c90c	Update perf_infer_gpu_one.md: fix a typo (#35441 )	2024-12-29 14:57:08 +01:00
NielsRogge	6e0515e99c	Add DINOv2 with registers (#35348 ) * added changes from 32905 * fixed mistakes caused by select all paste * rename diff_dinov2... * ran tests * Fix modular * Fix tests * Use new init * Simplify drop path * Convert all checkpoints * Add figure and summary * Update paths * Update docs * Update docs * Update toctree * Update docs --------- Co-authored-by: BernardZach <bernardzach00@gmail.com> Co-authored-by: Zach Bernard <132859071+BernardZach@users.noreply.github.com>	2024-12-24 13:21:59 +01:00
Andrei Panferov	64c05eecd6	HIGGS Quantization Support (#34997 ) * higgs init * working with crunches * per-model workspaces * style * style 2 * tests and style * higgs tests passing * protecting torch import * removed torch.Tensor type annotations * torch.nn.Module inheritance fix maybe * hide inputs inside quantizer calls * style structure something * Update src/transformers/quantizers/quantizer_higgs.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * reworked num_sms * Update src/transformers/integrations/higgs.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * revamped device checks * docstring upd * Update src/transformers/quantizers/quantizer_higgs.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * edited tests and device map assertions * minor edits * updated flute cuda version in docker * Added p=1 and 2,3bit HIGGS * flute version check update * incorporated `modules_to_not_convert` * less hardcoding * Fixed comment * Added docs * Fixed gemma support * example in docs * fixed torch_dtype for HIGGS * Update docs/source/en/quantization/higgs.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Collection link * dequantize interface * newer flute version, torch.compile support * unittest message fix * docs update compile * isort * ValueError instead of assert --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2024-12-23 16:54:49 +01:00
Steven Liu	608e163b52	[docs] Follow up register_pipeline (#35310 ) example json	2024-12-20 09:22:44 -08:00
UV	94fe0b915b	Improved Documentation Of Audio Classification (#35368 ) * Improved Documentation Of Audio Classification * Updated documentation as per review * Updated audio_classification.md * Update audio_classification.md	2024-12-20 09:17:28 -08:00
Joel Koch	c96cc039c3	Improve modular transformers documentation (#35322 ) * Improve modular transformers documentation - Adds hints to general contribution guides - Lists which utils scripts are available to generate single-files from modular files and check their content * Show commands in copyable code cells --------- Co-authored-by: Joel Koch <joel@bitcrowd.net>	2024-12-20 09:16:02 -08:00
Sigbjørn Skjæret	eafbb0eca7	Implement AsyncTextIteratorStreamer for asynchronous streaming (#34931 ) * Add AsyncTextIteratorStreamer class * export AsyncTextIteratorStreamer * export AsyncTextIteratorStreamer * improve docs * missing import * missing import * doc example fix * doc example output fix * add pytest-asyncio * first attempt at tests * missing import * add pytest-asyncio * fallback to wait_for and raise TimeoutError on timeout * check for TimeoutError * autodoc * reorder imports * fix style --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-12-20 12:08:12 +01:00
wejoncy	4e27a4009d	FEAT : Adding VPTQ quantization method to HFQuantizer (#34770 ) * init vptq * add integration * add vptq support fix readme * add tests && format * format * address comments * format * format * address comments * format * address comments * remove debug code * Revert "remove debug code" This reverts commit `ed3b3eaaba`. * fix test --------- Co-authored-by: Yang Wang <wyatuestc@gmail.com>	2024-12-20 09:45:53 +01:00
Tom Aarsen	f42084e641	[`docs`] Add link to ModernBERT Text Classification GLUE finetuning script (#35347 ) Add link to ModernBERT Text Classification GLUE finetuning script	2024-12-19 14:45:52 -08:00
Benjamin Warner	667ed5635e	Add ModernBERT to Transformers (#35158 ) * initial cut of modernbert for transformers * small bug fixes * fixes * Update import * Use compiled mlp->mlp_norm to match research implementation * Propagate changes in modular to modeling * Replace duplicate attn_out_dropout in favor of attention_dropout cc @warner-benjamin let me know if the two should remain separate! * Update BOS to CLS and EOS to SEP Please confirm @warner-benjamin * Set default classifier bias to False, matching research repo * Update tie_word_embeddings description * Fix _init_weights for ForMaskedLM * Match base_model_prefix * Add compiled_head to match research repo outputs * Fix imports for ModernBertForMaskedLM * Just use "gelu" default outright for classifier * Fix config name typo: initalizer -> initializer * Remove some unused parameters in docstring. Still lots to edit there! * Compile the embeddings forward Not having this resulted in very slight differences - so small it wasn't even noticed for the base model, only for the large model. But the tiny difference for large propagated at the embedding layer through the rest of the model, leading to notable differences of ~0.0084 average per value, up to 0.2343 for the worst case. * Add drafts for ForSequenceClassification/ForTokenClassification * Add initial SDPA support (not exactly equivalent to FA2 yet!) During testing, FA2 and SDPA still differ by about 0.0098 per value in the token embeddings. It still predicts the correct mask fills, but I'd like to get it fully 1-1 if possible. * Only use attention dropout if training * Add initial eager attention support (also not equivalent to FA2 yet!) Frustratingly, I also can't get eager to be equivalent to FA2 (or sdpa), but it does get really close, i.e. avg ~0.010 difference per value. Especially if I use fp32 for both FA2&eager, avg ~0.0029 difference per value The fill-mask results are good with eager. * Add initial tests, output_attentions, output_hidden_states, prune_heads Tests are based on BERT, not all tests pass yet: 23 failed, 79 passed, 100 skipped * Remove kwargs from ModernBertForMaskedLM Disable sparse_prediction by default to match the normal HF, can be enabled via config * Remove/adjust/skip improper tests; warn if padding but no attn mask * Run formatting etc. * Run python utils/custom_init_isort.py * FlexAttention with unpadded sequences(matches FA2 within bf16 numerics) * Reformat init_weights based on review * self -> module in attention forwards * Remove if config.tie_word_embeddings * Reformat output projection on a different line * Remove pruning * Remove assert * Call contiguous() to simplify paths * Remove prune_qkv_linear_layer * Format code * Keep as kwargs, only use if needed * Remove unused codepaths & related config options * Remove 3d attn_mask test; fix token classification tuple output * Reorder: attention_mask above position_ids, fixes gradient checkpointing * Fix usage if no FA2 or torch v2.5+ * Make torch.compile/triton optional Should we rename 'compile'? It's a bit vague * Separate pooling options into separate functions (cls, mean) - cls as default * Simplify _pad_modernbert_output, remove unused labels path * Update tied weights to remove decoder.weight, simplify decoder loading * Adaptively set config.compile based on hf_device_map/device/resize, etc. * Update ModernBertConfig docstring * Satisfy some consistency checks, add unfinished docs * Only set compile to False if there's more than 1 device * Add docstrings for public ModernBert classes * Dont replace docstring returns - ends up being duplicate * Fix mistake in toctree * Reformat toctree * Patched FlexAttention, SDPA, Eager with Local Attention * Implement FA2 -> SDPA -> Eager attn_impl defaulting, crucial both to match the original performance, and to get the highest inference speed without requiring users to manually pick FA2 * Patch test edge case with Idefics3 not working with 'attn_implementation="sdpa"' * Repad all_hidden_states as well * rename config.compile to reference_compile * disable flex_attention since it crashes * Update modernbert.md * Using dtype min to mask in eager * Fully remove flex attention for now It's only compatible with the nightly torch 2.6, so we'll leave it be for now. It's also slower than eager/sdpa. Also, update compile -> reference_compile in one more case * Call contiguous to allow for .view() * Copyright 2020 -> 2024 Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update/simplify __init__ structure Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove "... if dropout_prob > 0 else identity" As dropout with 0.0 should be efficient like identity * re-use existing pad/unpad functions instead of creating new ones * remove flexattention method * Compute attention_mask and local_attention_mask once in modeling * Simplify sequence classification prediction heads, only CLS now Users can make custom heads if they feel like it Also removes the unnecessary pool parameter * Simplify module.training in eager attn * Also export ModernBertPreTrainedModel * Update the documentation with links to finetuning scripts * Explain local_attention_mask parameter in docstring * Simplify _autoset_attn_implementation, rely on super() * Keep "in" to initialize Prediction head Doublechecked with Benjamin that it's correct/what we used for pretraining * add back mean pooling * Use the pooling head in TokenClassification * update copyright * Reset config._attn_implementation_internal on failure * Allow optional attention_mask in ForMaskedLM head * fix failing run_slow tests * Add links to the paper * Remove unpad_no_grad, always pad/unpad without gradients * local_attention_mask -> sliding_window_mask * Revert "Use the pooling head in TokenClassification" This reverts commit `99c38badd1`. There was no real motivation, no info on whether having this bigger head does anything useful. * Simplify pooling, 2 options via if-else --------- Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com> Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com> Co-authored-by: Said Taghadouini <taghadouinisaid@gmail.com> Co-authored-by: Benjamin Clavié <ben@clavie.eu> Co-authored-by: Antoine Chaffin <ant54600@hotmail.fr> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-12-19 14:03:35 +01:00
Tony Wu	d19b11f59b	Fix documentation for ColPali (#35321 ) * docs: fix typo quickstart snippet in ColPali's model card * docs: clean the ColPali's model card * docs: make the `ColPaliForRetrieval`'s docstring more concise * docs: add missing bash command used to convert weights for `vidore/colpali-v1.3-hf`	2024-12-19 09:08:28 +01:00
Yu Chin Fabian Lim	9613933b02	Add the Bamba Model (#34982 ) * initial commit for PR Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com> * rename dynamic cache Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * add more unit tests Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * add integration test Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * add integration test Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * Add modular bamba file * Remove trainer changes from unrelated PR * Modify modular and cofig to get model running * Fix some CI errors and beam search * Fix a plethora of bugs from CI/docs/etc * Add bamba to models with special caches * Updat to newer mamba PR for mamba sublayer * fix test_left_padding_compatibility Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * fix style Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * fix remaining tests Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * missed this test Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * ran make style Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * move slow tag to integration obj Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * make style Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * address comments Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * fix modular Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * left out one part of modular Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * change model Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * Make Rotary modular as well * Update bamba.md Added overview, update Model inference card and added config * Update bamba.md * Update bamba.md * Update bamba.md Minor fixes * Add docs for config and model back Signed-off-by: Antoni Viros i Martin <aviros@ibm.com> * Add warning when using fast kernels * replaced generate example Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * Address comments from PR Signed-off-by: Antoni Viros i Martin <aviros@ibm.com> * Propagate attention fixes Signed-off-by: Antoni Viros i Martin <aviros@ibm.com> * Fix attention interfaces to the new API Signed-off-by: Antoni Viros i Martin <aviros@ibm.com> * Fix API for decoder layer Signed-off-by: Antoni Viros i Martin <aviros@ibm.com> * Remove extra weights Signed-off-by: Antoni Viros i Martin <aviros@ibm.com> --------- Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> Signed-off-by: Antoni Viros i Martin <aviros@ibm.com> Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com> Co-authored-by: Antoni Viros i Martin <aviros@ibm.com> Co-authored-by: divya-kumari32 <72085811+divya-kumari32@users.noreply.github.com> Co-authored-by: Antoni Viros <ani300@gmail.com>	2024-12-18 20:18:17 +01:00
Steven Liu	0531d7513b	[docs] Improve register_pipeline (#35300 ) register_pipeline	2024-12-17 10:27:23 -08:00
UV	77080f023f	Fixed typo in audio_classification.md (#35305 )	2024-12-17 09:45:51 -08:00
alexrs-cohere	8bfd7eeeef	Add Cohere2 docs details (#35294 ) * Add Cohere2 docs details * Update docs/source/en/model_doc/cohere2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-12-17 09:36:31 -08:00
ShunanZhu	a7feae190f	Fix remove unused parameter in docs (#35306 ) remove unused parameter in example Co-authored-by: zzzzzsa <zzzzzsaqwq@gmail.com>	2024-12-17 09:34:41 -08:00
Jacky Lee	927c3e39ec	Fix image preview in multi-GPU inference docs (#35303 ) fix: link for img	2024-12-17 09:33:50 -08:00
Jacky Lee	4302b27719	Fix typos in translated quicktour docs (#35302 ) * fix: quicktour typos * fix: one more	2024-12-17 09:32:00 -08:00
Omar Salman	747f361da1	Add sdpa for Beit (#34941 ) * Add sdpa for Beit * Updates * [run-slow] beit * Update inference benchmarks * Update * Fix - add missed to super().forward() * Updates * Fix missing import	2024-12-17 14:44:47 +01:00
Billel Mokeddem	6c08b3b6e5	Add Falcon3 documentation (#35307 ) * Add Falcon3 documentation * Update Falcon3 documentation * Change Falcon to Falcon3 * Update docs and run make fix-copies * Add blog post and huggingface models links	2024-12-17 14:23:13 +01:00
Tony Wu	f33a0cebb3	Add ColPali to 🤗 transformers (#33736 ) * feat: run `add-new-model-like` * feat: add paligemma code with "copied from" * feat: add ColPaliProcessor * feat: add ColPaliModel * feat: add ColPaliConfig * feat: rename `ColPaliForConditionalGeneration` to `ColPaliModel` * fixup modeling colpali * fix: fix root import shortcuts * fix: fix `modeling_auto` dict * feat: comment out ColPali test file * fix: fix typos from `add-new-model-like` * feat: explicit the forward input args * feat: move everything to `modular_colpali.py` * fix: put back ColPaliProcesor * feat: add auto-generated files * fix: run `fix-copies` * fix: remove DOCStRING constants to make modular converter work * fix: fix typo + modular converter * fix: add missing imports * feat: no more errors when loading ColPaliModel * fix: remove unused args in forward + tweak doc * feat: rename `ColPaliModel` to `ColPaliForRetrieval` * fix: apply `fix-copies` * feat: add ColPaliProcessor to `modular_colpali` * fix: run make quality + make style * fix: remove duplicate line in configuration_auto * feat: make ColPaliModel inehrit from PaliGemmaForConditionalGeneration * fix: tweak and use ColPaliConfig * feat: rename `score` to `post_process_retrieval` * build: run modular formatter + make style * feat: convert colpali weights + fixes * feat: remove old weight converter file * feat: add and validate tests * feat: replace harcoded path to "vidore/colpali-v1.2-hf" in tests * fix: add bfloat16 conversion in weight converter * feat: replace pytest with unittest in modeling colpali test * feat: add sanity check for weight conversion (doesn't work yet) * feat: add shape sanity check in weigth converter * feat: make ColPaliProcessor args explicit * doc: add doc for ColPali * fix: trying to fix output mismatch * feat: tweaks * fix: ColPaliModelOutput inherits from ModelOutput instead of PaliGemmaCausalLMOutputWithPast * fix: address comments on PR * fix: adapt tests to the Hf norm * wip: try things * feat: add `__call__` method to `ColPaliProcessor` * feat: remove need for dummy image in `process_queries` * build: run new modular converter * fix: fix incorrect method override * Fix tests, processing, modular, convert * fix tokenization auto * hotfix: manually fix processor -> fixme once convert modular is fixed * fix: convert weights working * feat: rename and improve convert weight script * feat: tweaks * fest: remove `device` input for `post_process_retrieval` * refactor: remove unused `get_torch_device` * Fix all tests * docs: update ColPali model doc * wip: fix convert weights to hf * fix logging modular * docs: add acknowledgements in model doc * docs: add missing docstring to ColPaliProcessor * docs: tweak * docs: add doc for `ColPaliForRetrievalOutput.forward` * feat: add modifications from colpali-engine v0.3.2 in ColPaliProcessor * fix: fix and upload colapli hf weights * refactor: rename `post_process_retrieval` to `score_retrieval` * fix: fix wrong typing for `score_retrieval` * test: add integration test for ColPali * chore: rerun convert modular * build: fix root imports * Update docs/source/en/index.md Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * fix: address PR comments * wip: reduce the prediction gap in weight conversion * docs: add comment in weight conversion script * docs: add example for `ColPaliForRetrieval.forward` * tests: change dataset path to the new one in hf-internal * fix: colpali weight conversion works * test: add fine-grained check for ColPali integration test * fix: fix typos in convert weight script * docs: move input docstring in a variable * fix: remove hardcoded torch device in test * fix: run the new modular refactor * docs: fix python example for ColPali * feat: add option to choose `score_retrieval`'s output dtype and device * docs: update doc for `score_retrieval` * feat: add `patch_size` property in ColPali model * chore: run `make fix-copies` * docs: update description for ColPali cookbooks * fix: remove `ignore_index` methods * feat: remove non-transformers specific methods * feat: update `__init__.py` to new hf format * fix: fix root imports in transformers * feat: remove ColPali's inheritance from PaliGemma * Fix CI issues * nit remove prints * feat: remove ColPali config and model from `modular_colpali.py` * feat: add `ColPaliPreTrainedModel` and update modeling and configuration code * fix: fix auto-removed imports in root `__init__.py` * fix: various fixes * fix: fix `_init_weight` * temp: comment `AutoModel.from_config` for experiments * fix: add missing `output_attentions` arg in ColPali's forward * fix: fix `resize_token_embeddings` * fix: make `input_ids` optional in forward * feat: rename `projection_layer` to `embedding_proj_layer` * wip: fix convert colpali weight script * fix tests and convert weights from original repo * fix unprotected import * fix unprotected torch import * fix style * change vlm_backbone_config to vlm_config * fix unprotected import in modular this time * fix: load config from Hub + tweaks in convert weight script * docs: move example usage from model docstring to model markdown * docs: fix input docstring for ColPali's forward method * fix: use `sub_configs` for ColPaliConfig * fix: remove non-needed sanity checks in weight conversion script + tweaks * fix: fix issue with `replace_return_docstrings` in ColPali's `forward` * docs: update docstring for `ColPaliConfig` * test: change model path in ColPali test * fix: fix ColPaliConfig * fix: fix weight conversion script * test: fix expected weights for ColPali model * docs: update ColPali markdown * docs: fix minor typo in ColPaliProcessor * Fix tests and add _no_split_modules * add text_config to colpali config * [run slow] colpali * move inputs to torch_device in integration test * skip test_model_parallelism * docs: clarify quickstart snippet in ColPali's model card * docs: update ColPali's model card --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2024-12-17 11:26:43 +01:00
UV	f5620a7634	Improved documentation of Automatic speech recognition (#35268 ) Improved documentation quality of Automatic speech recognition	2024-12-16 09:50:11 -08:00
湛露先生	eb92bc44b7	Fix wrongs in quicktour[zh] (#35272 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2024-12-16 09:23:34 -08:00
HMJ0628	886f690e76	Translating "translate perf_infer_gpu_multi.md" to Chinese (#35271 ) add "translate perf_infer_gpu_multi"	2024-12-16 09:22:35 -08:00
Jacky Lee	22834eeba1	Fix typos in Translated Audio Classification Docs (#35287 ) * fix: qwen2 model ids * fix: line * fix: more format * update: reformat * fix: doc typos	2024-12-16 08:51:32 -08:00
Yoni Gozlan	5615a39369	Fall back to slow image processor in ImageProcessingAuto when no fast processor available (#34785 ) * refactor image_processing_auto logic * fix fast image processor tests * Fix tests fast vit image processor * Add safeguard when use_fast True and torchvision not available * change default use_fast back to None, add warnings * remove debugging print * call get_image_processor_class_from_name once	2024-12-15 14:00:36 -05:00
French_Ball	ca03842cdc	[i18n-Chinese] Translating perf_train_cpu.md to Chinese (#35242 ) add "1"	2024-12-13 14:46:49 -08:00
HMJ0628	6009642459	Translating agents_advanced.md to Chinese (#35231 ) add "translate agents_advanced"	2024-12-13 10:12:00 -08:00
UV	e94083bf90	Fixed typos in Audio Classification Documentation (#35263 ) * Fixed typos in Audio Classification Documentation * removed space in '8000 kHZ' * Changes made as per review	2024-12-13 09:43:44 -08:00
alexrs-cohere	64478c7631	Add Cohere2 model (#35224 )	2024-12-13 09:35:50 +01:00
EricWinsorDSIT	31f9a289a6	Fix typo in chat template example (#35250 ) Fix template example typo	2024-12-12 16:53:21 -08:00
Pavel Iakubovskii	5fcf6286bf	Add TimmWrapper (#34564 ) * Add files * Init * Add TimmWrapperModel * Fix up * Some fixes * Fix up * Remove old file * Sort out import orders * Fix some model loading * Compatible with pipeline and trainer * Fix up * Delete test_timm_model_1/config.json * Remove accidentally commited files * Delete src/transformers/models/modeling_timm_wrapper.py * Remove empty imports; fix transformations applied * Tidy up * Add image classifcation model to special cases * Create pretrained model; enable device_map='auto' * Enable most tests; fix init order * Sort imports * [run-slow] timm_wrapper * Pass num_classes into timm.create_model * Remove train transforms from image processor * Update timm creation with pretrained=False * Fix gamma/beta issue for timm models * Fixing gamma and beta renaming for timm models * Simplify config and model creation * Remove attn_implementation diff * Fixup * Docstrings * Fix warning msg text according to test case * Fix device_map auto * Set dtype and device for pixel_values in forward * Enable output hidden states * Enable tests for hidden_states and model parallel * Remove default scriptable arg * Refactor inner model * Update timm version * Fix _find_mismatched_keys function * Change inheritance for Classification model (fix weights loading with device_map) * Minor bugfix * Disable save pretrained for image processor * Rename hook method for loaded keys correction * Rename state dict keys on save, remove `timm_model` prefix, make checkpoint compatible with `timm` * Managing num_labels <-> num_classes attributes * Enable loading checkpoints in Trainer to resume training * Update error message for output_hidden_states * Add output hidden states test * Decouple base and classification models * Add more test cases * Add save-load-to-timm test * Fix test name * Fixup * Add do_pooling * Add test for do_pooling * Fix doc * Add tests for TimmWrapperModel * Add validation for `num_classes=0` in timm config + test for DINO checkpoint * Adjust atol for test * Fix docs * dev-ci * dev-ci * Add tests for image processor * Update docs * Update init to new format * Update docs in configuration * Fix some docs in image processor * Improve docs for modeling * fix for is_timm_checkpoint * Update code examples * Fix header * Fix typehint * Increase tolerance a bit * Fix Path * Fixing model parallel tests * Disable "parallel" tests * Add comment for metadata * Refactor AutoImageProcessor for timm wrapper loading * Remove custom test_model_outputs_equivalence * Add require_timm decorator * Fix comment * Make image processor work with older timm versions and tensor input * Save config instead of whole model in image processor tests * Add docstring for `image_processor_filename` * Sanitize kwargs for timm image processor * Fix doc style * Update check for tensor input * Update normalize * Remove _load_timm_model function --------- Co-authored-by: Amy Roberts <22614925+amyeroberts@users.noreply.github.com>	2024-12-11 12:40:30 +00:00
HMJ0628	10feacd88a	[i18n-<languageCode>] Translating agents.md to Chinese (#35139 ) * add "translate agents.md" * add "agents.md" * add "translate warnings" * add "totree" * add "remove transformer_agent" * add "remove transformer _agent file" --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-12-10 15:16:37 -08:00
Steven Liu	5290f6a62d	[docs] Fix FlashAttention link (#35171 ) fix link	2024-12-10 11:36:25 -08:00
French_Ball	91b8ab18b7	[i18n-<languageCode>] Translating Benchmarks.md to Chinese (#35137 ) * add "Translating Benchmarks.md to Chinese " * Removed all the English original text (which was previously kept as comments in the document) and refined some of the Chinese expressions.	2024-12-10 09:58:47 -08:00
Henry Hyeonmok Ko	52d135426f	Multiple typo fixes in NLP, Audio docs (#35181 ) Fixed multiple typos in Tutorials, NLP, and Audio sections	2024-12-10 09:08:55 -08:00
Ahmed Almaghz	425af6cdc2	[i18n-ar] Translated file : `docs/source/ar/community.md` into Arabic (#33027 ) * Add docs/source/ar/community.md to Add_docs_source_ar_community.md * Update community.md * Update community.md * Update community.md * Update _toctree.yml - add community.md * Update docs/source/ar/community.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Create how_to_hack_models.md * Create modular_transformers.md * Create tiktoken.md * Update _toctree.yml * Update docs/source/ar/how_to_hack_models.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/how_to_hack_models.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/how_to_hack_models.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/how_to_hack_models.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/how_to_hack_models.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/how_to_hack_models.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/how_to_hack_models.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/how_to_hack_models.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tiktoken.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tiktoken.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-12-10 09:08:27 -08:00
NielsRogge	9e420e0269	[I-JEPA] Update docs (#35148 ) Update docs	2024-12-09 10:01:31 +01:00
Pavel Iakubovskii	c8c8dffbe4	Update I-JEPA checkpoints path (#35120 ) Update checkpoints path	2024-12-06 13:42:51 +00:00
Aymeric Roucher	9ad4c93536	Add Aria (#34157 ) * Add Aria --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-12-06 12:17:34 +01:00
Jonathan Mamou	e27465c801	Adaptive dynamic number of speculative tokens (#34156 ) * initial commit * update strategy * add tradeoff FPR TPR with cost * all probs * fix * fix * fix style * Update src/transformers/generation/configuration_utils.py shorter docstring Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * import guard * fix style * add is_sklearn_available condition * vectorizing to flatten the for-loop * fix style * disable adaptation for UAG * update doc * add TestAssistedCandidateGeneratorUpdateStrategy * fix style * protect import * fix style --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2024-12-05 17:07:33 +01:00
João Marcelo	50189e36a6	Add I-JEPA (#33125 ) * first draft * add IJepaEmbeddings class * fix copy-from for IJepa model * add weight conversion script * update attention class names in IJepa model * style changes * Add push_to_hub option to convert_ijepa_checkpoint function * add initial tests for I-JEPA * minor style changes to conversion script * make fixup related * rename conversion script * Add I-JEPA to sdpa docs * minor fixes * adjust conversion script * update conversion script * adjust sdpa docs * [run_slow] ijepa * [run-slow] ijepa * [run-slow] ijepa * [run-slow] ijepa * [run-slow] ijepa * [run-slow] ijepa * formatting issues * adjust modeling to modular code * add IJepaModel to objects to ignore in docstring checks * [run-slow] ijepa * fix formatting issues * add usage instruction snippet to docs * change pos encoding, add checkpoint for doc * add verify logits for all models * [run-slow] ijepa * update docs to include image feature extraction instructions * remove pooling layer from IJepaModel in image classification class * [run-slow] ijepa * remove pooling layer from IJepaModel constructor * update docs * [run-slow] ijepa * [run-slow] ijepa * small changes * [run-slow] ijepa * style adjustments * update copyright in init file * adjust modular ijepa * [run-slow] ijepa	2024-12-05 16:14:46 +01:00
Steven Liu	1ed1de2fec	[docs] Increase visibility of torch_dtype="auto" (#35067 ) * auto-dtype * feedback	2024-12-04 09:18:44 -08:00
Fanli Lin	baa3b22137	[docs] add a comment that offloading requires CUDA GPU (#35055 ) * add commen to offloading * Update docs/source/en/kv_cache.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-12-04 07:48:34 -08:00
Fanli Lin	329f5dbf97	[docs] use device-agnostic API instead of hard-coded cuda (#35048 ) replace cuda	2024-12-03 10:54:15 -08:00
Fanli Lin	b8cdc262d5	[docs] use device-agnostic instead of `cuda` (#35047 ) * fix on xpu * [run_all] * add the missing import for Image lib * add more devices in comment * bug fix * replace cuda	2024-12-03 10:53:45 -08:00
wwwbai	346597b644	Translate community.md into Chinese (#35013 ) * community translation * Update docs/source/zh/community.md Co-authored-by: Isotr0py <2037008807@qq.com> --------- Co-authored-by: Isotr0py <2037008807@qq.com>	2024-12-03 10:22:02 -08:00
Fanli Lin	3deaa8179d	[docs] fix example code bug (#35054 ) fix code bug	2024-12-03 09:18:39 -08:00
Cyril Vallez	ee37bf0d95	Automatic compilation in generate: do not rely on inner function (#34923 ) * compiled forward in PreTrainedModel * update * style * update name * trigger CIs * Add way to use custom compile args * style * switch parameterization to generation_config * Add to inits * Update configuration_utils.py * inits * style * docs * style * Update configuration_utils.py * back without dataclass for repo consistency * Update configuration_utils.py * style * style * style once again * add config serialization * update * true dataclass * trigger CIs * merge compile methods + remove serialization of compile config	2024-12-03 11:20:31 +01:00
wwwbai	f9c7e6021e	Translate bertlogy.md into Chinese (#34908 ) * bertology translation * Update docs/source/zh/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/zh/bertology.md Co-authored-by: blueingman <15329507600@163.com> * Update docs/source/zh/bertology.md Co-authored-by: blueingman <15329507600@163.com> * Update docs/source/zh/bertology.md Co-authored-by: Isotr0py <2037008807@qq.com> * Update docs/source/zh/bertology.md Co-authored-by: Isotr0py <2037008807@qq.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: blueingman <15329507600@163.com> Co-authored-by: Isotr0py <2037008807@qq.com>	2024-12-02 11:42:40 -08:00
Fanli Lin	527dc04e46	[docs] add the missing import for Image and bug fix (#34776 ) * add the missing import for Image lib * add more devices in comment * bug fix	2024-12-02 11:40:20 -08:00
Ahmed Almaghz	4955e4e638	[i18n-ar] Translated file : `docs/source/ar/notebooks.md` into Arabic (#33049 ) * Add docs/source/ar/notebooks.md to Add_docs_source_ar_notebooks.md * Update notebooks.md * Update _toctree.yml	2024-12-02 11:40:04 -08:00
Henry Hyeonmok Ko	31299670cd	Multiple typo fixes in Tutorials docs (#35035 ) * Fixed typo in multi gpu docs and OLMoE version * Fixed typos in docs for agents, agents advanced, knowledge distillation, and image feature extraction * Fixed incorrect usage of model.image_guided_detection in zero shot object detection docs	2024-12-02 15:26:34 +00:00
Michael Goin	9d6f0ddcec	Add optimized `PixtralImageProcessorFast` (#34836 ) * Add optimized PixtralImageProcessorFast * make style * Add dummy_vision_object * Review comments * Format * Fix dummy * Format * np.ceil for math.ceil	2024-11-28 16:04:05 +01:00
Xiao Yuan	4120cb257f	Fix typo in code block in vipllava.md (#34957 ) fix typo in code block in vipllava.md	2024-11-27 08:19:34 -08:00
blueingman	2910015d6d	[i18n-zh]Translated perf_train_special.md into Chinese (#34948 ) * Add translation for perf_train_special documentation * Update docs/source/zh/perf_train_special.md Co-authored-by: Isotr0py <2037008807@qq.com> * Update docs/source/zh/perf_train_special.md Co-authored-by: Isotr0py <2037008807@qq.com> * Update _toctree.yml * Update _toctree.yml * Update perf_train_special.md * Update perf_train_special.md --------- Co-authored-by: Isotr0py <2037008807@qq.com>	2024-11-27 07:57:43 -08:00
Fanli Lin	637225508f	[docs] add explanation to `release_memory()` (#34911 ) * explain release_memory * Update docs/source/en/llm_tutorial_optimization.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-27 07:47:28 -08:00
MaCAT	0600f46353	🌐 [i18n-KO] Translated encoder-decoder.md to Korean (#34880 ) * Initial version of translation, english still remaining * Revised Translation, removed english. _toctree not updated * updated _toctree.yml && 3rd ver translation * updated _toctree.yml && 3rd ver translation * Update encoder-decoder.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update encoder-decoder.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update encoder-decoder.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update encoder-decoder.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update encoder-decoder.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update encoder-decoder.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> --------- Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>	2024-11-27 07:47:14 -08:00
blueingman	6c3f168b36	[i18n-zh]Translated tiktoken.md into chinese (#34936 ) * Add translation for tiktoken documentation * Update tiktoken.md * Update tiktoken.md	2024-11-26 10:09:52 -08:00
谭九鼎	5bfb40bc8e	docs: HUGGINGFACE_HUB_CACHE -> HF_HUB_CACHE (#34904 )	2024-11-26 09:37:18 -08:00
Fanli Lin	784d22078a	[doc] use full path for run_qa.py (#34914 ) use full path for run_qa.py	2024-11-26 09:23:44 -08:00
Fanli Lin	6bc0c219c1	[docs] use device-agnostic API instead of cuda (#34913 ) add device-agnostic API Signed-off-by: Lin, Fanli <fanli.lin@intel.com>	2024-11-26 09:23:34 -08:00
Ahmed Almaghz	64b73e61f8	[i18n-ar] Translated file : `docs/source/ar/benchmarks.md` into Arabic (#33023 ) * Add docs/source/ar/benchmarks.md to Add_docs_source_ar_benchmarks.md * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml * Update benchmarks.md --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-11-26 09:23:11 -08:00
Viktor Scherbakov	95c10fedb3	Updated documentation and added conversion utility (#34319 ) * Updated documentation and added conversion utility * Update docs/source/en/tiktoken.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tiktoken.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Moved util function to integration folder + allow for str * Update formatting Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Updated formatting * style changes --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-11-25 18:44:09 +01:00
Shane A	9121ab8fe8	Rename OLMo November to OLMo2 (#34864 ) * Rename/move OLMo Nov files to OLMo2 * Rename Olmo1124 and its variants to Olmo2	2024-11-25 16:31:22 +01:00
farrosalferro	c57eafdaa1	Add Nemotron GGUF Loading Support (#34725 ) * Add Nemotron GGUF Loading Support * fix the Nemotron architecture assignation --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-11-21 11:37:34 +01:00
wwwbai	3033509327	Translate attention.md into Chinese (#34716 ) * try * tryagain * tryagggain * translated * translated2 * Update docs/source/zh/attention.md Co-authored-by: Huazhong Ji <hzji210@gmail.com> --------- Co-authored-by: Huazhong Ji <hzji210@gmail.com>	2024-11-19 10:03:12 -08:00
Merve Noyan	befbbf2f98	Added image-text-to-text pipeline to task guide (#34783 ) * Added image-text-to-text pipeline to task guide * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Merge codeblocks --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-19 09:49:10 -08:00
Yoni Gozlan	eedc113914	Add Image Processor Fast Deformable DETR (#34353 ) * add deformable detr image processor fast * add fast processor to doc * fix copies * nit docstring * Add tests gpu/cpu and fix docstrings * fix docstring * import changes from detr * fix imports * rebase and fix * fix input data format change in detr and rtdetr fast	2024-11-19 11:18:58 -05:00
David Zhang	427b62ed1a	Fix post process function called in the instance segmentation example of mask2former (#34588 ) * Fix post process function called in the instance segmentation example of mask2former * fix description and additional notes for post_process_instance_segmentation of maskformers * remove white space in maskformers post_process_instance_segmentation doc * change image.size[::-1] to height and width for clarity in segmentation examples	2024-11-19 16:49:25 +01:00
Marc Sun	ce1d328e3b	Fix cache_utils for optimum.quanto kvcache quantization (#34750 ) * add co-author Co-authored-by: w3rew <w3rew@users.noreply.github.com> * fix docs * fix cache * remove print --------- Co-authored-by: w3rew <w3rew@users.noreply.github.com>	2024-11-19 14:16:34 +01:00
Arthur	54739a320e	Self-speculation (Layer-Skip Llama) (#34240 ) * 😅 * early exit (#34244) * mvp * docs and tests * a few fixes * no shared cache * Apply suggestions from code review Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org> * docs * make fix-copies * cohere fix * [test all] * [test all] consistent model code copies * [test all] make fix-copies :D * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org> * Update src/transformers/generation/candidate_generator.py * Update src/transformers/generation/configuration_utils.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * [test all] don't use a stand-alone attribute; fix test --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2024-11-19 12:20:07 +00:00
Ke Wen	20142ab542	Simplify Tensor Parallel implementation with PyTorch TP (#34184 ) * Simplify Tensor Parallel implementation with PyTorch TP * Move tp_plan to config * Lint * Format and warning * Disable copy-from check * Conditionally get attr from config * make fix-copies * Move base_model_tp_plan to PretrainedConfig * Move TP into from_pretrained * Add device context for load * Do not serialize * Move _tp_plan setting to post_init * Add has_tp_plan * Add test_tp * Add 'Multi-gpu inference' doc * Add backward support for device type identification * Auto-detect accelerator * supports_tp_plan * copyright year * Fix copy	2024-11-18 19:51:49 +01:00
ecyht2	7df93d6ffb	fix: Wrong task mentioned in docs (#34757 )	2024-11-18 18:42:28 +00:00
Hun-soo Jung	7693b62268	Fix callback key name (#34762 ) Fixes typo.	2024-11-18 18:41:12 +00:00
Fanli Lin	e80a65ba4f	[tests] add XPU part to testing (#34778 ) add XPU part to testing Signed-off-by: Lin, Fanli <fanli.lin@intel.com>	2024-11-18 09:59:11 -08:00
Fanli Lin	9568a9dfc5	[docs] add XPU besides CUDA, MPS etc. (#34777 ) add XPU	2024-11-18 09:58:50 -08:00
Fanli Lin	8568bf1bcf	[docs] make `empty_cache` device-agnostic (#34774 ) make device-agnostic	2024-11-18 09:58:26 -08:00
Ofek Lev	eb0ab3ed4b	Fix broken link (#34618 )	2024-11-18 14:13:26 +01:00
Raushan Turganbay	1646ffb4d1	VLMs: `patch_size` -> `num_image_tokens` in processing (#33424 ) * use num additional tokens * fix copies + docs * another fix copies :) * add docs * move order for BC	2024-11-18 13:21:07 +01:00
Shane A	3ee24e2208	Add OLMo November 2024 (#34551 ) * Add model skeletion with transformers-cli add-new-model-like * Convert config to modular, add rms_norm_eps, delete clip_qkv * Convert model to modular, add RMSNorm * Add flash attention with qk norm and no qkv clipping * Add decoder layer with RMSNorm after attention/feedforward layers * Add base and causal model * Add converter improvements from OLMo repo * Update weight loading in OLMo to HF converter * Set correct default for rms_norm_eps * Set correct pipeline_model_mapping in test * Run make fixup * Fix model type * Re-run modular conversion * Manually set config docs to fix build errors * Convert olmo-1124 to olmo_1124 to fix flash attention docs errors * Start updating tests * Update tests * Copy upstream test_eager_matches_sdpa_inference_1_bfloat16 changes to olmo_1124 * Rename input_layernorm and post_attention_layernorm to reflect their ops better * Use correct tokenizer * Remove test unsupported by GPT2 tokenizer * Create GenerationConfig outside of from_pretrained call * Use simpler init file structure * Add explicit __all__ to support simplified init * Make safetensor serialization the default * Update OLMo November 2024 docs	2024-11-18 10:43:10 +01:00
jiqing-feng	52ea4aa589	add xpu path for awq (#34712 ) * add xpu path for awq * update readme	2024-11-15 15:45:24 +01:00
Lysandre Debut	f5dbfab7f3	Update llava.md (#34749 ) LLava -> Llava	2024-11-15 15:39:57 +01:00
Fanli Lin	a3d69a8994	[docs] add xpu device check (#34684 ) * add XPU path * use accelerate API * Update docs/source/en/tasks/semantic_segmentation.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update more places with accelerate API --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-13 14:16:59 -08:00
Pedro Cuenca	e7c36a9d57	[docs] Broken link in generation_strategies (#34717 ) [docs] Broken link	2024-11-13 13:44:42 -08:00
MaCAT	be8748a53c	🌐 [i18n-KO] Translated marian.md to Korean (#34698 ) * initial translation * removed english * Fixed Trivial Typos, updated _toctree.yml	2024-11-13 13:14:23 -08:00
Ahmed Almaghz	6de2a4d1f1	[i18n-ar] Translated file : `docs/source/ar/torchscript.md` into Arabic (#33079 ) * Add docs/source/ar/torchscript.md to Add_docs_source_ar_torchscript.md * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Merge troubleshooting.md with this Branch * Update _toctree.yml * Update torchscript.md * Update troubleshooting.md --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-11-11 10:41:01 -08:00
Fanli Lin	25f510a9c6	[docs] update not-working model revision (#34682 ) update revision	2024-11-11 07:09:31 -08:00
Aymeric Roucher	3ea3ab62d8	Agents: turn any Space into a Tool with `Tool.from_space()` (#34561 ) * Agents: you can now load a Space as a tool	2024-11-10 12:22:40 +01:00
Ahmed Almaghz	768f3c016e	[i18n-ar] Translated file : `docs/source/ar/trainer.md` into Arabic (#33080 ) * Add docs/source/ar/trainer.md to Add_docs_source_ar_trainer.md * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update trainer.md * Update trainer.md * Update trainer.md * Create _toctree.yml * Delete docs/source/ar/_toctree.yml * Update _toctree.yml - add trainer * Update _toctree.yml * merge serialization.md into this branch * merge sagemaker.md into this PR * Update _toctree.yml * Update docs/source/ar/trainer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-09 11:26:28 -08:00
MaCAT	a06a0d1263	🌐 [i18n-KO] Translated bert.md to Korean (#34627 ) * Translated bert.md, Need additional check * Translation 2nd ver, changed _toctree.yml * Fixed Typo * Update bert.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update bert.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update bert.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update bert.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-07 18:56:09 -08:00
Jiwook Han	1cf17077bf	🌐 [i18n-KO] Translated `timesformer.md` to Korean (#33972 ) * docs: ko: model_doc/timesformer.md * feat: nmt draft * fix: manual edits * fix_toctree * fix toctree on Video Models	2024-11-07 11:04:27 -08:00
Ahnjj_DEV	7bbc624743	🌐 [i18n-KO] Translated `convbert.md` to Korean (#34599 ) * docs: ko: convbert.md * Update _toctree.yml * feat: nmt draft	2024-11-05 09:32:17 -08:00
MaCAT	1112c54604	🌐 [i18n-KO] Translated perf_train_special.md to Korean (#34590 ) * Translated to Ko, 1st version * updated _toctree.yml	2024-11-04 09:41:44 -08:00
Karthik Vallamsetla	a86bd6f2d8	[i18n-HI] Translated TFLite page to Hindi (#34572 ) * [i18n-HI] Translated TFLite page to Hindi * [i18n-HI] Translated TFLite page to Hindi * Update docs/source/hi/tflite.md Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com> --------- Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>	2024-11-04 09:40:30 -08:00
Raushan Turganbay	187439c3fa	VLM: special multimodal Tokenizer (#34461 ) * kinda works * update * add tests * update * use special tokens in processors * typo * fix copies * fix * fix moshi after rebase * update * fix tests * update * Update docs/source/en/main_classes/tokenizer.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update docs * test for load time adding tokens * fix some more tests which are now fetched better * one more fix --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-11-04 16:37:51 +01:00
Karthik Vallamsetla	33868a057c	[i18n-HI] Translated accelerate page to Hindi (#34443 ) * [i18n-HI] Translated accelerate page to Hindi * Update docs/source/hi/accelerate.md Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com> * Update docs/source/hi/accelerate.md Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com> * Update docs/source/hi/accelerate.md Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com> * Update docs/source/hi/accelerate.md Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com> --------- Co-authored-by: Kay <kay@Kays-MacBook-Pro.local> Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>	2024-11-01 08:26:45 -07:00
Ahmed Almaghz	b53e44e847	[i18n-ar] Translated file : `docs/source/ar/multilingual.md` into Arabic (#33048 ) * Add docs/source/ar/multilingual.md to Add_docs_source_ar_multilingual.md * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml * Update _toctree.yml * Add Translated files to branch for merg * Update _toctree.yml * Update _toctree.yml * Update custom_models.md * Update chat_templating.md * Update docs/source/ar/create_a_model.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update create_a_model.md * Update gguf.md * Update gguf.md * Update gguf.md * Update gguf.md --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-31 16:10:09 -07:00
jiqing-feng	2801d7bcf6	update doc (#34478 ) * update doc * Update docs/source/en/perf_train_cpu.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * delete closing tip --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-31 15:59:23 -07:00
Yoni Gozlan	203e27059b	Add image text to text pipeline (#34170 ) * Standardize image-text-to-text-models-output add post_process_image_text_to_text to chameleon and cleanup Fix legacy kwarg behavior and deprecation warning add post_process_image_text_to_text to qwen2_vl and llava_onevision Add post_process_image_text_to_text to idefics3, mllama, pixtral processor * nit var name post_process_image_text_to_text udop * nit fix deprecation warnings * Add image-text-to-text pipeline * add support for image url in chat template for pipeline * Reformat to be fully compatible with chat templates * Add tests chat template * Fix imports and tests * Add pipeline tag * change logic handling of single prompt ans multiple images * add pipeline mapping to models * fix batched inference * fix tests * Add manual batching for preprocessing * Fix outputs with nested images * Add support for all common processing kwargs * Add default padding when multiple text inputs (batch size>1) * nit change version deprecation warning * Add support for text only inference * add chat_template warnings * Add pipeline tests and add copied from post process function * Fix batched pipeline tests * nit * Fix pipeline tests blip2 * remove unnecessary max_new_tokens * revert processing kosmos2 and remove unnecessary max_new_tokens * fix pipeline tests idefics * Force try loading processor if pipeline supports it * revert load_processor change * hardcode loading only processor * remove unnecessary try except * skip imagetexttotext tests for kosmos2 as tiny model causes problems * Make code clearer * Address review comments * remove preprocessing logic from pipeline * fix fuyu * add BC resize fuyu * Move post_process_image_text_to_text to ProcessorMixin * add guard in post_process * fix zero shot object detection pipeline * add support for generator input in pipeline * nit * change default image-text-to-text model to llava onevision * fix owlv2 size dict * Change legacy deprecation warning to only show when True	2024-10-31 15:48:11 -04:00
Yoni Gozlan	48872fd6ae	Add Image Processor Fast RT-DETR (#34354 ) * add fast image processor rtdetr * add gpu/cpu test and fix docstring * remove prints * add to doc * nit docstring * avoid iterating over images/annotations several times * change torch typing * Add image processor fast documentation	2024-10-30 13:49:47 -04:00
Vladislav Bronzov	5251fe6271	Add GGUF for Mamba (#34200 ) * add mamba architecture for gguf * add logic for weights conversion, some fixes and refactoring * add lm_head layers, unit test refactoring * more fixes for tests * remove lm_head creation * remove unused comments	2024-10-30 16:52:17 +01:00
Raushan Turganbay	0f764a5af7	Mllama: update docs (#34334 ) * update docs * be more explicit * use avaialble methods	2024-10-30 10:11:50 +01:00
Apoorv Khandelwal	e9ad460494	Adding `optimizer_cls_and_kwargs` to `Trainer.__init__` (#34358 ) * Adding `optimizer_cls_and_kwargs` to `Trainer.__init__` * formatting * make fix-copies docstring * added more docs for optimizer_cls_and_kwargs * add docs for Trainer(optimizer_cls_and_kwargs) * reverting anchor names	2024-10-29 16:23:16 +01:00
Martin Gubri	626c610a4d	Fix perplexity computation in perplexity.md (#34387 ) fix average NLL in perplexity.md	2024-10-29 11:10:10 +01:00
StevenBucaille	a1835195d1	🚨🚨🚨 [SuperPoint] Fix keypoint coordinate output and add post processing (#33200 ) * feat: Added int conversion and unwrapping * test: added tests for post_process_keypoint_detection of SuperPointImageProcessor * docs: changed docs to include post_process_keypoint_detection method and switched from opencv to matplotlib * test: changed test to not depend on SuperPointModel forward * test: added missing require_torch decorator * docs: changed pyplot parameters for the keypoints to be more visible in the example * tests: changed import torch location to make test_flax and test_tf * Revert "tests: changed import torch location to make test_flax and test_tf" This reverts commit `39b32a2f69`. * tests: fixed import * chore: applied suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * tests: fixed import * tests: fixed import (bis) * tests: fixed import (ter) * feat: added choice of type for target_size and changed tests accordingly * docs: updated code snippet to reflect the addition of target size type choice in post process method * tests: fixed imports (...) * tests: fixed imports (...) * style: formatting file * docs: fixed typo from image[0] to image.size[0] * docs: added output image and fixed some tests * Update docs/source/en/model_doc/superpoint.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix: included SuperPointKeypointDescriptionOutput in TYPE_CHECKING if statement and changed tests results to reflect changes to SuperPoint from absolute keypoints coordinates to relative * docs: changed SuperPoint's docs to print output instead of just accessing * style: applied make style * docs: added missing output type and precision in docstring of post_process_keypoint_detection * perf: deleted loop to perform keypoint conversion in one statement * fix: moved keypoint conversion at the end of model forward * docs: changed SuperPointInterestPointDecoder to SuperPointKeypointDecoder class name and added relative (x, y) coordinates information to its method * fix: changed type hint * refactor: removed unnecessary brackets * revert: SuperPointKeypointDecoder to SuperPointInterestPointDecoder * Update docs/source/en/model_doc/superpoint.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> --------- Co-authored-by: Steven Bucaille <steven.bucaille@buawei.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2024-10-29 09:36:03 +00:00
Ahmed Almaghz	a17f287ac0	[i18n-ar] Translated file : `docs/source/ar/fast_tokenizers.md` into Arabic (#33034 ) * Add docs/source/ar/fast_tokenizers.md to Add_docs_source_ar_fast_tokenizers.md * Update _toctree.yml * Update _toctree.yml * Update docs/source/ar/_toctree.yml Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-10-28 10:54:37 -07:00
wony617	1f7539c829	🌐 [i18n-KO] Translated `model_doc/barthez.md` to Korean (#33980 ) * docs: ko: model_doc/barthez.md * feat: nmt draft --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-28 10:46:49 -07:00
Rudy Delouya	6a62a6d1b5	Fix typos in agents_advanced.md (#34405 )	2024-10-25 08:52:29 -07:00
Joao Gante	8814043c8c	SynthID: better example (#34372 ) * better example * Update src/transformers/generation/configuration_utils.py * Update src/transformers/generation/logits_process.py * nits	2024-10-25 11:46:46 +01:00
김준재	dd267fca72	Add T5 GGUF loading support (#33389 ) * add: GGUFT5Converter * add: tensormapping for t5 * add: test code for t5 * fix: Remove whitespace from blank line * add: t5 fp16 tests * fix: whitespace formatting * fix: minor formatting * fix: testing every weights	2024-10-24 15:10:59 +02:00
Thomas Furtner	30c76d5b28	add code generation to natural language processing section (#34333 )	2024-10-24 14:42:47 +02:00
Lysandre Debut	2112027d0c	Zamba is an LM (#34342 ) * Zamba is an LM * Addition	2024-10-24 14:29:33 +02:00
blueingman	f0b3ef9e2e	translated gguf.md into chinese (#34163 ) * translated gguf.md into chinese * Apply suggestions from code review I have updated the PR accordingly.Thank you very much for detailed guidance,and I 'll pay more attention to the details next time. Co-authored-by: Isotr0py <2037008807@qq.com> * Apply suggestions from code review Co-authored-by: Isotr0py <2037008807@qq.com> --------- Co-authored-by: Isotr0py <2037008807@qq.com>	2024-10-24 11:47:58 +02:00
Yih-Dar	f0e640adfa	Drop support for Python 3.8 (#34314 ) * drop python 3.8 * update docker files --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-24 11:16:55 +02:00
Abhishek Maurya	65753d6065	Remove graph breaks for torch.compile() in flash_attention_forward when Lllama Model is padding free tuned (#33932 ) * fix: fixes for graph breaks Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: formatting Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: import error Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: Add Fa2Kwargs Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * Revert "PR changes" This reverts commit `39d2868e5c`. * PR changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: FlashAttentionKwarg Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: FlashAttentionKwarg Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * addition of documentation Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * change in _flash_attention_forward Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * make fix-copies Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * revert make fix-copies Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix copies * style * loss kwargs typing * style and pull latest changes --------- Signed-off-by: Abhishek <maurya.abhishek@ibm.com> Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2024-10-24 11:02:54 +02:00
Joao Gante	b0f0c61899	Add SynthID (watermerking by Google DeepMind) (#34350 ) * Add SynthIDTextWatermarkLogitsProcessor * esolving comments. * Resolving comments. * esolving commits, * Improving SynthIDWatermark tests. * switch to PT version * detector as pretrained model + style * update training + style * rebase * Update logits_process.py * Improving SynthIDWatermark tests. * Shift detector training to wikitext negatives and stabilize with lower learning rate. * Clean up. * in for 7B * cleanup * upport python 3.8. * README and final cleanup. * HF Hub upload and initiaze. * Update requirements for synthid_text. * Adding SynthIDTextWatermarkDetector. * Detector testing. * Documentation changes. * Copyrights fix. * Fix detector api. * ironing out errors * ironing out errors * training checks * make fixup and make fix-copies * docstrings and add to docs * copyright * BC * test docstrings * move import * protect type hints * top level imports * watermarking example * direct imports * tpr fpr meaning * process_kwargs * SynthIDTextWatermarkingConfig docstring * assert -> exception * example updates * no immutable dict (cant be serialized) * pack fn * einsum equivalent * import order * fix test on gpu * add detector example --------- Co-authored-by: Sumedh Ghaisas <sumedhg@google.com> Co-authored-by: Marc Sun <marc@huggingface.co> Co-authored-by: sumedhghaisas2 <138781311+sumedhghaisas2@users.noreply.github.com> Co-authored-by: raushan <raushan@huggingface.co>	2024-10-23 21:18:52 +01:00
Steven Liu	5ba85de7a4	[docs] Fix Korean toctree (#34324 ) fix	2024-10-23 10:52:51 +02:00
wony617	644d5287b2	🌐 [i18n-KO] Translated `model_doc/bartpho.md` to Korean (#33981 ) * docs: ko: model_doc/bartpho.md * feat: nmt draft * Update docs/source/ko/model_doc/bartpho.md * Update docs/source/ko/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-22 09:46:52 -07:00
Ahnjj_DEV	b03dc0a87e	🌐 [i18n-KO] Translated `bert japanese.md` to Korean (#33890 ) * docs: ko: bert-japanese.md * Update _toctree.yml * fix: manual edits * Update docs/source/ko/_toctree.yml Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * Update docs/source/ko/_toctree.yml Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> --------- Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-22 09:46:31 -07:00
Ahnjj_DEV	4b14aa1bcd	🌐 [i18n-KO] Translated `executorch.md` to Korean (#33888 ) * docs: ko: executorch.md * Update _toctree.yml * fix: manual edits * Update docs/source/ko/main_classes/executorch.md Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * Update docs/source/ko/_toctree.yml Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * Update docs/source/ko/_toctree.yml * Update docs/source/ko/_toctree.yml * Update docs/source/ko/_toctree.yml --------- Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-22 09:46:20 -07:00
Fanli Lin	688eeac81e	[docs] fix typo (#34235 ) fix typo	2024-10-22 09:46:07 -07:00
Alexandros Benetatos	c31a6ff474	Add post_process_depth_estimation to image processors and support ZoeDepth's inference intricacies (#32550 ) * add colorize_depth and matplotlib availability check * add post_process_depth_estimation for zoedepth + tests * add post_process_depth_estimation for DPT + tests * add post_process_depth_estimation in DepthEstimationPipeline & special case for zoedepth * run `make fixup` * fix import related error on tests * fix more import related errors on test * forgot some `torch` calls in declerations * remove `torch` call in zoedepth tests that caused error * updated docs for depth estimation * small fix for `colorize` input/output types * remove `colorize_depth`, fix various names, remove matplotlib dependency * fix formatting * run fixup * different images for test * update examples in `forward` functions * fixed broken links * fix output types for docs * possible format fix inside `<Tip>` * Readability related updates Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Readability related update * cleanup after merge * refactor `post_process_depth_estimation` to return dict; simplify ZoeDepth's `post_process_depth_estimation` * rewrite dict merging to support python 3.8 --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2024-10-22 15:50:54 +02:00
regisss	93352e81f5	Fix Korean doc _toctree.yml (#34293 ) Fix korean doc _toctree.yml	2024-10-22 11:05:56 +02:00
Raushan Turganbay	21d5025826	Attn implementation for composite models (#32238 ) * first try * codestyle * idefics2 is happy * [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo, paligemma * fix-copies * [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo * blip-2 needs to init vision from config * when was this removed O_o * minor fix * tests * this way? * tests * model-agnostic code * codestyle * add tests for idefics * modify general test for VLMs * no generation test for vlm yet! * no generation test here also * wanr in VIT-SDPA if output attn * add more tests * user can pass dict as attn impl * repo consistency * update * muicgen * no prints * forgot speech enc-dec and clip * how many composite models we have? * musicgen meelody is same as mudicgen * +siglip * fix tests + add some more * remove idefics custom overriden code * make idefics2 automappable * nits * skip tests * doctests * Update src/transformers/models/idefics2/configuration_idefics2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/clip/test_modeling_clip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/idefics2/test_modeling_idefics2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/idefics2/test_modeling_idefics2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/configuration_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * major update, no need for automap * clean up * add FA2 test * more tests * style * skip tests * why did these started failing now? * no attributes for FA2 needed * one tiny test * address comment about FA2 false warning * style * add new models and resolve conflicts * fix copies * let it be this way for now, come back tomorrow to review * some more fixes * update * more updates * update * fix copies * style and tests * another big update * fix tests * fix tests * update * another update * fix tests * fix copies * fix tests --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-10-22 06:54:44 +02:00
Andrés Marafioti	32590b5ecb	Fix method name which changes in tutorial (#34252 ) The method `model_download_tool` was called `model_download_counter` earlier in the tutorial, this raises an error when following the code.	2024-10-21 14:21:52 -03:00
Matt	f701b98e4a	Add a doc section on writing generation prompts (#34248 ) Add a section on writing generation prompts	2024-10-21 14:35:57 +01:00
Yoni Gozlan	a4122813d1	Add DetrImageProcessorFast (#34063 ) * add fully functionning image_processing_detr_fast * Create tensors on the correct device * fix copies * fix doc * add tests equivalence cpu gpu * fix doc en * add relative imports and copied from * Fix copies and nit	2024-10-21 09:05:05 -04:00
Cyril Vallez	6604764007	add Glm (#33823 ) * Create modular_glm.py * Update modular_glm.py * Finalize architecture without all attentions * Add all attentions modules * Finalize modular * Update given last version * Last update * Finalize model * Finalize converter * Update convert_glm_weights_to_hf.py * style * style * Create __init__.py * Aff all inits * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Correct the rotary embeddings * Remove apply_residual_connection_post_layernorm (always false) * remove use_rms_norm (always true) * remove past_layer_norm (always true) * Update __init__.py * Update config and license * start adding tests and doc * Add doc + style * Update test_modeling_glm.py * Add dummies * Apply correct modeling * Refactor attention to follow llama * Update __init__.py * Update convert_glm_weights_to_hf.py * Correct bias * remove linear_bias and pdrop (never used) * apply modular * Simplify converter * remove dummies + style * add model_input_names * Add pretraining_tp to config for when eager attention is used * Update modular to remove all pretraining_tp * Update test_modeling_glm.py * Update the __all__ * Update __all__ * Update __init__.py * Update test_modeling_glm.py * add revisions * Add the correct repos and revisions * style * Update __init__.py * update exports * remove import of modular files * style * Apply Llama changes + refine converter * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * style * Use new modular converter * add pretrainedmodel to init * style * Update test_modeling_glm.py * Move config outside modular to please CI about docstrings * Add dummies to please CI * Update glm.md * Update glm.md	2024-10-18 17:41:12 +02:00
Yoach Lacombe	9ba021ea75	Moshi integration (#33624 ) * clean mimi commit * some nits suggestions from Arthur * make fixup * first moshi WIP * converting weights working + configuration + generation configuration * finalize converting script - still missing tokenizer and FE and processor * fix saving model w/o default config * working generation * use GenerationMixin instead of inheriting * add delay pattern mask * fix right order: moshi codes then user codes * unconditional inputs + generation config * get rid of MoshiGenerationConfig * blank user inputs * update convert script:fix conversion, add tokenizer, feature extractor and bf16 * add and correct Auto classes * update modeling code, configuration and tests * make fixup * fix some copies * WIP: add integration tests * add dummy objects * propose better readiblity and code organisation * update tokenization tests * update docstrigns, eval and modeling * add .md * make fixup * add MoshiForConditionalGeneration to ignore Auto * revert mimi changes * re * further fix * Update moshi.md * correct md formating * move prepare causal mask to class * fix copies * fix depth decoder causal * fix and correct some tests * make style and update .md * correct config checkpoitn * Update tests/models/moshi/test_tokenization_moshi.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/moshi/test_tokenization_moshi.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * make style * Update src/transformers/models/moshi/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup * change firm in copyrights * udpate config with nested dict * replace einsum * make style * change split to True * add back splt=False * remove tests in convert * Update tests/models/moshi/test_modeling_moshi.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add default config repo + add model to FA2 docstrings * remove logits float * fix some tokenization tests and ignore some others * make style tokenization tests * update modeling with sliding window + update modeling tests * [run-slow] moshi * remove prepare for generation frol CausalLM * isort * remove copied from * ignore offload tests * update causal mask and prepare 4D mask aligned with recent changes * further test refine + add back prepare_inputs_for_generation for depth decoder * correct conditional use of prepare mask * update slow integration tests * fix multi-device forward * remove previous solution to device_map * save_load is flaky * fix generate multi-devices * fix device * move tensor to int --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Marc Sun <marc@huggingface.co>	2024-10-16 11:21:49 +02:00
Chulhwa (Evan) Han	9d6998c759	🌐 [i18n-KO] Translated `blip-2.md` to Korean (#33516 ) * docs: ko: model_doc/blip-2 * feat: nmt draft * Apply suggestions from code review Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> * Update docs/source/ko/model_doc/blip-2.md Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>	2024-10-15 11:21:22 -07:00
Yijun Lee	554ed5d1e0	🌐 [i18n-KO] Translated `trainer_utils.md` to Korean (#33817 ) * docs: ko: trainer_utils.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>	2024-10-15 11:21:05 -07:00
Yijun Lee	8c33cf4eec	🌐 [i18n-KO] Translated `gemma2.md` to Korean (#33937 ) * docs: ko: gemma2.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions	2024-10-15 11:20:46 -07:00
Jiwook Han	67acb0b123	🌐 [i18n-KO] Translated `vivit.md` to Korean (#33935 ) * docs: ko: model_doc/vivit.md * feat: nmt draft * fix: manual edits * fix: manual edits	2024-10-15 10:31:44 -07:00
Prakarsh Kaushik	293e6271c6	Add sdpa for Vivit (#33757 ) * chore:add sdpa to vivit * fix:failing slow test_inference_interpolate_pos_encoding(failing on main branch too) * chore:fix nits * ci:fix repo consistency failure * chore:add info and benchmark to model doc * [run_slow] vivit * chore:revert interpolation test fix for new issue * [run_slow] vivit * [run_slow] vivit * [run_slow] vivit * chore:add fallback for output_attentions being True * [run_slow] vivit * style:make fixup * [run_slow] vivit	2024-10-15 11:27:54 +02:00
Vladislav Bronzov	cb5ca3265f	Add GGUF for starcoder2 (#34094 ) * add starcoder2 arch support for gguf * fix q6 test	2024-10-14 10:22:49 +02:00
Anton Vlasjuk	7434c0ed21	Mistral-related models for QnA (#34045 ) * mistral qna start * mixtral qna * oops * qwen2 qna * qwen2moe qna * add missing input embed methods * add copied to all methods, can't directly from llama due to the prefix * make top level copied from	2024-10-14 08:53:32 +02:00
Lysandre Debut	f052e94bcc	Fix flax failures (#33912 ) * Few fixes here and there * Remove typos * Remove typos	2024-10-11 14:38:35 +02:00
Michael Goin	b2f09fb90f	[Docs] Update compressed_tensors.md (#33961 ) * Update compressed_tensors.md Fix some unfinished sections * Update docs/source/en/quantization/compressed_tensors.md Co-authored-by: Xiao Yuan <yuanx749@gmail.com> --------- Co-authored-by: Xiao Yuan <yuanx749@gmail.com>	2024-10-10 15:22:41 +02:00
Daniel Korat	fb0c6b521d	Universal Assisted Generation: Assisted generation with any assistant model (by Intel Labs) (#33383 ) * Update candidate_generator.py * Update utils.py * add lookbehind params to _get_candidate_generator * make fixup * add unit tests * fix failing tests * add docstrings * fix docstrings; remove non-optimized AnyTokenizer * added any tokenizer generation correctness test * make fixup * fix assertion syntax * PR review fixes * address additional PR comments * fix tests * remove stropping criteria arg * make fixup * add AssistantConfig * fix prev_tokens branching * pass tokenizers through `generate()`kwargs * fix lookbehind values; tokenizer params WIP * fixup * AssistantConfig * remove AssistantConfig; apply PR suggestions * restructure tests * fixup * fix assistant_tokenizer arg validation * fixup * fix tests in TestAssistedCandidateGeneratorDifferentTokenizers * fix class docstring * PR suggestions * doc * doc update and improvements to `_validate_assistant()` --------- Co-authored-by: mosheber <moshe.berchansky@intel.com>	2024-10-10 14:41:53 +02:00
Vladislav Bronzov	c9afee5392	Add gguf support for gpt2 (#34044 ) * add gpt2 gguf support * add doc change * small refactoring	2024-10-10 13:42:18 +02:00
Avishai Elmakies	a265600c60	add sdpa to OPT (#33298 ) * add sdpa to OPT * chore: remove redundant whitespace in OPTDecoder class * fixup * bug fix * add sdpa and attention generate test * fixup * Refactor OPTAttention forward method for improved readability and maintainability * undo refactor for _shape and key,val states * add OPT to doc, fixup didn't find it for some reason * change order * change default attn_implemntation in testing to eager * [run-slow] opt * change test_eager_matches_sdpa_generate to the one llama * Update default attention implementation in testing common * [run-slow] opt * remove uneeded print * [run-slow] opt * refactor model testers to have attn_implementation="eager" * [run-slow] opt * convert test_eager_matches_sdpa_generate to opt-350M * bug fix when creating mask for opt * [run-slow] opt * if layer head mask default to eager * if head mask is not none fall to eager * [run-slow] opt * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Clean up Unpack imports (#33631) clean up Unpack imports * Fix DPT /Dinov2 sdpa regression on main (#33660) * fallback to eager if output attentions. * fix copies * handle dependency errors in check_imports (#33622) * handle dependency errors in check_imports * change log level to warning * add back self.max_position_embeddings = config.max_position_embeddings (#33550) * add back self.max_position_embeddings = config.max_position_embeddings * fix-copies * Fix Llava conversion for LlavaQwen2ForCausalLM with Clip vision tower (#33613) fix llavaqwen2 model conversion * Uniformize kwargs for Udop processor and update docs (#33628) * Add optional kwargs and uniformize udop * cleanup Unpack * nit Udop * Generation: deprecate `PreTrainedModel` inheriting from `GenerationMixin` (#33203) * Enable BNB multi-backend support (#31098) * enable cpu bnb path * fix style * fix code style * fix 4 bit path * Update src/transformers/utils/import_utils.py Co-authored-by: Aarni Koskela <akx@iki.fi> * add multi backend refactor tests * fix style * tweak 4bit quantizer + fix corresponding tests * tweak 8bit quantizer + try fixing corresponding tests * fix dequant bnb 8bit * account for Intel CPU in variability of expected outputs * enable cpu and xpu device map * further tweaks to account for Intel CPU * fix autocast to work with both cpu + cuda * fix comments * fix comments * switch to testing_utils.torch_device * allow for xpu in multi-gpu tests * fix tests 4bit for CPU NF4 * fix bug with is_torch_xpu_available needing to be called as func * avoid issue where test reports attr err due to other failure * fix formatting * fix typo from resolving of merge conflict * polish based on last PR review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix CI * Update src/transformers/integrations/integration_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/integrations/integration_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix error log * fix error msg * add \n in error log * make quality * rm bnb cuda restriction in doc * cpu model don't need dispatch * fix doc * fix style * check cuda avaliable in testing * fix tests * Update docs/source/en/model_doc/chameleon.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Aarni Koskela <akx@iki.fi> * Update tests/quantization/bnb/test_4bit.py Co-authored-by: Aarni Koskela <akx@iki.fi> * Update tests/quantization/bnb/test_4bit.py Co-authored-by: Aarni Koskela <akx@iki.fi> * fix doc * fix check multibackends * fix import sort * remove check torch in bnb * docs: update bitsandbytes references with multi-backend info * docs: fix small mistakes in bnb paragraph * run formatting * reveret bnb check * move bnb multi-backend check to import_utils * Update src/transformers/utils/import_utils.py Co-authored-by: Aarni Koskela <akx@iki.fi> * fix bnb check * minor fix for bnb * check lib first * fix code style * Revert "run formatting" This reverts commit `ac108c6d6b`. * fix format * give warning when bnb version is low and no cuda found] * fix device assignment check to be multi-device capable * address akx feedback on get_avlbl_dev fn * revert partially, as we don't want the function that public, as docs would be too much (enforced) --------- Co-authored-by: Aarni Koskela <akx@iki.fi> Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix error string after refactoring into get_chat_template (#33652) * Fix error string after refactoring into get_chat_template * Take suggestion from CR Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * uniformize git processor (#33668) * uniformize git processor * update doctring * Modular `transformers`: modularity and inheritance for new model additions (#33248) * update exampel * update * push the converted diff files for testing and ci * correct one example * fix class attributes and docstring * nits * oups * fixed config! * update * nitd * class attributes are not matched against the other, this is missing * fixed overwriting self.xxx now onto the attributes I think * partial fix, now order with docstring * fix docstring order? * more fixes * update * fix missing docstrings! * examples don't all work yet * fixup * nit * updated * hick * update * delete * update * update * update * fix * all default * no local import * fix more diff * some fix related to "safe imports" * push fixed * add helper! * style * add a check * all by default * add the * update * FINALLY! * nit * fix config dependencies * man that is it * fix fix * update diffs * fix the last issue * re-default to all * alll the fixes * nice * fix properties vs setter * fixup * updates * update dependencies * make sure to install what needs to be installed * fixup * quick fix for now * fix! * fixup * update * update * updates * whitespaces * nit * fix * simplify everything, and make it file agnostic (should work for image processors) * style * finish fixing all import issues * fixup * empty modeling should not be written! * Add logic to find who depends on what * update * cleanup * update * update gemma to support positions * some small nits * this is the correct docstring for gemma2 * fix merging of docstrings * update * fixup * update * take doc into account * styling * update * fix hidden activation * more fixes * final fixes! * fixup * fixup instruct blip video * update * fix bugs * align gemma2 with the rest as well * updats * revert * update * more reversiom * grind * more * arf * update * order will matter * finish del stuff * update * rename to modular * fixup * nits * update makefile * fixup * update order of the checks! * fix * fix docstring that has a call inside * fiix conversion check * style * add some initial documentation * update * update doc * some fixup * updates * yups * Mostly todo gimme a minut * update * fixup * revert some stuff * Review docs for the modular transformers (#33472) Docs * good update * fixup * mmm current updates lead to this code * okay, this fixes it * cool * fixes * update * nit * updates * nits * fix doc * update * revert bad changes * update * updates * proper update * update * update? * up * update * cool * nits * nits * bon bon * fix * ? * minimise changes * update * update * update * updates? * fixed gemma2 * kind of a hack * nits * update * remove `diffs` in favor of `modular` * fix make fix copies --------- Co-authored-by: Lysandre Debut <hi@lysand.re> * Fix CIs post merging modular transformers (#33681) update * Fixed docstring for cohere model regarding unavailability of prune_he… (#33253) * Fixed docstring for cohere model regarding unavailability of prune_head() methods The docstring mentions that cohere model supports prune_heads() methods. I have fixed the docstring by explicitly mentioning that it doesn't support that functionality. * Update src/transformers/models/cohere/modeling_cohere.py --------- Co-authored-by: Lysandre Debut <hi@lysand.re> * Generation tests: update imagegpt input name, remove unused functions (#33663) * Improve Error Messaging for Flash Attention 2 on CPU (#33655) Update flash-attn error message on CPU Rebased to latest branch * Gemma2: fix config initialization (`cache_implementation`) (#33684) * Fix ByteLevel alphabet missing when Sequence pretokenizer is used (#33556) * Fix ByteLevel alphabet missing when Sequence pretokenizer is used * Fixed formatting with `ruff`. * Uniformize kwargs for image-text-to-text processors (#32544) * uniformize FUYU processor kwargs * Uniformize instructblip processor kwargs * Fix processor kwargs and tests Fuyu, InstructBlip, Kosmos2 * Uniformize llava_next processor * Fix save_load test for processor with chat_template only as extra init args * Fix import Unpack * Fix Fuyu Processor import * Fix FuyuProcessor import * Fix FuyuProcessor * Add defaults for specific kwargs kosmos2 * Fix Udop to return BatchFeature instead of BatchEncoding and uniformize kwargs * Add tests processor Udop * remove Copied from in processing Udop as change of input orders caused by BatchEncoding -> BatchFeature * Fix overwrite tests kwargs processors * Add warnings and BC for changes in processor inputs order, change docs, add BC for text_pair as arg for Udop * Fix processing test fuyu * remove unnecessary pad_token check in instructblip ProcessorTest * Fix BC tests and cleanup * FIx imports fuyu * Uniformize Pix2Struct * Fix wrong name for FuyuProcessorKwargs * Fix slow tests reversed inputs align fuyu llava-next, change udop warning * Fix wrong logging import udop * Add check images text input order * Fix copies * change text pair handling when positional arg * rebase on main, fix imports in test_processing_common * remove optional args and udop uniformization from this PR * fix failing tests * remove unnecessary test, fix processing utils and test processing common * cleanup Unpack * cleanup * fix conflict grounding dino * 🚨🚨 Setting default behavior of assisted decoding (#33657) * tests: fix pytorch tensor placement errors (#33485) This commit fixes the following errors: * Fix "expected all tensors to be on the same device" error * Fix "can't convert device type tensor to numpy" According to pytorch documentation torch.Tensor.numpy(force=False) performs conversion only if tensor is on CPU (plus few other restrictions) which is not the case. For our case we need force=True since we just need a data and don't care about tensors coherency. Fixes: #33517 See: https://pytorch.org/docs/2.4/generated/torch.Tensor.numpy.html Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * bump tokenizers, fix added tokens fast (#32535) * update based on tokenizers release * update * nits * update * revert re addition * don't break that yet * fmt * revert unwanted * update tokenizers version * update dep table * update * update in conversion script as well * some fix * revert * fully revert * fix training * remove set trace * fixup * update * update * [Pixtral] Improve docs, rename model (#33491) * Improve docs, rename model * Fix style * Update repo id * fix code quality after merge * HFQuantizer implementation for compressed-tensors library (#31704) * Add compressed-tensors HFQuantizer implementation * flag serializable as False * run * revive lines deleted by ruff * fixes to load+save from sparseml, edit config to quantization_config, and load back * address satrat comment * compressed_tensors to compressed-tensors and revert back is_serializable * rename quant_method from sparseml to compressed-tensors * tests * edit tests * clean up tests * make style * cleanup * cleanup * add test skip for when compressed tensors is not installed * remove pydantic import + style * delay torch import in test * initial docs * update main init for compressed tensors config * make fix-copies * docstring * remove fill_docstring * Apply suggestions from code review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * review comments * review comments * comments - suppress warnings on state dict load, tests, fixes * bug-fix - remove unnecessary call to apply quant lifecycle * run_compressed compatability * revert changes not needed for compression * no longer need unexpected keys fn * unexpected keys not needed either * Apply suggestions from code review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * add to_diff_dict * update docs and expand testing * Update _toctree.yml with compressed-tensors * Update src/transformers/utils/quantization_config.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update doc * add note about saving a loaded model --------- Co-authored-by: George Ohashi <george@neuralmagic.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Sara Adkins <sara@neuralmagic.com> Co-authored-by: Sara Adkins <sara.adkins65@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Dipika Sikka <ds3822@columbia.edu> Co-authored-by: Dipika <dipikasikka1@gmail.com> * update model card for opt * add batch size to inference table * [slow-run] opt * [run-slow] opt --------- Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Co-authored-by: Avishai Elmakies <avishai.elma@cs.huji.ac.il> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> Co-authored-by: chengchengpei <5881383+chengchengpei@users.noreply.github.com> Co-authored-by: Isotr0py <2037008807@qq.com> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Aarni Koskela <akx@iki.fi> Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Tibor Reiss <75096465+tibor-reiss@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Lysandre Debut <hi@lysand.re> Co-authored-by: Muhammad Naufil <m.naufil1@gmail.com> Co-authored-by: sizhky <yyeshr@gmail.com> Co-authored-by: Umar Butler <umar@umar.au> Co-authored-by: Jonathan Mamou <jonathan.mamou@intel.com> Co-authored-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com> Co-authored-by: George Ohashi <george@neuralmagic.com> Co-authored-by: Sara Adkins <sara@neuralmagic.com> Co-authored-by: Sara Adkins <sara.adkins65@gmail.com> Co-authored-by: Dipika Sikka <ds3822@columbia.edu> Co-authored-by: Dipika <dipikasikka1@gmail.com>	2024-10-10 11:49:34 +02:00
Ahmed Almaghz	69b5ccb887	Add Translate docs into Arabic - section files CONCEPTUAL GUIDES (#33982 ) Add Translate docs into Arabic - section files CONCEPTUAL GUIDES --------------------------------------------------------------------------------------- Philosophy [i18n-ar] Translated file : docs/source/ar/philosophy.md into Arabic #33064 Glossary [i18n-ar] Translated file : docs/source/ar/glossary.md into Arabic #33038 What 🤗 Transformers can do [i18n-ar] Translated file : docs/source/ar/task_summary.md into Arabic #33073 How 🤗 Transformers solve tasks [i18n-ar] Translated file : docs/source/ar/tasks_explained.md into Arabic #33074 The Transformer model family [i18n-ar] Translated file : docs/source/ar/model_summary.md into Arabic #33047 Summary of the tokenizers [i18n-ar] Translated file : docs/source/ar/tokenizer_summary.md into Arabic #33078 Attention [i18n-ar] Translated file : docs/source/ar/attention.md into Arabic #33021 Padding and truncation [i18n-ar] Translated file : docs/source/ar/pad_truncation.md into Arabic #33050 BERTology [i18n-ar] Translated file : docs/source/ar/bertology.md into Arabic #33024 Perplexity of fixed-length models [i18n-ar] Translated file : docs/source/ar/perplexity.md into Arabic #33063 Pipelines for webserver inference [i18n-ar] Translated file : docs/source/ar/pipeline_webserver.md into Arabic #33066 Model training anatomy [i18n-ar] Translated file : docs/source/ar/model_memory_anatomy.md into Arabic #33045 Getting the most out of LLMs [i18n-ar] Translated file : docs/source/ar/llm_tutorial_optimization.md into Arabic #33043	2024-10-09 14:51:19 -07:00
Yijun Lee	88d01d9119	🌐 [i18n-KO] Translated `generation_utils.md` to Korean (#33818 ) * docs: ko: generation_utils.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update generation_utils.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-09 11:55:07 -07:00
wony617	c02cf48729	🌐 [i18n-KO] Translated `main_classes/callback.md` to Korean (#33572 ) * docs: ko: callback.md * feat: nmt draft & manual edits * fix: resolve suggestions * Update docs/source/ko/main_classes/callback.md * Apply suggestions from code review * Apply suggestions from code review 확인했습니다! 상세한 리뷰 정말 감사합니다! Co-authored-by: boyunJang <gobook1234@naver.com> * Update _toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: boyunJang <gobook1234@naver.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-09 11:54:38 -07:00
Yijun Lee	0354d44926	🌐 [i18n-KO] Translated `text_generation.md` to Korean (#33777 ) * docs: ko: text_generation.md * feat: nmt draft * fix: manual edits * fix: manual edits * fix: resolve suggestions Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> --------- Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-09 11:20:01 -07:00
Sungmin Oh	973e6066d4	🌐 [i18n-KO] Translated `model_doc/patchtst.md` to Korean (#33589 ) * docs: ko: model_doc/patchtst.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * fix: resolve suggestions Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> --------- Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-09 11:15:24 -07:00
Sungmin Oh	61a6dce7e4	🌐 [i18n-KO] Translated `main_classes/data_collator.md` to Korean (#33954 ) * docs: ko: main_classes/data_collator.md * feat: nmt draft * fix: resolve suggestions Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> * fix: resolve suggestions --------- Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-09 11:14:43 -07:00
Yijun Lee	6ac5f25bb6	🌐 [i18n-KO] Translated `modeling_utils.md` to Korean (#33808 ) * docs: ko: modeling_utils.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>	2024-10-09 10:50:03 -07:00
Sungmin Oh	8dca259826	🌐 [i18n-KO] Translated `model_doc/graphormer.md` to Korean (#33569 ) * docs: ko: model_doc/graphormer.md * feat: nmt draft * fix: resolve suggestions Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * fix: resolve suggestions * fix: resolve suggestions --------- Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>	2024-10-09 10:44:28 -07:00
Sungmin Oh	4ad923344d	🌐 [i18n-KO] Translated `model_doc/informer.md` to Korean (#33585 ) * docs: ko: model_doc/informer.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * fix: resolve suggestions --------- Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>	2024-10-09 10:41:06 -07:00
Sungmin Oh	04f51c42c8	🌐 [i18n-KO] Translated `model_doc/time_series_transformer.md` to Korean (#33596 ) * docs: ko: model_doc/time_series_transformer.md * fix: resolve suggestions Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions * fix: resolve suggestions * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> --------- Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>	2024-10-09 10:40:48 -07:00
Sungmin Oh	32cc15c6a2	🌐 [i18n-KO] Translated `model_doc/trajectory_transformer.md` to Korean (#33597 ) * docs: ko: model_doc/trajectory_transformer.md * fix: resolve suggestions Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * fix: resolve suggestions * fix: resolve suggestions --------- Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>	2024-10-09 10:40:36 -07:00
Sungmin Oh	f0fbef1c63	🌐 [i18n-KO] Translated `main_classes/model.md` to Korean (#33606 ) * feat: nmt draft * fix: manual edits * docs: ko: main_classes/model.md * fix: resolve suggestions Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions * fix: resolve suggestions --------- Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>	2024-10-09 10:40:06 -07:00
Sungmin Oh	48b54205d0	🌐 [i18n-KO] Translated `model_doc/mamba2.md` to Korean (#33629 ) * docs: ko: model_doc/mamba2.md * fix: resolve suggestions Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestion * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions --------- Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>	2024-10-09 10:39:54 -07:00
Sungmin Oh	03e6fa0061	🌐 [i18n-KO] Translated `main_classes/keras_callbacks.md` to Korean (#33955 ) * docs: ko: main_classes/keras_callbacks.md * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> --------- Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>	2024-10-09 10:34:01 -07:00
Sungmin Oh	13929a0ec6	🌐 [i18n-KO] Translated `model_doc/deberta.md` to Korean (#33967 ) * docs: ko: model_doc/deberta.md * feat: nmt draft * fix: resolve suggestions Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> * fix: resolve suggestions * fix: resolve suggestions --------- Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>	2024-10-09 10:33:34 -07:00
Sungmin Oh	41794e6098	🌐 [i18n-KO] Translated `model_doc/bart.md` to Korean (#33893 ) * docs: ko: model_doc/bart.md * fix: anchor edits * feat: nmt draft * Update docs/source/ko/model_doc/bart.md * Update docs/source/ko/model_doc/bart.md * fix: manual edits * Update docs/source/ko/model_doc/bart.md * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * fix: resolve suggestions fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions * fix: resolve suggestions Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * fix: resolve suggestions Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-09 10:33:14 -07:00
Mohamed Mekkouri	36d410dab6	FEAT : Adding BitNet quantization method to HFQuantizer (#33410 ) * rebasing changes * fixing style * adding some doc to functions * remove bitblas * change dtype * fixing check_code_quality * fixing import order * adding doc to tree * Small update on BitLinear * adding some tests * sorting imports * small update * reformatting * reformatting * reformatting with ruff * adding assert * changes after review * update disk offloading * adapting after review * Update after review * add is_serializable back * fixing style * adding serialization test * make style * small updates after review	2024-10-09 17:51:41 +02:00
Vladislav Bronzov	faa0f63b93	Add gguf support for StableLM (#33793 ) * add stablelm gguf architecture support * add additional quantization tests * resolve merge conflict, add weight conversion tests for fp16	2024-10-09 12:16:13 +02:00
Yijun Lee	698b36da72	🌐 [i18n-KO] Translated `modular_transformers.md` to Korean (#33772 ) * docs: ko: modular_transformers.md * feat: nmt draft * fix inline TOC * fix: manual edits * fix: resolve suggestions * fix: resolve suggestions Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> * fix: resolve suggestions Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ko/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-08 18:30:41 -07:00
Yijun Lee	6151bc47ba	🌐 [i18n-KO] Translated `image_processing_utils.md` to Korean (#33804 ) * docs: ko: image_processing_utils.md * feat: nmt draft * fix: manual edits	2024-10-08 18:19:37 -07:00
YONGSANG	d31d076b53	🌐 [i18n-KO] Translated output.md to Korean (#33607 ) * nmt draft * fix toctree * minor fix * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: boyunJang <gobook1234@naver.com> Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> * Apply suggestions from code review * Apply suggestions from code review * Update docs/source/ko/main_classes/output.md * Update docs/source/ko/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: boyunJang <gobook1234@naver.com> Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-08 18:19:21 -07:00
Chulhwa (Evan) Han	109b1e7591	🌐 [i18n-KO] Translated `blip.md` to Korean (#33515 ) * docs: ko: model_doc/blip * feat: nmt darft * Apply suggestions from code review Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> * Update docs/source/ko/model_doc/blip.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>	2024-10-08 17:59:31 -07:00
Yijun Lee	5809b43a62	🌐 [i18n-KO] Translated `biogpt.md` to Korean (#33773 ) * docs: ko: biogpt.md * feat: nmt draft * fix: manual edits * fix: resolve suggestion Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> --------- Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>	2024-10-08 17:57:51 -07:00
Yijun Lee	c674f2e313	🌐 [i18n-KO] Translated `openai-gpt.md` to Korean (#33801 ) * docs: ko: openai-gpt.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> * fix: resolve suggestions * fix: resolve suggestions --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>	2024-10-08 17:57:33 -07:00

... 4 5 6 7 8 ...

3358 Commits