transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-03 21:00:08 +06:00

Author	SHA1	Message	Date
Yih-Dar	d4e7aa5526	Fix `qwen_2_5 omni` (#38658 ) * fix * fix * break style * break style * Apply style fixes * break style * Apply style fixes * fix modular --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-06-12 14:43:54 +02:00
Jesse Cai	e1812864ab	[docs] Add int4wo + 2:4 sparsity example to TorchAO README (#38592 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * update quantization readme * update --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-06-12 12:17:07 +00:00
Quentin Gallouédec	bc68defcac	Update PULL_REQUEST_TEMPLATE.md (#38770 )	2025-06-12 14:03:33 +02:00
Quentin Gallouédec	960fda25d1	Reduce verbosity for `average_tokens_across_devices=True` and `world size = 1` (#38785 ) * Warning to info for average_tokens_across_devices and world size = 1 * Update src/transformers/training_args.py	2025-06-12 14:02:53 +02:00
Yih-Dar	89c46b648d	Skip some export tests on torch 2.7 (#38677 ) * skip * fix * better check * Update import_utils.py --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-06-12 12:47:15 +02:00
Raushan Turganbay	27459025b8	[video processors] support frame sampling within processors (#38105 ) * apply updates smolVLM (still needs workaround for chat template) * add other models * dump qwen omni for now, come back later * port qwen omni from their impl * wait, all qwens sample videos in same way! * clean up * make smolvlm backwards compatible and fix padding * dix some tests * fox smolvlm tests * more clean up and test fixing * delete unused arg * fix * address comments * style * fix test	2025-06-12 09:34:30 +00:00
Cyril Vallez	887054c714	Fix masking utils (#38783 ) * fix * Update masking_utils.py * Update masking_utils.py	2025-06-12 11:00:46 +02:00
Yih-Dar	7c58336949	[Hotfix] Fix style bot (#38779 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-12 10:20:36 +02:00
Raushan Turganbay	7c6b1707c3	[masking utils] check `None` instead of try/except (#38561 ) * fix vllm's compile backend * fix the test * apply the same changes in other masking strategies	2025-06-12 06:50:28 +00:00
rileyafox	9487765f07	Add Qwen2 MoE model card (#38649 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * Add Qwen2 MoE model card * Revisions to qwen2 moe model card * Add Qwen2 MoE model card	2025-06-11 15:14:01 -07:00
Emile Aydar	32dbf4bddb	Update altCLIP model card (#38306 ) * Update altclip.md * Update altclip.md * Update altclip.md * Update altclip.md * Update altclip.md * Update altclip.md * Rename altclip.md to altclip.mdx * Rename altclip.mdx to altclip.md * Update altclip.md * Update altclip.md * Update altclip.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-11 14:48:34 -07:00
Dongruixuan Li	1dcb022e8f	chore(pixtral): emit block attention mask when using flash attention (#38741 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * chore(pixtral): emit block attention mask when using flash attention Since flash_attention_2 relies solely on position_ids, emitting the block attention mask avoids unnecessary memory usage and prevents OOM on large inputs. * remove unnecessary attention_mask assignment	2025-06-11 18:55:23 +00:00
Yih-Dar	60d4b35b20	Make style bot trigger CI after push (#38754 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-11 20:40:04 +02:00
Drew Ross	bb44d2a0f6	Update pegasus model card (#38675 ) * Update Pegasus model card * Fix transformers-cli command * Update code examples to use bfloat16 * Reverted code examples to use float16 * Fix typo, update checkpoints link * Update str formatting in code examples * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fix typo * Remove inaccurate badges * Revert badge removal * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Include cache_implementation argument in quantization example --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-11 10:56:25 -07:00
L	b84ebb7f3c	fix(qwen3_moe): pass kwargs to self_attn (#38691 ) This is needed to avoid `.item()` calls in `_flash_attention_forward`.	2025-06-11 19:26:08 +02:00
Matt	9f563ada70	Deprecate TF + JAX (#38758 ) * Scatter deprecation warnings around * Delete the tests * Make logging work properly!	2025-06-11 17:28:06 +01:00
Matt	337757cbd5	Update repo consistency check (#38763 )	2025-06-11 17:02:03 +01:00
Matthew Douglas	e2bdc13375	Remove IPEX requirement for bitsandbytes on CPU (#38594 ) Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-11 17:46:34 +02:00
Matt	063bef0865	Prepare for TF+Jax deprecation (#38760 ) * Prepare for TF+Jax deprecation * Remove .circleci jobs	2025-06-11 16:03:31 +01:00
Marc Sun	11ad9be153	Better typing for num_items_in_batch (#38728 ) * fix * style * type checking ? * maybe this ? * fix * can't be an int anymore * fix	2025-06-11 16:26:41 +02:00
Pavel Iakubovskii	84710a4291	Add V-JEPA 2 (#38746 ) * adding model and conversion scripts * add imports to test vjepa conversion * fix imports and make conversion work * fix computation for short side * replace attention with library attention function * cleanup more attention classes * remove config overrides * add test cases, fix some of the failing ones * fix the model outputs * fix outputs of the model per review * fix too big model test case * fix styling __init__.py * fix initialization test * remove all asserts per review * update sorting unsorting logic as per feedback * remove is_video per review * remove another is_video segment * remove unwanted stuff * small fixes * add docstrings for the model * revert adding vjepa2 config here * update styling * add config docstrings (wip) * fix dpr issue * removed test failing issues * update styles * merge predictor configs into main config * remove processing code, add video processor * remove permute which is not necessary now * fix styles * updated vjepa2 to be in video_processing_auto * update comment for preprocessing * test integration test and fix the outputs * update test values, change test to look at repeated frames for a given image * add a simple video processing test * refactoring pixel_values_videos and upload ckpts to original * fix torch_fx test cases * remove unused config * add all config docstrings * add more integration tests * add basic doc * revert unwanted styling changes * working make fixup * Fix model_type in config * update attention implementation to fit new hf standards * fix the preprocessing logic, ensure it matches the original model * remove use_rope logic, cleanup * fix docstrings * Further cleanup, update doc * Fix model prefix * fix get_vision_features * VJEPA2Embeddings style refactor * nit, style comment * change modules default values * Only `str` activation in config * GradientCheckpointingLayer * fixup * fix conversion script * Remove return_dict * remove None return typehint * Refactor VJEPA2Layer, remove use_SiLU * Fix fx tests * dpr -> drop_path_rates * move ModelOutput on top format docs bit * update docs * update docs * update doc example * remove prune_heads from model * remove unused config params * refactor embed signature * Add vjepa to docs * Fix config docstring * update defaults * Update docs/source/en/model_doc/vjepa2.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/vjepa2.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Fix import * Min refactoring * Update HUB_SOURCE and HUB_REPO in conversion script * Add missing headers * VJEPA -> V-JEPA in docs * Add image to doc * fix style * fix init weights * change checkpoint name in modeling tests --------- Co-authored-by: Koustuv Sinha <koustuv.sinha@mail.mcgill.ca> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: Koustuv Sinha <koustuvsinha@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2025-06-11 15:00:08 +01:00
Davis Wertheimer	a6f0e2b64a	Add z-loss to Bamba for v2 (#37842 ) * Remove const * Fix arg ref * Sharded save * Add z_loss flag * Add modeling zloss * Demodularize clm forward for zloss * Also demodularize init for z_loss flag * PR comments (mostly modularizing right) * Demodularize forward * Better name zloss and explain typematch * Fully propagate coeff name * style fixes * zloss default float * Remove conflicting annotations --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-06-11 15:29:17 +02:00
Yih-Dar	6b610d89f1	Revert "Trigger doc-builder job after style bot" (#38735 ) Revert "Trigger doc-builder job after style bot (#38398)" This reverts commit `51e0fac29f`.	2025-06-11 14:56:39 +02:00
Minho Ryu	0bf53e69e2	[DeepSeek-V3] implement when q_lora_rank is None (#38743 ) * implement when q_lora_rank is None * make style and quality	2025-06-11 13:35:10 +01:00
ye	b426c2b313	fix: bf16 with TPU is allowed in configuration (#38670 ) * fix: tpu bf16 * fix: style --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-11 12:35:01 +00:00
Yao Matrix	c8c1e525ed	from 1.11.0, torchao.prototype.low_bit_optim is promoted to torchao.optim (#38689 ) * since 1.11.0, torchao.prototype.low_bit_optim is promoted to torchao.optim Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix review comments Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-11 12:16:25 +00:00
Yushun Xiang	56a7cf5546	fix: Add method to get image features in PaliGemmaForConditionalGeneration (#38730 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * fix: Add method to retrieve image features in PaliGemmaForConditionalGeneration * feat: Add get_image_features method to multiple models for image feature extraction * fix: reformat the files with ruff. * feat: Add methods for packing and retrieving image and video features across multiple models modified: - modeling_chameleon.py - modeling_llava_next.py - modular_llava_next_video.py - modeling_qwen2_vl.py and generate the: - modeling_llava_next_video.py - modeling_llava_onevision.py - modeling_qwen2_5_vl.py * feat: Implement get_image_features method in Aria, Mistral3, and VipLlava models with updated parameters * fix: reformatted the code with fix-style	2025-06-11 10:26:31 +00:00
Raushan Turganbay	380e6ea406	[llava] fix integration tests with Siglip (#38732 ) fix llava siglip test	2025-06-11 08:09:16 +00:00
Rémi Ouazan	f1849eab22	Fixed a multiple-devices issue in SmolVLM model (#38736 ) Fixed a multiple-devices issue in SmolVLMModel (#38557) * Fixed a multiple-devices issue in SmolVLMModel * Changed the modular to reflect changes	2025-06-11 10:08:01 +02:00
RogerSinghChugh	aa798b7ac9	New canine model card (#38631 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). * Commit for new_gpt_model_card. * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * commit for new canine model card. * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * implemented suggestion by @stevhliu. * Update canine.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-10 09:30:05 -07:00
Matt	e28fb26e7d	Add AGENTS.md (#38734 ) * More name sync * repeatedly underlining "WRITE LESS, ROBOT" * fewer, commas, please * Clarify "copied from" * Clarify "copied from" * Mention test dependencies * Added a line on preferring `modular` style	2025-06-10 16:27:37 +00:00
Francisco R Castro Garcia	cb4c56ce0d	Fix typo in Language Modeling example scripts and update TPU type (#38652 ) * Fix typo that prevents the examples to be run correctly * return .TPU in accelerator.distributedtype comparison	2025-06-10 13:43:35 +00:00
alexzms	8ff22e9d3b	[add-new-model-like] Robust search & proper outer '),' in tokenizer mapping (#38703 ) * [add-new-model-like] Robust search & proper outer '),' in tokenizer mapping * code-style: arrange the importation in add_new_model_like.py * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-06-10 12:25:12 +00:00
Yuanyuan Chen	8340e8746e	Use OSError (#38712 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-06-10 12:13:49 +00:00
Yih-Dar	8257734b5f	Fix `llava` tests (#38722 ) * update * fix 1 * fix 2 * fix 3 * fix 4 * fix 5 * fix 6 * fix 7 * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-10 13:53:17 +02:00
वेदांत	71f7385942	Logging message for `` `is_bitsandbytes_available()` `` (#38528 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * bnb import log * bnb import log * log mesage change * moved error issue in qunatizer_bnb_4_bit.py * ruff * arg added for bnb check * required changes --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-10 10:15:01 +00:00
Yih-Dar	04cdf83244	Update some tests for torch 2.7.1 (#38701 ) * fix 1 * fix 2 * fix 3 * fix 4 * fp16 * break * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-10 11:46:52 +02:00
rdonggroq	afdb821318	Fix smart resize (#38706 ) * Fix smart_resize bug * Add smart_resize test * Remove unnecessary error checking * Fix smart_resize tests --------- Co-authored-by: Richard Dong <rdong@rdong.c.groq-143208.internal>	2025-06-10 08:59:22 +00:00
Yana Mishula	81799d8b55	Standardize ByT5 model card format (#38699 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * Standardize ByT5 model card format * Apply review feedback from @stevhliu * Fix Notes formatting and wording * Fix `aya_vision` test (#38674) * fix 1: load_in_4bit=True, * fix 2: decorateor * fixfix 2: breakpoint * fixfix 3: update * fixfix 4: fast * fixfix 5: cond * fixfix 5: cond * fixfix 6: cuda 8 * ruff * breakpoint * dtype * a10 * a10 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix autodoc formatting for ByT5Tokenizer --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-09 15:02:50 -07:00
Yih-Dar	e55983e2b9	Fix `aya_vision` test (#38674 ) * fix 1: load_in_4bit=True, * fix 2: decorateor * fixfix 2: breakpoint * fixfix 3: update * fixfix 4: fast * fixfix 5: cond * fixfix 5: cond * fixfix 6: cuda 8 * ruff * breakpoint * dtype * a10 * a10 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-09 22:18:52 +02:00
Aashish Anand	b61c47f5a5	Created model card for xlm-roberta-xl (#38597 ) * Created model card for xlm-roberta-xl * Update XLM-RoBERTa-XL model card with improved descriptions and usage examples * Minor option labeling fix * Added MaskedLM version of XLM RoBERTa XL to model card * Added quantization example for XLM RoBERTa XL model card * minor fixes to xlm roberta xl model card * Minor fixes to mask format in xlm roberta xl model card	2025-06-09 13:00:38 -07:00
Aashish Anand	e594e75f1b	Update XLM-RoBERTa model documentation with enhanced usage examples and improved layout (#38596 ) * Update XLM-RoBERTa model documentation with enhanced usage examples and improved layout * Added CLI command example and quantization example for XLM RoBERTa model card. * Minor change to transformers CLI and quantization example for XLM roberta model card	2025-06-09 12:26:31 -07:00
Aashish Anand	29ca043856	Created model card for XLM model (#38595 ) * Created model card for XLM model * Revised model card structure and content of XLM model * Update XLM model documentation with improved examples and code snippets for predicting <mask> tokens using Pipeline and AutoModel.	2025-06-09 12:26:23 -07:00
Marcel Ambo Ndowah	25f711aa89	Drop as_target_processor from the _call_ and pad methods (#38642 ) Drop as_target_processor from _call_ and pad methods; reformat docstrings for readability	2025-06-09 12:26:09 -07:00
Matthew Douglas	837ddac1ec	Docs: update bitsandbytes torch.compile compatibility (#38651 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details	2025-06-09 14:51:57 -04:00
dbleyl	b9faf2f930	Fix TypeError: 'NoneType' object is not iterable for esm (#38667 ) (#38668 ) Add post_init() calls to EsmForMaskedLM, EsmForTokenClassification and EsmForSequenceClassification.	2025-06-09 15:23:20 +00:00
Fiona Waters	11dca07a10	Fix retrieve function signature and remove faiss requirement (#38624 ) Signed-off-by: Fiona Waters <fiwaters6@gmail.com>	2025-06-09 15:17:33 +00:00
xiao	b31d462c61	Fix some models import (#38694 ) Fix models import	2025-06-09 16:09:24 +01:00
pweglik	282d6684dc	Fix attention mask expansion when converting to executorch (#38637 )	2025-06-09 15:00:55 +00:00
Anthony	19224c3642	fix: "check out" as verb (#38678 ) "check out" as verb	2025-06-09 14:07:31 +00:00

1 2 3 4 5 ...

19276 Commits