transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-03 04:40:06 +06:00

Author	SHA1	Message	Date
Matt	063bef0865	Prepare for TF+Jax deprecation (#38760 ) * Prepare for TF+Jax deprecation * Remove .circleci jobs	2025-06-11 16:03:31 +01:00
Marc Sun	11ad9be153	Better typing for num_items_in_batch (#38728 ) * fix * style * type checking ? * maybe this ? * fix * can't be an int anymore * fix	2025-06-11 16:26:41 +02:00
Pavel Iakubovskii	84710a4291	Add V-JEPA 2 (#38746 ) * adding model and conversion scripts * add imports to test vjepa conversion * fix imports and make conversion work * fix computation for short side * replace attention with library attention function * cleanup more attention classes * remove config overrides * add test cases, fix some of the failing ones * fix the model outputs * fix outputs of the model per review * fix too big model test case * fix styling __init__.py * fix initialization test * remove all asserts per review * update sorting unsorting logic as per feedback * remove is_video per review * remove another is_video segment * remove unwanted stuff * small fixes * add docstrings for the model * revert adding vjepa2 config here * update styling * add config docstrings (wip) * fix dpr issue * removed test failing issues * update styles * merge predictor configs into main config * remove processing code, add video processor * remove permute which is not necessary now * fix styles * updated vjepa2 to be in video_processing_auto * update comment for preprocessing * test integration test and fix the outputs * update test values, change test to look at repeated frames for a given image * add a simple video processing test * refactoring pixel_values_videos and upload ckpts to original * fix torch_fx test cases * remove unused config * add all config docstrings * add more integration tests * add basic doc * revert unwanted styling changes * working make fixup * Fix model_type in config * update attention implementation to fit new hf standards * fix the preprocessing logic, ensure it matches the original model * remove use_rope logic, cleanup * fix docstrings * Further cleanup, update doc * Fix model prefix * fix get_vision_features * VJEPA2Embeddings style refactor * nit, style comment * change modules default values * Only `str` activation in config * GradientCheckpointingLayer * fixup * fix conversion script * Remove return_dict * remove None return typehint * Refactor VJEPA2Layer, remove use_SiLU * Fix fx tests * dpr -> drop_path_rates * move ModelOutput on top format docs bit * update docs * update docs * update doc example * remove prune_heads from model * remove unused config params * refactor embed signature * Add vjepa to docs * Fix config docstring * update defaults * Update docs/source/en/model_doc/vjepa2.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/vjepa2.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Fix import * Min refactoring * Update HUB_SOURCE and HUB_REPO in conversion script * Add missing headers * VJEPA -> V-JEPA in docs * Add image to doc * fix style * fix init weights * change checkpoint name in modeling tests --------- Co-authored-by: Koustuv Sinha <koustuv.sinha@mail.mcgill.ca> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: Koustuv Sinha <koustuvsinha@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2025-06-11 15:00:08 +01:00
Davis Wertheimer	a6f0e2b64a	Add z-loss to Bamba for v2 (#37842 ) * Remove const * Fix arg ref * Sharded save * Add z_loss flag * Add modeling zloss * Demodularize clm forward for zloss * Also demodularize init for z_loss flag * PR comments (mostly modularizing right) * Demodularize forward * Better name zloss and explain typematch * Fully propagate coeff name * style fixes * zloss default float * Remove conflicting annotations --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-06-11 15:29:17 +02:00
Yih-Dar	6b610d89f1	Revert "Trigger doc-builder job after style bot" (#38735 ) Revert "Trigger doc-builder job after style bot (#38398)" This reverts commit `51e0fac29f`.	2025-06-11 14:56:39 +02:00
Minho Ryu	0bf53e69e2	[DeepSeek-V3] implement when q_lora_rank is None (#38743 ) * implement when q_lora_rank is None * make style and quality	2025-06-11 13:35:10 +01:00
ye	b426c2b313	fix: bf16 with TPU is allowed in configuration (#38670 ) * fix: tpu bf16 * fix: style --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-11 12:35:01 +00:00
Yao Matrix	c8c1e525ed	from 1.11.0, torchao.prototype.low_bit_optim is promoted to torchao.optim (#38689 ) * since 1.11.0, torchao.prototype.low_bit_optim is promoted to torchao.optim Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix review comments Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-11 12:16:25 +00:00
Yushun Xiang	56a7cf5546	fix: Add method to get image features in PaliGemmaForConditionalGeneration (#38730 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * fix: Add method to retrieve image features in PaliGemmaForConditionalGeneration * feat: Add get_image_features method to multiple models for image feature extraction * fix: reformat the files with ruff. * feat: Add methods for packing and retrieving image and video features across multiple models modified: - modeling_chameleon.py - modeling_llava_next.py - modular_llava_next_video.py - modeling_qwen2_vl.py and generate the: - modeling_llava_next_video.py - modeling_llava_onevision.py - modeling_qwen2_5_vl.py * feat: Implement get_image_features method in Aria, Mistral3, and VipLlava models with updated parameters * fix: reformatted the code with fix-style	2025-06-11 10:26:31 +00:00
Raushan Turganbay	380e6ea406	[llava] fix integration tests with Siglip (#38732 ) fix llava siglip test	2025-06-11 08:09:16 +00:00
Rémi Ouazan	f1849eab22	Fixed a multiple-devices issue in SmolVLM model (#38736 ) Fixed a multiple-devices issue in SmolVLMModel (#38557) * Fixed a multiple-devices issue in SmolVLMModel * Changed the modular to reflect changes	2025-06-11 10:08:01 +02:00
RogerSinghChugh	aa798b7ac9	New canine model card (#38631 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). * Commit for new_gpt_model_card. * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * commit for new canine model card. * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * implemented suggestion by @stevhliu. * Update canine.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-10 09:30:05 -07:00
Matt	e28fb26e7d	Add AGENTS.md (#38734 ) * More name sync * repeatedly underlining "WRITE LESS, ROBOT" * fewer, commas, please * Clarify "copied from" * Clarify "copied from" * Mention test dependencies * Added a line on preferring `modular` style	2025-06-10 16:27:37 +00:00
Francisco R Castro Garcia	cb4c56ce0d	Fix typo in Language Modeling example scripts and update TPU type (#38652 ) * Fix typo that prevents the examples to be run correctly * return .TPU in accelerator.distributedtype comparison	2025-06-10 13:43:35 +00:00
alexzms	8ff22e9d3b	[add-new-model-like] Robust search & proper outer '),' in tokenizer mapping (#38703 ) * [add-new-model-like] Robust search & proper outer '),' in tokenizer mapping * code-style: arrange the importation in add_new_model_like.py * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-06-10 12:25:12 +00:00
Yuanyuan Chen	8340e8746e	Use OSError (#38712 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-06-10 12:13:49 +00:00
Yih-Dar	8257734b5f	Fix `llava` tests (#38722 ) * update * fix 1 * fix 2 * fix 3 * fix 4 * fix 5 * fix 6 * fix 7 * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-10 13:53:17 +02:00
वेदांत	71f7385942	Logging message for `` `is_bitsandbytes_available()` `` (#38528 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * bnb import log * bnb import log * log mesage change * moved error issue in qunatizer_bnb_4_bit.py * ruff * arg added for bnb check * required changes --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-10 10:15:01 +00:00
Yih-Dar	04cdf83244	Update some tests for torch 2.7.1 (#38701 ) * fix 1 * fix 2 * fix 3 * fix 4 * fp16 * break * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-10 11:46:52 +02:00
rdonggroq	afdb821318	Fix smart resize (#38706 ) * Fix smart_resize bug * Add smart_resize test * Remove unnecessary error checking * Fix smart_resize tests --------- Co-authored-by: Richard Dong <rdong@rdong.c.groq-143208.internal>	2025-06-10 08:59:22 +00:00
Yana Mishula	81799d8b55	Standardize ByT5 model card format (#38699 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * Standardize ByT5 model card format * Apply review feedback from @stevhliu * Fix Notes formatting and wording * Fix `aya_vision` test (#38674) * fix 1: load_in_4bit=True, * fix 2: decorateor * fixfix 2: breakpoint * fixfix 3: update * fixfix 4: fast * fixfix 5: cond * fixfix 5: cond * fixfix 6: cuda 8 * ruff * breakpoint * dtype * a10 * a10 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix autodoc formatting for ByT5Tokenizer --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-09 15:02:50 -07:00
Yih-Dar	e55983e2b9	Fix `aya_vision` test (#38674 ) * fix 1: load_in_4bit=True, * fix 2: decorateor * fixfix 2: breakpoint * fixfix 3: update * fixfix 4: fast * fixfix 5: cond * fixfix 5: cond * fixfix 6: cuda 8 * ruff * breakpoint * dtype * a10 * a10 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-09 22:18:52 +02:00
Aashish Anand	b61c47f5a5	Created model card for xlm-roberta-xl (#38597 ) * Created model card for xlm-roberta-xl * Update XLM-RoBERTa-XL model card with improved descriptions and usage examples * Minor option labeling fix * Added MaskedLM version of XLM RoBERTa XL to model card * Added quantization example for XLM RoBERTa XL model card * minor fixes to xlm roberta xl model card * Minor fixes to mask format in xlm roberta xl model card	2025-06-09 13:00:38 -07:00
Aashish Anand	e594e75f1b	Update XLM-RoBERTa model documentation with enhanced usage examples and improved layout (#38596 ) * Update XLM-RoBERTa model documentation with enhanced usage examples and improved layout * Added CLI command example and quantization example for XLM RoBERTa model card. * Minor change to transformers CLI and quantization example for XLM roberta model card	2025-06-09 12:26:31 -07:00
Aashish Anand	29ca043856	Created model card for XLM model (#38595 ) * Created model card for XLM model * Revised model card structure and content of XLM model * Update XLM model documentation with improved examples and code snippets for predicting <mask> tokens using Pipeline and AutoModel.	2025-06-09 12:26:23 -07:00
Marcel Ambo Ndowah	25f711aa89	Drop as_target_processor from the _call_ and pad methods (#38642 ) Drop as_target_processor from _call_ and pad methods; reformat docstrings for readability	2025-06-09 12:26:09 -07:00
Matthew Douglas	837ddac1ec	Docs: update bitsandbytes torch.compile compatibility (#38651 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details	2025-06-09 14:51:57 -04:00
dbleyl	b9faf2f930	Fix TypeError: 'NoneType' object is not iterable for esm (#38667 ) (#38668 ) Add post_init() calls to EsmForMaskedLM, EsmForTokenClassification and EsmForSequenceClassification.	2025-06-09 15:23:20 +00:00
Fiona Waters	11dca07a10	Fix retrieve function signature and remove faiss requirement (#38624 ) Signed-off-by: Fiona Waters <fiwaters6@gmail.com>	2025-06-09 15:17:33 +00:00
xiao	b31d462c61	Fix some models import (#38694 ) Fix models import	2025-06-09 16:09:24 +01:00
pweglik	282d6684dc	Fix attention mask expansion when converting to executorch (#38637 )	2025-06-09 15:00:55 +00:00
Anthony	19224c3642	fix: "check out" as verb (#38678 ) "check out" as verb	2025-06-09 14:07:31 +00:00
StevenBucaille	237ff80387	Fixed modeling_auto.py MODEL_FOR_MASK_GENERATION_MAPPING_NAMES variable (#38664 ) fix: grouped the two MODEL_FOR_MASK_GENERATION_MAPPING_NAMES variables	2025-06-09 13:40:46 +00:00
Isotr0py	d7b87b415a	Fix qwen2-audio chat template audio placeholder insertion (#38640 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * fix qwen2-audio template Signed-off-by: Isotr0py <2037008807@qq.com> * add message['type'] back Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-06-09 09:56:42 +00:00
Yih-Dar	10627c1a0f	Use torch 2.7.1 on daily CI (#38620 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-08 14:37:45 +02:00
Yih-Dar	ebeec13609	Fix `InternVL` integration test (#38612 ) Some checks failed Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Has been cancelled Details Build documentation / build (push) Has been cancelled Details Slow tests on important models (on Push - A10) / Get all modified files (push) Has been cancelled Details Self-hosted runner (push-caller) / Check if setup was changed (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Transformers metadata / build_and_package (push) Has been cancelled Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Has been cancelled Details Self-hosted runner (push-caller) / build-docker-containers (push) Has been cancelled Details Self-hosted runner (push-caller) / Trigger Push CI (push) Has been cancelled Details * fix * fix * fix OOM --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-07 08:30:47 +02:00
Yih-Dar	3fb7e7bc01	Skip torchscript tests for 2 models (#38643 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 20:17:37 +02:00
Yao Matrix	dc76eff12b	remove ipex_optimize_model usage (#38632 ) * remove ipex_optimize_model usage Signed-off-by: YAO Matrix <matrix.yao@intel.com> * update Dockerfile Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> Co-authored-by: root <root@a4bf01945cfe.jf.intel.com>	2025-06-06 20:04:44 +02:00
Yih-Dar	5009252a05	Better CI (#38552 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details better CI Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 17:59:14 +02:00
jiqing-feng	2e889c18e1	fix torch_dtype on awq (#38463 ) Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-06 17:14:00 +02:00
inkcherry	871901cb3d	fix total batch size calculation in trainer (#38286 ) * fix total batch size calculation * update Signed-off-by: inkcherry <mingzhi.liu@intel.com> * Update src/transformers/trainer.py --------- Signed-off-by: inkcherry <mingzhi.liu@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-06 14:54:00 +00:00
Yih-Dar	02f946a038	Don't run `AriaForConditionalGenerationModelTest` on CircleCI (#38615 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details git rid of this model Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 11:30:31 +02:00
Mehant Kammakomati	3d15606e64	fix: support grad clipping for TP through replicating non-sharded modules (#36132 ) * feat: fix tp grad norm: Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * feat: use implicit replication Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> --------- Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-06 11:07:22 +02:00
Yih-Dar	fca6748246	Improve `test_initialization` for `SwiftFormer` (#38636 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 10:47:10 +02:00
Yih-Dar	92a87134ea	update `ColQwen2ModelIntegrationTest` (#38583 ) * update * update * update * update * 4 bit * 8 bit * final --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 10:41:17 +02:00
Raushan Turganbay	dbfc79c17c	[generation] bring back tests on vision models (#38603 ) * bring back geenration tests on VLMs * remove head mask tests overwritten	2025-06-06 08:23:15 +00:00
Yih-Dar	90c4b90a10	Use torch 2.7.1 on CircleCI jobs (#37856 ) 2.7.1 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 10:16:57 +02:00
Yih-Dar	3e35ea1782	Improve `test_initialization` (#38607 ) * fix flaky init tests * fix flaky init tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 10:08:05 +02:00
Yao Matrix	89542fb81c	enable more test cases on xpu (#38572 ) * enable glm4 integration cases on XPU, set xpu expectation for blip2 Signed-off-by: Matrix YAO <matrix.yao@intel.com> * more Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * refine wording Signed-off-by: YAO Matrix <matrix.yao@intel.com> * refine test case names Signed-off-by: YAO Matrix <matrix.yao@intel.com> * run Signed-off-by: YAO Matrix <matrix.yao@intel.com> * add gemma2 and chameleon Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix review comments Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: Matrix YAO <matrix.yao@intel.com> Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-06-06 09:29:51 +02:00
Armaghan Shakir	31023b6909	Fix `MiniMax` (docs and integration tests checkpoint) (#38575 ) * update checkpoints for integration tests * minor fixes in docs	2025-06-06 08:43:11 +02:00

1 2 3 4 5 ...

19358 Commits