transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-03 04:40:06 +06:00

Author	SHA1	Message	Date
Cyril Vallez	0725cd6953	Remove deprecated classes in modeling_utils.py (#38919 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * remove deprecated classes * style	2025-06-19 19:25:20 +02:00
Hamza Benchekroun	797860c68c	feat: add flexible Liger Kernel configuration to TrainingArguments (#38911 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * feat: add flexible Liger Kernel configuration to TrainingArguments Add support for granular Liger Kernel configuration through a new `liger_kernel_config` parameter in TrainingArguments. This allows users to selectively enable/disable specific kernels (rope, swiglu, cross_entropy, etc.) instead of the current approach that rely on default configuration. Features: - Add `liger_kernel_config` dict parameter to TrainingArguments - Support selective kernel application for all supported models - Maintain full backward compatibility with existing `use_liger_kernel` flag Example usage: ```python TrainingArguments( use_liger_kernel=True, liger_kernel_config={ "rope": True, "swiglu": True, "cross_entropy": False, "fused_linear_cross_entropy": True } ) Closes #38905 * Address comments and update Liger section in Trainer docs	2025-06-19 15:54:08 +00:00
Matt	89b35be618	Allow make-fixup on main branch, albeit slowly (#38892 ) * Allow make-fixup on main branch, albeit slowly * Make the other style checks work correctly on main too * More update * More makefile update	2025-06-19 15:22:59 +01:00
Gabe Goodhart	9a02e7602d	feat: Add granite architectures to auto tokenizer name mappings (#38802 ) Branch: GraniteTokenizerMapping Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-19 15:20:42 +01:00
Matt	54a02160eb	Fix ReDOS in tokenizer digit substitution (#38844 ) * Fix regexes vulnerable to ReDOS * Let's just use regex * Import regex/re correctly	2025-06-19 14:53:52 +01:00
ivarflakstad	af6120b3eb	Skip sdpa tests if submodule does not support sdpa (#38907 )	2025-06-19 13:11:01 +00:00
Yih-Dar	5d26a38735	Fix `FalconMambaIntegrationTests` (#38566 ) * update * update * update * update * update * update * update * update * update * update * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-19 13:50:33 +02:00
Yao Matrix	a9ce8c69c9	align xpu's autocast behavior w/ cuda by using device agnostic torch APIs (#38284 ) * siwtch to device agnostic autocast in nemotron to align xpu behavior w/ cuda Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix issue Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> * use torch.cast as other modeling code for decision_transformer&gpt2&imagegpt Signed-off-by: Matrix Yao <matrix.yao@intel.com> * refine Signed-off-by: Matrix Yao <matrix.yao@intel.com> * update get_autocast_gpu_dtype to device agnostic one Signed-off-by: Matrix YAO <matrix.yao@intel.com> * fix style Signed-off-by: Matrix YAO <matrix.yao@intel.com> * fix comments Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com> Signed-off-by: Matrix YAO <matrix.yao@intel.com> Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-06-19 11:48:23 +00:00
Yuanyuan Chen	0a53df1a77	Fix unnecessary super calls (#38897 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-06-19 11:45:51 +00:00
Yih-Dar	b949747b54	Fix `fsmt` tests (#38904 ) * fix 1 * fix 2 * fix 3 * fix 4 * fix 5 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-19 10:56:34 +02:00
eustlb	11738f8537	[phi-4] use mel filters from audio utils (#36966 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * use mel_filter_bank from audio utils * Apply style fixes --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-06-19 12:35:32 +09:00
Lucain	f7b21822e3	Use `raise from e` in `hub.py` utility (#37241 ) Use raise from e Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-06-19 03:06:25 +00:00
Isaac Breen	3756bf192c	Add support for specifying revisions when pushing to Hub via internal Trainer call (#36852 ) * Update training_args.py * Update trainer.py * fixes * fix * remove extraneous comments * explicit revision arg * add msg * fixup * fix field name * rename field revision to hub_revision * restore gradient_checkpointing doc * fix ws --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-06-19 02:35:33 +00:00
Dhruv	458e0b376c	Update bamba model card (#38853 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * Update bamba model card * Update the doc for bamba * Update docs/source/en/model_doc/bamba.md Bamba paragraph Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bamba.md Bamba collection url Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bamba.md Update Padding-Free Training to Notes heading Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bamba.md update examples Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bamba.md Update additional info Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bamba.md consistent casing Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bamba.md simplify sentences Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Include pipeline and cli examples + fix formatting * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bamba.md update cli id * Update quantization example * Fix auto code formatter changes * Update cli command + include BambaModel * Update docs/source/en/model_doc/bamba.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-18 16:01:25 -07:00
Raushan Turganbay	ea01334873	[video processor] fix slow tests (#38881 ) * we need to check against mapping to be safe * need to check only when inferring from image type, otherwise messes custom code --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-06-18 22:39:56 +02:00
Sam Rae	b922b22ec2	36978 \| Fast image processor for DPT model (#37481 ) * chore: ran codegen script * test: test_image_processor_properties * test: test_image_processor_from_dict_with_kwargs * test: wip - test_padding * test: test_padding * test: test_keep_aspect_ratio * wip * test * test: wip * test: wip * test: test_call_segmentation_maps, wip * chore: tidy up * test: test_call_segmentation_maps * fix: test_save_load_fast_slow * test: reduce labels * chore: make fixup * chore: rm comment * chore: tidy * chore remove comment * refactor: no need to infer channel dimesnion * refactor: encapsulate logic for preparing segmentation maps * refactor: improve readability of segmentation_map preparation * improvement: batched version of pad_image * chore: fixup * docs * chore: make quality * chore: remove unecessary comment * fix: add SemanticSegmentationMixin * feat: add post_process_depth_estimation to fast dpt image processor * chore: fix formatting * remove max_height, max_width * fix: better way of processin segmentation maps - copied from Beit Fast processor * chore: formatting + remove TODO * chore: fixup styles * chore: remove unecessary line break * chore: core review suggestion to remove autodocstring * fix: add do_reduce_labels logic + refactor - refactor preprocess logic to make it consistent with other processors - add missing reduce labels logic * refactor: remove deprecated mixin * chore: fixup * use modular for dpt + final nit changes * fix style --------- Co-authored-by: Samuel Rae <samuelrae@Samuels-Air.fritz.box> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-06-18 17:33:29 +00:00
Ashu_kun	c27f628e98	Docs: Add custom fine-tuning tutorial to TrOCR model page (#38847 ) * Update trocr.md Docs: add community fine‑tuning notebook link to TrOCR page * apply suggested changes from PR review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/trocr.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-18 09:38:58 -07:00
Keshav Singh	0a289d1630	log: Add logging when using split_batches and per_device_train_batch_size (#38633 ) * log: Add logging when user uses split_batches and per_device_train_batch_size * refactor: remove whitespace from blank line * Update src/transformers/training_args.py Change logging level to info Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-18 16:26:46 +00:00
Quyu Kong	c55d806355	[bugfix] fix ATTN_MASK_NPU device mismatch error on multi-device NPU … (#38876 ) [bugfix] fix ATTN_MASK_NPU device mismatch error on multi-device NPU setups	2025-06-18 16:26:22 +00:00
Matt	9cd7570f34	Fix loop var naming (#38885 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details	2025-06-18 13:45:01 +00:00
Yuanyuan Chen	1fc67a25c6	More PYUP fixes (#38883 ) More pyup fixes Signed-off-by: cyy <cyyever@outlook.com>	2025-06-18 14:38:08 +01:00
Wing Lian	12d4c5b66f	null deepspeed_plugin in args for wandb callback fake trainer (#38867 )	2025-06-18 13:10:22 +00:00
Stefan	3620b32cc8	Fixed markdown for BertTokenizer's '[CLS]' token. (#38506 )	2025-06-18 13:09:58 +00:00
艾梦	cb0f604192	Fix HQQ model param device transfer issue (#38466 ) * Fix HQQ model param device transfer issue * modify a comment * clear the code and add test for hqq device/dtype * fix test hqq code quality of imports --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-18 15:09:00 +02:00
Yih-Dar	c77bcd889f	Fix `qwen3_moe` tests (#38865 ) * try 1 * try 2 * try 3 * try 4 * try 5 * try 6 * try 7 * try 8 * try 9 * try 10 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-18 14:36:03 +02:00
Cyril Vallez	5a95ed5ca0	🚨🚨 Fix initialization of Mask2Former (#38864 ) * Correctly fix init Co-authored-by: BUI Van Tuan <buivantuan07@gmail.com> * add back the block, breaking BC but this is correct author's code * override the test for params needing it --------- Co-authored-by: BUI Van Tuan <buivantuan07@gmail.com>	2025-06-18 09:46:22 +02:00
Yih-Dar	309e8c96f2	Fix `phi4_multimodal` tests (#38816 ) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-18 09:39:17 +02:00
Yao Matrix	3526e25d3d	enable misc test cases on XPU (#38852 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * enable misc test cases on XPU Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * tweak bamba ground truth on XPU Signed-off-by: YAO Matrix <matrix.yao@intel.com> * remove print Signed-off-by: YAO Matrix <matrix.yao@intel.com> * one more Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-06-18 09:20:49 +02:00
Matt	d058f81e5b	Post-PR fixes! (#38868 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * Post-PR fixes! * make fix-copies	2025-06-17 19:58:47 +01:00
Matt	508a704055	No more Tuple, List, Dict (#38797 ) * No more Tuple, List, Dict * make fixup * More style fixes * Docstring fixes with regex replacement * Trigger tests * Redo fixes after rebase * Fix copies * [test all] * update * [test all] * update * [test all] * make style after rebase * Patch the hf_argparser test * Patch the hf_argparser test * style fixes * style fixes * style fixes * Fix docstrings in Cohere test * [test all] --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-17 19:37:18 +01:00
SohamPrabhu	a396f4324b	Update roc bert docs (#38835 ) * Moved the sources to the right * small Changes * Some Changes to moonshine * Added the install to pipline * updated the monshine model card * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Updated Documentation According to changes * Fixed the model with the commits * Changes to the roc_bert * Final Update to the branch * Adds Quantizaiton to the model * Finsihed Fixing the Roc_bert docs * Fixed Moshi * Fixed Problems * Fixed Problems * Fixed Problems * Fixed Problems * Fixed Problems * Fixed Problems * Added the install to pipline * updated the monshine model card * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Updated Documentation According to changes * Fixed the model with the commits * Fixed the problems * Final Fix * Final Fix * Final Fix * Update roc_bert.md --------- Co-authored-by: Your Name <sohamprabhu@Mac.fios-router.home> Co-authored-by: Your Name <sohamprabhu@Sohams-MacBook-Air.local> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-17 11:02:18 -07:00
Md. Muhaimin Rahman	3ae52cc312	Update CvT documentation with improved usage examples and additional … (#38731 ) * Update CvT documentation with improved usage examples and additional notes * initial update * cvt * Update docs/source/en/model_doc/cvt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update cvt.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-17 10:30:03 -07:00
StevenBucaille	e5a9ce48f7	Add LightGlue model (#31718 ) * init * chore: various changes to LightGlue * chore: various changes to LightGlue * chore: various changes to LightGlue * chore: various changes to LightGlue * Fixed dynamo bug and image padding tests * refactor: applied refactoring changes from SuperGlue's concat, batch and stack functions to LightGlue file * tests: removed sdpa support and changed expected values * chore: added some docs and refactoring * chore: fixed copy to superpoint.image_processing_superpoint.convert_to_grayscale * feat: adding batch implementation * feat: added validation for preprocess and post process method to LightGlueImageProcessor * chore: changed convert_lightglue_to_hf script to comply with new standard * chore: changed lightglue test values to match new lightglue config pushed to hub * chore: simplified convert_lightglue_to_hf conversion map * feat: adding batching implementation * chore: make style * feat: added threshold to post_process_keypoint_matching method * fix: added missing instructions that turns keypoints back to absolute coordinate before matching forward * fix: added typehint and docs * chore: make style * [run-slow] lightglue * fix: add matches different from -1 to compute valid matches in post_process_keypoint_matching * tests: added CUDA proof tests similar to SuperGlue * chore: various changes to modeling_lightglue.py - Added "Copies from" statements for copied functions from modeling_superglue.py - Added missing docstrings - Removed unused functions or classes - Removed unnecessary statements - Added missing typehints - Added comments to the main forward method * chore: various changes to convert_lightglue_to_hf.py - Added model saving - Added model reloading * chore: fixed imports in lightglue files * [run-slow] lightglue * chore: make style * [run-slow] lightglue * Apply suggestions from code review Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * [run-slow] lightglue * chore: Applied some suggestions from review - Added missing typehints - Refactor "cuda" to device variable - Variable renaming - LightGlue output order changed - Make style * fix: added missing grayscale argument in image processor in case use of SuperPoint keypoint detector * fix: changed lightglue HF repo to lightglue_superpoint with grayscale default to True * refactor: make keypoints `(batch_size, num_keypoints, keypoint_dim)` through forward and unsqueeze only before attention layer * refactor: refactor do_layer_keypoint_pruning * tests: added tests with no early stop and keypoint pruning * refactor: various refactoring to modeling_lightglue.py - Removed unused functions - Renamed variables for consistency - Added comments for clarity - Set methods to private in LightGlueForKeypointMatching - Replaced tensor initialization to list then concatenation - Used more pythonic list comprehension for repetitive instructions * refactor: added comments and renamed filter_matches to get_matches_from_scores * tests: added copied from statement with superglue tests * docs: added comment to prepare_keypoint_matching_output function in tests * [run-slow] lightglue * refactor: reordered _concat_early_stopped_outputs in LightGlue class * [run-slow] lightglue * docs: added lightglue.md model doc * docs: added Optional typehint to LightGlueKeypointMatchingOutput * chore: removed pad_images function * chore: set do_grayscale default value to True in LightGlueImageProcessor * Apply suggestions from code review Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Apply suggestions from code review Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * docs: added missing LightGlueConfig typehint in nn.Module __init__ methods * docs: removed unnecessary code in docs * docs: import SuperPointConfig only from a TYPE_CHECKING context * chore: use PretrainedConfig arguments `num_hidden_layers` and `num_attention_heads` instead of `num_layers` and `num_heads` * chore: added organization as arg in convert_lightglue_to_hf.py script * refactor: set device variable * chore: added "gelu" in LightGlueConfig as hidden_act parameter * docs: added comments to reshape.flip.reshape instruction to perform cross attention * refactor: used batched inference for keypoint detector forward pass * fix: added fix for SDPA tests * docs: fixed docstring for LightGlueImageProcessor * [run-slow] lightglue * refactor: removed unused line * refactor: added missing arguments in LightGlueConfig init method * docs: added missing LightGlueConfig typehint in init methods * refactor: added checkpoint url as default variable to verify models output only if it is the default url * fix: moved print message inside if statement * fix: added log assignment r removal in convert script * fix: got rid of confidence_thresholds as registered buffers * refactor: applied suggestions from SuperGlue PR * docs: changed copyright to 2025 * refactor: modular LightGlue * fix: removed unnecessary import * feat: added plot_keypoint_matching method to LightGlueImageProcessor with matplotlib soft dependency * fix: added missing import error for matplotlib * Updated convert script to push on ETH org * fix: added missing licence * fix: make fix-copies * refactor: use cohere apply_rotary_pos_emb function * fix: update model references to use ETH-CVG/lightglue_superpoint * refactor: add and use intermediate_size attribute in config to inherit CLIPMLP for LightGlueMLP * refactor: explicit variables instead of slicing * refactor: use can_return_tuple decorator in LightGlue model * fix: make fix-copies * docs: Update model references in `lightglue.md` to use the correct pretrained model from ETH-CVG * Refactor LightGlue configuration and processing classes - Updated type hints for `keypoint_detector_config` in `LightGlueConfig` to use `SuperPointConfig` directly. - Changed `size` parameter in `LightGlueImageProcessor` to be optional. - Modified `position_embeddings` in `LightGlueAttention` and `LightGlueAttentionBlock` to be optional tuples. - Cleaned up import statements across multiple files for better readability and consistency. * refactor: Update LightGlue configuration to enforce eager attention implementation - Added `attn_implementation="eager"` to `keypoint_detector_config` in `LightGlueConfig` and `LightGlueAttention` classes. - Removed unnecessary logging related to attention implementation fallback. - Cleaned up import statements for better readability. * refactor: renamed message into attention_output * fix: ensure device compatibility in LightGlueMatchAssignmentLayer descriptor normalization - Updated the normalization of `m_descriptors` to use the correct device for the tensor, ensuring compatibility across different hardware setups. * refactor: removed Conv layers from init_weights since LightGlue doesn't have any * refactor: replace add_start_docstrings with auto_docstring in LightGlue models - Updated LightGlue model classes to utilize the new auto_docstring utility for automatic documentation generation. - Removed legacy docstring handling to streamline the code and improve maintainability. * refactor: simplify LightGlue image processing tests by inheriting from SuperGlue - Refactored `LightGlueImageProcessingTester` and `LightGlueImageProcessingTest` to inherit from their SuperGlue counterparts, reducing code duplication. - Removed redundant methods and properties, streamlining the test setup and improving maintainability. * test: forced eager attention implementation to LightGlue model tests - Updated `LightGlueModelTester` to include `attn_implementation="eager"` in the model configuration. - This change aligns the test setup with the recent updates in LightGlue configuration for eager attention. * refactor: update LightGlue model references * fix: import error * test: enhance LightGlue image processing tests with setup method - Added a setup method in `LightGlueImageProcessingTest` to initialize `LightGlueImageProcessingTester`. - Included a docstring for `LightGlueImageProcessingTester` to clarify its purpose. * refactor: added LightGlue image processing implementation to modular file * refactor: moved attention blocks into the transformer layer * fix: added missing import * fix: added missing import in __all__ variable * doc: added comment about enforcing eager attention because of SuperPoint * refactor: added SuperPoint eager attention comment and moved functions to the closest they are used --------- Co-authored-by: Steven Bucaille <steven.bucaille@buawei.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-06-17 18:10:23 +02:00
Yih-Dar	2507169bf6	Fix `qwen3` tests (#38862 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * fix * update * update * update * update * update * update * format --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-17 15:21:36 +02:00
Mikhail Moskovchenko	41e0c921cb	Improve `auxiliary_in_channels` default behavior in UperNet (#37540 ) Improve auxiliary_in_channels behavior in UperNet Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-06-17 12:56:46 +00:00
Yih-Dar	c61ca64aaa	Fix `qwen2_5_vl` tests (#38845 ) * fix * breakpoint() * breakpoint() * update * update * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-17 10:55:24 +02:00
Kimish Patel	37367c7d9f	Allow customization of sdpa in executorch.py (#38827 ) Earlier PR put executorch specific sdpa and mask function in the export function. This prevent any customization that can be done to sdpa, prior to export. By moving this to __init__, we still keep the original behavior but allow users like optimum-executorch to override sdpa by setting model.config._attn_implementation.	2025-06-17 10:38:20 +02:00
Jingxiang Zhang	9c878d2f64	Fix incorrect width ratio calculation in Llama4 image processor (#38842 )	2025-06-17 07:33:36 +00:00
Raushan Turganbay	bf370e446b	[video processor] fix BC when no video config if found (#38840 ) fix auto video processor	2025-06-17 09:20:16 +02:00
Dhruv	e61160c5db	Remove merge conflict artifacts in Albert model doc (#38849 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details	2025-06-16 14:21:18 -07:00
Vanshu	64e9b049d9	Updated aya_vision.md (#38749 ) * Update aya_vision.md * Suggested changes made to aya_vision.md * Quantization Example added - aya_vision.md * Polished - aya_vision.md * Update aya_vision.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-16 10:46:30 -07:00
Shawn Tan	5ab0f447ab	GraniteMoeHybrid: Allow for only shared expert case. (#38801 ) * Allow for only shared expert case. * Style	2025-06-16 16:15:42 +01:00
Yusuf Shihata	a7593a1d1f	[BugFix] QA pipeline edge case: `align_to_words=True` in `QuestionAnsweringPipeline` can lead to duplicate answers (#38761 ) * fixing the problem align_to_words=True leading to duplicate solutions * adding tests * some fixes * some fixes * changing the handle_duplicate_answers=False by default * some fixese * some fixes * make the duplicate handling the default behaviour and merge duplicates * make the duplicate handling the default behaviour	2025-06-16 15:01:22 +00:00
Drew Ross	18c7f32daa	Fix broken tag in Longformer model card (#38828 )	2025-06-16 07:44:40 -07:00
VolodymyrBg	b44b04ee9a	Fix broken notebooks link in Italian training docs (#38834 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details	2025-06-16 07:38:51 -07:00
Cyril Vallez	9300728665	Fix peft integration (#38841 ) Update peft.py	2025-06-16 10:39:25 +02:00
Cyril Vallez	608884960e	add default mapping to peft integration	2025-06-16 10:23:51 +02:00
Manuel Faysse	ce6ac53ac1	bugfix: propage weight key_mapping to peft to fix 3.52 VLM renaming (#38627 ) * propage key mapping to peft * propage key mapping to peft * make requested changes * revert	2025-06-16 10:10:23 +02:00
Yaswanth Gali	925da8ac56	Fix redundant code in Janus (#38826 ) * minor mistake * modify return statements	2025-06-16 06:53:59 +00:00
Raushan Turganbay	d2fd3868bb	[internvl] fix video inference (#38811 ) fix	2025-06-16 08:37:30 +02:00

1 2 3 4 5 ...

19357 Commits