transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Garrett Goon	390f153469	Add padding-free to bamba (#35861 ) * add seq_idx and fa kwargs * update tests * docs and grad ckpt support * fmt * better names * test_raise_missing_padding_free_kwarg_errs * + seq_idx in doc strings * padding free training docs * add link to pr plots * raise err on attn_mask with padding free * rm raising missing padding free err test * BambaFlashAttentionKwargs * run modular util for modular_granitemoehybrid.py	2025-05-20 17:13:59 +02:00
ivarflakstad	3f0b7d0fac	Mamba2 remove unecessary test parameterization (#38227 )	2025-05-20 13:54:04 +00:00
Pablo Montalvo	9cde2f5d42	Minor llama4 fixes (#38123 ) * fix wrong scaling value/default Cache init * style * fix various issues on integration tests * change expected outputs * fixup * fix config access * protect default scaling	2025-05-20 13:15:54 +00:00
ivarflakstad	de70c8426e	Disable torchscript tests for AriaForConditionalGenerationModelTest (#38225 ) Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-05-20 14:37:55 +02:00
Manuel de Prada Corral	d34e21e7dd	New cache tests and refactored Hybrid Cache (#37972 )	2025-05-20 12:46:13 +02:00
Titus	f022bf9322	Remove trust_remote_code=True tests from bnb quantization tests (MPT now integrated) (#38206 ) bnb quant tests: remove obsolete trust_remote_code test The MPT model is now natively integrated in Transformers and no longer requires trust_remote_code=True. This removes the failing test_get_keys_to_not_convert_trust_remote_code and related usage, which depended on remote code and caused CI issues due to missing dependencies (e.g., triton_pre_mlir).	2025-05-20 11:43:11 +02:00
Raushan Turganbay	0a52bd2403	[fix] sliding window attention mask (#38045 ) * fix sliding attn * make style * Update tests/test_modeling_common.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * no a second throught, should default to `True` fo BC --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-05-20 09:32:19 +00:00
Yao Matrix	3bd1c20149	enable misc cases on XPU & use device agnostic APIs for cases in tests (#38192 ) * use device agnostic APIs in tests Signed-off-by: Matrix Yao <matrix.yao@intel.com> * more Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> * add reset_peak_memory_stats API Signed-off-by: YAO Matrix <matrix.yao@intel.com> * update --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com> Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-20 10:09:01 +02:00
Matej Sirovatka	46a4b7c909	Feat: save_pretrained for tensor parallel (and other parallelisms) models (#37919 ) * tmp: initial save pretrained with dtensors * Feat: add correctness tests * Refactor: version checks * Temp: 1:1 checkpoint llama4 * refactor * Tests * Feat: works * Style * Feat: version checks + minor fixes * Style * Fix: version checks in tests * Feat: move more stuff into tensor_parallel.py	2025-05-19 18:16:21 +00:00
Joao Gante	9c500015c5	🚨🚨🚨 [pipelines] update defaults in pipelines that can `generate` (#38129 ) * pipeline generation defaults * add max_new_tokens=20 in test pipelines * pop all kwargs that are used to parameterize generation config * add class attr that tell us whether a pipeline calls generate * tmp commit * pt text gen pipeline tests passing * remove failing tf tests * fix text gen pipeline mixin test corner case * update text_to_audio pipeline tests * trigger tests * a few more tests * skips * some more audio tests * not slow * broken * lower severity of generation mode errors * fix all asr pipeline tests * nit * skip * image to text pipeline tests * text2test pipeline * last pipelines * fix flaky * PR comments * handle generate attrs more carefully in models that cant generate * same as above	2025-05-19 18:02:06 +01:00
NielsRogge	7c9b0ca08c	[SAM-HQ] Update names in the docs (#38058 ) Update names	2025-05-19 09:21:14 -07:00
Shane A	aef12349b6	Make HF implementation match original OLMo 2 models for lower precisions (#38131 ) * Make HF implementation match OLMo models for lower precisions * Add test of 1B logits in bfloat16 * Run make fixup	2025-05-19 15:35:23 +02:00
Lysandre Debut	003deb16f1	Support for transformers explicit filename (#38152 ) * Support for transformers explicit filename * Tests * Rerun tests	2025-05-19 14:33:47 +02:00
Joao Gante	dbb9813dff	[generation] Less verbose warnings by default (#38179 ) * tmp commit (imports broken) * working version; update tests * remove line break * shorter msg * dola checks need num_beams=1; other minor PR comments * update early trainer failing on bad gen config * make fixup * test msg	2025-05-19 10:03:37 +00:00
Joao Gante	40a493c7ed	[tests] remove `test_sdpa_equivalence` (redundant) (#37911 ) * rm test_sdpa_equivalence * make fixup --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-05-16 18:37:27 +01:00
kang sheng	ea29f61ed9	fix bug in distributed loss test (#38166 ) * fix bug in distributed loss test and change some config to pass at both 2&8 gpus * fix doc	2025-05-16 16:21:35 +00:00
Yoni Gozlan	0ba95564b7	Add args support for fast image processors (#37018 ) * add args support to fast image processors * add comment for clarity * fix-copies * Handle child class args passed as both args or kwargs in call and preprocess functions * revert support args passed as kwargs in overwritten preprocess * fix image processor errors	2025-05-16 12:01:46 -04:00
Peter St. John	d69945e5fc	[ESM] Add flash-attention-2 backend for ESM-2 (#38023 ) * Add flash-attention-2 backend for ESM-2 Signed-off-by: Peter St. John <pstjohn@nvidia.com> * update extended_attention_mask for fa2 Signed-off-by: Peter St. John <pstjohn@nvidia.com> * add test_flash_attn_2_equivalence test Signed-off-by: Peter St. John <pstjohn@nvidia.com> --------- Signed-off-by: Peter St. John <pstjohn@nvidia.com>	2025-05-16 14:11:56 +01:00
Yao Matrix	7f28da2850	clean autoawq cases on xpu (#38163 ) * clean autoawq cases on xpu Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com>	2025-05-16 13:56:43 +02:00
Raushan Turganbay	01ad9f4b49	Bart: new cache format (#35314 ) * bart compile * add mbart * some more models touched by fix-copies * more * more models * even more models * fix copies * fix tests * fix copies * fix * biogpt accepts position ids now (breaking?) * fix failing non-slow tests * fix some tests * should not be removed * small update * Update src/transformers/models/bart/modeling_bart.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * update for last `main` * fix copies * clone `update_causal_mask` from llama * tmp * fixup * why? how? * fix bart tests * dont skip test * address comments * fix tests * fix * fixup and delete the file --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-05-16 13:26:54 +02:00
Jerry Zhang	44fa04ae8d	Include output embedding as well with `include_embedding` flag (#37935 ) * Include output embedding as well with `include_embedding` flag Summary: att Test Plan: python tests/quantization/torchao_integration/test_torchao.py -k test_include_embedding Reviewers: Subscribers: Tasks: Tags: * format * rename include_embedding to include_input_output_embeddings --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-05-16 12:06:11 +02:00
Yao Matrix	34c1e29cdd	enable autoround cases on XPU (#38167 ) * enable autoround cases on XPU Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com>	2025-05-16 09:08:35 +00:00
Yao Matrix	7caa57e85e	enable trainer test cases on xpu (#38138 ) * enable trainer test cases on xpu Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com>	2025-05-15 12:17:44 +00:00
Joao Gante	0e0e5c1044	[generate] Run custom generation code from the Hub (#36405 ) * mvp * remove trust_remote_code * generate_from_hub * handle requirements; docs * english * doc PR suggestions * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * changed remote code path to generate/generate.py * model repo has custom generate -> override base generate * check for proper inheritance * some doc updates (missing: tag-related docs) * update docs to model repo * nit * nit * nits * Update src/transformers/dynamic_module_utils.py * Apply suggestions from code review * Update docs/source/en/generation_strategies.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * trust remote code is required * use new import utils for requirements version parsing * use org examples * add tests * Apply suggestions from code review Co-authored-by: Manuel de Prada Corral <6536835+manueldeprada@users.noreply.github.com> * ascii file structure; tag instructions on readme.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Manuel de Prada Corral <6536835+manueldeprada@users.noreply.github.com>	2025-05-15 10:35:54 +01:00
Raushan Turganbay	955e61b0da	Remove head mask in generative models (#35786 ) * just squash into one commit * delete print	2025-05-15 10:44:19 +02:00
Yao Matrix	0173a99e73	enable csm integration cases on xpu, all passed (#38140 ) * enable csm test cases on XPU, all passed Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com>	2025-05-15 09:46:29 +02:00
Kirire	935bbbc711	Add config validation and style tweaks (#37589 ) * Add config validation and style tweaks * Fix style issues * Fix style issues * style * Small fixes for copy/paste errors --------- Co-authored-by: Cyrile <cyrile.delestre@arkea.com>	2025-05-14 12:22:10 +00:00
ivarflakstad	1b00966395	Fix auto batch size finder test (#38125 ) Ensure --auto_find_batch_size is the last test arg so indexing is correct	2025-05-14 12:12:04 +00:00
Ritwick Chaudhry	fe918d13b9	Fix temporal padding in Qwen2VLImageProcessor when the number of frames is not divisible by temporal_patch_size (#38076 ) Qwen2VL: Fix temporal padding in Qwen2VLImageProcessor when frames are not divisible by temporal_patch_size	2025-05-14 12:28:21 +02:00
Raushan Turganbay	aaf224d570	[video processor] fix tests (#38104 ) * fix tests * delete * fix one more test * fix qwen + some tests are failing irrespective of `VideoProcessor` * delete file	2025-05-14 10:24:07 +00:00
Yao Matrix	9b5ce556aa	enable finegrained_fp8 and granite_speech cases on XPU (#38036 ) * enable finegrained_fp8 cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> * change back to auto Signed-off-by: Yao Matrix <matrix.yao@intel.com> * rename per comments Signed-off-by: Matrix Yao <matrix.yao@intel.com> --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com> Signed-off-by: Matrix Yao <matrix.yao@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-05-14 08:58:40 +00:00
eustlb	e0f225cb10	[CSM] update test for t4 runners (#38110 ) update test for t4 runners	2025-05-13 11:59:26 -04:00
Jinyong Lee	342961f669	Add Fast Image Processor for vilt (#37304 ) * init vilt image processor fast * Refactor image processor tests to use loop for all processors * Add ViltImageProcessorFast with PyTorch-based optimized image processing * Change made automatically by make fixup command * Change made automatically by make fix-copies command * Fix type hints in ViltImageProcessorFast for Python compatibility * Define constants for image resizing based on COCO dataset aspect ratio * Add missing property initializations to ViltImageProcessorFast * Extract resize logic into dedicated method in ViltImageProcessorFast * Extract padding logic into dedicated method * Implement shape-based image grouping for optimized processing in Vilt * Update test suite to verify ViltImageProcessorFast attributes * Move variable declarations to _preprocess method parameters * Remove unused parameters * Rename _resize method to resize to override existing function * Remove whitespace * Remove unnecessary type check and conversion for stacked_images * Remove redundant loop and apply padding directly to stacked images * Refactor pad function to return images and mask as tuple instead of dict * Add tests comparing padding masks in slow and fast implementations * Update ViltImageProcessor tests to ensure compatibility between slow and fast implementations * Replace add_start_docstrings with auto_docstring in ViltImageProcessorFast * Move docstrings of custom args to ViltFastImageProcessorKwargs * Use reorder_images function for both masks and images --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-05-13 15:40:53 +00:00
youngrok cha	a5cc7a67d7	[bug] fix llava processor to calculate unpadding size correctly (#37988 ) * fix llava processor to calculate unpad size correctly * repo consistency * Revert "repo consistency" & "setUp in llava family" This reverts commit `26a50af8db`. * add edge case test for padding & unpadding * compute unpadding size from original size * make test config explicit * Revert "compute unpadding size from original size" This reverts commit `752cd27ad9`. * Revert "add edge case test for padding & unpadding" This reverts commit `ccbd094d69`. * revert unpad logic * remove irrelevant tests * model test * remove processor from model test --------- Co-authored-by: jaycha <jaycha@ncsoft.com>	2025-05-13 13:49:09 +00:00
Raushan Turganbay	e40f301f1f	[smolvlm] skip the test (#38099 ) skip the test	2025-05-13 12:50:43 +00:00
ivarflakstad	e27d230ddd	Disable report callbacks for certain training tests (#38088 ) * Disable report callbacks for certain training tests * Disable report callbacks for test_auto_batch_size_finder	2025-05-13 14:49:55 +02:00
Yih-Dar	3ad35d0bca	update `require_read_token` (#38093 ) * update require_read_token * new repo * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-13 12:07:07 +02:00
Yoni Gozlan	e3b70b0d1c	Refactor image processor phi4 (#36976 ) * refactor image processor phi4 * nits fast image proc * add image tests phi4 * Fix image processing tests * update integration tests * remove revision and add comment in integration tests	2025-05-12 15:13:40 -04:00
Shiyu	a63cb7578e	update seed_worker to set seed based on worker_id and rank (#37980 ) * update seed_worker to set seed based on worker_id and rank * test case * set output_dir as remove tmp dir	2025-05-12 15:59:16 +00:00
efsotr	e387821a96	Fix tot update in trainer (#37923 ) * fix total updates in epoch * add test; fix max_steps * replace with multi-gpu decorator	2025-05-12 17:45:24 +02:00
ivarflakstad	8ea72d12a2	Fix mt5 test on AMD devices (#38081 )	2025-05-12 16:59:00 +02:00
ivarflakstad	7eaa90b87b	Add AMD expectation to test_gpt2_sample (#38079 )	2025-05-12 16:51:21 +02:00
Pavel Iakubovskii	4220039b29	Fix OneFormer integration test (#38016 ) * Fix integration tests * format	2025-05-12 16:02:41 +02:00
Raushan Turganbay	a5c6172c81	[VLM] fix loading issues (#38051 ) * fix qwen2-vl loading * fix a few nore models * delete print * fix copies	2025-05-12 10:14:04 +00:00
Raushan Turganbay	a31fa218ad	🔴 Video processors as a separate class (#35206 ) * initial design * update all video processors * add tests * need to add qwen2-vl (not tested yet) * add qwen2-vl in auto map * fix copies * isort * resolve confilicts kinda * nit: * qwen2-vl is happy now * qwen2-5 happy * other models are happy * fix copies * fix tests * add docs * CI green now? * add more tests * even more changes + tests * doc builder fail * nit * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * small update * imports correctly * dump, otherwise this is getting unmanagebale T-T * dump * update * another update * update * tests * move * modular * docs * test * another update * init * remove flakiness in tests * fixup * clean up and remove commented lines * docs * skip this one! * last fix after rebasing * run fixup * delete slow files * remove unnecessary tests + clean up a bit * small fixes * fix tests * more updates * docs * fix tests * update * style * fix qwen2-5-vl * fixup * fixup * unflatten batch when preparing * dump, come back soon * add docs and fix some tests * how to guard this with new dummies? * chat templates in qwen * address some comments * remove `Fast` suffix * fixup * oops should be imported from transforms * typo in requires dummies * new model added with video support * fixup once more * last fixup I hope * revert image processor name + comments * oh, this is why fetch test is failing * fix tests * fix more tests * fixup * add new models: internvl, smolvlm * update docs * imprt once * fix failing tests * do we need to guard it here again, why? * new model was added, update it * remove testcase from tester * fix tests * make style * not related CI fail, lets' just fix here * mark flaky for now, filas 15 out of 100 * style * maybe we can do this way? * don't download images in setup class --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-05-12 11:55:51 +02:00
Arjuna Sky Kok	716819b830	fix(conversion): Fix size mismatch error during TF->PT model loading (#38014 )	2025-05-10 11:11:07 +00:00
Yao Matrix	8f08318769	enable generation fsdp/utils cases on XPU (#38009 ) * enable generation fsdp/utils test cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> * xx Signed-off-by: Yao Matrix <matrix.yao@intel.com> * use backend_xx APIs Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com>	2025-05-09 20:52:41 +00:00
Lysandre Debut	23d79cea75	Support for version spec in requires & arbitrary mismatching depths across folders (#37854 ) * Support for version spec in requires & arbitrary mismatching depths * Quality * Testing	2025-05-09 15:26:27 +02:00
Yao Matrix	a72cb31434	enable utils test cases on XPU (#38005 ) * enable utils test cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> * Update tests/utils/test_skip_decorators.py Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> * fix comment Signed-off-by: Yao Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>	2025-05-09 08:45:01 +02:00
Yao Matrix	1dfad4beb2	make mistral3 pass on xpu (#37882 ) * enabled mistral3 test cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * calibrate A100 expectation Signed-off-by: YAO Matrix <matrix.yao@intel.com> * update * update * update * update * update * update --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com> Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-09 06:41:11 +00:00

1 2 3 4 5 ...

4913 Commits