transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-04 13:20:12 +06:00

Author	SHA1	Message	Date
eustlb	798f948e88	Add CSM model (#36719 ) * draft structure * depth decoder with forward pre hook * full model forward draft * draft update * depth decoder update * ConversationalSpeechModelForCausalLM udpates * add generate * max length criteria small fix * udpate * updates * generation update * update in loss compute * conversion script * update for correct input embeddings * handle interleaved rope * update * update * update * support compile * update training * add doc * update doc * correct inits * ConversationalSpeechModel -> Csm * conf update * name update * tests CsmForCausalLMTest * convert use cached_file * conf + modeling updates * generate utils handle third dim shape * integration test * modeling + conf updates * common test handle more than 2 dims * add nested audio list utils * processing handle nested audio list * csm processing draft * mimi util * init updates * modular update * convert modular * processing update * csm tests update * generate tests handle third dim * generate utils handle third dim * propagate _get_initial_cache_position update * tied_weight_keys update + convert correctly * fix inputs_embeds * revert audio nested list * batch inference update + return audio * audio_utils update * processor update * some more integration tests * remove old test * porcessing output labels * improve * fix * update rope values with equivalent ones * conversion update * udpate tests * handle depth decoder generation config * remove default eos_token_id * make style * revert modeling_mimi * add default generation_config * remove sdpa since handled by default * make * fix conflict * fix conflicts * correct naming * correct imports * make * causal -> conditional naming * causal -> conditional naming * auto update * make * make * add doc * test update * fix weight init * audio tokens offsets as buffer * 4d mask in conditional class * make * doc update * fix causal mask * fix causal mask * doc update * doc update * add processor doc * update doc * fix 4d causal mask * update make_list_of_audio * do not default to mutable * remove duplicates * remove useless reset_parameters * use GradientCheckpointingLayer * use can_return_tuple * formatting * prepend placeholder in _sample * torch compile fix * some more fixies * convert modular * fix * default max_length in convert * handle depth decoder generation config correctly * clearer formulation * handle output_loading_info * handle softmax warning * add doc * propagate _get_initial_cache_position changes * generation in its own module * add processor tests * fix compile witu cuda graphs * fix compile with cuda graphs * add csm.md * include CSM loss * doc nit * doc nit * doc nit * Update docs/source/en/model_doc/csm.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add save_audio to processor * Update src/transformers/models/csm/modular_csm.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * doc update * simplify audio_codes_mask computation * doc update * simplify loss computation * fix static cache test * fix * remove comment * simplify encoded length computation * use hf-internal-testing * doc update * cast to float before numpy * nit * mem efficient codebook head * nit * cat input values with cutoffs --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-05-07 10:20:13 -04:00
Yao Matrix	34f26e2c3e	enable internvl UTs on XPU (#37779 ) * enable internvl UTs on XPU Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style per comments Signed-off-by: Yao Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Signed-off-by: Yao Matrix <matrix.yao@intel.com>	2025-04-30 10:29:40 +02:00
co63oc	d5fa7d2d19	Fix typos in strings and comments (#37799 )	2025-04-28 11:39:11 +01:00
Cyril Vallez	58e5e976e0	Small fix on context manager detection (#37562 ) * small fixes * Update modeling_utils.py * test * Update test_modeling_common.py * Update test_modeling_timm_backbone.py * more general * simpler	2025-04-17 15:39:44 +02:00
Cyril Vallez	688f4707bf	All models can be initialized on meta device (#37563 ) * Update test_modeling_common.py * fix all * more fixes	2025-04-16 23:26:44 +02:00
Yih-Dar	5a6de703a7	Run `test_can_load_with_global_device_set` using a subprocess (#37553 ) * fix * fix * fix * Update tests/test_modeling_common.py Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-04-16 19:48:30 +02:00
Garrett Goon	503541d7ef	add FlashAttentionKwargs and seq_idx to flat collator (#36456 ) * add flash attn kwargs to flattening collator * add return_seq_idx option * doc string edits * cleaner max len updates * various fixes * temp testing code * return int32 seq_idx and FlashAttnKwargs * DataCollatorIntegrationTest impl * fix batch dims and dtypes * fill out remaining collator tests * test name change and fmt * rm unused var * fmt * minor change * fmt * add missing pos_ids check * consistent {np,pt,tf} tests * split pt tests into 3, like np/tf tests * mv comment, rename fa test * remove batch dim comment * simply wrapping * compute cu_seq_len/max_length once * fmt * remove tf code * rm warning * move separator_id back to 2nd pos * use cleaner lists in tests * ret -> batch * fmt * attr ordering * use py ints for max_length_{k,q}	2025-04-16 15:45:03 +02:00
Yao Matrix	33f6c5a5c8	enable several cases on XPU (#37516 ) * enable several cases on XPU Signed-off-by: YAO Matrix <matrix.yao@intel.com> * Update tests/test_modeling_common.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-04-16 11:01:04 +02:00
Cyril Vallez	c8e0e603de	Detect and use device context manager or global device in `from_pretrained` (#37216 ) * Update modeling_utils.py * improve * Update modeling_utils.py * Update test_modeling_common.py * Update test_modeling_timm_backbone.py * Update test_modeling_common.py * Update test_modeling_common.py * Update test_modeling_common.py * Update test_modeling_common.py * CIs	2025-04-15 09:59:20 +02:00
Cyril Vallez	4e53840920	Detect and fix most `_init_weights()` issues - make it work for composite models (#37070 ) * Update test_modeling_common.py * Fix Llama and its modular children * Update test_modeling_common.py * qwen3 * first try at prioritizing models * Update test_modeling_common.py * Update test_modeling_common.py * Update test_modeling_common.py * test * fix * fix * more models * more * more * more * smarter init for composite models! * fix post rebase * smol * fix missing args * more * typo * Super elegant and efficient init for submodels * Update modeling_utils.py * style * last fixes * cleanup * finalize cleanup * CIs * improve docstring * Update modeling_utils.py * llama4 * style * CIs * style * add dpt * granite speech * qwen 2.5 omni * better fix * Parse the config file instead * CIs	2025-04-14 16:19:04 +02:00
Lysandre Debut	54a123f068	Simplify soft dependencies and update the dummy-creation process (#36827 ) * Reverse dependency map shouldn't be created when test_all is set * [test_all] Remove dummies * Modular fixes * Update utils/check_repo.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * [test_all] Better docs * [test_all] Update src/transformers/commands/chat.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * [test_all] Remove deprecated AdaptiveEmbeddings from the tests * [test_all] Doc builder * [test_all] is_dummy * [test_all] Import utils * [test_all] Doc building should not require all deps --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-04-11 11:08:36 +02:00
cyyever	371c44d0ef	Remove old code for PyTorch, Accelerator and tokenizers (#37234 ) * Remove unneeded library version checks Signed-off-by: cyy <cyyever@outlook.com> * Remove PyTorch condition Signed-off-by: cyy <cyyever@outlook.com> * Remove PyTorch condition Signed-off-by: cyy <cyyever@outlook.com> * Fix ROCm get_device_capability Signed-off-by: cyy <cyyever@outlook.com> * Revert "Fix ROCm get_device_capability" This reverts commit `0e756434bd`. * Remove unnecessary check Signed-off-by: cyy <cyyever@outlook.com> * Revert changes Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-04-10 20:54:21 +02:00
ivarflakstad	aa478567f8	Allow rocm systems to run these tests (#37278 ) * Allow rocm systems to run these tests * Fix skipTest logic * Use get_device_properties to check system capabilities	2025-04-10 13:33:01 +02:00
cyyever	1e6b546ea6	Use Python 3.9 syntax in tests (#37343 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-04-08 14:12:08 +02:00
Yao Matrix	f697b3f824	enable 2 types of case on XPU (#37198 ) enable 2 types of case on XPU 1. test_resize_tokens_embeddings_with_deepspeed_multi_gpu 2. test_resize_embeddings_untied_with_deepspeed_multi_gpu Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-04-03 11:37:55 +02:00
Cyril Vallez	6ce238fe7a	Fix test (#37213 ) * Update test_modeling_common.py * style	2025-04-03 10:24:34 +02:00
Pavel Iakubovskii	3249c5dc15	Refactor attention for SigLIP based models (#36981 ) * Update Siglip attention implementation * Update tests for Siglip * Remove one level of indentation * Update test to be more specific * Fixup * Idefics2 * Idefics3 * Emu3 * SmolVLM * Phi4 (just init small update) * Idefics2 (test fix) * Update siglip2 tests * Update eager * trigger * Clean up * Transfer inputs to device in test * Fixing test * Fixing test * Revert contiguous * Remove unused is_flash_attn_2_available * Move flaky to specific models	2025-04-01 15:37:25 +02:00
cyyever	786d9c5ed9	Fix more inefficient PT operations (#37060 ) * Fix inefficient operations * Remove cpu() call * Reorder detach() * Reorder detach() * tolist without detach * item without detach * Update src/transformers/models/rag/modeling_rag.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/models/encodec/test_modeling_encodec.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Use detach().cpu().numpy * Revert some numpy operations * More fixes --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-31 16:31:24 +01:00
Cyril Vallez	f304318f5f	Remove low_cpu_mem_usage and _fast_init (#36963 ) * Remove low_cpu_mem_usage and _fast_init * Update deepspeed.py * Update modeling_utils.py * remove the first 2 tests everywhere * Update test_modeling_common.py * remove what was remaining about fast_init * fix logic and simplify * mismatched keys logic update * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * fix 2 models init_weights * extend to others * remove grad * Update modeling_fsmt.py * init weights in tests * style * Update test_modeling_fsmt.py * more old models * fix more init_weights * copies * fix * style * Update modeling_lxmert.py * fix inits * more and more * more * should finalize * style * Update modeling_dinov2_with_registers.py * fix * Update modeling_encoder_decoder.py * fix * style * Update modeling_lxmert.py * post rebase cleanup * Update modeling_informer.py * back to start for device * fix * add test to detect all failing cases correctly * Update test_modeling_common.py * fix * fix * sam * style * Update modeling_maskformer_swin.py * CIs * CIs * remove test - will add it on separate PR * fix * fix * Update modeling_sam.py * CIs * CIs * CIs * convnext * suggestions * CIs * fix copies after merge --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-31 17:18:43 +02:00
Abu Bakr Soliman	49b5ab6a27	Support QuestionAnswering Module for ModernBert based models. (#35566 ) * push ModernBertForQuestionAnswering * update ModernBertForQuestionAnswering * update __init__ loading * set imports for ModernBertForQuestionAnswering * update ModernBertForQuestionAnswering * remove debugging logs * update init_weights method * remove custom initialization for ModernBertForQuestionAnswering * apply make fix-copies * apply make style * apply make fix-copies * append ModernBertForQuestionAnswering to the pipeline supported models * remove unused file * remove invalid autoload value * update en/model_doc/modernbert.md * apply make fixup command * make fixup * Update dummies * update usage tips for ModernBertForQuestionAnswering * update usage tips for ModernBertForQuestionAnswering * add init * add lint * add consistency * update init test * change text to trigger stuck text * use self.loss_function instead of custom loss By @Cyrilvallez Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update modeling_modernbert.py make comparable commit to even it out * Match whitespace * whitespace --------- Co-authored-by: Matt <rocketknight1@gmail.com> Co-authored-by: Orion Weller <wellerorion@gmail.com> Co-authored-by: Orion Weller <31665361+orionw@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-03-26 21:24:18 +01:00
Yih-Dar	c6814b4ee8	Update ruff to `0.11.2` (#36962 ) * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-25 16:00:11 +01:00
Pavel Iakubovskii	66291778dd	Refactor Attention implementation for ViT-based models (#36545 ) * Refactor vit attention * Refactor ViT-based models * 🚨🚨🚨 Fix prefix for DPT * Update params order * trigger tests * Fix Dinov2 attention * Fix DPT attention impl propagation for backbone config * Common test fix: config is modif. inplace - avoid it * view->reshape * Fixup * Fixup * Enable IJepa FA2 * Add FA2 in corresponding model docs	2025-03-20 15:15:01 +00:00
Joao Gante	cff4caa0c1	[CI] remove redundant checks in `test_eager_matches_sdpa_inference` (#36740 )	2025-03-17 16:29:18 +00:00
Joao Gante	42ebb6c23e	[tests] Parameterized `test_eager_matches_sdpa_inference` (#36650 )	2025-03-14 14:41:27 +00:00
Cyril Vallez	071a161d3e	[core] Large/full refactor of `from_pretrained` (#36033 ) * squash everything together start to simplify inner logic Update modeling_utils.py Update modeling_utils.py Update modeling_utils.py Update modeling_utils.py continue refactor fix small fixes add type hints/docstring Update modeling_utils.py remove _fast_init keep improving Update modeling_utils.py Update modeling_utils.py new first tp loading version style fix weird in-place op trigger CIs Update modeling_utils.py much clearer renaming of keys fix update Update test_modeling_common.py trigger CIs update update style Update modeling_utils.py Update modeling_utils.py Update modeling_utils.py fix fast download first prototype remove old function remove old functions Remove unused function and move back _get_tp_registry fix tp plan registry simplify CIs Update hub.py Update modeling_utils.py simplify simplify renaming logic remove unused check add sanity check back (a test depends on it) Update modeling_utils.py finalize sound renaming logic style add forgotten check Update modeling_utils.py add key_mapping keyword style Update modeling_utils.py add comment minor updates minor change for clarity fix small prefix issue and simplify style trigger CIs typo fix Post rebase fix post rebase cleanup simplify tp typo oupsi typo correctly escape improvements based on Marc's review finalize Marc's review comments squash everything * improve * Update modeling_utils.py * Update modeling_utils.py * fix * Update modeling_utils.py * Update modeling_utils.py * style * Update modeling_utils.py * simplify * style * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * fix dtype issue * Update modeling_utils.py * style * remove test that does not make sense * style * small fixes * style * fix * cleanup after rebase * style * typo * escape * tp for task specific top modules * Update modeling_utils.py * Update modeling_utils.py * fix allocation * CIs * CIs * CIs * improve docstring * CIs * Update modeling_utils.py * fix	2025-03-12 13:39:25 +01:00
Ilyas Moutawwakil	89f6956015	HPU support (#36424 ) * test * fix * fix * skip some and run some first * test fsdp * fix * patches for generate * test distributed * copy * don't test distributed loss for hpu * require fp16 and run first * changes from marc's PR fixing zero3 * better alternative * return True when fp16 support on gaudi without creating bridge * fix * fix tested dtype in deepspeed inference test * test * fix * test * fix * skip * require fp16 * run first fsdp * Apply suggestions from code review * address comments * address comments and refactor test * reduce precison * avoid doing gaudi1 specific stuff in the genreation loop * document test_gradient_accumulation_loss_alignment_with_model_loss test a bit more	2025-03-12 09:08:12 +01:00
co63oc	996f512d52	Fix typos in tests (#36547 ) Signed-off-by: co63oc <co63oc@users.noreply.github.com>	2025-03-05 15:04:06 -08:00
Yih-Dar	482d17be60	Fix `hub_retry` (#36449 ) * cry * trigger --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-27 14:38:25 +01:00
Zach Mueller	41925e4213	Add retry hf hub decorator (#35213 ) * Add retry torch decorator * New approach * Empty commit * Empty commit * Style * Use logger.error * Add a test * Update src/transformers/testing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Fix err * Update tests/utils/test_modeling_utils.py --------- Co-authored-by: Lucain <lucainp@gmail.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-02-25 20:53:11 +01:00
Joao Gante	678885bbbd	[CI] Check test if the `GenerationTesterMixin` inheritance is correct 🐛 🔫 (#36180 )	2025-02-21 10:18:20 +00:00
Pavel Iakubovskii	a957b7911a	Add SigLIP 2 (#36323 ) * Docs * Inits * Auto classes * Add siglip base * Add base tests * Fix Siglip V1 for fix res version * Add image processor * Update conversion * Experimenting with vectorized embeddings * Fixup * Add modular Siglip2Processor * Add modular configuration * Rename num patches * Correct image and text features merging * Working conversion script * Refactoring conversion script * Remove unused code in conversion script * Shorten dict a bit * Refactoring conversion * Done conversion refactoring * Fixup * Modular siglip2 * Make model exportable and compilable without graph breaks * Remove position_ids from image_processor * REmove position ids from modeling file * Update modular * Type hint * Fixup * Set defaults to processor * Add integration test * Revert spatial shapes back to tensor * Change order * Fix most of the tests * Fix docstring * Remove interpolate_pos_encoding arg (not needed) * Update docs * Standardize processing * Fix attention_mask in vision head * Siglip v1: remove double transpose in FA2 * Update modular file * Update FA2 test * Update expected logits * Fix interpolation for siglip2 image processor * Skip init test * Skip dispatch on flash test * Fix modeling tests * Fixup * Add dummy objects * Fix some docstrings * Add siglip2 in index.md * Fix consistency * Add docs * Remove size and data format * Add image processor tests * Fix * Add fast image processor * Fix style * Fix * Docs * Set lowercase for tokenizer * Adjust head size for Siglip v1 * Update siglip2 for consistency with siglip1 * Update siglip2 conversion * Update pipeline * Update checkpoints in tests * Update checkpoint name * Fix pooling for image classification model * Fix FA2 test * Update processor * Fix check repo * Update docs * Fix typos * Fix docstring for fast image processor * Add siglip2 to FA2 docs * Fix fast ip tests * Fix constitency * Fix tokenizer class for siglip v1 * Fix missing header * Refactor scaling for clip, siglip, siglip2 * Remove unused imports * Make fast IP default for siglip2 * Update docs * Update checkpoints * Update modular * Update paper link * Fixup * Fix name in toctree * Fix test	2025-02-21 09:04:19 +00:00
Orr Zohar	4397dfcb71	SmolVLM2 (#36126 ) * smolvlm init * updates * fixing bugs * minimal run, no checks * minimal run, no checks * passing first check + adding url support * updating video dataloading logic * fixing image logic * trying modular, but fails * modular is working, changing processor to match PR comments and general transformers logic * fixing kwargs * offloading video loading logic to image_util * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * update * add idefics3-based tests * add keyword to all * add PreTrainedModel * updateing video loading logic * working inference * updates for PR comments * updates for PR comments * moving SmolVLMPretrainedModel higher to fix import error * CI test pass * CI test pass * removing lambda * CI test pass * CI test pass * CI test pass * CI test pass * CI test pass * CI test pass * processor tests * add example in docs * typo * fix copies * skip compile tests - sdpa for VisionTransformer * fix init * raise import error for num2words * update doc for FA2 * more doc fix * CI * updates for PR comments * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Joshua Lochner <admin@xenova.com> * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * fixing processor -- tokenizer not defined properly, (gpt2 tokenizer), and does not have the attributes of fake image token, etc * adding smolvlm to VQA models * removing vqa auto class * Update src/transformers/models/smolvlm/processing_smolvlm.py Co-authored-by: Joshua Lochner <admin@xenova.com> * removing smolvlmvisiontransformer from index.md * my bad, video processing had typos * fixing docs * renaming params in SmolVLMModel.inputs_merger * removing un-needed dtype/device in model forward * ruff for CI * update docs * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * return cache position * return cache position * return cache also in modular * needed to run modular again * fix training tests * push vectorized inputs merger * format * format * reduce number of mappings * addressing PR comments * happy CI, happy me :) * skip non-nested images * adjust integration test for smaller GPUs * format * fix kwargs in chat template apply * skip this for now --------- Co-authored-by: raushan <raushan@huggingface.co> Co-authored-by: Pablo <pablo.montalvo.leroux@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Joshua Lochner <admin@xenova.com>	2025-02-20 15:00:26 +01:00
Joao Gante	99adc74462	[tests] remove flax-pt equivalence and cross tests (#36283 )	2025-02-19 15:13:27 +00:00
Joao Gante	0863eef248	[tests] remove `pt_tf` equivalence tests (#36253 )	2025-02-19 11:55:11 +00:00
Raushan Turganbay	0c78ef6cd3	🔴 VLM: compile compatibility (#35724 ) * llavas * add mroe models * fix `compile_forward` test for all models * fix copies * make style * also doesn't support cache class * fix some tests * not copied from * ci green? * fix tests * fix copies * fix tests * check with `numel` and remove `item` * fix copies * fix copies * Update src/transformers/models/cohere2/modeling_cohere2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * opt remove cross attn * gemma2 * fixup * fixup * fix newly added test * maybe fixed? * green please? --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-02-14 15:23:49 +01:00
Joao Gante	62c7ea0201	CI: avoid human error, automatically infer generative models (#33212 ) * tmp commit * move tests to the right class * remove ALL all_generative_model_classes = ... * skip tf roberta * skip InstructBlipForConditionalGenerationDecoderOnlyTest * videollava * reduce diff * reduce diff * remove on vlms * fix a few more * manual rebase bits * more manual rebase * remove all manual generative model class test entries * fix up to ernie * a few more removals * handle remaining cases * recurrent gemma * it's better here * make fixup * tf idefics is broken * tf bert + generate is broken * don't touch tf :() * don't touch tf :( * make fixup * better comments for test skips * revert tf changes * remove empty line removal * one more * missing one	2025-02-13 16:27:11 +01:00
Pavel Iakubovskii	f42d46ccb4	Add common test for `torch.export` and fix some vision models (#35124 ) * Add is_torch_greater_or_equal test decorator * Add common test for torch.export * Fix bit * Fix focalnet * Fix imagegpt * Fix seggpt * Fix swin2sr * Enable torch.export test for vision models * Enable test for video models * Remove json * Enable for hiera * Enable for ijepa * Fix detr * Fic conditional_detr * Fix maskformer * Enable test maskformer * Fix test for deformable detr * Fix custom kernels for export in rt-detr and deformable-detr * Enable test for all DPT * Remove custom test for deformable detr * Simplify test to use only kwargs for export * Add comment * Move compile_compatible_method_lru_cache to utils * Fix beit export * Fix deformable detr * Fix copies data2vec<->beit * Fix typos, update test to work with dict * Add seed to the test * Enable test for vit_mae * Fix beit tests * [run-slow] beit, bit, conditional_detr, data2vec, deformable_detr, detr, focalnet, imagegpt, maskformer, rt_detr, seggpt, swin2sr * Add vitpose test * Add textnet test * Add dinov2 with registers * Update tests/test_modeling_common.py * Switch to torch.testing.assert_close * Fix masformer * Remove save-load from test * Add dab_detr * Add depth_pro * Fix and test RT-DETRv2 * Fix dab_detr	2025-02-11 11:37:31 +00:00
Zach Mueller	28f73bc307	Fix model kwargs (#35875 ) * Save state * Make a failing test * Better test * mpt -> done, many more to go * Rm extranious * Bamba * Bert * big_bird * biogpt * bloom * codegen * ctrl * data2vec * dbrx * Through up to Dbrx * electra * ernie * falcon * Fuyu/persimmon * Include noop kwargs to base models * Rebase * Skip musigen * Refactor/skip mllama * Revert makefile * Rm file * Fix PT failing, need to modify rest of loss funcs to not resize * Propagate some * Continue * More * More options * Mostly fixed * Proved that it's the same * Bloom is good * Make ability to override loss func possible * Fixup * Clean * Fix xglm * Quality tests * Skip OCR2 * Make specific loss for xglm * Make order the same/line up 1:1 * xglm * Skip fx output loss bloom model * Didn't pass in pad_token_id * Fix quality	2025-02-06 11:35:25 -05:00
Yih-Dar	dce9970884	Update `test_flash_attn_2_can_dispatch_composite_models` (#36050 ) * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-06 12:09:49 +01:00
Yih-Dar	fe52679e74	Update tests regarding attention types after #35235 (#36024 ) * update * update * update * dev-ci * more changes * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-04 18:04:47 +01:00
Yih-Dar	5757681837	Less flaky for `TimmBackboneModelTest::test_batching_equivalence` (#35971 ) * fix * remove is_flaky * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-30 16:56:26 +01:00
Raushan Turganbay	9725e5be2f	Pixtral: vectorize patch embeddings and enable tests (#35122 ) * initial POC * - batch mix feature * fix tests * fix tests * make style * do not skip and instead fix tests * update * return back the test * correct text with the correct ckpt	2025-01-30 12:40:18 +01:00
Joao Gante	8bc4c89ee9	[bart] minor test fixes (#35965 ) fix tests	2025-01-30 10:00:11 +00:00
Raushan Turganbay	b764c20b09	Fix: loading DBRX back from saved path (#35728 ) * fix dtype as dict for some models + add test * add comment in tests	2025-01-28 11:38:45 +01:00
Arthur	b912f5ee43	use torch.testing.assertclose instead to get more details about error in cis (#35659 ) * use torch.testing.assertclose instead to get more details about error in cis * fix * style * test_all * revert for I bert * fixes and updates * more image processing fixes * more image processors * fix mamba and co * style * less strick * ok I won't be strict * skip and be done * up	2025-01-24 16:55:28 +01:00
Cyril Vallez	d3af76df58	[Backend support] Allow `num_logits_to_keep` as Tensor + add flag (#35757 ) * support * Update modeling_utils.py * style * most models * Other models * fix-copies * tests + generation utils	2025-01-23 09:47:54 +01:00
Arthur	8736e91ad6	[ `tests`] remove some flash attention class tests (#35817 ) remove class from tests	2025-01-23 09:44:21 +01:00
Raushan Turganbay	373e50e970	Init cache on meta device (#35164 ) * init cache on meta device * offloaded static + enable tests * tests weren't running before :( * update * fix mamba * fix copies * update * address comments and fix tests * fix copies * Update src/transformers/cache_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update * mamba fix --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-22 09:49:17 +01:00
Yih-Dar	fd8d61fdb2	Byebye `test_batching_equivalence`'s flakiness (#35729 ) * fix * fix * skip * better error message --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-21 13:11:33 +01:00
Raushan Turganbay	09d5f76274	Clean-up composite configs (#34603 ) * remove manual assignment tie-word-embeddings * remove another unused attribute * fix tests * fix tests * remove unnecessary overwrites * fix * decoder=True * clean pix2struct * run-all * forgot `_tied_weights_keys` when adding Emu3 * also Aria + fix-copies * and clean aria	2025-01-15 10:04:07 +01:00

1 2 3 4 5 ...

434 Commits