transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
ZuoChen_BUPT	c772d4d91e	fix a typo bug where 'id2label' was incorrectly written as 'i2label' when reading config (#34637 ) fix a bug where 'id2label' was incorrectly written as 'i2label' when reading the config from pretrained config	2024-11-18 14:41:48 +01:00
Ofek Lev	eb0ab3ed4b	Fix broken link (#34618 )	2024-11-18 14:13:26 +01:00
Raushan Turganbay	1646ffb4d1	VLMs: `patch_size` -> `num_image_tokens` in processing (#33424 ) * use num additional tokens * fix copies + docs * another fix copies :) * add docs * move order for BC	2024-11-18 13:21:07 +01:00
Shane A	3ee24e2208	Add OLMo November 2024 (#34551 ) * Add model skeletion with transformers-cli add-new-model-like * Convert config to modular, add rms_norm_eps, delete clip_qkv * Convert model to modular, add RMSNorm * Add flash attention with qk norm and no qkv clipping * Add decoder layer with RMSNorm after attention/feedforward layers * Add base and causal model * Add converter improvements from OLMo repo * Update weight loading in OLMo to HF converter * Set correct default for rms_norm_eps * Set correct pipeline_model_mapping in test * Run make fixup * Fix model type * Re-run modular conversion * Manually set config docs to fix build errors * Convert olmo-1124 to olmo_1124 to fix flash attention docs errors * Start updating tests * Update tests * Copy upstream test_eager_matches_sdpa_inference_1_bfloat16 changes to olmo_1124 * Rename input_layernorm and post_attention_layernorm to reflect their ops better * Use correct tokenizer * Remove test unsupported by GPT2 tokenizer * Create GenerationConfig outside of from_pretrained call * Use simpler init file structure * Add explicit __all__ to support simplified init * Make safetensor serialization the default * Update OLMo November 2024 docs	2024-11-18 10:43:10 +01:00
Joao Gante	13493215ab	🧼 remove v4.44 deprecations (#34245 ) * remove v4.44 deprecations * PR comments * deprecations scheduled for v4.50 * hub version update * make fiuxp --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-11-15 23:07:24 +01:00
AbdelKarim ELJANDOUBI	8d50fda644	Remove FSDP wrapping from sub-models. (#34452 ) * Remove FSDP wrapping from sub-models. * solve conflict trainer.py * make fixup * add unit test for fsdp_auto_wrap_policy when using auto_find_batch_size * put back extract_model_from_parallel * use transformers unwrap_model	2024-11-15 23:00:03 +01:00
Wing Lian	b0c0ba7b4d	FSDP grad accum fix (#34645 ) * add gradient accumulation steps tests for fsdp * invert no_sync context to fix training for fsdp	2024-11-15 22:28:06 +01:00
jiqing-feng	52ea4aa589	add xpu path for awq (#34712 ) * add xpu path for awq * update readme	2024-11-15 15:45:24 +01:00
CezaPasc	7b3d615bc2	fix(wandb): pass fake dataset to avoid exception in trainer (see #34455 ) (#34720 )	2024-11-15 15:44:02 +01:00
Lysandre Debut	f5dbfab7f3	Update llava.md (#34749 ) LLava -> Llava	2024-11-15 15:39:57 +01:00
lewtun	8ba3e1505e	Retain newlines in chat template when `continue_final_message=True` (#34253 ) * Retain newlines in chat template when * Add try/except * Add regression test * Simplify test * Apply suggestions from code review Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2024-11-15 14:27:04 +00:00
Fanli Lin	a3d69a8994	[docs] add xpu device check (#34684 ) * add XPU path * use accelerate API * Update docs/source/en/tasks/semantic_segmentation.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update more places with accelerate API --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-13 14:16:59 -08:00
Xiao Yuan	68f8186a89	Fix example in EsmConfig docstring (#34653 )	2024-11-13 13:55:58 -08:00
Pedro Cuenca	e7c36a9d57	[docs] Broken link in generation_strategies (#34717 ) [docs] Broken link	2024-11-13 13:44:42 -08:00
MaCAT	be8748a53c	🌐 [i18n-KO] Translated marian.md to Korean (#34698 ) * initial translation * removed english * Fixed Trivial Typos, updated _toctree.yml	2024-11-13 13:14:23 -08:00
Aymeric Roucher	33eef99250	Agents: Small fixes in streaming to gradio + add tests (#34549 ) * Better support transformers.agents in gradio: small fixes and additional tests	2024-11-11 20:52:09 +01:00
Ahmed Almaghz	6de2a4d1f1	[i18n-ar] Translated file : `docs/source/ar/torchscript.md` into Arabic (#33079 ) * Add docs/source/ar/torchscript.md to Add_docs_source_ar_torchscript.md * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Merge troubleshooting.md with this Branch * Update _toctree.yml * Update torchscript.md * Update troubleshooting.md --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-11-11 10:41:01 -08:00
Fanli Lin	25f510a9c6	[docs] update not-working model revision (#34682 ) update revision	2024-11-11 07:09:31 -08:00
Aymeric Roucher	3ea3ab62d8	Agents: turn any Space into a Tool with `Tool.from_space()` (#34561 ) * Agents: you can now load a Space as a tool	2024-11-10 12:22:40 +01:00
Louis Brulé Naudet	134ba90da9	Update llm_engine.py (#33332 ) * Update llm_engine.py - Added support for optional token and max_tokens parameters in the constructor. - Provided usage examples and detailed documentation for each method.	2024-11-10 12:19:20 +01:00
Ahmed Almaghz	768f3c016e	[i18n-ar] Translated file : `docs/source/ar/trainer.md` into Arabic (#33080 ) * Add docs/source/ar/trainer.md to Add_docs_source_ar_trainer.md * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update trainer.md * Update trainer.md * Update trainer.md * Create _toctree.yml * Delete docs/source/ar/_toctree.yml * Update _toctree.yml - add trainer * Update _toctree.yml * merge serialization.md into this branch * merge sagemaker.md into this PR * Update _toctree.yml * Update docs/source/ar/trainer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-09 11:26:28 -08:00
MaCAT	a06a0d1263	🌐 [i18n-KO] Translated bert.md to Korean (#34627 ) * Translated bert.md, Need additional check * Translation 2nd ver, changed _toctree.yml * Fixed Typo * Update bert.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update bert.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update bert.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update bert.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-07 18:56:09 -08:00
Jiwook Han	1cf17077bf	🌐 [i18n-KO] Translated `timesformer.md` to Korean (#33972 ) * docs: ko: model_doc/timesformer.md * feat: nmt draft * fix: manual edits * fix_toctree * fix toctree on Video Models	2024-11-07 11:04:27 -08:00
Ivan Shcheklein	6938524a28	fix(dvclive): pass fake dataset to avoid exception in trainer init (#34455 ) fix(dvclive): pass fake dataset to avoid exception in trainer	2024-11-07 15:57:34 +01:00
Ahnjj_DEV	7bbc624743	🌐 [i18n-KO] Translated `convbert.md` to Korean (#34599 ) * docs: ko: convbert.md * Update _toctree.yml * feat: nmt draft	2024-11-05 09:32:17 -08:00
Isotr0py	e83aaaa86b	Fix `use_parallel_residual` and `qkv_bias` for StableLM GGUF config extraction (#34450 ) * fix stablelm qkv_bias * fix stablelm qkv_bias and use_parallel_residual * remove original_model.config for stablelm gguf test	2024-11-05 18:26:20 +01:00
Yoni Gozlan	9f28d0c5d0	Fix torchvision interpolation CI (#34539 ) fix-torch-interpolation-ci	2024-11-05 11:02:14 -05:00
Mohamed Mekkouri	d2bae7ee9d	Changing __repr__ in torchao to show quantized Linear (#34202 ) * Changing __repr__ in torchao * small update * make style * small update * add LinearActivationQuantizedTensor * remove some cases * update imports & handle return None * update	2024-11-05 16:11:02 +01:00
Yih-Dar	f2d5dfbab2	Remove `@slow` for `test_eager_matches_sdpa_inference` (#34558 ) * update * update * update * update * update * update * update * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-05 16:10:42 +01:00
Yoni Gottesman	082e57e0d4	Fix #34494 assistant tokens when truncated (#34531 ) * Fix assistant tokens when truncated * fix test * fix test * step	2024-11-05 15:10:15 +00:00
Yih-Dar	74d3824cc0	Revert "Fix Whisper CI" (#34605 ) Revert "Fix Whisper CI (#34541)" This reverts commit `eb811449a2`.	2024-11-05 15:12:47 +01:00
Eon Kim	45b0c7680c	Remove unused test_dataset (#34516 )	2024-11-05 14:01:25 +00:00
Guang Yang	663c851239	DistilBERT is ExecuTorch compatible (#34475 ) * DistillBERT is ExecuTorch compatible * [run_slow] distilbert * [run_slow] distilbert --------- Co-authored-by: Guang Yang <guangyang@fb.com>	2024-11-05 13:41:48 +01:00
Raushan Turganbay	893ad04fad	Load sub-configs from composite configs (#34410 ) * save/load sub-configs * nit forgot these * fix copies * move test to common * use dict for sub-configs * add load-save-laod test * clean up modeling check * oops this are correct keys * fix some tests, missed some composite configs * this model was missed	2024-11-05 11:34:01 +01:00
Benjamin Bossan	5e1fd4e204	FIX: Broken repr of TorchAoConfig (#34560 ) FIX Broken repr of TorchAoConfig The __repr__ method references a non-existent self.kwargs. This is now fixed. There does not appear to be a uniform way of defining __repr__ for quantization configs. I copied the method as implemented for HQQ: `e2ac16b28a/src/transformers/utils/quantization_config.py (L285-L287)`	2024-11-05 10:26:13 +01:00
AbdelKarim ELJANDOUBI	d0b1d8d888	Skip DeepSpeed ZeRO Stage 3 model initialization when bnb (#34395 ) * Skip DeepSpeed ZeRO Stage 3 model initialization when it is intended to be quantized. * Propagate the quantization state using a context manager * make fixup	2024-11-05 10:06:07 +01:00
Yih-Dar	eb811449a2	Fix Whisper CI (#34541 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-04 21:35:37 +01:00
kang sheng	bfa021be05	fix TrainerState doc because num_input_tokens_seen is unused by defau… (#34593 ) fix TrainerState doc because num_input_tokens_seen is unused by default config Co-authored-by: kangsheng <kangsheng@meituan.com>	2024-11-04 09:42:20 -08:00
Ju Hoon Park	0a6795af12	🌐 [i18n-KO] Update README_ko.md (#33098 ) * Update README_ko.md Delete the blank paragraph in the language selection button and Edit to synchronize with the English version of README.md * [i18n-KO] Update README_ko.md * Additional edit for keep consistency with main [documentation](https://huggingface.co/docs/transformers/v4.44.2/ko/index). (메인 문서와 일관성 유지를 위한 수정) * Update README_ko.md Additional update. * Change docs link to Korean translated page if it exists. * Change doc link to korean translated if it exists. Change the link of doc and delete a row 'migration' of the table Learn more[더 알아보기], since it does not exist in the main version of doc. * modify a link of the main README.md from `https://huggingface.co/docs/transformers/index#supported-frameworks` to `https://huggingface.co/docs/transformers/index#supported-models-and-frameworks` since the title of 'supported table' changed. * [i18n-ko] edit links and sync with main `README.md` * docs/change comment to Korean1 Change English comment to Korean Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * docs/change comment to Korean2 Change English comment to Korean Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * revise to original to seperate `edit_README_ko_md` and `README.md` * Synchronization with English documentation. Synchronization with English documentation, and translated a line of comment from English to Korean. --------- Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>	2024-11-04 09:42:07 -08:00
MaCAT	1112c54604	🌐 [i18n-KO] Translated perf_train_special.md to Korean (#34590 ) * Translated to Ko, 1st version * updated _toctree.yml	2024-11-04 09:41:44 -08:00
Karthik Vallamsetla	a86bd6f2d8	[i18n-HI] Translated TFLite page to Hindi (#34572 ) * [i18n-HI] Translated TFLite page to Hindi * [i18n-HI] Translated TFLite page to Hindi * Update docs/source/hi/tflite.md Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com> --------- Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>	2024-11-04 09:40:30 -08:00
JacobLinCool	48831b7d11	Add text support to the Trainer's TensorBoard integration (#34418 ) * feat: add text support to TensorBoardCallback * feat: ignore long strings in trainer progress * docs: add docstring for max_str_len * style: remove trailing whitespace --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-11-04 17:36:27 +01:00
Joao Gante	34927b0f73	MPS: `isin_mps_friendly` can support 0D tensors (#34538 ) * apply fix * tested * make fixup	2024-11-04 16:18:50 +00:00
Raushan Turganbay	187439c3fa	VLM: special multimodal Tokenizer (#34461 ) * kinda works * update * add tests * update * use special tokens in processors * typo * fix copies * fix * fix moshi after rebase * update * fix tests * update * Update docs/source/en/main_classes/tokenizer.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update docs * test for load time adding tokens * fix some more tests which are now fetched better * one more fix --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-11-04 16:37:51 +01:00
Zach Mueller	ef976a7e18	Update trainer for easier handling of accumulate, compile fixes, and proper reporting (#34511 ) * Update trainer for easier handling of accumulate + proper reporting * test * Fixup tests * Full fix * Fix style * rm comment * Fix tests * Minimize test + remove py 311 check * Unused import * Forward contrib credits from discussions * Fix reported metrics * Refactor, good as it's going to get * rm pad tok id check * object detection and audio are being annoying * Fin * Fin x2 --------- Co-authored-by: Gyanateet Dutta <Ryukijano@users.noreply.github.com>	2024-11-04 07:47:34 -05:00
Karthik Vallamsetla	33868a057c	[i18n-HI] Translated accelerate page to Hindi (#34443 ) * [i18n-HI] Translated accelerate page to Hindi * Update docs/source/hi/accelerate.md Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com> * Update docs/source/hi/accelerate.md Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com> * Update docs/source/hi/accelerate.md Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com> * Update docs/source/hi/accelerate.md Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com> --------- Co-authored-by: Kay <kay@Kays-MacBook-Pro.local> Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>	2024-11-01 08:26:45 -07:00
Cyril Vallez	e2ac16b28a	Large modular logic refactoring (#34487 ) * rework converter * Update modular_model_converter.py * Update modular_model_converter.py * Update modular_model_converter.py * Update modular_model_converter.py * cleaning * cleaning * finalize imports * imports * Update modular_model_converter.py * Better renaming to avoid visiting same file multiple times * start converting files * style * address most comments * style * remove unused stuff in get_needed_imports * style * move class dependency functions outside class * Move main functions outside class * style * Update modular_model_converter.py * rename func * add augmented dependencies * Update modular_model_converter.py * Add types_to_file_type + tweak annotation handling * Allow assignment dependency mapping + fix regex * style + update modular examples * fix modular_roberta example (wrong redefinition of __init__) * slightly correct order in which dependencies will appear * style * review comments * Performance + better handling of dependencies when they are imported * style * Add advanced new classes capabilities * style * add forgotten check * Update modeling_llava_next_video.py * Add prority list ordering in check_conversion as well * Update check_modular_conversion.py * Update configuration_gemma.py	2024-11-01 10:13:51 +01:00
Pablo Montalvo	86701f2b6f	🔴 🔴 fix `query_pre_attn_scalar` different of `num_heads` in default gemma2 config (#34540 ) * fix query_pre_attn_scalar different of num_heads in default config * propagate modular changes * fix copies * fix modular copies * fix copies? * correct copies fix	2024-11-01 09:06:17 +01:00
Raushan Turganbay	4cc0813e28	BLIP: enable generation tests (#34174 ) * blip2 tests * instructblips * copies * fix slow tests * fix * uncomment this * clean up after rebase * should be model main input * fix overwritten tests * oops len should be multiple of frame number * style * fix some tests	2024-11-01 08:54:48 +01:00
Raushan Turganbay	6beb3f1691	Blip: get/set input embeddings correctly (#34152 ) * set-get embeds * add tests * fix tests * remove * return dict True * fix tests * why did i remove this * enabel torchscript tests	2024-11-01 08:39:39 +01:00

1 2 3 4 5 ...

17364 Commits