transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-30 17:52:35 +06:00

Author	SHA1	Message	Date
Cyril Vallez	17806d11ba	Improve modular converter (#33991 ) * improve modular * style * Update modular_model_converter.py * pretty print warning * style * Support to remove unused classes as part of added dependencies as well * nits * correct bug * add example * style * Add documentation	2024-10-08 14:53:58 +02:00
Matt	fb360a6c7a	BatchFeature.to() supports non-tensor keys (#33918 ) * Fix issue in oneformer preprocessing * [run slow] oneformer * [run_slow] oneformer * Make the same fixes in DQA and object detection pipelines * Fix BatchFeature.to() instead * Revert pipeline-specific changes * Add the same check in Pixtral's methods * Add the same check in BatchEncoding * make sure torch is imported	2024-10-08 13:43:32 +01:00
Matt	3b44d2f042	Image pipelines spec compliance (#33899 ) * Update many similar visual pipelines * Add input tests * Add ImageToText as well * Add output tests * Add output tests * Add output tests * OutputElement -> Output * Correctly test elements * make fixup * fix typo in the task list * Fix VQA testing * Add copyright to image_classification.py * Revert changes to VQA pipeline because outputs have differences - will move to another PR * make fixup * Remove deprecation warnings	2024-10-08 13:34:28 +01:00
Yoni Gozlan	e2001c3413	Add auto model for image-text-to-text (#32472 ) * Add Auto model for image-text-to-text * Remove donut from processing auto, add chameleon ti image text to text models * add qwen2_vl and llava_onevision * add pixtral to auto model for image-text-to-text * add mllama and idefics3 * remove models in IGNORE_NON_AUTO_CONFIGURED * add AutoModelForImageTextToText to tests and doc	2024-10-08 14:26:43 +02:00
Raushan Turganbay	0dbc7090ba	Processors: don't default padding side (#33942 ) * don't default padding side * fix	2024-10-08 10:58:49 +02:00
Arthur	a3add29097	Add support for __all__ and potentilly deleting functions (#33859 ) * Add support for __all__ and potentailly deleting functions * updates * update * nits * remove dummies * fix warning * fixup * style * update * fixup * skip copied from when # skip * remove log * bring dummies back * fixup * remove copied from * fixup * remove warnings from `make fix-copies` * fix doc issues * nits * Better error message ! * add support for more flexible naming! * style * breaking style? * fix super() renaming issues * del not needed when you don't call super().__init__() * style * no more fmt on :) * properly remove `self` * fixup * fix * doc nits * add some doc 🫡	2024-10-08 10:19:17 +02:00
Raushan Turganbay	bead0fa8dc	Cache: slight change in naming (#32421 ) * squash * codestyle * Update src/transformers/cache_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * propagate changes to all cache classes * + whisper * fix tests * more fixes * add deprecation warning * fix copies * address comments * fix mistral also * these didn't have "copied from" --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2024-10-08 09:43:40 +02:00
Yijun Lee	d6ba1ac041	🌐 [i18n-KO] Translated `gemma.md` to Korean (#33936 ) * docs: ko: gemma.md * feat: nmt draft * fix: manual edits	2024-10-07 15:59:14 -07:00
Jiwook Han	46f146a2b5	🌐 [i18n-KO] Translated `vit.md` to Korean (#33884 ) * docs: ko: model_doc/vit.md * feat: nmt draft * fix: manual edits * fix: manual edits * Update docs/source/ko/model_doc/vit.md Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> * Update docs/source/ko/model_doc/vit.md Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> --------- Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>	2024-10-07 15:35:11 -07:00
Jiwook Han	1ecca92f03	🌐 [i18n-KO] Translated `swin2sr.md` to Korean (#33795 ) * ko: doc: model_doc/swin2sr.md * feat: nmt draft * Update docs/source/ko/model_doc/swin2sr.md Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> --------- Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>	2024-10-07 15:34:56 -07:00
boyunJang	8258219c4c	🌐 [i18n-KO] Translated `auto.md` to Korean (#33590 ) * docs: ko: model_doc/auto.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * fix: resolve suggestions --------- Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>	2024-10-07 15:34:45 -07:00
Chaewon Song	253a9a9d6f	🌐 [i18n-KO] Translated `logging.md` to Korean (#33543 ) * docs: ko: main_classes/logging.md * feat: nmt-draft * fix: update toctree.yml * Update docs/source/ko/main_classes/logging.md Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * Update docs/source/ko/main_classes/logging.md Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * Apply suggestions from code review Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> --------- Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>	2024-10-07 15:34:34 -07:00
Yijun Lee	178d707b7e	🌐 [i18n-KO] Translated `chameleon.md` to Korean (#33799 ) * docs: ko: chameleon.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>	2024-10-07 15:06:13 -07:00
Yijun Lee	13432f8409	🌐 [i18n-KO] Translated `trainer.md` to Korean (#33797 ) * docs: ko: trainer.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>	2024-10-07 15:05:57 -07:00
Yijun Lee	e9fbe62965	🌐 [i18n-KO] Translated `pipelines_utils.md` to Korean (#33809 ) * docs: ko: pipelines_utils.md * feat: nmt draft * fix: manual edits	2024-10-07 15:05:17 -07:00
Yijun Lee	9c61ba2f25	🌐 [i18n-KO] Translated `time_series_utils.md` to Korean (#33806 ) * docs: ko: time_series_utils.md * feat: nmt draft * fix: manual edits	2024-10-07 15:05:00 -07:00
Yijun Lee	9c8bd3fc1b	🌐 [i18n-KO] Translated `esm.md` to Korean (#33796 ) * docs: ko: esm.md * feat: nmt draft * fix: manual edits	2024-10-07 13:39:22 -07:00
Yijun Lee	6996f2186a	🌐 [i18n-KO] Translated `audio_utils.md` to Korean (#33802 ) * docs: ko: audio_utils.md * feat: nmt draft * fix: manual edits	2024-10-07 13:39:10 -07:00
Jiwook Han	410c73af1d	🌐 [i18n-KO] Translated `swinv2.md` to Korean (#33566 ) * docs: ko: model_doc/swinv2.md * feat: nmt draft * fix: manual edits * fix: manual edits	2024-10-07 12:50:43 -07:00
Yijun Lee	6c18cefed0	🌐 [i18n-KO] Translated `gguf.md` to Korean (#33764 ) * docs: ko: gguf.md * feat nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>	2024-10-07 12:49:08 -07:00
Michael Goin	c91fe85b78	Fix undefined default_config in configuration_utils.py (#33934 )	2024-10-07 18:32:20 +02:00
Arthur	736c7cde51	[`pytes collection`] Fix flax test collection (#34004 ) bit weird but to filter I had to use this	2024-10-07 18:11:13 +02:00
roy	55be7c4c48	Enable customized optimizer for DeepSpeed (#32049 ) * transformers: enable custom optimizer for DeepSpeed * transformers: modify error message --------- Co-authored-by: datakim1201 <roy.kim@maum.ai>	2024-10-07 15:36:54 +02:00
Arthur	7bae833728	properly fix and RUN_SLOW (#33965 ) * properly fix and RUN_SLOW * lots of models were affected * fix-copies * more fixes	2024-10-07 14:45:57 +02:00
Kaito	e782e95e34	Fix Tensor + Embedding error in some cases when using SiglipVisionModel (#33994 ) Fix Tensor + Embedding error in some cases Co-authored-by: kaitolucifer <kaito.o@ghelia.com>	2024-10-07 11:17:34 +02:00
Arthur	9b4b0c07db	[`Red CIs`] Fix hub failures (#34001 ) maybe setup should work?	2024-10-07 10:56:24 +02:00
Magnus	ad1a250719	[Docs] Add Developer Guide: How to Hack Any Transformers Model (#33979 ) * docs: add example for separating q, k, v projections in SAM * docs: How to Hack Any Transformers Model * docs: remove changes from sam model docs * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-10-07 10:08:20 +02:00
NielsRogge	f5aeb7c1a5	[Docs] Improve VLM docs (#33393 ) * Improve docs * Update docs/source/en/model_doc/llava.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/llava.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Address comment * Address comment * Improve pixtral docs --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-10-07 09:54:07 +02:00
Cyril Vallez	1f33023cfa	Flash-attn performance: remove cuda sync during inference (#33570 ) Switch conditions to use short-circuit during inference	2024-10-07 09:52:19 +02:00
Avishai Elmakies	4953ddf036	Add position ids in forward pass to opt model (#33121 ) * start working on adding position ids * add docs * Refactor modeling_biogpt.py and modeling_opt.py for code consistency * fix 2 PR comments * move position_ids to end of args * remove trailing white space * add comment with TODO * bug fix gradient checkpointing * fixup * missed on position_ids * remove _attention_to_position_ids and refactor embedding class * remove redundent code --------- Co-authored-by: Avishai Elmakies <avishai.elma@cs.huji.ac.il>	2024-10-07 09:20:49 +02:00
TomLim	1bd604d11c	[WIP] Add Tokenizer for MyT5 Model (#31286 ) * Initial commit for MyT5 model * custom implementation of MyT5 tokenizer, unused files deleted * unittest for myt5 tokenizer * upadate of import structure and style * removed remmanents of MyT5Config * fixed docstrings * Updates after review: filled documentaion file, new docstrings and tests added * Fixed code style issues * fixed copied from to refer to function * updated loading myt5 tokenizer in tests, added sample byte map file to fixtures * changes after review * removed redundant copied from * removed redundant copied from * optimalization and loading model from hf * [run_slow] myt5 * [run-slow] myt5 * Updated en documentation for myt5 Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-10-06 10:33:16 +02:00
Anton Vlasjuk	5ef432e474	[`TF`] Fix Tensorflow XLA Generation on limited seq_len models (#33903 ) * fix tf xla generation on limited seq_len models * [run-slow] opt * [run-slow] opt	2024-10-05 16:20:50 +02:00
Vladislav Bronzov	22e102ad98	Bug fix gguf qwen2moe (#33940 ) * fix qwen2moe tensors mapping, add unit tests * add expert tensor split logic, test refactoring * small params refactoring * add comment to tensor reshaping	2024-10-05 16:19:01 +02:00
Yehoshua Cohen	56be9f1925	add test for Jamba with new model jamba-tiny-dev (#33863 ) * add test for jamba with new model * ruff fix --------- Co-authored-by: Yehoshua Cohen <yehoshuaco@ai21.com>	2024-10-05 16:03:12 +02:00
Adam Pocock	a7e4e1a77c	Updating `char_to_token` documentation to note behaviour when `trim_offsets` is True (#33919 ) Updating char_to_token documentation.	2024-10-05 14:13:26 +02:00
Raushan Turganbay	612065efeb	Paligemma: fix static cache test (#33941 ) * fix * not flaky anymore + style	2024-10-05 09:47:37 +02:00
Joao Gante	38f9f10dd9	Cache: revert DynamicCache init for BC (#33861 ) * tmp commit * tmp commit * make fixup * missing removal * fix condition * fix end-to-end compilation * if -> elif * BC * BC * use @deprecate_kwarg("num_hidden_layers", version="4.47.0") * wups the import * 🥴 --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2024-10-04 22:47:08 +02:00
Arthur	f92d354823	fix red check-copies (#33964 )	2024-10-04 22:45:37 +02:00
pglorio	f319ba16fa	Add Zamba (#30950 ) * Update index.md * Rebase * Rebase * Updates from make fixup * Update zamba.md * Batched inference * Update * Fix tests * Fix tests * Fix tests * Fix tests * Update docs/source/en/model_doc/zamba.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/zamba.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update configuration_zamba.py * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update modeling_zamba.py * Update modeling_zamba.py * Update modeling_zamba.py * Update configuration_zamba.py * Update modeling_zamba.py * Update modeling_zamba.py * Merge branch 'main' of https://github.com/Zyphra/transformers_zamba * Update ZambaForCausalLM * Update ZambaForCausalLM * Describe diffs with original mamba layer * Moved mamba init into `_init_weights` * Update index.md * Rebase * Rebase * Updates from make fixup * Update zamba.md * Batched inference * Update * Fix tests * Fix tests * Fix tests * Fix tests * Update docs/source/en/model_doc/zamba.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/zamba.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update configuration_zamba.py * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update modeling_zamba.py * Update modeling_zamba.py * Update modeling_zamba.py * Update configuration_zamba.py * Update modeling_zamba.py * Update modeling_zamba.py * Merge branch 'main' of https://github.com/Zyphra/transformers_zamba * Update ZambaForCausalLM * Moved mamba init into `_init_weights` * Update ZambaForCausalLM * Describe diffs with original mamba layer * make fixup fixes * quality test fixes * Fix Zamba model path * circleci fixes * circleci fixes * circleci fixes * circleci fixes * circleci fixes * circleci fixes * circleci fixes * circleci fixes * circleci fixes * Update * circleci fixes * fix zamba test from merge * fix ValueError for disabling mamba kernels * add HF copyright Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * shared_transf --> shared_transformer * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fixes * Move attention head dim to config * Fix circle/ci tests * Update modeling_zamba.py * apply GenerationMixin inheritance change from upstream * apply import ordering * update needed transformers version for zamba Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add contribution author * add @slow to avoid CI * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Define attention_hidden_size * Added doc for attention_head_size * trigger CI * Fix doc of attention_hidden_size * [run-slow] zamba * Fixed shared layer logic, swapped up<->gate in mlp * shared_transformer -> shared_transf * reformat HybridLayer __init__ * fix docstrings in zamba config * added definition of _get_input_ids_and_config * fixed formatting of _get_input_ids_and_config --------- Co-authored-by: root <root@node-4.us-southcentral1-a.compute.internal> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: root <root@node-1.us-southcentral1-a.compute.internal> Co-authored-by: Quentin Anthony <qganthony@yahoo.com>	2024-10-04 22:28:05 +02:00
Amit Garg	e3775539c8	PhiMoE (#33363 ) * onboard phimoe model * removed debug code * added unit tests * updated docs * formatted * fixed unit tests * fixed test case * fixed format * refactored code * fixed expected outputs in the integration tests * Added a warning msg * Addressed comments * Addressed comments * fixed test cases * added paper link * Addressed comments * Refactored PhimoeForCausalLM forward fn * Refactored PhimoeRotaryEmbedding class * fixed test cases * fixed testcase * fixed test case * Addressed comments * fixed test cases * fixed testcases * Used cache position instead to get the seq len	2024-10-04 21:39:45 +02:00
Arthur	46579c0e77	hot fix `self.position_embeddings->self.position_embedding` (#33958 )	2024-10-04 21:35:31 +02:00
Longjie Zheng	0d1692a49b	Fix attn mask ignore logic in training-time trace (#32613 ) * fix attn mask logic for training-time trace * add test * fix * fix * fix * fix * fix * format * [run-slow] llama * avoid accelearate * [run-slow] llama	2024-10-04 19:00:45 +02:00
karan-uppal3	614660fdb9	Removed unnecessary transpose in Switch Transformer Routing (#33582 ) removed switch transformer routing transpose	2024-10-04 17:39:03 +02:00
Mohamed Abu El-Nasr	78ef58325c	🔴 🚨 Resizing tokens embeddings: initialize from old embeddings' normal distribution. (#33325 ) * intilize new embeddings from normal distrib * Fix typo in comments * Fix typo in comments * Fix style * Fix variables naming * Add tests * Fix style * code consistency nit * Add deepspeed support * Add deepspeed support * Conver embeddings weights to float32 before computations * Add deepspeed tests * Cover when vocab_size is smaller than embedding_size * Style fix * Add tests for vocab_size smaller than hiddin_size * Style fix * Nits in tests * Nits in tests * Check for deepspeed before importing it * Increase vocab_size for positive definite covariance matrix test * Add warning * Add multivariate_resizing flag and implement resizing for lm_heads * Fix typo * Fix wrong bias indexing * Fix bias is zero check * remove multivariate_resizing flag from tests * Intialize bias from old bias normal distribution * Fixup * Code usability * Use mean_resizing instead of multivariate_resizing * Fix up * Fix comments and docs	2024-10-04 16:29:55 +02:00
jiqing-feng	b916efcb3c	Enables CPU AWQ model with IPEX version. (#33460 ) * enable cpu awq ipex linear * add doc for cpu awq with ipex kernel * add tests for cpu awq * fix code style * fix doc and tests * Update docs/source/en/quantization/awq.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/autoawq/test_awq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix comments * fix log * fix log * fix style --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-10-04 16:25:10 +02:00
Matt	de4112e4d2	Add a section on writing tool templates to the chat template docs (#33924 ) * Add a section on writing tool templates to the chat template docs * Small cleanups	2024-10-04 14:40:44 +01:00
Arthur	2e719e35fd	[`PR run-slow`] (#33939 ) * force latest torch * Update .github/workflows/self-pr-slow-ci.yml Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2024-10-04 14:46:15 +02:00
Raushan Turganbay	061c2c4c38	Ignore keys on `validate_rope` (#33753 ) * ignore keys on check rope * add tests * fix tests, so maybe better leave at logger lvl	2024-10-04 12:39:37 +02:00
Artyom Semidolin	4a173b88b5	[i18n-ru] Fixes typo in the README_ru.md (#33882 )	2024-10-04 11:21:38 +02:00
Deepak Saldanha	b6a01df6e9	[Doc]: Broken link in Kubernetes doc (#33879 ) * add relative path in .md and redirects to conf.py * add redirects to conf.py and update .md * modify links in .md	2024-10-04 11:20:56 +02:00

1 2 3 4 5 ...

17063 Commits