transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
zzaebok	02ed609285	Replace tokenizer to processing_class in Seq2SeqTrainer (#35452 )	2025-01-07 09:51:12 +00:00
Dmitry Rogozhkin	9fd123ac31	ci: mark model_parallel tests as cuda specific (#35269 ) `parallelize()` API is deprecated in favor of accelerate's `device_map="auto"` and therefore is not accepting new features. At the same time `parallelize()` implementation is currently CUDA-specific. This commit marks respective ci tests with `@require_torch_gpu`. Fixes: #35252 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>	2025-01-07 10:16:34 +01:00
pglorio	bd442c6d3a	Zamba new attention standard (#35375 ) * updated zamba to new attention standard * make fixup fixes	2025-01-07 10:08:45 +01:00
NielsRogge	12ba96aa3c	[Dinov2 with Registers] Some fixes (#35411 ) * First draft * Thanks claude * Remove print statement * Use torch_int * Address comments * Address comment	2025-01-06 21:10:59 +01:00
Sarthak Karandikar	ca00950057	added logic for deleting adapters once loaded (#34650 ) * added logic for deleting adapters once loaded * updated to the latest version of transformers, merged utility function into the source * updated with missing check * added peft version check * Apply suggestions from code review Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * changes according to reviewer * added test for deleting adapter(s) * styling changes * styling changes in test * removed redundant code * formatted my contributions with ruff * optimized error handling * ruff formatted with correct config * resolved formatting issues --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-01-06 18:36:40 +00:00
Mukund Sudarshan	1650e0e514	Fixed typo in Llama configuration docstring (#35520 ) Update configuration_llama.py There is no `num_heads` parameter, only `num_attention_heads`	2025-01-06 09:54:08 -08:00
Woojun Jung	3b1be043cd	🌐 [i18n-KO] Remove duplicates in toctree (#35496 ) fix(docs): remove duplicates in toctree	2025-01-06 09:14:22 -08:00
Isotr0py	3951da1a6b	[GGUF] Refactor and decouple gguf checkpoint loading logic (#34385 ) * draft load_gguf refactor * update Signed-off-by: Isotr0py <2037008807@qq.com> * remove llama mapping Signed-off-by: Isotr0py <2037008807@qq.com> * remove qwen2 mapping Signed-off-by: Isotr0py <2037008807@qq.com> * remove unused function Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate stablelm mapping Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate phi3 mapping Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate t5 mapping Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate bloom mapping Signed-off-by: Isotr0py <2037008807@qq.com> * fix bloom Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate starcoder2 mapping Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate gpt2 mapping Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate mistral mapping Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate nemotron mapping Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate mamba mapping Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate mamba mapping Signed-off-by: Isotr0py <2037008807@qq.com> * code format Signed-off-by: Isotr0py <2037008807@qq.com> * code format Signed-off-by: Isotr0py <2037008807@qq.com> * fix mamba Signed-off-by: Isotr0py <2037008807@qq.com> * fix qwen2moe Signed-off-by: Isotr0py <2037008807@qq.com> * remove qwen2moe mapping Signed-off-by: Isotr0py <2037008807@qq.com> * clean up Signed-off-by: Isotr0py <2037008807@qq.com> * remove falcon 7b map Signed-off-by: Isotr0py <2037008807@qq.com> * remove all ggml tensors mapping Signed-off-by: Isotr0py <2037008807@qq.com> * add comments Signed-off-by: Isotr0py <2037008807@qq.com> * update messages Signed-off-by: Isotr0py <2037008807@qq.com> * fix tensors in parsed parameters Signed-off-by: Isotr0py <2037008807@qq.com> * add gguf check Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-01-06 18:02:38 +01:00
dependabot[bot]	86fa3cedad	Bump jinja2 from 3.1.4 to 3.1.5 in /examples/research_projects/decision_transformer (#35408 ) Bump jinja2 in /examples/research_projects/decision_transformer Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.4 to 3.1.5. - [Release notes](https://github.com/pallets/jinja/releases) - [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst) - [Commits](https://github.com/pallets/jinja/compare/3.1.4...3.1.5) --- updated-dependencies: - dependency-name: jinja2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-01-06 16:58:29 +00:00
Jacky Lee	44a26c871c	Update llm_optims docs for `sdpa_kernel` (#35481 ) update: use sdpa_kernel	2025-01-06 08:54:31 -08:00
Chulhwa (Evan) Han	18e896bd8f	🌐 [i18n-KO] Translated `altclip.md` to Korean (#34594 ) * docs: ko: model_doc/timesformer.md * feat: nmt draft * Apply suggestions from code review Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com> * Update docs/source/ko/model_doc/altclip.md * add snippet --------- Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com>	2025-01-06 08:45:26 -08:00
Zach Mueller	a821b9c7ab	Add check for if num_items_in_batch is not None (#35102 )	2025-01-06 10:11:21 -05:00
Yih-Dar	203e978826	Add `position_ids` in `XLMRobertaXLForCausalLM.prepare_inputs_for_generation` (#35044 ) * fix * fix * cleanup * style --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-06 16:10:21 +01:00
Antoine Dussolle	c451a72cd7	Add French translation of task_summary and tasks_explained (#33407 ) * Add French translation of task_summary and tasks_explained --------- Co-authored-by: Aymeric Roucher <69208727+aymeric-roucher@users.noreply.github.com>	2025-01-06 14:23:52 +01:00
Raushan Turganbay	9895f7df81	Idefics: fix docstring (#35079 ) nit: fix docstring	2025-01-06 10:58:04 +01:00
Isotr0py	32aa2db04a	Fix Llava conversion for models that use safetensors to store weights (#35406 ) * fix llava-med-v1.5-mistral-7b conversion Signed-off-by: Isotr0py <2037008807@qq.com> * add weights_only=True Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-01-06 09:59:38 +01:00
Lysandre Debut	b2f2977533	Applies the rest of the init refactor except to modular files (#35238 ) * [test_all] Applies the rest of the init refactor except to modular files * Revert modular that doesn't work * [test_all] TFGPT2Tokenizer	2025-01-05 18:30:08 +01:00
Yijun Lee	e5fd865eba	Add Gemma2 GGUF support (#34002 ) * initial setup for ggml.py * initial setup of GGUFGemma2Converter class * Add gemma2 model to gguf.md doc * Partial work on GGUF_TENSOR_MAPPING * initial setup of GGUF_TENSOR_MAPPING for Gemma2 * refactor: rename GemmaConvert class to GemmaConverter for naming consistency * feat: complete gemma2 tensor mapping implementation * feat: add initial implementation of GGUFGemmaConverter * feat: complete GGUFGemmaConverter implementation * feat: add test code for gemma2 * refactor: minor code cleanup * refactor: minor code cleanup * fix: resolve suggestions * Update tests/quantization/ggml/test_ggml.py Co-authored-by: Isotr0py <2037008807@qq.com> --------- Co-authored-by: Isotr0py <2037008807@qq.com>	2025-01-03 14:50:07 +01:00
湛露先生	1fe2d53d4e	Reuse "if not" logic in image_processing. (#35405 )	2025-01-03 14:44:57 +01:00
Jacky Lee	30a9971632	Use `sdpa_kernel` in tests (#35472 ) * update: use sdpa_kernel * update: rerun test	2025-01-03 14:39:52 +01:00
Blanchon	cba49cb2a6	Change `is_soundfile_availble` to `is_soundfile_available` (#35030 )	2025-01-03 14:37:42 +01:00
hoshi-hiyouga	42865860ec	Fix paligemma warning message (#35486 ) fix log input	2025-01-02 11:36:53 +01:00
湛露先生	b2b04e86e7	Fix docs typos. (#35465 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-01-02 11:29:46 +01:00
Matthew Douglas	6b1e86fd4d	Fix new BNB test failures (#35345 )	2025-01-02 11:24:52 +01:00
Tom Aarsen	5b516b06c8	Reintroduce Python 3.9 support for ModernBERT (#35458 ) Co-authored-by: Koichi Yasuoka <yasuoka@kanji.zinbun.kyoto-u.ac.jp>	2025-01-02 11:23:07 +01:00
Jacky Lee	919220dab1	Update translated docs for `sdpa_kernel` (#35461 ) * docs: update sdpa_kernel for translation * fix: nn.attention * update: infer many	2024-12-31 08:37:58 -08:00
Ahmed Almaghz	eb2b452432	[i18n-ar] Translated file: `docs/source/ar/tasks/summarization.md` into Arabic (#35195 ) * إضافة الترجمة العربية: summarization.md * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-12-31 08:35:54 -08:00
Ahmed Almaghz	d5aebc6465	[i18n-ar] Translated file: `docs/source/ar/tasks/question_answering.md` into Arabic (#35196 ) * إضافة الترجمة العربية: question_answering.md * Update question_answering.md * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-12-30 11:56:05 -08:00
Jacky Lee	b5f97977ed	Update docs for `sdpa_kernel` (#35410 ) update: sdp_kernel -> sdpa_kernel	2024-12-30 09:50:34 -08:00
Cheng-Han Chiang	5cabc75b4b	Add compute_loss_func to Seq2SeqTrainer (#35136 )	2024-12-29 15:01:35 +01:00
Martin	90f256c90c	Update perf_infer_gpu_one.md: fix a typo (#35441 )	2024-12-29 14:57:08 +01:00
Pavel Iakubovskii	5c75087aee	Fix `model_accepts_loss_kwargs` for timm model (#35257 ) * Fix for timm model * Add comment	2024-12-27 16:33:44 +00:00
Kyle Safran	3b0a94ef9e	Fix f-string to show `ACCELERATE_MIN_VERSION` on error (#35189 ) fix f-string to show ACCELERATE_MIN_VERSION on error Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-12-27 13:21:44 +01:00
Thien Tran	f63da20a9f	CLIP conversion script - Change fairseq to OpenAI (#35384 ) Change fairseq to OpenAI	2024-12-27 13:12:32 +01:00
宁宇	7f97d01675	Fix: Rename keyword argument in_channels to num_channels (#35289 ) Fix: Rename keyword argument in_channels to num_channels in some default backbone configs	2024-12-27 13:07:31 +01:00
Quentin Gallouédec	4eb17b26e7	Drop inplace operation for loss computation with gradient accumulation (#35416 ) Fix inplace loss computation	2024-12-26 14:58:53 +01:00
Anton Vlasjuk	24c91f095f	[`GPTQ`, `CompressedTensors`] Fix unsafe imports and metada check (#34815 ) * fix gptq creation when optimum is not installed + fix metadata checking * fix compressed tensors as well * style * pray for ci luck on flaky tests :prayge: * trigger ci --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2024-12-24 19:32:44 +01:00
NielsRogge	6e0515e99c	Add DINOv2 with registers (#35348 ) * added changes from 32905 * fixed mistakes caused by select all paste * rename diff_dinov2... * ran tests * Fix modular * Fix tests * Use new init * Simplify drop path * Convert all checkpoints * Add figure and summary * Update paths * Update docs * Update docs * Update toctree * Update docs --------- Co-authored-by: BernardZach <bernardzach00@gmail.com> Co-authored-by: Zach Bernard <132859071+BernardZach@users.noreply.github.com>	2024-12-24 13:21:59 +01:00
jiqing-feng	d8c1db2f56	enable non-cuda awq model support without modify version (#35334 ) Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2024-12-24 12:36:00 +01:00
Yih-Dar	ccc4a5a59b	Disable `.github/workflows/self-comment-ci.yml` for now (#35366 ) * disable * disable --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-24 10:53:57 +01:00
Yoni Gozlan	93aafdc620	Add compile test for fast image processor (#35184 ) * add compile test for fast image processor * override pixtral test	2024-12-23 13:12:45 -05:00
Mohamed Mekkouri	82fcac0a7e	Adding logger.info about update_torch_dtype in some quantizers (#35046 ) adding logger.info	2024-12-23 17:01:00 +01:00
Miquel Farré	a1780b7ba5	bugfix Idefics3 processor - handle gracefully cases with text and no images (#35363 ) * bugfix processing empty images * fix * fix * Update src/transformers/models/idefics3/processing_idefics3.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * adding tests * fix * fix * fix --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2024-12-23 16:59:01 +01:00
Andrei Panferov	64c05eecd6	HIGGS Quantization Support (#34997 ) * higgs init * working with crunches * per-model workspaces * style * style 2 * tests and style * higgs tests passing * protecting torch import * removed torch.Tensor type annotations * torch.nn.Module inheritance fix maybe * hide inputs inside quantizer calls * style structure something * Update src/transformers/quantizers/quantizer_higgs.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * reworked num_sms * Update src/transformers/integrations/higgs.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * revamped device checks * docstring upd * Update src/transformers/quantizers/quantizer_higgs.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * edited tests and device map assertions * minor edits * updated flute cuda version in docker * Added p=1 and 2,3bit HIGGS * flute version check update * incorporated `modules_to_not_convert` * less hardcoding * Fixed comment * Added docs * Fixed gemma support * example in docs * fixed torch_dtype for HIGGS * Update docs/source/en/quantization/higgs.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Collection link * dequantize interface * newer flute version, torch.compile support * unittest message fix * docs update compile * isort * ValueError instead of assert --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2024-12-23 16:54:49 +01:00
Huazhong Ji	ef1f54a0a7	add bnb support for Ascend NPU (#31512 ) * add bnb support for Ascend NPU * delete comment	2024-12-23 16:36:16 +01:00
Mohamed Mekkouri	59178780a6	Fix : VPTQ test (#35394 ) fix_test	2024-12-23 16:27:46 +01:00
Alvaro Bartolome	3a4ced9ab4	Fix typing in docstring for `PaliGemmaProcessor` (#35278 ) Updated typing for `tokenizer` in the `PaliGemmaProcessor` to be `GemmaTokenizerFast` instead of `LlamaTokenizerFast`	2024-12-23 16:22:04 +01:00
Quentin Gallouédec	3cd3cd50ac	Scale loss before backward (#35207 )	2024-12-23 16:16:38 +01:00
Mohamed Mekkouri	f5264a86ee	Deprecate _is_quantized_training_enabled (#34991 ) deperecate Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-12-23 15:51:31 +01:00
Tibor Reiss	e10be82b71	uniformize kwargs for SAM (#34578 ) * Make kwargs uniform for SAM * Remove unused attribute * Make point_pad_value part of image_kwargs * Update annotations * Code review - use existing methods * Use ProcessorTesterMixin * Do not add ProcessorTesterMixin everywhere	2024-12-23 13:54:57 +01:00

... 3 4 5 6 7 ...

17891 Commits