transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Omar Salman	e7c8af7f33	Add sdpa for DistilBert (#33724 ) * Add sdpa for DistilBert * [run_slow] distilbert * [run_slow] distilbert * [run_slow] distilbert * Try without slow tests * [run_slow] distilbert * [run_slow] distilbert	2024-10-02 13:55:19 +01:00
Kyle Sayers	614c79a9b0	Fix kwargs passed by AutoQuantizationConfig.from_pretrained (#33798 ) fix kwargs Co-authored-by: kylesayrs <kyle@neuralmagic.com>	2024-10-02 14:12:03 +02:00
Kyle Sayers	b09234cfc1	Allow for nightly packages of `compressed_tensors` (#33828 ) * only check spec * correct typo in nightly package name	2024-10-02 14:11:44 +02:00
g-prz	fe484726aa	Add falcon gguf (#33437 ) * feat(gguf): add falcon q2 k * fix(gguf): remove useless renaming * feat(gguf): seperate falcon 7b and 40b * feat(gguf): apply fixup * fix(test): error rebase * feat(gguf): add fp16 weight comparison for falcon * feat(gguf): test weight of all layers * test(gguf): add falcon 40b under skip decorator * feat(gguf): quick example for extracting model size	2024-10-02 14:10:39 +02:00
George	181c962aab	populate quantization_config for kv-cache-scheme only configs (#33874 )	2024-10-02 14:06:40 +02:00
Yih-Dar	e5d14f39ad	Don't run reminder bot for now (#33883 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-02 11:51:01 +02:00
Pablo Montalvo	50290cf7a0	Uniformize model processors (#31368 ) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default 👀 * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * 🧹 * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-10-02 10:41:08 +02:00
TrickEye	2292be6c1b	Fix: typo (#33880 ) Update llm_tutorial.md: typo	2024-10-02 09:12:21 +01:00
Yoni Gozlan	61ac161a9d	Add support for custom inputs and batched inputs in ProcessorTesterMixin (#33711 ) * add support for custom inputs and batched inputs in ProcessorTesterMixin * Fix batch_size behavior ProcessorTesterMixin * Change format prepare inputs batched * Remove override test pixtral processor * Remove unnecessary tests and cleanup after new prepare_inputs functions * Fix instructBlipVideo image processor	2024-10-01 23:52:03 +02:00
amyeroberts	1baa08897d	Repo consistency fix after #33339 (#33873 ) * Repo consistency fix after #33339 * [run-slow] omdet_turbo	2024-10-01 21:03:15 +01:00
Prakarsh Kaushik	68a2b50069	[Fix] ViViT interpolate_pos_encoding (#33815 ) * fix:test_inference_interpolate_pos_encoding * style:make style;make fixup * test: add suggestion to test_modeling_vivit * chore:add suggestions * style:make style * [run_slow] vivit * ci:slow test fix * [run_slow] vivit	2024-10-01 20:14:35 +01:00
g-prz	8635802af9	Move weight initilization deformabledetr (#33339 ) * fix(copy): fixup copy * fix(deformable_detr): move weight initialization to the right place * fix(grounding_dino): move weight initialization to the right place * fix(rt_detr): move weight initialization to the right place * [run-slow] deformable_detr, grounding_dino, rt_detr	2024-10-01 20:08:57 +01:00
Matt	a43e84cb3b	Make ASR pipeline compliant with Hub spec + add tests (#33769 ) * Remove max_new_tokens arg * Add ASR pipeline to testing * make fixup * Factor the output test out into a util * Full error reporting * Full error reporting * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Lysandre Debut <hi@lysand.re> * Small comment --------- Co-authored-by: Lysandre Debut <hi@lysand.re>	2024-10-01 18:15:04 +01:00
Nicola De Angeli	0256520794	fix: repair depth estimation multiprocessing (#33759 ) * fix: repair depth estimation multiprocessing * test: add test for multiprocess depth estimation	2024-10-01 17:59:59 +01:00
Yih-Dar	f205da9660	Avoid using context that is not accessable from external contributors (#33866 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-01 17:42:45 +02:00
Manal ML	0c4c2d7e07	Add include_loss_for_metrics (#33088 ) * Add include_loss_for_metrics * Fix styling * Initialize inputs and losses to avoid AttributeError * Ruff styling * Refactor compute_metrics and update EvalPrediction * Change Naming * Added include_for_metrics to group both args * Fix style * Change warnings to logger Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-10-01 16:51:41 +02:00
jackyjinjing	5f9f58fc59	Validate the eval dataset in advance. (#33743 ) * Validate the eval dataset in advance. * format * format * format * Update src/transformers/trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * format --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-10-01 16:45:06 +02:00
Kyle Sayers	f8110a6ddf	Raise `accelerate` dependency error in case of defaulting `low_cpu_mem_usage=True` (#33830 ) Clarify warning, add import check	2024-10-01 16:44:38 +02:00
aroun-coumar	326b2bad1c	This PR contains additional changes for #33143 (#33581 ) * fix: Fix optimizer bug in ModelCard * fix: fix W293 * Fixes in modelcard.py for issue #33143 --------- Co-authored-by: moontidef <53668275+relic-yuexi@users.noreply.github.com>	2024-10-01 16:42:30 +02:00
Raushan Turganbay	b1c914e463	Fix device mismatch errors (#33851 ) fix device mismatch errors	2024-10-01 15:55:57 +02:00
Matt	ac28a23b3d	Workaround for bark issue in pipelines (#33824 ) * Quick workaround for bark + generation_config issue * make fixup * [run slow] bark	2024-10-01 14:40:12 +01:00
Francesco Ortu	acdfdd9387	add attention weight up-cast to float32 in chameleon (#33822 ) add attention weight float32 cast in chameleon	2024-10-01 15:19:16 +02:00
Fabian David Schmidt	351873a145	fix: skip dropout in eval for flash_attn in various models (#33844 ) * fix(m2m_100): skip dropout in eval for flash_attn * fix(misc): skip dropout in eval for flash attn various models * chore(m2m_100): copy flash attn from bart * chore: run make fix-copies * [run-slow] bart, m2m_100	2024-10-01 14:39:21 +02:00
Kenza Bouzid	88d960937c	Refactor image features selection in LlaVa (#33696 ) * refactor image features selection * break line * remove whitespace * add pr comments: include projection and rename function * make fix-copies * fix get_image_feature in vip llava	2024-10-01 14:37:31 +02:00
Joao Gante	22266be970	Generate: move llama `prepare_inputs_for_generation` to `GenerationMixin` (#33677 )	2024-10-01 12:32:54 +01:00
Yih-Dar	d19ab15421	post reminder comment only once (#33848 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-01 12:52:53 +02:00
Wing Lian	fbde09c8c9	fix check for hidden size in text model for deepspeed zero3 auto entries (#33829 ) * fix check for hidden size in text model for deepspeed zero3 auto entries * fix typo	2024-10-01 12:28:26 +02:00
Guang Yang	808997a634	Fix passing str dtype to static cache (#33741 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-01 09:50:17 +02:00
Adibvafa Fallahpour	c269c5c74d	Fix Mamba slow path bug with dtype mismatch. (#32691 ) * Fix Mamba slow path bug with dtype mismatch. * Update test_modeling_mamba.py * Improve style. * Fix issue with cache position of dtype mismatch test. * Change test for slow path. * Revert changes. * Switch to buggy code and add test to catch it. * Fix the dtype mismatch bug and add test code to verify it. * Fix minor bug with test. * Fix incorrect dtype of model output. * Fix incorrect dtype of cache. * Fix incorrect dtype of ssm cache. * Fix incorrect dtype of conv state. * Remove assertion for ssm state. * Add assertion for conv state dtype. * Fix all issues with dtype mismatch test.	2024-10-01 09:28:40 +02:00
dependabot[bot]	570c89625b	Bump torch from 1.13.1 to 2.2.0 in /examples/research_projects/lxmert (#33821 ) Bumps [torch](https://github.com/pytorch/pytorch) from 1.13.1 to 2.2.0. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v1.13.1...v2.2.0) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-09-30 21:57:57 +02:00
Aryan	90dca5a71b	minor typo fix (#33784 ) fix typo	2024-09-30 21:42:22 +02:00
pogpog	b77846a6e6	Fix link in gguf.md (#33768 ) Change hyphen to underscore for URL in link to convert_hf_to_gguf.py	2024-09-30 20:17:33 +02:00
aroun-coumar	baa765f813	Fixes for issue #33763 in idefics2 model (#33766 )	2024-09-30 18:08:48 +01:00
Joshua Lochner	18c5b216f1	Fix ViT-MAE decoder interpolate (#33330 ) * Fix ViT-MAE decoder interpolate * Add unit test for `interpolate_pos_encoding` w/ custom sizes * [run_slow] vit_mae	2024-09-30 18:47:13 +02:00
Arthur	1dba608df9	[`modular`] fixes! (#33820 ) * fix converter for function definitions * small changes * no prints * style	2024-09-30 16:43:55 +02:00
Yih-Dar	1d29a75a6a	Add Slow CI reminder bot (#33506 ) * add workflow * update * fix * Update .github/workflows/slow_ci_remainder.yml Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-09-30 16:26:54 +02:00
mobicham	f5247aca01	Hqq serialization (#33141 ) * HQQ model serialization attempt * fix hqq dispatch and unexpected keys * style * remove check_old_param * revert to check HQQLinear in quantizer_hqq.py * revert to check HQQLinear in quantizer_hqq.py * update HqqConfig default params * make ci happy * make ci happy * revert to HQQLinear check in quantizer_hqq.py * check hqq_min version 0.2.0 * set axis=1 as default in quantization_config.py * validate_env with hqq>=0.2.0 version message * deprecated hqq kwargs message * make ci happy * remove run_expected_keys_check hack + bump to 0.2.1 min hqq version * fix unexpected_keys hqq update * add pre_quantized check * add update_expected_keys to base quantizerr * ci base.py fix? * ci base.py fix? * fix "quantization typo" src/transformers/utils/quantization_config.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix post merge --------- Co-authored-by: Marc Sun <marc@huggingface.co> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-09-30 14:47:18 +02:00
Quentin Gallouédec	4d5b458704	Fix typo in documentation (#33805 ) fix typo	2024-09-30 12:02:23 +02:00
Jerry Zhang	4bb49d4e00	Enable non-safetensor ser/deser for TorchAoConfig quantized model 🔴 (#33456 ) * Enable non-safetensor serialization and deserialization for TorchAoConfig quantized model Summary: After https://github.com/huggingface/huggingface_hub/pull/2440 we added non-safetensor serialization and deserialization in huggingface, with this we can now add the support in transformers Note that we don't plan to add safetensor serialization due to different goals of wrapper tensor subclass and safetensor see README for more details Test Plan: tested locally Reviewers: Subscribers: Tasks: Tags: * formatting * formatting * minor fix * formatting * address comments * comments * minor fix * update doc * refactor compressed tensor quantizer	2024-09-30 11:30:29 +02:00
Philip May	2e24ee4dfa	Fix typing in `load_balancing_loss_func` function of `modeling_mixtral.py`. (#33641 ) * fix return type * update to union * fix gate_logits typing * fix num_experts type * fix typing * run fix-copies * add doc for top_k * run fix-copies * empty commit to trigger CI	2024-09-27 18:10:07 +02:00
Matt	d3821c4aed	Make audio classification pipeline spec-compliant and add test (#33730 ) * Make audio classification pipeline spec-compliant and add test * Check that test actually running in CI * Try a different pipeline for the CI * Move the test so it gets triggered * Move it again, this time into task_tests! * make fixup * indentation fix * comment * Move everything from testing_utils to test_pipeline_mixin * Add output testing too * revert small diff with main * make fixup * Clarify comment * Update tests/pipelines/test_pipelines_audio_classification.py Co-authored-by: Lucain <lucainp@gmail.com> * Update tests/test_pipeline_mixin.py Co-authored-by: Lucain <lucainp@gmail.com> * Rename function and js_args -> hub_args * Cleanup the spec recursion * Check keys for all outputs --------- Co-authored-by: Lucain <lucainp@gmail.com>	2024-09-27 17:01:06 +01:00
Lysandre Debut	4973fc5769	Model addition timeline (#33762 ) * Model addition timeline * Link guide * Update docs/source/en/add_new_model.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/add_new_model.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Review comments * Add contact email --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-09-27 17:15:13 +02:00
Matt	75cd270e5e	Cleanup return_text and return_full_text options in TextGenerationPipeline (#33542 ) * Cleanup return_text and return_full_text options in TextGenerationPipeline * Cleanup return_text and return_full_text options in TextGenerationPipeline * Cleanup return_text and return_full_text options in TextGenerationPipeline * Cleanup return_text and return_full_text options in TextGenerationPipeline * Revert pipeline code, but update docs instead * Restore pipeline test	2024-09-27 15:01:31 +01:00
Ita Zaporozhets	0d09c44bd4	remove warning v2 (#33761 )	2024-09-27 14:54:28 +02:00
dependabot[bot]	4196590aa0	Bump torch from 1.13.1 to 2.2.0 in /examples/flax/vision (#33748 ) Bumps [torch](https://github.com/pytorch/pytorch) from 1.13.1 to 2.2.0. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v1.13.1...v2.2.0) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-09-27 13:24:11 +02:00
Vladislav Bronzov	9d200cfbee	Add gguf support for bloom (#33473 ) * add bloom arch support for gguf * apply format * small refactoring, bug fix in GGUF_TENSOR_MAPPING naming * optimize bloom GGUF_TENSOR_MAPPING * implement reverse reshaping for bloom gguf * add qkv weights test * add q_8 test for bloom	2024-09-27 12:13:40 +02:00
Raushan Turganbay	3e039d3827	Paligemma support for multi-image (#33447 ) * upadte * Update src/transformers/models/paligemma/processing_paligemma.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update docs * better example in tests * support image tokens * read token * Update tests/models/paligemma/test_processing_paligemma.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * nit: naming * Update docs/source/en/model_doc/paligemma.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * conflicts after rebasing --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2024-09-27 11:23:14 +02:00
John B Nelson	55b7a0404e	Make siglip examples clearer and error free (#33667 ) Update siglip.md This was already partially fixed relative to the deployed docs. But the partial fix made it inconsistent. Additionally, giving the full text ("This is a photo of...") is likely not the desired output.	2024-09-27 10:33:55 +02:00
Arthur	7f9a9ca1e0	[`MllamaImageProcessing`] Update doc (#33747 ) * update docstring * style	2024-09-27 10:27:11 +02:00
Arthur	5f4420587a	[`clean_up_tokenization_spaces`] Pl bart was failing, updating (#33735 ) `clean_up_tokenization_spaces=True` for pl bart	2024-09-27 10:26:51 +02:00

1 2 3 4 5 ...

16990 Commits