transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Deepak Saldanha	b6a01df6e9	[Doc]: Broken link in Kubernetes doc (#33879 ) * add relative path in .md and redirects to conf.py * add redirects to conf.py and update .md * modify links in .md	2024-10-04 11:20:56 +02:00
Yoach Lacombe	124713c32b	Fix distil whisper segment computation (#33920 ) * Fix distil whisper segment computation * [run-slow] whisper	2024-10-04 11:18:01 +02:00
Hamza Tahboub	2bd4d5897d	Minor error condition bug fix (#33781 ) * Error condition bug fix * Update error message * Update src/transformers/models/qwen2_vl/modeling_qwen2_vl.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Making change in the rest of the repo * Formatting * Formatting with ruff --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2024-10-04 08:25:32 +02:00
Matthew Hoffman	550673a70c	Remove `logits.float()` (#33902 ) * Remove logits.float() if not computing loss * Remove warning about 4.46 logits dtype change if not computing loss	2024-10-04 08:21:12 +02:00
Yoni Gozlan	074aa3b3fd	Uniformize kwargs for Idefics/2 processors (#32568 ) * Add uniformize idefics processor kwargs and tests * Uniformize idefics2 processor kwargs * add image_processor tests idefics * add BC args order change idefics2 processor and update doc * Add support for multiple images per prompt in image-text-to-text mode idefics * Fix processor input args in idefics tests * improve test processing common, remove unnecessary tests, update process uniformization * fix doctrings idefics * fix tests processors idefics/2	2024-10-03 18:08:24 +02:00
Joao Gante	b0c5660e88	Config: lower `save_pretrained` exception to warning (#33906 ) * lower to warning * msg * make fixup * rm extra comma	2024-10-03 16:45:14 +01:00
Jerry Zhang	15a4d24805	Add support for `weights_only` flag when loading state_dict (#32481 ) * Add support for `weights_only` flag when loading state_dict Summary: This is to enable loading a state_dict with wrapper tensor subclasses (used in torchao to for quantized weights) Test Plan: tested locally with torchao weights, also need https://github.com/huggingface/transformers/pull/32306: ``` import torch from transformers import AutoModelForCausalLM, AutoTokenizer from transformers import TorchAoConfig from torchao.utils import benchmark_model import torchao DEVICE_TYPE = "cuda" def init_model_and_benchmark(model_id, torch_dtype=torch.bfloat16, quantization_config=None): tokenizer = AutoTokenizer.from_pretrained(model_id) if quantization_config is not None: model = AutoModelForCausalLM.from_pretrained(model_id, device_map=DEVICE_TYPE, torch_dtype=torch.\bfloat16, quantization_config=quantization_config) else: model = AutoModelForCausalLM.from_pretrained(model_id, device_map=DEVICE_TYPE, torch_dtype=torch.\bfloat16, weights_only=False) # sanity check: run the model input_text = "What are we having for dinner?" input_ids = tokenizer(input_text, return_tensors="pt").to(DEVICE_TYPE) output = model.generate(*input_ids, max_new_tokens=1000) print(tokenizer.decode(output[0], skip_special_tokens=True)) NUM_WARMUP = 1 NUM_RUNS = 5 if quantization_config is not None: torchao.quantization.utils.recommended_inductor_config_setter() model = torch.compile(model, mode="max-autotune") benchmark_model(model.generate, NUM_WARMUP, kwargs=input_ids, device_type=DEVICE_TYPE) print("running benchmark") results = benchmark_model(model.generate, NUM_RUNS, kwargs=input_ids, device_type=DEVICE_TYPE) return model, results model_id = "jerryzh168/test-model" torchao.quantization.utils.recommended_inductor_config_setter() bf16_model, bf16_time = init_model_and_benchmark(model_id) print(f"bf16: {bf16_time}") ``` Reviewers: Subscribers: Tasks: Tags: format	2024-10-03 17:03:42 +02:00
Arthur	a220c5b99f	add setter for trainer processor (#33911 ) * add setter for trainer processor * Update src/transformers/trainer.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2024-10-03 16:34:10 +02:00
Benjamin Bossan	6500f78c86	[PEFT] Support low_cpu_mem_usage option for PEFT loading adapters (#33725 ) * [PEFT] Support low_cpu_mem_usage for PEFT loading PEFT added support for low_cpu_mem_usage=True when loading adapters in https://github.com/huggingface/peft/pull/1961. This feature is now available when installing PEFT v0.13.0. With this PR, this option is also supported when loading PEFT adapters directly into transformers models. Additionally, with this PR, https://github.com/huggingface/diffusers/pull/9510 will be unblocked, which implements this option in diffusers. * Fix typo	2024-10-03 16:15:36 +02:00
Yoach Lacombe	bf0ffe3d29	[Tests] Diverse Whisper fixes (#33665 ) * fix beam indices in token_timestamps * fix attention_mask in FA2 * correct translation example with the right example * correct how somes tests are using outputs + correct num_frames * fix shortform batch prev cond tests * make fix-copies * make fix-copies * take care of shifting beam indices * [run-slow] whisper * [run-slow] whisper	2024-10-03 15:59:01 +02:00
KanTakahiro	ab97a78130	Fix: use unidic-lite instead of ipadic as the tokenizer dictionary for Japanese (#33372 ) * Fix: use unidic-lite instead of ipadic as the tokenizer dictionary of Japanese Signed-off-by: Kan Takahiro <kan@Kans-Mac-mini.local> * fix the default name --------- Signed-off-by: Kan Takahiro <kan@Kans-Mac-mini.local> Co-authored-by: Kan Takahiro <kan@Kans-Mac-mini.local> Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2024-10-03 15:30:03 +02:00
Joao Gante	d29738f5b4	Generate tests: modality-agnostic input preparation (#33685 )	2024-10-03 14:01:24 +01:00
Arie Pratama Sutiono	f2bf4fcf3d	Add `SplinterTokenizer` unit test (#32652 ) * add unit tests for splinter_tokenizer * add unit test for splinter tokenizer, pass in the question_token to be saved on save_pretrained called * remove unused import * remove vocab_splinter.txt, add Copied from, use fmt:on and fmt:off to prevent autoformatting on long lines * remove all the spaces Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-10-03 14:49:56 +02:00
Ben Schneider	95a2f5f6c3	Fix module initialization for root module under Zero3 (#33632 ) * Use all state dict keys when checking if root module is initialized. * Apply style corrections * Add comment explaining change. * Change comment phrasing.	2024-10-03 14:41:50 +02:00
Guillaume LEGENDRE	4df3ccddb7	Migrate the CI runners to the new clusters (#33849 ) * try fixing push-ci * move to new runners * move benchmark.yml to new runners * move doctest_job.yml to new runners * move doctests.yml to new runners * move push-important-models.yml to new runners * move self-pr-slow-ci.yml to new runners * fix typo Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * fix working directory Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * fix working directory Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * improve code Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2024-10-03 14:39:49 +02:00
Joao Gante	6f0ce52760	VLM Generate: tag `test_static_cache_matches_dynamic` as flaky (#33630 ) flaky	2024-10-03 12:27:02 +01:00
Nonthachai Plodthong	f1a5f81296	Update an keyerror on _save_check_point prevent confusion of missing … (#33832 ) * Update an keyerror on _save_check_point prevent confusion of missing metric keys * Update grammar error and case sensitive. Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * adding update KeyError on _evaluate function to align with _save_checkpoint function --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-10-03 10:27:49 +02:00
HofitBata	dc8156fdd8	Fix dt proj bias reassigned (#33314 ) * When we set self.dt_proj.bias = None, it removes the bias parameter from the model. When we later tried to assign a tensor to self.dt_proj.bias, it caused a TypeError because PyTorch expects a Parameter object. * When we set self.dt_proj.bias = None, it removes the bias parameter from the model. When we later tried to assign a tensor to self.dt_proj.bias, it caused a TypeError because PyTorch expects a Parameter object. * When we set self.dt_proj.bias = None, it removes the bias parameter from the model. When we later tried to assign a tensor to self.dt_proj.bias, it caused a TypeError because PyTorch expects a Parameter object.	2024-10-03 09:51:03 +02:00
Yoni Gozlan	d7950bff82	uniformize processor Mllama (#33876 ) * uniformize processor Mllama * nit syntax * nit	2024-10-02 16:50:15 +02:00
Yoni Gozlan	62e8c759c3	rename all test_processing_.py to test_processor_.py (#33878 ) * rename all test_processing_.py to test_processor_.py ans fix duplicate test processor paligemma * fix copies * fix broken tests * fix-copies * fix test processor bridgetower	2024-10-02 16:43:43 +02:00
Pavel Iakubovskii	2f25ab95db	Handle Trainer `tokenizer` kwarg deprecation with decorator (#33887 ) * Handle deprecation with decorator * Fix for seq2seq Trainer	2024-10-02 15:28:20 +01:00
Yoni Gozlan	ee71c9853a	Optim deformable detr (#33600 ) * optimize deformable detr * fix copies * remove deformable_detr_basline * fix hardcoded float16 and .float() * [run slow] deformable-detr,grounding-dino,mask2former,oneformer,rt-detr * [run slow] deformable_detr,grounding_dino,mask2former,oneformer,rt_detr	2024-10-02 15:46:27 +02:00
Marc Sun	cac4a4876b	[Quantization] Switch to optimum-quanto (#31732 ) * switch to optimum-quanto rebase squach * fix import check * again * test try-except * style	2024-10-02 15:14:34 +02:00
amyeroberts	b7474f211d	Trainer - deprecate tokenizer for processing_class (#32385 ) * Trainer - deprecate tokenizer for processing_class * Extend chage across Seq2Seq trainer and docs * Add tests * Update to FutureWarning and add deprecation version	2024-10-02 14:08:46 +01:00
Omar Salman	e7c8af7f33	Add sdpa for DistilBert (#33724 ) * Add sdpa for DistilBert * [run_slow] distilbert * [run_slow] distilbert * [run_slow] distilbert * Try without slow tests * [run_slow] distilbert * [run_slow] distilbert	2024-10-02 13:55:19 +01:00
Kyle Sayers	614c79a9b0	Fix kwargs passed by AutoQuantizationConfig.from_pretrained (#33798 ) fix kwargs Co-authored-by: kylesayrs <kyle@neuralmagic.com>	2024-10-02 14:12:03 +02:00
Kyle Sayers	b09234cfc1	Allow for nightly packages of `compressed_tensors` (#33828 ) * only check spec * correct typo in nightly package name	2024-10-02 14:11:44 +02:00
g-prz	fe484726aa	Add falcon gguf (#33437 ) * feat(gguf): add falcon q2 k * fix(gguf): remove useless renaming * feat(gguf): seperate falcon 7b and 40b * feat(gguf): apply fixup * fix(test): error rebase * feat(gguf): add fp16 weight comparison for falcon * feat(gguf): test weight of all layers * test(gguf): add falcon 40b under skip decorator * feat(gguf): quick example for extracting model size	2024-10-02 14:10:39 +02:00
George	181c962aab	populate quantization_config for kv-cache-scheme only configs (#33874 )	2024-10-02 14:06:40 +02:00
Yih-Dar	e5d14f39ad	Don't run reminder bot for now (#33883 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-02 11:51:01 +02:00
Pablo Montalvo	50290cf7a0	Uniformize model processors (#31368 ) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default 👀 * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * 🧹 * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-10-02 10:41:08 +02:00
TrickEye	2292be6c1b	Fix: typo (#33880 ) Update llm_tutorial.md: typo	2024-10-02 09:12:21 +01:00
Yoni Gozlan	61ac161a9d	Add support for custom inputs and batched inputs in ProcessorTesterMixin (#33711 ) * add support for custom inputs and batched inputs in ProcessorTesterMixin * Fix batch_size behavior ProcessorTesterMixin * Change format prepare inputs batched * Remove override test pixtral processor * Remove unnecessary tests and cleanup after new prepare_inputs functions * Fix instructBlipVideo image processor	2024-10-01 23:52:03 +02:00
amyeroberts	1baa08897d	Repo consistency fix after #33339 (#33873 ) * Repo consistency fix after #33339 * [run-slow] omdet_turbo	2024-10-01 21:03:15 +01:00
Prakarsh Kaushik	68a2b50069	[Fix] ViViT interpolate_pos_encoding (#33815 ) * fix:test_inference_interpolate_pos_encoding * style:make style;make fixup * test: add suggestion to test_modeling_vivit * chore:add suggestions * style:make style * [run_slow] vivit * ci:slow test fix * [run_slow] vivit	2024-10-01 20:14:35 +01:00
g-prz	8635802af9	Move weight initilization deformabledetr (#33339 ) * fix(copy): fixup copy * fix(deformable_detr): move weight initialization to the right place * fix(grounding_dino): move weight initialization to the right place * fix(rt_detr): move weight initialization to the right place * [run-slow] deformable_detr, grounding_dino, rt_detr	2024-10-01 20:08:57 +01:00
Matt	a43e84cb3b	Make ASR pipeline compliant with Hub spec + add tests (#33769 ) * Remove max_new_tokens arg * Add ASR pipeline to testing * make fixup * Factor the output test out into a util * Full error reporting * Full error reporting * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Lysandre Debut <hi@lysand.re> * Small comment --------- Co-authored-by: Lysandre Debut <hi@lysand.re>	2024-10-01 18:15:04 +01:00
Nicola De Angeli	0256520794	fix: repair depth estimation multiprocessing (#33759 ) * fix: repair depth estimation multiprocessing * test: add test for multiprocess depth estimation	2024-10-01 17:59:59 +01:00
Yih-Dar	f205da9660	Avoid using context that is not accessable from external contributors (#33866 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-01 17:42:45 +02:00
Manal ML	0c4c2d7e07	Add include_loss_for_metrics (#33088 ) * Add include_loss_for_metrics * Fix styling * Initialize inputs and losses to avoid AttributeError * Ruff styling * Refactor compute_metrics and update EvalPrediction * Change Naming * Added include_for_metrics to group both args * Fix style * Change warnings to logger Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-10-01 16:51:41 +02:00
jackyjinjing	5f9f58fc59	Validate the eval dataset in advance. (#33743 ) * Validate the eval dataset in advance. * format * format * format * Update src/transformers/trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * format --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-10-01 16:45:06 +02:00
Kyle Sayers	f8110a6ddf	Raise `accelerate` dependency error in case of defaulting `low_cpu_mem_usage=True` (#33830 ) Clarify warning, add import check	2024-10-01 16:44:38 +02:00
aroun-coumar	326b2bad1c	This PR contains additional changes for #33143 (#33581 ) * fix: Fix optimizer bug in ModelCard * fix: fix W293 * Fixes in modelcard.py for issue #33143 --------- Co-authored-by: moontidef <53668275+relic-yuexi@users.noreply.github.com>	2024-10-01 16:42:30 +02:00
Raushan Turganbay	b1c914e463	Fix device mismatch errors (#33851 ) fix device mismatch errors	2024-10-01 15:55:57 +02:00
Matt	ac28a23b3d	Workaround for bark issue in pipelines (#33824 ) * Quick workaround for bark + generation_config issue * make fixup * [run slow] bark	2024-10-01 14:40:12 +01:00
Francesco Ortu	acdfdd9387	add attention weight up-cast to float32 in chameleon (#33822 ) add attention weight float32 cast in chameleon	2024-10-01 15:19:16 +02:00
Fabian David Schmidt	351873a145	fix: skip dropout in eval for flash_attn in various models (#33844 ) * fix(m2m_100): skip dropout in eval for flash_attn * fix(misc): skip dropout in eval for flash attn various models * chore(m2m_100): copy flash attn from bart * chore: run make fix-copies * [run-slow] bart, m2m_100	2024-10-01 14:39:21 +02:00
Kenza Bouzid	88d960937c	Refactor image features selection in LlaVa (#33696 ) * refactor image features selection * break line * remove whitespace * add pr comments: include projection and rename function * make fix-copies * fix get_image_feature in vip llava	2024-10-01 14:37:31 +02:00
Joao Gante	22266be970	Generate: move llama `prepare_inputs_for_generation` to `GenerationMixin` (#33677 )	2024-10-01 12:32:54 +01:00
Yih-Dar	d19ab15421	post reminder comment only once (#33848 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-01 12:52:53 +02:00

1 2 3 4 5 ...

17014 Commits