Commit Graph

127 Commits

Author SHA1 Message Date
Joao Gante
a6b51e7341
[Whisper + beam search] fix usage of beam_indices (#38259)
Some checks are pending
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run
Build documentation / build (push) Waiting to run
New model PR merged notification / Notify new model (push) Waiting to run
Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions
Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run
Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions
Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions
Secret Leaks / trufflehog (push) Waiting to run
Update Transformers metadata / build_and_package (push) Waiting to run
* tmp

* fix test_tiny_token_timestamp_batch_generation

* better comments

* test

* comments

* Apply suggestions from code review

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

---------

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
2025-05-23 10:05:44 +00:00
Joao Gante
e4decee9c0
[whisper] small changes for faster tests (#38236) 2025-05-21 14:11:08 +01:00
Yao Matrix
3bd1c20149
enable misc cases on XPU & use device agnostic APIs for cases in tests (#38192)
* use device agnostic APIs in tests

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* more

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* fix style

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* add reset_peak_memory_stats API

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* update

---------

Signed-off-by: Matrix Yao <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-20 10:09:01 +02:00
Raushan Turganbay
955e61b0da
Remove head mask in generative models (#35786)
* just squash into one commit

* delete print
2025-05-15 10:44:19 +02:00
Fanli Lin
c7c2f08994
make test_speculative_decoding_non_distil device-agnostic (#38010)
* make device-agnostic

* use condition

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-05-08 19:19:47 +00:00
co63oc
d5fa7d2d19
Fix typos in strings and comments (#37799) 2025-04-28 11:39:11 +01:00
Joao Gante
85665a4263
[tests] Stricter generate + compilation test -- no recompilations allowed (#37629)
* tmp commit

* stricter compilation test

* trigger tests

* rm todo
2025-04-22 11:12:18 +01:00
cyyever
1e6b546ea6
Use Python 3.9 syntax in tests (#37343)
Signed-off-by: cyy <cyyever@outlook.com>
2025-04-08 14:12:08 +02:00
Cyril Vallez
f304318f5f
Remove low_cpu_mem_usage and _fast_init (#36963)
* Remove low_cpu_mem_usage and _fast_init

* Update deepspeed.py

* Update modeling_utils.py

* remove the first 2 tests everywhere

* Update test_modeling_common.py

* remove what was remaining about fast_init

* fix logic and simplify

* mismatched keys logic update

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* fix 2 models init_weights

* extend to others

* remove grad

* Update modeling_fsmt.py

* init weights in tests

* style

* Update test_modeling_fsmt.py

* more old models

* fix more init_weights

* copies

* fix

* style

* Update modeling_lxmert.py

* fix inits

* more and more

* more

* should finalize

* style

* Update modeling_dinov2_with_registers.py

* fix

* Update modeling_encoder_decoder.py

* fix

* style

* Update modeling_lxmert.py

* post rebase cleanup

* Update modeling_informer.py

* back to start for device

* fix

* add test to detect all failing cases correctly

* Update test_modeling_common.py

* fix

* fix

* sam

* style

* Update modeling_maskformer_swin.py

* CIs

* CIs

* remove test - will add it on separate PR

* fix

* fix

* Update modeling_sam.py

* CIs

* CIs

* CIs

* convnext

* suggestions

* CIs

* fix copies after merge

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-03-31 17:18:43 +02:00
co63oc
996f512d52
Fix typos in tests (#36547)
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
2025-03-05 15:04:06 -08:00
Joao Gante
678885bbbd
[CI] Check test if the GenerationTesterMixin inheritance is correct 🐛 🔫 (#36180) 2025-02-21 10:18:20 +00:00
Joao Gante
99adc74462
[tests] remove flax-pt equivalence and cross tests (#36283) 2025-02-19 15:13:27 +00:00
Joao Gante
0863eef248
[tests] remove pt_tf equivalence tests (#36253) 2025-02-19 11:55:11 +00:00
Joao Gante
62c7ea0201
CI: avoid human error, automatically infer generative models (#33212)
* tmp commit

* move tests to the right class

* remove ALL all_generative_model_classes = ...

* skip tf roberta

* skip InstructBlipForConditionalGenerationDecoderOnlyTest

* videollava

* reduce diff

* reduce diff

* remove  on vlms

* fix a few more

* manual rebase bits

* more manual rebase

* remove all manual generative model class test entries

* fix up to ernie

* a few more removals

* handle remaining cases

* recurrent gemma

* it's better here

* make fixup

* tf idefics is broken

* tf bert + generate is broken

* don't touch tf :()

* don't touch tf :(

* make fixup

* better comments for test skips

* revert tf changes

* remove empty line removal

* one more

* missing one
2025-02-13 16:27:11 +01:00
Joao Gante
be2ac0916a
[generate] shape checks in tests compatible with fixed-length caches (+ some minor fixes) (#35993)
* shape checks compatible with static cache

* add test

* tmp

* manually turn on eager attn when we want to output attn

* typo

* generalize to encoder-decoder models

* force compilation on cpu

* tmp commit

* fix static cache shape checks

* models with odd caches

* fix copies

* shorter cache search loop

* use decoder_past_key_values everywhere

* better test variable names and comments

* signature

* rename _check_outputs into _check_generate_outputs

* add comments

* HybridCache future test note
2025-02-10 17:50:54 +00:00
Raushan Turganbay
365fecb4d0
Whisper: fix static cache CI (#35852)
* fix

* remove overriden method

* small change
2025-01-30 12:43:00 +01:00
Joao Gante
ece8c42488
Test: generate with torch.compile(model.forward) as a fast test (#34544) 2025-01-28 14:10:38 +00:00
Arthur
b912f5ee43
use torch.testing.assertclose instead to get more details about error in cis (#35659)
* use torch.testing.assertclose instead to get more details about error in cis

* fix

* style

* test_all

* revert for I bert

* fixes and updates

* more image processing fixes

* more image processors

* fix mamba and co

* style

* less strick

* ok I won't be strict

* skip and be done

* up
2025-01-24 16:55:28 +01:00
eustlb
da334bcfa8
[Whisper] 🚨 Fix whisper decoding 🚨 (#34135)
* do not remove decoder_input_ids for the first segment

* do not remove eos token in generate_with_fallback

* when removing padding tokens, do not remove eos token

* remove eos token in generate (and not in generate_with_fallback!)

* reconciliate short-from/ long-form behavior

* correct avg_logprobs calculation

* handle eos token in segments

* handle decoder_input_ids and eos token in _prepare_decoder_input_ids

* fix incorrect time precision

* always remove eos token

* always remove decoder_input_ids

* no need to handle decoder_inputs_ids and eos token

* no need to remove decoder_input_ids

* no need to handle eos token

* fix num_beams in _retrieve_logit_processors

* remove todo unconsistency

* no need to add eos token

* last_timestamp_pos should indeed be timestamp token pos

* patch generate to enable compatibility with GenerationTesterMixin tests

* adapt test_generate_continue_from_past_key_values

* adapt test_prompt_lookup_decoding_matches_greedy_search

* adapt generic GenerationMixin tests to whisper's generate

* fix speculative decoding

* fix

* [run-slow] whisper

* change HF_HUB_TOKEN for require_read_token

* [run-slow] whisper

* prioritize kwargs over generation_config

* remove unnecessary args

* [run-slow] whisper

* update tests

* [run-slow] whisper

* add comment

* update test

* [run-slow] whisper

* update test + revert require_read_token

* docstring updates

* revert tokenizer decode args change

* do not use a patch + docstring updates

* [run-slow] whisper

* make

* [run-slow] whisper

* add a flag to force unique call to generate

* test update

* [run-slow] whisper

* add force_unique_generate_call arg

* do not use a patch

* correct the timestamps for the pad tokens

* docstring update

* docstring update

* docstring update

* upodate TF tests

* add require_read_token

* [run-slow] whisper

* test reset dynamo

* [run-slow] whisper

* fix

* [run-slow] whisper

* avoid iterating twice on current_segments

* [run-slow] whisper

* [run-slow] whisper

---------

Co-authored-by: Eustache Le Bihan <eustlb@users.noreply.huggingface.co>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-18 14:13:21 +01:00
eustlb
54aae121eb
[Whisper] Fix whisper tokenizer (#34537)
* handle single timestamp ending

* include last timestamp token

* handle single timestamp ending

* avoid floating points arithm limitations

* ensure float64 operations

* new test

* make fixup

* make copies

* handle edge case double tokens ending with different tokens

* handle single timestamp ending

* make fixup

* handle conditioning on prev segments

* fix

* Update src/transformers/models/whisper/generation_whisper.py

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* [run-slow] whisper

* don't call item() to avoid unnecessary sync

* fix

---------

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
Co-authored-by: Eustache Le Bihan <eustlb@users.noreply.huggingface.co>
2024-12-05 13:46:29 +01:00
Raushan Turganbay
5e8c1d713d
Offloaded cache: fix generate (#34921)
* fix cache impl

* require_torch_gpu

* fix mamba

* fix copies
2024-11-28 15:05:56 +01:00
eustlb
4d1d0f29a4
[Whisper] Fix whisper integration tests (#34111)
* fix test_tiny_timestamp_generation

* fix test_large_timestamp_generation

* fix test_whisper_shortform_single_batch_prev_cond

* fix test_whisper_shortform_multi_batch_hard_prev_cond

* return_timestamps necessary with long form

* fix test_default_multilingual_transcription_long_form

* fix test_tiny_token_timestamp_generation_longform

* fix test_whisper_longform_multi_batch_hard

* Update tests/models/whisper/test_modeling_whisper.py

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* fix typo

* do not expect special tokens

* fix test_whisper_longform_single_batch_beam

* fix test_whisper_longform_multi_batch_hard_prev_cond

* update test_whisper_longform_multi_batch_hard_prev_cond

* update test_whisper_longform_multi_batch_hard_prev_cond

* these tests does not make sense anymore

* this test does not make sense anymore

* make fixup

* suggested nits

* add test with forced_decoder_ids

* this test does not make sense anymore

* change assert for unittest test cases

* make fixup

* test with prompt_ids and task and language

* fix unittest test case call

* fix test_tiny_generation

* fix test_tiny_en_generation

* fix test_tiny_en_batched_generation

* fix test_tiny_longform_timestamps_generation

* fix test_tiny_timestamp_generation

* fix test_large_generation

* fix test_large_batched_generation

* fix test_large_generation_multilingual

* fix test_large_timestamp_generation

* fix test_large_timestamp_generation

* fix test_tiny_token_timestamp_generation_longform

* fix test_tiny_en_batched_generation

* make fixup

* [run-slow] whisper

---------

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
2024-11-26 12:23:08 +01:00
Joao Gante
8a734ea2c3
Tests: move generate tests to the right mixin and delete redundant tests (#34464)
* tmp commit

* tmp commit

* cull overwrites of deleted tests

* typo

* more specific docstring

* make fixup

* parameterize at the top?

* correction

* more deletions :D

* tmp commit

* for VLMs too

* fix _check_outputs

* test nit

* make fixup

* fix another flaky

* test_generate_from_inputs_embeds -- handle missing attention mask
2024-10-30 10:59:08 +00:00
Pavel Iakubovskii
48461c0fe2
Make pipeline able to load processor (#32514)
* Refactor get_test_pipeline

* Fixup

* Fixing tests

* Add processor loading in tests

* Restructure processors loading

* Add processor to the pipeline

* Move model loading on tom of the test

* Update `get_test_pipeline`

* Fixup

* Add class-based flags for loading processors

* Change `is_pipeline_test_to_skip` signature

* Skip t5 failing test for slow tokenizer

* Fixup

* Fix copies for T5

* Fix typo

* Add try/except for tokenizer loading (kosmos-2 case)

* Fixup

* Llama not fails for long generation

* Revert processor pass in text-generation test

* Fix docs

* Switch back to json file for image processors and feature extractors

* Add processor type check

* Remove except for tokenizers

* Fix docstring

* Fix empty lists for tests

* Fixup

* Fix load check

* Ensure we have non-empty test cases

* Update src/transformers/pipelines/__init__.py

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Update src/transformers/pipelines/base.py

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Rework comment

* Better docs, add note about pipeline components

* Change warning to error raise

* Fixup

* Refine pipeline docs

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-10-09 16:46:11 +01:00
Yoach Lacombe
124713c32b
Fix distil whisper segment computation (#33920)
* Fix distil whisper segment computation

* [run-slow] whisper
2024-10-04 11:18:01 +02:00
Yoach Lacombe
bf0ffe3d29
[Tests] Diverse Whisper fixes (#33665)
* fix beam indices in token_timestamps

* fix attention_mask in FA2

* correct translation example with the right example

* correct how somes tests are using outputs + correct num_frames

* fix shortform batch prev cond tests

* make fix-copies

* make fix-copies

* take care of shifting beam indices

* [run-slow] whisper

* [run-slow] whisper
2024-10-03 15:59:01 +02:00
Joao Gante
d29738f5b4
Generate tests: modality-agnostic input preparation (#33685) 2024-10-03 14:01:24 +01:00
Joao Gante
a7734238ff
Generation tests: update imagegpt input name, remove unused functions (#33663) 2024-09-24 16:40:48 +01:00
Joao Gante
2fdb5e74cc
VLM generate: tests can't generate image/video tokens (#33623) 2024-09-20 15:43:27 +01:00
Fanli Lin
8bd1f2f338
[tests] make more tests device-agnostic (#33580)
* enable

* fix

* add xpu skip

* add marker

* skip for xpu

* add more

* enable on accelerator

* add more cases

* add more tests

* add more
2024-09-20 10:16:43 +01:00
Raushan Turganbay
d7975a5874
VLMs: enable generation tests (#33533)
* add tests

* fix whisper

* update

* nit

* add qwen2-vl

* more updates!

* better this way

* fix this one

* fix more tests

* fix final tests, hope so

* fix led

* Update tests/generation/test_utils.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* pr comments

* not pass pixels and extra for low-mem tests, very flaky because of visio tower

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2024-09-19 12:04:24 +02:00
Nikita Krasnytskyi
c29a8694b0
Fix missing sequences_scores in the Whisper beam search output (#32970)
* added sequences_scores to the output

* added beam_indices to output

* added test to check for beam_indices, sequences_scores and their shape

* removed redundant whitespaces

* make fixup
2024-09-17 19:36:11 +01:00
Yoach Lacombe
98adf24883
[Whisper test] Fix some failing tests (#33450)
* Fix failing tensor placement in Whisper

* fix long form generation tests

* more return_timestamps=True

* make fixup

* [run_slow] whisper

* [run_slow] whisper
2024-09-16 19:05:17 +02:00
benniekiss
5c6257d1fc
[whisper] Clarify error message when setting max_new_tokens (#33324)
* clarify error message when setting max_new_tokens

* sync error message in test_generate_with_prompt_ids_max_length

* there is no self
2024-09-12 18:48:36 +02:00
Amir Mohammad Fakhimi
3314fe1760
Add validation for maximum sequence length in modeling_whisper.py (#33196)
* Add validation for maximum sequence length in modeling_whisper.py

Added a validation check to ensure that the sequence length of labels does not exceed the maximum allowed length of 448 tokens. If the sequence length exceeds this limit, a ValueError is raised with a descriptive error message.

This change prevents the model from encountering errors or unexpected behavior due to excessively long sequences during training or fine-tuning, ensuring consistent input dimensions and improving overall robustness.

* Change exception message in src/transformers/models/whisper/modeling_whisper.py

The exception message is for whisper's label's sequence max length.

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* Change 448 to config.max_target_positions in src/transformers/models/whisper/modeling_whisper.py

It's for whisper's config.max_target_positions.

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* Change method's documentation in src/transformers/models/whisper/modeling_whisper.py

* Add test for maximum label's sequence length in test_modeling_whisper.py

* Add self to modeling_whisper.py

* Update test_modeling_whisper.py with respect to automatic validations

* Update modeling_whisper.py with respect to ci/circleci: check_code_quality

* Update test_modeling_whisper.py with respect to ci/circleci: check_code_quality

* Update test_modeling_whisper.py with respect to ci/circleci: tests_generate

* Update test_modeling_whisper.py with respect to ci/circleci: tests_generate

* Update test_modeling_whisper.py with respect to ci/circleci: check_code_quality

* Separate test_labels_sequence_max_length tests in test_modeling_whisper.py

* Update test_modeling_whisper.py with respect to ci/circleci: check_code_quality

* Remove assert from test_modeling_whisper.py

* Add max_target_positions to WhisperModelTester in test_modeling_whisper.py

* Update test_modeling_whisper.py with respect to ci/circleci: check_code_quality

* Update test_modeling_whisper.py with respect to ci/circleci: tests_generate

* Update test_modeling_whisper.py

* Change test_labels_sequence_max_length_error_after_changing_config in test_modeling_whisper.py

* Change self.config.max_target_positions to self.max_target_positions modeling_whisper.py

* Add new tests in test_modeling_whisper.py

* Update test_modeling_whisper.py

---------

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
2024-09-06 14:09:49 +02:00
Sanchit Gandhi
51d15eb1c1
[whisper] alternative fix for long-form timestamps (#32131)
* [whisper] alternative fix for long-form timestamps

* update test
2024-09-06 12:57:08 +02:00
Joao Gante
970a16ec7f
Forbid PretrainedConfig from saving generate parameters; Update deprecations in generate-related code 🧹 (#32659)
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-08-23 11:12:53 +01:00
Joao Gante
70d5df6107
Generate: unify LogitsWarper and LogitsProcessor (#32626) 2024-08-16 11:20:41 +01:00
Raushan Turganbay
8f2b6d5e3d
Fix: FA2 with packed training (#32487)
* fix check

* add tests

* [run-slow] llama, gemma2

* oops, whisper actually runs but needed some special treatment
2024-08-12 13:40:07 +05:00
Pablo Montalvo
044281605f
Fix generate with inputs_embeds as input (#32493)
* I think inputs_embeds has ndim == 3

* fix sequence length catch

* add generate test

* [run-slow]olmo, persimmon, gemma, gemma2, qwen2, llama

* skip whisper

* fix bart test

* more fixes
2024-08-08 18:44:53 +02:00
Sanchit Gandhi
e234061cdd
[whisper] compile compatibility with long-form decoding (#31772)
* [whisper] compile compatibility with long-form decoding

* clarify comment

* fix after rebase

* finalise

* fix bsz

* fix cache split

* remove contiguous

* style

* finish

* update doc

* prevent cuda graph trace
2024-08-01 18:10:56 +08:00
Joao Gante
7ffe25f2b9
Generate: end-to-end compilation (#30788)
* mvp

* added test (a few models need fixes)

* fix a few test cases

* test nits

* harder test 😈

* revert changes in stablelm

* test with improved condition

* add todo

* tmp commit

* merged with main

* nits

* add todo

* final corrections

* add docs for generation compilation

* docs nits

* add  tip

* PR suggestions

* add more details to the compilation docs

* fix cache positions

* cache is now init in generate; update docs

* tag test as flaky

* docs

* post rebase make fixup and other nits

* remove unintended changes

* whisper (encoder-decoder) not supported

* move token default updates to ; add tests for token defaults

* push changes

* manual rebase

* chameleon doesn't support this

* fix test_static_cache_mha_mqa_gqa (broken in another PR)

* docs: dynamic is better with end-to-end compilation
2024-07-29 10:52:13 +01:00
Sanchit Gandhi
5658e749ad
[whisper] fix short-form output type (#32178)
* [whisper] fix short-form output type

* add test

* make style

* update long-form tests

* fixes

* last fix

* finalise test
2024-07-25 16:58:02 +08:00
Sanchit Gandhi
3263b34354
Revert "Incorrect Whisper long-form decoding timestamps " (#32148)
Revert "Incorrect Whisper long-form decoding timestamps  (#32003)"

This reverts commit cd48553fc8.
2024-07-23 18:34:30 +08:00
Sanchit Gandhi
f83c6f1d02
Remove trust_remote_code when loading Libri Dummy (#31748)
* [whisper integration] use parquet dataset for testing

* propagate to others

* more propagation

* last one
2024-07-23 14:54:38 +08:00
Kamil Akesbi
89575b567e
Support generating with fallback for short form audio in Whisper (#30984)
* remove is_shortform

* adapt _retrieve_max_frames_and_seek for short_form

* return bos token in short and long form

* add decoder_input_ids to short form audios

* add eos token for  short form

* handle short form token_timestamps

* no need to return scores

* add is_shortform conditions

* handle when max_new_tokens is None - short form

* handle assistant decoding

* fix

* handle return_dict_in_generate

* handle split_by_batch for encoder_attentions attribute

* handle num_beams>1

* handle num_return_sequences>1 in generate_with_fallback

* handle num_return_sequences>1 with return_dict_in_generate=True

* raise error if max_new_tokens + decoder_inputs_ids > max_target_pos

* fix

* apply review suggestions

* fix

* Update src/transformers/models/whisper/generation_whisper.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update src/transformers/models/whisper/generation_whisper.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update src/transformers/models/whisper/generation_whisper.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* fix

* logits for both short form and long form

* handle if logits_processor is None

* test

* apply review changes to num_return_sequences

* add _expand_variables_for_generation

* remove short form commented section

* update comments

* uncomment num_beams line in generate_with_fallback

* update assistant decoding

* handle return_segment with short form generation

* up

* fix output format is_shortform

* overwrite beam_sample test

* update _set_return_timestamps

* apply review suggestions

* apply review suggestions

* remove seek_outputs_short_form

* fix _stack_split_outputs

* fix stack dim in _stack_split_outputs

* update tests

* fix past_key_values + beam tests

* fix

* clean _expand_variables_for_generation

* make style

* fix slow tests

* make style

* max_length condition

* make style

* add slow tests for shortform fallback

* Update src/transformers/models/whisper/generation_whisper.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update src/transformers/models/whisper/generation_whisper.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* apply review changes

* Update src/transformers/models/whisper/generation_whisper.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* up

* fix slow tests

* apply review suggestions

* update test

* make style

* small fix

* fix

* fix test_new_cache_format

* fix past_key_values

* fix

* make style

* fix slow tests

* fix

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2024-07-19 13:42:22 +01:00
Kamil Akesbi
cd48553fc8
Incorrect Whisper long-form decoding timestamps (#32003)
* fix lo form timestamps in decode_batch

* Update src/transformers/models/whisper/tokenization_whisper.py

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* Update src/transformers/models/whisper/tokenization_whisper.py

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* add test

* make style

* fix copies

* Update src/transformers/models/whisper/tokenization_whisper_fast.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/whisper/tokenization_whisper.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/whisper/processing_whisper.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/whisper/tokenization_whisper.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* apply review suggestions

* fix

* fix copies

* fix

* Update src/transformers/models/whisper/tokenization_whisper_fast.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix-copies

---------

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-19 09:26:38 +01:00
Pavel Iakubovskii
691586b0dc
Fix tests skip (#32012)
* [run-slow] clip

* [run-slow] clip

* Fix skip -> skipTest

* [run-slow] clip
2024-07-17 08:37:43 +01:00
Joao Gante
e4682de635
Masking: remove flakiness from test (#31939) 2024-07-15 18:49:37 +01:00
Sanchit Gandhi
a9701953ff
[whisper] static kv cache (#31166)
* make work with cache abstraction

* correct for static cache

* hacks for compile

* make fast

* fix

* fix pos ids

* generate

* fix sdpa

* fix sdpa cache pos

* fix fa2

* clean fa2

* integrate cache into generate

* make style

* copies

* more copies

* update eager

* update sdpa

* update fa2

* simplify

* use cache pos

* always compute cross-cache for debug

* avoid recompiles
Co-authored-by: Arthur Zucker <arthur@huggingface.co>

* fix fix

* fix fix fix

* more fix

* try encoder-decoder cache (too messy)

* revert encoder-decoder cache

* check cross-attn cache

* use enc-dec dataclass

* use richer enc-dec dataclass

* clean-up

* revert static cache changes

* small fixes

* revert to cpu flag

* fix copies

* add static slow test

* past k/v docstring

* more docstrings

* cache_position docstrings

* add to docs

* add enc-dec cache to docs

* make style

* fix after rebase

* fix beam

* style

* fix generation strategies

* fix most decoder-only tests

* style

* skip test

* more clean up

* small docstrings

* Apply suggestions from code review

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* add todo

* only crop self-attn

* check cache in mixin

* style

* fix re-compile after rebase

* move `is_updated` logic to enc-dec wrapper

* revert back

* revert cache back

* finalise design

* fix

* fix fix

* style

* Update src/transformers/cache_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* deprecate

* updates

* final updates

* style

* style

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-07-02 13:24:15 +01:00