transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

History

Nima Yaqmuri 07ae53e6e7 Fix/speecht5 bug (#28481 ) * Fix bug in SpeechT5 speech decoder prenet's forward method - Removed redundant `repeat` operation on speaker_embeddings in the forward method. This line was erroneously duplicating the embeddings, leading to incorrect input size for concatenation and performance issues. - Maintained original functionality of the method, ensuring the integrity of the speech decoder prenet's forward pass remains intact. - This change resolves a critical bug affecting the model's performance in handling speaker embeddings. * Refactor SpeechT5 text to speech integration tests - Updated SpeechT5ForTextToSpeechIntegrationTests to accommodate the variability in sequence lengths due to dropout in the speech decoder pre-net. This change ensures that our tests are robust against random variations in generated speech, enhancing the reliability of our test suite. - Removed hardcoded dimensions in test assertions. Replaced with dynamic checks based on model configuration and seed settings, ensuring tests remain valid across different runs and configurations. - Added new test cases to thoroughly validate the shapes of generated spectrograms and waveforms. These tests leverage seed settings to ensure consistent and predictable behavior in testing, addressing potential issues in speech generation and vocoder processing. - Fixed existing test cases where incorrect assumptions about output shapes led to potential errors. * Fix bug in SpeechT5 speech decoder prenet's forward method - Removed redundant `repeat` operation on speaker_embeddings in the forward method. This line was erroneously duplicating the embeddings, leading to incorrect input size for concatenation and performance issues. - Maintained original functionality of the method, ensuring the integrity of the speech decoder prenet's forward pass remains intact. - This change resolves a critical bug affecting the model's performance in handling speaker embeddings. * Refactor SpeechT5 text to speech integration tests - Updated SpeechT5ForTextToSpeechIntegrationTests to accommodate the variability in sequence lengths due to dropout in the speech decoder pre-net. This change ensures that our tests are robust against random variations in generated speech, enhancing the reliability of our test suite. - Removed hardcoded dimensions in test assertions. Replaced with dynamic checks based on model configuration and seed settings, ensuring tests remain valid across different runs and configurations. - Added new test cases to thoroughly validate the shapes of generated spectrograms and waveforms. These tests leverage seed settings to ensure consistent and predictable behavior in testing, addressing potential issues in speech generation and vocoder processing. - Fixed existing test cases where incorrect assumptions about output shapes led to potential errors. * Enhance handling of speaker embeddings in SpeechT5 - Refined the generate and generate_speech functions in the SpeechT5 class to robustly handle two scenarios for speaker embeddings: matching the batch size (one embedding per sample) and one-to-many (a single embedding for all samples in the batch). - The update includes logic to repeat the speaker embedding when a single embedding is provided for multiple samples, and a ValueError is raised for any mismatched dimensions. - Also added corresponding test cases to validate both scenarios, ensuring complete coverage and functionality for diverse speaker embedding situations. * Improve Test Robustness with Randomized Speaker Embeddings		2024-01-16 14:14:28 +00:00
..
benchmark	[Test refactor 1/5] Per-folder tests reorganization (#15725 )	2022-02-23 15:46:28 -05:00
bettertransformer	Fixed malapropism error (#26660 )	2023-10-09 11:04:57 +02:00
deepspeed	Fix initialization for missing parameters in `from_pretrained` under ZeRO-3 (#28245 )	2024-01-09 14:58:21 +00:00
extended	Device agnostic trainer testing (#27131 )	2023-10-30 18:16:40 +00:00
fixtures	[WIP] add SpeechT5 model (#18922 )	2023-02-03 12:43:46 -05:00
fsdp	fix resuming from ckpt when using FSDP with FULL_STATE_DICT (#27891 )	2023-12-16 19:41:43 +05:30
generation	Generate: consolidate output classes (#28494 )	2024-01-15 17:04:08 +00:00
models	Fix/speecht5 bug (#28481 )	2024-01-16 14:14:28 +00:00
optimization	Make schedulers picklable by making lr_lambda fns global (#21768 )	2023-03-02 12:08:43 -05:00
peft_integration	[`Peft`] `modules_to_save` support for peft integration (#27466 )	2023-11-14 10:32:57 +01:00
pipelines	Tokenizer kwargs in textgeneration pipe (#28362 )	2024-01-15 16:52:18 +01:00
quantization	[GPTQ] Fix test (#28018 )	2024-01-15 11:22:54 -05:00
repo_utils	Allow `# Ignore copy` (#27328 )	2023-12-07 10:00:08 +01:00
sagemaker	Broken links fixed related to datasets docs (#27569 )	2023-11-17 13:44:09 -08:00
tokenization	[`Styling`] stylify using ruff (#27144 )	2023-11-16 17:43:19 +01:00
tools	Add support for for loops in python interpreter (#24429 )	2023-06-26 09:58:14 -04:00
trainer	Support `DeepSpeed` when using auto find batch size (#28088 )	2024-01-10 06:03:13 -05:00
utils	improve dev setup comments and hints (#28495 )	2024-01-15 18:36:40 +00:00
__init__.py	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
test_backbone_common.py	Align backbone stage selection with out_indices & out_features (#27606 )	2023-12-20 18:33:17 +00:00
test_cache_utils.py	Generate: SinkCache can handle iterative prompts (#27907 )	2023-12-08 20:02:20 +00:00
test_configuration_common.py	[ `PretrainedConfig`] Improve messaging (#27438 )	2023-11-15 14:10:39 +01:00
test_configuration_utils.py	F.scaled_dot_product_attention support (#26572 )	2023-12-09 05:38:14 +09:00
test_feature_extraction_common.py	Split common test from core tests (#24284 )	2023-06-15 07:30:24 -04:00
test_feature_extraction_utils.py	Remove-auth-token (#27060 )	2023-11-13 14:20:54 +01:00
test_image_processing_common.py	Fix a couple of typos and add an illustrative test (#26941 )	2023-12-11 15:51:51 +00:00
test_image_processing_utils.py	Remove-auth-token (#27060 )	2023-11-13 14:20:54 +01:00
test_image_transforms.py	Normalize floating point cast (#27249 )	2023-11-10 15:35:27 +00:00
test_modeling_common.py	Use mmap option to load_state_dict (#28331 )	2024-01-10 09:57:30 +01:00
test_modeling_flax_common.py	Split common test from core tests (#24284 )	2023-06-15 07:30:24 -04:00
test_modeling_flax_utils.py	Default to msgpack for safetensors (#27460 )	2023-11-13 15:17:01 +01:00
test_modeling_tf_common.py	Replace build() with build_in_name_scope() for some TF tests (#28046 )	2023-12-14 17:42:25 +00:00
test_modeling_tf_utils.py	Replace build() with build_in_name_scope() for some TF tests (#28046 )	2023-12-14 17:42:25 +00:00
test_modeling_utils.py	Fix mismatching loading in from_pretrained with/without accelerate (#28414 )	2024-01-16 14:29:51 +01:00
test_pipeline_mixin.py	Shorten the conversation tests for speed + fixing position overflows (#26960 )	2023-10-31 14:20:04 +00:00
test_sequence_feature_extraction_common.py	Fix typo (#25966 )	2023-09-05 10:12:25 +02:00
test_tokenization_common.py	[`Styling`] stylify using ruff (#27144 )	2023-11-16 17:43:19 +01:00
test_tokenization_utils.py	Remove-auth-token (#27060 )	2023-11-13 14:20:54 +01:00