transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-15 02:28:24 +06:00

History

Matthijs Hollemans ac2bc50a10 TTS fine-tuning for SpeechT5 (#21824 ) * wrong argument name * append eos_token_id * all tokenizers need mask and ctc_blank tokens * remove reduction factor from feature extractor * add proper TTS loss * did shifting the wrong way around * mask out padded portions * remove logits again (don't really need it) * fix unit tests * fixup * pad also returns the decoder attention mask, since that's useful to have * clean up feature extractor logic * pad can handle TTS task too * remove stop_labels from loss calculation * simplify logic * fixup * do -100 masking properly * small STFT optimization (calculate mel filterbanks only once) * replace torchaudio fbanks with audio_utils * remove torchaudio dependency * simplify & speed up the STFT * don't serialize window and mel filters * output cross attentions when generating speech * add guided attention loss * fix failing test * Update src/transformers/models/speecht5/feature_extraction_speecht5.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/speecht5/modeling_speecht5.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * change type annotation of attention_mask to LongTensor * extract loss into class * remove unused frame_signal_scale argument * use config object in loss class * fix type annotations in doc comments * change optional to just bool * implement missing tokenizer method * add deprecation warning * Update src/transformers/models/speecht5/feature_extraction_speecht5.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/speecht5/feature_extraction_speecht5.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add deprecation warning for stop_labels --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>		2023-04-18 10:12:30 +01:00
..
__init__.py	[WIP] add SpeechT5 model (#18922 )	2023-02-03 12:43:46 -05:00
test_feature_extraction_speecht5.py	TTS fine-tuning for SpeechT5 (#21824 )	2023-04-18 10:12:30 +01:00
test_modeling_speecht5.py	TTS fine-tuning for SpeechT5 (#21824 )	2023-04-18 10:12:30 +01:00
test_processor_speecht5.py	TTS fine-tuning for SpeechT5 (#21824 )	2023-04-18 10:12:30 +01:00
test_tokenization_speecht5.py	TTS fine-tuning for SpeechT5 (#21824 )	2023-04-18 10:12:30 +01:00