transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

History

Matthijs Hollemans e4bacf6614 [WIP] add SpeechT5 model (#18922 ) * make SpeechT5 model by copying Wav2Vec2 * add paper to docs * whoops added docs in wrong file * remove SpeechT5Tokenizer + put CTC back in the name * remove deprecated class * remove unused docstring * delete SpeechT5FeatureExtractor, use Wav2Vec2FeatureExtractor instead * remove classes we don't need right now * initial stab at speech encoder prenet * add more speech encoder prenet stuff * improve SpeechEncoderPrenet * add encoder (not finished yet) * add relative position bias to self-attention * add encoder CTC layers * fix formatting * add decoder from BART, doesn't work yet * make it work with generate loop * wrap the encoder into a speech encoder class * wrap the decoder in a text decoder class * changed my mind * changed my mind again ;-) * load decoder weights, make it work * add weights for text decoder postnet * add SpeechT5ForCTC model that uses only the encoder * clean up EncoderLayer and DecoderLayer * implement _init_weights in SpeechT5PreTrainedModel * cleanup config + Encoder and Decoder * add head + cross attention masks * improve doc comments * fixup * more cleanup * more fixup * TextDecoderPrenet works now, thanks Kendall * add CTC loss * add placeholders for other pre/postnets * add type annotation * fix freeze_feature_encoder * set padding tokens to 0 in decoder attention mask * encoder attention mask downsampling * remove features_pen calculation * disable the padding tokens thing again * fixup * more fixup * code review fixes * rename encoder/decoder wrapper classes * allow checkpoints to be loaded into SpeechT5Model * put encoder into wrapper for CTC model * clean up conversion script * add encoder for TTS model * add speech decoder prenet * add speech decoder post-net * attempt to reconstruct the generation loop * add speech generation loop * clean up generate_speech * small tweaks * fix forward pass * enable always dropout on speech decoder prenet * sort declaration * rename models * fixup * fix copies * more fixup * make consistency checker happy * add Seq2SeqSpectrogramOutput class * doc comments * quick note about loss and labels * add HiFi-GAN implementation (from Speech2Speech PR) * rename file * add vocoder to TTS model * improve vocoder * working on tokenizer * more better tokenizer * add CTC tokenizer * fix decode and batch_code in CTC tokenizer * fix processor * two processors and feature extractors * use SpeechT5WaveformFeatureExtractor instead of Wav2Vec2 * cleanup * more cleanup * even more fixup * notebooks * fix log-mel spectrograms * support reduction factor * fixup * shift spectrograms to right to create decoder inputs * return correct labels * add labels for stop token prediction * fix doc comments * fixup * remove SpeechT5ForPreTraining * more fixup * update copyright headers * add usage examples * add SpeechT5ProcessorForCTC * fixup * push unofficial checkpoints to hub * initial version of tokenizer unit tests * add slow test * fix failing tests * tests for CTC tokenizer * finish CTC tokenizer tests * processor tests * initial test for feature extractors * tests for spectrogram feature extractor * fixup * more fixup * add decorators * require speech for tests * modeling tests * more tests for ASR model * fix imports * add fake tests for the other models * fixup * remove jupyter notebooks * add missing SpeechT5Model tests * add missing tests for SpeechT5ForCTC * add missing tests for SpeechT5ForTextToSpeech * sort tests by name * fix Hi-Fi GAN tests * fixup * add speech-to-speech model * refactor duplicate speech generation code * add processor for SpeechToSpeech model * add usage example * add tests for speech-to-speech model * fixup * enable gradient checkpointing for SpeechT5FeatureEncoder * code review * push_to_hub now takes repo_id * improve doc comments for HiFi-GAN config * add missing test * add integration tests * make number of layers in speech decoder prenet configurable * rename variable * rename variables * add auto classes for TTS and S2S * REMOVE CTC!!! * S2S processor does not support save/load_pretrained * fixup * these models are now in an auto mapping * fix doc links * rename HiFiGAN to HifiGan, remove separate config file * REMOVE auto classes * there can be only one * fixup * replace assert * reformat * feature extractor can process input and target at same time * update checkpoint names * fix commit hash		2023-02-03 12:43:46 -05:00
..
test_module	AutoImageProcessor (#20111 )	2022-11-08 19:54:41 +00:00
tf_ops	Check TF ops for ONNX compliance (#10025 )	2021-02-15 07:55:10 -05:00
check_config_docstrings.py	Create dummy models (#19901 )	2022-10-28 13:05:41 +02:00
check_copies.py	update template (#20885 )	2023-01-04 10:15:45 +01:00
check_doc_toc.py	Split model list on modality (#18328 )	2022-08-01 11:10:20 -05:00
check_doctest_list.py	check paths in `utils/documentation_tests.txt` (#21315 )	2023-01-26 15:33:47 +01:00
check_dummies.py	Add some tests for check_dummies (#19146 )	2022-09-21 14:54:09 -04:00
check_inits.py	Add ESMFold (#19977 )	2022-10-31 21:32:58 -04:00
check_repo.py	[WIP] add SpeechT5 model (#18922 )	2023-02-03 12:43:46 -05:00
check_self_hosted_runner.py	Add offline runners info in the Slack report (#19169 )	2022-09-23 19:23:05 +02:00
check_table.py	Fix some typos. (#17560 )	2022-07-11 05:00:13 -04:00
check_task_guides.py	Automated compatible models list for task guides (#21338 )	2023-01-27 13:19:28 -05:00
check_tf_ops.py	Check TF ops for ONNX compliance (#10025 )	2021-02-15 07:55:10 -05:00
create_dummy_models.py	Pipeline testing - using tiny models on Hub (#20426 )	2023-01-30 10:39:43 +01:00
custom_init_isort.py	Fix init import_structure sorting (#20477 )	2022-11-29 09:46:10 -05:00
documentation_tests.txt	[WIP] add SpeechT5 model (#18922 )	2023-02-03 12:43:46 -05:00
download_glue_data.py	Raise exceptions instead of asserts (#13907 )	2021-10-07 12:44:23 +05:30
extract_warnings.py	Update some GH action versions (#20537 )	2022-12-06 16:54:40 +01:00
get_ci_error_statistics.py	Update Past CI report script (#19228 )	2022-09-29 19:22:23 +02:00
get_github_job_time.py	add a script to get time info. from GA workflow jobs (#18822 )	2022-09-01 12:02:52 +02:00
get_modified_files.py	Updates the default branch from master to main (#16326 )	2022-03-23 03:46:59 -04:00
notification_service_doc_tests.py	fix missing block when there is no failure (#18775 )	2022-08-29 09:10:13 +02:00
notification_service.py	extract warnings in GH workflows (#20487 )	2022-11-29 15:58:54 +01:00
past_ci_versions.py	Fix past CI (#20967 )	2023-01-12 18:04:21 +01:00
prepare_for_doc_test.py	Add a check regarding the number of occurrences of ``` (#18389 )	2022-08-01 14:23:02 +02:00
print_env.py	Print more library versions in CI (#17384 )	2022-06-02 10:24:16 +02:00
release.py	Clean README in post release job as well. (#17519 )	2022-06-02 07:44:03 -04:00
sort_auto_mappings.py	Automatically sort auto mappings (#17250 )	2022-05-16 13:24:20 -04:00
tests_fetcher.py	Add TF image classification example script (#19956 )	2023-02-01 19:09:36 +00:00
update_metadata.py	Adapt repository creation to latest hf_hub (#21158 )	2023-01-18 11:14:00 -05:00