transformers/utils/documentation_tests.txt
Matthijs Hollemans e4bacf6614
[WIP] add SpeechT5 model (#18922)
* make SpeechT5 model by copying Wav2Vec2

* add paper to docs

* whoops added docs in wrong file

* remove SpeechT5Tokenizer + put CTC back in the name

* remove deprecated class

* remove unused docstring

* delete SpeechT5FeatureExtractor, use Wav2Vec2FeatureExtractor instead

* remove classes we don't need right now

* initial stab at speech encoder prenet

* add more speech encoder prenet stuff

* improve SpeechEncoderPrenet

* add encoder (not finished yet)

* add relative position bias to self-attention

* add encoder CTC layers

* fix formatting

* add decoder from BART, doesn't work yet

* make it work with generate loop

* wrap the encoder into a speech encoder class

* wrap the decoder in a text decoder class

* changed my mind

* changed my mind again ;-)

* load decoder weights, make it work

* add weights for text decoder postnet

* add SpeechT5ForCTC model that uses only the encoder

* clean up EncoderLayer and DecoderLayer

* implement _init_weights in SpeechT5PreTrainedModel

* cleanup config + Encoder and Decoder

* add head + cross attention masks

* improve doc comments

* fixup

* more cleanup

* more fixup

* TextDecoderPrenet works now, thanks Kendall

* add CTC loss

* add placeholders for other pre/postnets

* add type annotation

* fix freeze_feature_encoder

* set padding tokens to 0 in decoder attention mask

* encoder attention mask downsampling

* remove features_pen calculation

* disable the padding tokens thing again

* fixup

* more fixup

* code review fixes

* rename encoder/decoder wrapper classes

* allow checkpoints to be loaded into SpeechT5Model

* put encoder into wrapper for CTC model

* clean up conversion script

* add encoder for TTS model

* add speech decoder prenet

* add speech decoder post-net

* attempt to reconstruct the generation loop

* add speech generation loop

* clean up generate_speech

* small tweaks

* fix forward pass

* enable always dropout on speech decoder prenet

* sort declaration

* rename models

* fixup

* fix copies

* more fixup

* make consistency checker happy

* add Seq2SeqSpectrogramOutput class

* doc comments

* quick note about loss and labels

* add HiFi-GAN implementation (from Speech2Speech PR)

* rename file

* add vocoder to TTS model

* improve vocoder

* working on tokenizer

* more better tokenizer

* add CTC tokenizer

* fix decode and batch_code in CTC tokenizer

* fix processor

* two processors and feature extractors

* use SpeechT5WaveformFeatureExtractor instead of Wav2Vec2

* cleanup

* more cleanup

* even more fixup

* notebooks

* fix log-mel spectrograms

* support reduction factor

* fixup

* shift spectrograms to right to create decoder inputs

* return correct labels

* add labels for stop token prediction

* fix doc comments

* fixup

* remove SpeechT5ForPreTraining

* more fixup

* update copyright headers

* add usage examples

* add SpeechT5ProcessorForCTC

* fixup

* push unofficial checkpoints to hub

* initial version of tokenizer unit tests

* add slow test

* fix failing tests

* tests for CTC tokenizer

* finish CTC tokenizer tests

* processor tests

* initial test for feature extractors

* tests for spectrogram feature extractor

* fixup

* more fixup

* add decorators

* require speech for tests

* modeling tests

* more tests for ASR model

* fix imports

* add fake tests for the other models

* fixup

* remove jupyter notebooks

* add missing SpeechT5Model tests

* add missing tests for SpeechT5ForCTC

* add missing tests for SpeechT5ForTextToSpeech

* sort tests by name

* fix Hi-Fi GAN tests

* fixup

* add speech-to-speech model

* refactor duplicate speech generation code

* add processor for SpeechToSpeech model

* add usage example

* add tests for speech-to-speech model

* fixup

* enable gradient checkpointing for SpeechT5FeatureEncoder

* code review

* push_to_hub now takes repo_id

* improve doc comments for HiFi-GAN config

* add missing test

* add integration tests

* make number of layers in speech decoder prenet configurable

* rename variable

* rename variables

* add auto classes for TTS and S2S

* REMOVE CTC!!!

* S2S processor does not support save/load_pretrained

* fixup

* these models are now in an auto mapping

* fix doc links

* rename HiFiGAN to HifiGan, remove separate config file

* REMOVE auto classes

* there can be only one

* fixup

* replace assert

* reformat

* feature extractor can process input and target at same time

* update checkpoint names

* fix commit hash
2023-02-03 12:43:46 -05:00

224 lines
12 KiB
Plaintext

docs/source/en/quicktour.mdx
docs/source/es/quicktour.mdx
docs/source/en/pipeline_tutorial.mdx
docs/source/en/autoclass_tutorial.mdx
docs/source/en/task_summary.mdx
docs/source/en/model_doc/markuplm.mdx
docs/source/en/model_doc/speech_to_text.mdx
docs/source/en/model_doc/switch_transformers.mdx
docs/source/en/model_doc/t5.mdx
docs/source/en/model_doc/t5v1.1.mdx
docs/source/en/model_doc/byt5.mdx
docs/source/en/model_doc/tapex.mdx
docs/source/en/model_doc/donut.mdx
docs/source/en/model_doc/encoder-decoder.mdx
src/transformers/generation/configuration_utils.py
src/transformers/generation/tf_utils.py
src/transformers/generation/utils.py
src/transformers/models/albert/configuration_albert.py
src/transformers/models/albert/modeling_albert.py
src/transformers/models/albert/modeling_tf_albert.py
src/transformers/models/audio_spectrogram_transformer/modeling_audio_spectrogram_transformer.py
src/transformers/models/bart/configuration_bart.py
src/transformers/models/bart/modeling_bart.py
src/transformers/models/beit/configuration_beit.py
src/transformers/models/beit/modeling_beit.py
src/transformers/models/bert/configuration_bert.py
src/transformers/models/bert/modeling_bert.py
src/transformers/models/bert/modeling_tf_bert.py
src/transformers/models/bert_generation/configuration_bert_generation.py
src/transformers/models/bigbird_pegasus/configuration_bigbird_pegasus.py
src/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py
src/transformers/models/big_bird/configuration_big_bird.py
src/transformers/models/big_bird/modeling_big_bird.py
src/transformers/models/blenderbot/configuration_blenderbot.py
src/transformers/models/blenderbot/modeling_blenderbot.py
src/transformers/models/blenderbot_small/configuration_blenderbot_small.py
src/transformers/models/blenderbot_small/modeling_blenderbot_small.py
src/transformers/models/blip/modeling_blip.py
src/transformers/models/bloom/configuration_bloom.py
src/transformers/models/camembert/configuration_camembert.py
src/transformers/models/canine/configuration_canine.py
src/transformers/models/canine/modeling_canine.py
src/transformers/models/clip/configuration_clip.py
src/transformers/models/clipseg/modeling_clipseg.py
src/transformers/models/codegen/configuration_codegen.py
src/transformers/models/conditional_detr/configuration_conditional_detr.py
src/transformers/models/conditional_detr/modeling_conditional_detr.py
src/transformers/models/convbert/configuration_convbert.py
src/transformers/models/convnext/configuration_convnext.py
src/transformers/models/convnext/modeling_convnext.py
src/transformers/models/ctrl/configuration_ctrl.py
src/transformers/models/ctrl/modeling_ctrl.py
src/transformers/models/cvt/configuration_cvt.py
src/transformers/models/cvt/modeling_cvt.py
src/transformers/models/data2vec/configuration_data2vec_audio.py
src/transformers/models/data2vec/configuration_data2vec_text.py
src/transformers/models/data2vec/configuration_data2vec_vision.py
src/transformers/models/data2vec/modeling_data2vec_audio.py
src/transformers/models/data2vec/modeling_data2vec_vision.py
src/transformers/models/deberta/configuration_deberta.py
src/transformers/models/deberta/modeling_deberta.py
src/transformers/models/deberta_v2/configuration_deberta_v2.py
src/transformers/models/deberta_v2/modeling_deberta_v2.py
src/transformers/models/decision_transformer/configuration_decision_transformer.py
src/transformers/models/deformable_detr/configuration_deformable_detr.py
src/transformers/models/deformable_detr/modeling_deformable_detr.py
src/transformers/models/deit/configuration_deit.py
src/transformers/models/deit/modeling_deit.py
src/transformers/models/deit/modeling_tf_deit.py
src/transformers/models/deta/configuration_deta.py
src/transformers/models/deta/modeling_deta.py
src/transformers/models/detr/configuration_detr.py
src/transformers/models/detr/modeling_detr.py
src/transformers/models/dinat/configuration_dinat.py
src/transformers/models/dinat/modeling_dinat.py
src/transformers/models/distilbert/configuration_distilbert.py
src/transformers/models/dpr/configuration_dpr.py
src/transformers/models/dpt/modeling_dpt.py
src/transformers/models/electra/configuration_electra.py
src/transformers/models/electra/modeling_electra.py
src/transformers/models/electra/modeling_tf_electra.py
src/transformers/models/ernie/configuration_ernie.py
src/transformers/models/flava/configuration_flava.py
src/transformers/models/fnet/configuration_fnet.py
src/transformers/models/fsmt/configuration_fsmt.py
src/transformers/models/git/modeling_git.py
src/transformers/models/glpn/modeling_glpn.py
src/transformers/models/gpt2/configuration_gpt2.py
src/transformers/models/gpt2/modeling_gpt2.py
src/transformers/models/gptj/modeling_gptj.py
src/transformers/models/gpt_neo/configuration_gpt_neo.py
src/transformers/models/gpt_neox/configuration_gpt_neox.py
src/transformers/models/gpt_neox_japanese/configuration_gpt_neox_japanese.py
src/transformers/models/groupvit/modeling_groupvit.py
src/transformers/models/groupvit/modeling_tf_groupvit.py
src/transformers/models/hubert/modeling_hubert.py
src/transformers/models/imagegpt/configuration_imagegpt.py
src/transformers/models/layoutlm/configuration_layoutlm.py
src/transformers/models/layoutlm/modeling_layoutlm.py
src/transformers/models/layoutlm/modeling_tf_layoutlm.py
src/transformers/models/layoutlmv2/configuration_layoutlmv2.py
src/transformers/models/layoutlmv2/modeling_layoutlmv2.py
src/transformers/models/layoutlmv3/configuration_layoutlmv3.py
src/transformers/models/layoutlmv3/modeling_layoutlmv3.py
src/transformers/models/layoutlmv3/modeling_tf_layoutlmv3.py
src/transformers/models/levit/configuration_levit.py
src/transformers/models/lilt/modeling_lilt.py
src/transformers/models/longformer/modeling_longformer.py
src/transformers/models/longformer/modeling_tf_longformer.py
src/transformers/models/longt5/modeling_longt5.py
src/transformers/models/marian/modeling_marian.py
src/transformers/models/markuplm/modeling_markuplm.py
src/transformers/models/mask2former/configuration_mask2former.py
src/transformers/models/mask2former/modeling_mask2former.py
src/transformers/models/maskformer/configuration_maskformer.py
src/transformers/models/maskformer/modeling_maskformer.py
src/transformers/models/mbart/configuration_mbart.py
src/transformers/models/mbart/modeling_mbart.py
src/transformers/models/mctct/configuration_mctct.py
src/transformers/models/megatron_bert/configuration_megatron_bert.py
src/transformers/models/mobilebert/configuration_mobilebert.py
src/transformers/models/mobilebert/modeling_mobilebert.py
src/transformers/models/mobilebert/modeling_tf_mobilebert.py
src/transformers/models/mobilenet_v1/modeling_mobilenet_v1.py
src/transformers/models/mobilenet_v2/modeling_mobilenet_v2.py
src/transformers/models/mobilevit/modeling_mobilevit.py
src/transformers/models/mobilevit/modeling_tf_mobilevit.py
src/transformers/models/nat/configuration_nat.py
src/transformers/models/nat/modeling_nat.py
src/transformers/models/nezha/configuration_nezha.py
src/transformers/models/oneformer/configuration_oneformer.py
src/transformers/models/oneformer/modeling_oneformer.py
src/transformers/models/openai/configuration_openai.py
src/transformers/models/opt/configuration_opt.py
src/transformers/models/opt/modeling_opt.py
src/transformers/models/opt/modeling_tf_opt.py
src/transformers/models/owlvit/modeling_owlvit.py
src/transformers/models/pegasus/configuration_pegasus.py
src/transformers/models/pegasus/modeling_pegasus.py
src/transformers/models/pegasus_x/configuration_pegasus_x.py
src/transformers/models/perceiver/modeling_perceiver.py
src/transformers/models/plbart/configuration_plbart.py
src/transformers/models/plbart/modeling_plbart.py
src/transformers/models/poolformer/configuration_poolformer.py
src/transformers/models/poolformer/modeling_poolformer.py
src/transformers/models/realm/configuration_realm.py
src/transformers/models/reformer/configuration_reformer.py
src/transformers/models/reformer/modeling_reformer.py
src/transformers/models/regnet/modeling_regnet.py
src/transformers/models/regnet/modeling_tf_regnet.py
src/transformers/models/resnet/configuration_resnet.py
src/transformers/models/resnet/modeling_resnet.py
src/transformers/models/resnet/modeling_tf_resnet.py
src/transformers/models/roberta/configuration_roberta.py
src/transformers/models/roberta/modeling_roberta.py
src/transformers/models/roberta/modeling_tf_roberta.py
src/transformers/models/roberta_prelayernorm/configuration_roberta_prelayernorm.py
src/transformers/models/roberta_prelayernorm/modeling_roberta_prelayernorm.py
src/transformers/models/roberta_prelayernorm/modeling_tf_roberta_prelayernorm.py
src/transformers/models/roc_bert/modeling_roc_bert.py
src/transformers/models/roc_bert/tokenization_roc_bert.py
src/transformers/models/segformer/modeling_segformer.py
src/transformers/models/sew/configuration_sew.py
src/transformers/models/sew/modeling_sew.py
src/transformers/models/sew_d/configuration_sew_d.py
src/transformers/models/sew_d/modeling_sew_d.py
src/transformers/models/speech_encoder_decoder/modeling_speech_encoder_decoder.py
src/transformers/models/speech_to_text/configuration_speech_to_text.py
src/transformers/models/speech_to_text/modeling_speech_to_text.py
src/transformers/models/speech_to_text_2/configuration_speech_to_text_2.py
src/transformers/models/speech_to_text_2/modeling_speech_to_text_2.py
src/transformers/models/speecht5/modeling_speecht5.py
src/transformers/models/speecht5/tokenization_speecht5.py
src/transformers/models/segformer/modeling_tf_segformer.py
src/transformers/models/squeezebert/configuration_squeezebert.py
src/transformers/models/swin/configuration_swin.py
src/transformers/models/swin/modeling_swin.py
src/transformers/models/swin2sr/modeling_swin2sr.py
src/transformers/models/swinv2/configuration_swinv2.py
src/transformers/models/table_transformer/modeling_table_transformer.py
src/transformers/models/time_series_transformer/configuration_time_series_transformer.py
src/transformers/models/time_series_transformer/modeling_time_series_transformer.py
src/transformers/models/trajectory_transformer/configuration_trajectory_transformer.py
src/transformers/models/transfo_xl/configuration_transfo_xl.py
src/transformers/models/trocr/configuration_trocr.py
src/transformers/models/trocr/modeling_trocr.py
src/transformers/models/unispeech/configuration_unispeech.py
src/transformers/models/unispeech/modeling_unispeech.py
src/transformers/models/unispeech_sat/modeling_unispeech_sat.py
src/transformers/models/upernet/modeling_upernet.py
src/transformers/models/van/modeling_van.py
src/transformers/models/videomae/modeling_videomae.py
src/transformers/models/vilt/modeling_vilt.py
src/transformers/models/vision_encoder_decoder/configuration_vision_encoder_decoder.py
src/transformers/models/vision_encoder_decoder/modeling_vision_encoder_decoder.py
src/transformers/models/vision_text_dual_encoder/configuration_vision_text_dual_encoder.py
src/transformers/models/vit/configuration_vit.py
src/transformers/models/vit/modeling_vit.py
src/transformers/models/vit/modeling_tf_vit.py
src/transformers/models/vit_mae/modeling_vit_mae.py
src/transformers/models/vit_mae/configuration_vit_mae.py
src/transformers/models/vit_msn/modeling_vit_msn.py
src/transformers/models/visual_bert/configuration_visual_bert.py
src/transformers/models/wav2vec2/configuration_wav2vec2.py
src/transformers/models/wav2vec2/modeling_wav2vec2.py
src/transformers/models/wav2vec2/tokenization_wav2vec2.py
src/transformers/models/wav2vec2_conformer/configuration_wav2vec2_conformer.py
src/transformers/models/wav2vec2_conformer/modeling_wav2vec2_conformer.py
src/transformers/models/wav2vec2_with_lm/processing_wav2vec2_with_lm.py
src/transformers/models/wavlm/configuration_wavlm.py
src/transformers/models/wavlm/modeling_wavlm.py
src/transformers/models/whisper/configuration_whisper.py
src/transformers/models/whisper/modeling_whisper.py
src/transformers/models/whisper/modeling_tf_whisper.py
src/transformers/models/xlm/configuration_xlm.py
src/transformers/models/xlm_roberta/configuration_xlm_roberta.py
src/transformers/models/xlm_roberta_xl/configuration_xlm_roberta_xl.py
src/transformers/models/xlnet/configuration_xlnet.py
src/transformers/models/yolos/configuration_yolos.py
src/transformers/models/yolos/modeling_yolos.py
src/transformers/models/x_clip/modeling_x_clip.py
src/transformers/models/yoso/configuration_yoso.py
src/transformers/pipelines/