# Speech Encoder Decoder Models The [`SpeechEncoderDecoderModel`] can be used to initialize a speech-sequence-to-text-sequence model with any pretrained speech autoencoding model as the encoder (*e.g.* [Wav2Vec2](wav2vec2), [Hubert](hubert)) and any pretrained autoregressive model as the decoder. The effectiveness of initializing speech-sequence-to-text-sequence models with pretrained checkpoints for speech recognition and speech translation has *e.g.* been shown in [Large-Scale Self- and Semi-Supervised Learning for Speech Translation](https://arxiv.org/abs/2104.06678) by Changhan Wang, Anne Wu, Juan Pino, Alexei Baevski, Michael Auli, Alexis Conneau. An example of how to use a [`SpeechEncoderDecoderModel`] for inference can be seen in [Speech2Text2](speech_to_text_2). ## SpeechEncoderDecoderConfig [[autodoc]] SpeechEncoderDecoderConfig ## SpeechEncoderDecoderModel [[autodoc]] SpeechEncoderDecoderModel - forward - from_encoder_decoder_pretrained ## FlaxSpeechEncoderDecoderModel [[autodoc]] FlaxSpeechEncoderDecoderModel - __call__ - from_encoder_decoder_pretrained