* do not use prefix="val" for test
The dummy example fails when test_epoch_end is called. The prefix="test" should be dynamic in the log metrics too.
* Create test.source
* Create test.target
For IterableDataset, return DataLoader using self._train_batch_size. This is consistent with how we generate a regular DataLoader, and leads to the correct args.per_device_train_batch_size eventually ending up on each GPU.
* Add tutorial doc for TF + TPU
* Fix all those extra asterisks in the markdown
* Use the actual Tip formatting
* Remove unnecessary spaces
* Reformat checklist
* Fix checklist and reformat tips slightly
* Update docs/source/en/perf_train_tpu_tf.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update docs/source/en/perf_train_tpu_tf.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update docs/source/en/perf_train_tpu_tf.mdx
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Update docs/source/en/perf_train_tpu_tf.mdx
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Add link to TPU notebook in the notebooks list
* Add links to the TPU notebook in the tutorial doc
* Make the markdown table a bit less wild
* Fix notebook link
* More notebook links
* More fixes to wild tables
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* make SpeechT5 model by copying Wav2Vec2
* add paper to docs
* whoops added docs in wrong file
* remove SpeechT5Tokenizer + put CTC back in the name
* remove deprecated class
* remove unused docstring
* delete SpeechT5FeatureExtractor, use Wav2Vec2FeatureExtractor instead
* remove classes we don't need right now
* initial stab at speech encoder prenet
* add more speech encoder prenet stuff
* improve SpeechEncoderPrenet
* add encoder (not finished yet)
* add relative position bias to self-attention
* add encoder CTC layers
* fix formatting
* add decoder from BART, doesn't work yet
* make it work with generate loop
* wrap the encoder into a speech encoder class
* wrap the decoder in a text decoder class
* changed my mind
* changed my mind again ;-)
* load decoder weights, make it work
* add weights for text decoder postnet
* add SpeechT5ForCTC model that uses only the encoder
* clean up EncoderLayer and DecoderLayer
* implement _init_weights in SpeechT5PreTrainedModel
* cleanup config + Encoder and Decoder
* add head + cross attention masks
* improve doc comments
* fixup
* more cleanup
* more fixup
* TextDecoderPrenet works now, thanks Kendall
* add CTC loss
* add placeholders for other pre/postnets
* add type annotation
* fix freeze_feature_encoder
* set padding tokens to 0 in decoder attention mask
* encoder attention mask downsampling
* remove features_pen calculation
* disable the padding tokens thing again
* fixup
* more fixup
* code review fixes
* rename encoder/decoder wrapper classes
* allow checkpoints to be loaded into SpeechT5Model
* put encoder into wrapper for CTC model
* clean up conversion script
* add encoder for TTS model
* add speech decoder prenet
* add speech decoder post-net
* attempt to reconstruct the generation loop
* add speech generation loop
* clean up generate_speech
* small tweaks
* fix forward pass
* enable always dropout on speech decoder prenet
* sort declaration
* rename models
* fixup
* fix copies
* more fixup
* make consistency checker happy
* add Seq2SeqSpectrogramOutput class
* doc comments
* quick note about loss and labels
* add HiFi-GAN implementation (from Speech2Speech PR)
* rename file
* add vocoder to TTS model
* improve vocoder
* working on tokenizer
* more better tokenizer
* add CTC tokenizer
* fix decode and batch_code in CTC tokenizer
* fix processor
* two processors and feature extractors
* use SpeechT5WaveformFeatureExtractor instead of Wav2Vec2
* cleanup
* more cleanup
* even more fixup
* notebooks
* fix log-mel spectrograms
* support reduction factor
* fixup
* shift spectrograms to right to create decoder inputs
* return correct labels
* add labels for stop token prediction
* fix doc comments
* fixup
* remove SpeechT5ForPreTraining
* more fixup
* update copyright headers
* add usage examples
* add SpeechT5ProcessorForCTC
* fixup
* push unofficial checkpoints to hub
* initial version of tokenizer unit tests
* add slow test
* fix failing tests
* tests for CTC tokenizer
* finish CTC tokenizer tests
* processor tests
* initial test for feature extractors
* tests for spectrogram feature extractor
* fixup
* more fixup
* add decorators
* require speech for tests
* modeling tests
* more tests for ASR model
* fix imports
* add fake tests for the other models
* fixup
* remove jupyter notebooks
* add missing SpeechT5Model tests
* add missing tests for SpeechT5ForCTC
* add missing tests for SpeechT5ForTextToSpeech
* sort tests by name
* fix Hi-Fi GAN tests
* fixup
* add speech-to-speech model
* refactor duplicate speech generation code
* add processor for SpeechToSpeech model
* add usage example
* add tests for speech-to-speech model
* fixup
* enable gradient checkpointing for SpeechT5FeatureEncoder
* code review
* push_to_hub now takes repo_id
* improve doc comments for HiFi-GAN config
* add missing test
* add integration tests
* make number of layers in speech decoder prenet configurable
* rename variable
* rename variables
* add auto classes for TTS and S2S
* REMOVE CTC!!!
* S2S processor does not support save/load_pretrained
* fixup
* these models are now in an auto mapping
* fix doc links
* rename HiFiGAN to HifiGan, remove separate config file
* REMOVE auto classes
* there can be only one
* fixup
* replace assert
* reformat
* feature extractor can process input and target at same time
* update checkpoint names
* fix commit hash
* no dot scale gradient in bf16 mode
* fix since args.fp16 might be none
* fixed typo
* typo
* only do if grad scaling is true
* self.amp_dtype == torch.float16 is true
* put back prop when fsdp is not none
* updated resources for LayoutLM
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* fixed formatting, removed extra section
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
input_ids_seq_length doesn't exist in the GenerationConfig, it exists as local variable in the function.
Setting exponential_decay_length_penalty therefore results in an error:
`AttributeError: 'GenerationConfig' object has no attribute 'input_ids_seq_length'`
This simple change fixes this issue, and the exponential_decay_length_penalty works as expected.
* force `memory_efficient_backward=True`
* enhancements
- trainer support
- add new flag
* some changes
- internal changes in `Trainer`
- small refactor
* make quality
* Fixes
- add new testing util
- add new test
- change test in Trainer
* fix CI test
* educate users on how to ft 8bit models
* more checks
* fix `logger` error
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* adapt from review
* fix
* add comment
* use return instead
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>