Patrick von Platen
|
bc6f51e539
|
[Wav2Vec2ForPretraining] Correct checkpoints wav2vec2 & fix tests (#12089)
* fix_torch_device_generate_test
* remove @
* fix tests
|
2021-06-09 20:41:59 +01:00 |
|
Anton Lozhkov
|
d472bd7b18
|
Wav2Vec2 Pretraining (#11306)
* Working quantizer forward
* Working quantizer forward
* Clean up unused model parts, test reproducibility
* Working quantizer forward
* Clean up unused model parts, test reproducibility
* Remove custom outputs from the shared ones
* correct conversion
* correct bug
* add first pretrain script
* save intermediate
* static shapes
* save intermediate
* finish first pretrain script version
* more refactor
* remove wanddb
* refactor more
* improve test
* correct perplexity compute bug
* finish model implementation
* add to docs
* finish docs
* finish pretraining script
* finish pretraining script
* remove wandb
* finish PR for merge
* finish config
* finish
* make deepspeed work
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* apply suggestions
* fix flaky test
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
|
2021-06-09 18:40:56 +01:00 |
|
Patrick von Platen
|
7630c11f32
|
[Wav2Vec2] SpecAugment Fast (#11764)
* first try
* finish
|
2021-05-25 13:59:52 +01:00 |
|
Patrick von Platen
|
3e3e41ae20
|
Pytorch - Lazy initialization of models (#11471)
* lazy_init_weights
* remove ipdb
* save int
* add necessary code
* remove unnecessary utils
* Update src/transformers/models/t5/modeling_t5.py
* clean
* add tests
* correct
* finish tests
* finish tests
* fix some more tests
* fix xlnet & transfo-xl
* fix more tests
* make sure tests are independent
* fix tests more
* finist tests
* final touches
* Update src/transformers/modeling_utils.py
* Apply suggestions from code review
* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* clean tests
* give arg positive name
* add more mock weights to xlnet
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
|
2021-05-05 17:22:20 +02:00 |
|
Sylvain Gugger
|
acc3bd9d2a
|
Enforce string-formatting with f-strings (#10980)
* First third
* Styling and fix mistake
* Quality
* All the rest
* Treat %s and %d
* typo
* Missing )
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
|
2021-03-31 10:00:27 -04:00 |
|
Patrick von Platen
|
0486ccdd3d
|
small improvements (#10773)
|
2021-03-17 18:10:17 +03:00 |
|
Patrick von Platen
|
d9e693e1d0
|
make wav2vec2 test deterministic (#10714)
|
2021-03-15 09:50:05 -04:00 |
|
Patrick von Platen
|
0234de8418
|
Add Fine-Tuning for Wav2Vec2 (#10145)
* add encode labels function to tokenizer
* start adding finetuning
* init dropout
* upload
* correct convert script
* apply changes
* fix second typo
* make first dummy training run
* adapt convert script
* push confg for comparison
* remove conf
* finish training
* adapt data collator
* add research folder
* update according to fairseq feedback
* some minor corrections
* refactor masking indices a bit
* some minor changes
* clean tokenizer
* finish clean-up
* remove previous logic
* update run script
* correct training
* finish changes
* finish model
* correct bug
* fix training a bit more
* add some tests
* finish gradient checkpointing
* finish example
* correct gradient checkpointing
* improve tokenization method
* revert changes in tokenizer
* revert general change
* adapt fine-tuning
* update
* save intermediate test
* Update README.md
* finish finetuning
* delete conversion script
* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py
* Update src/transformers/models/wav2vec2/processing_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* finish wav2vec2 script
* finish wav2vec2 fine-tuning
* finalize test
* correct test
* adapt tests
* finish
* remove test file
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
|
2021-03-01 12:13:17 +03:00 |
|
Patrick von Platen
|
cb38ffcc5e
|
[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer (#10324)
* push to show
* small improvement
* small improvement
* Update src/transformers/feature_extraction_utils.py
* Update src/transformers/feature_extraction_utils.py
* implement base
* add common tests
* make all tests pass for wav2vec2
* make padding work & add more tests
* finalize feature extractor utils
* add call method to feature extraction
* finalize feature processor
* finish tokenizer
* finish general processor design
* finish tests
* typo
* remove bogus file
* finish docstring
* add docs
* finish docs
* small fix
* correct docs
* save intermediate
* load changes
* apply changes
* apply changes to doc
* change tests
* apply surajs recommend
* final changes
* Apply suggestions from code review
* fix typo
* fix import
* correct docstring
|
2021-02-25 17:42:46 +03:00 |
|
Patrick von Platen
|
495c157d6f
|
[Wav2Vec2] Improve Tokenizer & Model for batched inference (#10117)
* save intermediate
* finish batch the same as fairseq
* add normalization
* fix batched input
* add better comment
* Update src/transformers/models/wav2vec2/modeling_wav2vec2.py
* add nice docstring
* add tokenizer tests
* make all slow tests pass
* finish PR
* correct import
|
2021-02-11 15:40:54 +03:00 |
|
Patrick von Platen
|
b972125ced
|
Deprecate Wav2Vec2ForMaskedLM and add Wav2Vec2ForCTC (#10089)
* add wav2vec2CTC and deprecate for maskedlm
* remove from docs
|
2021-02-09 03:49:02 -05:00 |
|
Patrick von Platen
|
d6217fb30c
|
Wav2Vec2 (#9659)
* add raw scaffold
* implement feat extract layers
* make style
* remove +
* correctly convert weights
* make feat extractor work
* make feature extraction proj work
* run forward pass
* finish forward pass
* Succesful decoding example
* remove unused files
* more changes
* add wav2vec tokenizer
* add new structure
* fix run forward
* add other layer norm architecture
* finish 2nd structure
* add model tests
* finish tests for tok and model
* clean-up
* make style
* finish docstring for model and config
* make style
* correct docstring
* correct tests
* change checkpoints to fairseq
* fix examples
* finish wav2vec2
* make style
* apply sylvains suggestions
* apply lysandres suggestions
* change print to log.info
* re-add assert statement
* add input_values as required input name
* finish wav2vec2 tokenizer
* Update tests/test_tokenization_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* apply sylvains suggestions
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
|
2021-02-02 15:52:10 +03:00 |
|