Commit Graph

5 Commits

Author SHA1 Message Date
Patrick von Platen
880154d2e1
[Wav2Vec2] Fix special tokens for Wav2Vec2 tokenizer (#11349)
* fix wav2vec2 tok

* up
2021-04-22 12:23:08 +02:00
Lysandre Debut
2adc8c926a
W2v2 test require torch (#10665)
* Adds a @require_torch to a test that requires it

* Tokenizer too

* Style
2021-03-11 12:56:12 -05:00
Patrick von Platen
cb38ffcc5e
[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer (#10324)
* push to show

* small improvement

* small improvement

* Update src/transformers/feature_extraction_utils.py

* Update src/transformers/feature_extraction_utils.py

* implement base

* add common tests

* make all tests pass for wav2vec2

* make padding work & add more tests

* finalize feature extractor utils

* add call method to feature extraction

* finalize feature processor

* finish tokenizer

* finish general processor design

* finish tests

* typo

* remove bogus file

* finish docstring

* add docs

* finish docs

* small fix

* correct docs

* save intermediate

* load changes

* apply changes

* apply changes to doc

* change tests

* apply surajs recommend

* final changes

* Apply suggestions from code review

* fix typo

* fix import

* correct docstring
2021-02-25 17:42:46 +03:00
Patrick von Platen
495c157d6f
[Wav2Vec2] Improve Tokenizer & Model for batched inference (#10117)
* save intermediate

* finish batch the same as fairseq

* add normalization

* fix batched input

* add better comment

* Update src/transformers/models/wav2vec2/modeling_wav2vec2.py

* add nice docstring

* add tokenizer tests

* make all slow tests pass

* finish PR

* correct import
2021-02-11 15:40:54 +03:00
Patrick von Platen
d6217fb30c
Wav2Vec2 (#9659)
* add raw scaffold

* implement feat extract layers

* make style

* remove +

* correctly convert weights

* make feat extractor work

* make feature extraction proj work

* run forward pass

* finish forward pass

* Succesful decoding example

* remove unused files

* more changes

* add wav2vec tokenizer

* add new structure

* fix run forward

* add other layer norm architecture

* finish 2nd structure

* add model tests

* finish tests for tok and model

* clean-up

* make style

* finish docstring for model and config

* make style

* correct docstring

* correct tests

* change checkpoints to fairseq

* fix examples

* finish wav2vec2

* make style

* apply sylvains suggestions

* apply lysandres suggestions

* change print to log.info

* re-add assert statement

* add input_values as required input name

* finish wav2vec2 tokenizer

* Update tests/test_tokenization_wav2vec2.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* apply sylvains suggestions

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-02-02 15:52:10 +03:00