mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-05 22:00:09 +06:00

* wav2vec2: support datasets other than LibriSpeech * Formatting run_asr.py to pass code quality test * bundled orthography options and added verbose logs * fixing a typo in timit fine-tuning script * update comment for clarity * resize_lm_head and load custom vocab from file * adding a max_duration_in_seconds filter * do not assign `duration_filter` lambda, use a def * log untransliterated text as well * fix base model for arabic * fix duration filter when target_sr is not set * drop duration_in_seconds when unneeded * script for wav2vec2-large-lv60-timit-asr * fix for "tha" in arabic corpus (huggingface#10581) * adding more options to work with common_voice * PR feedback (huggingface#10581) * small README change
23 lines
610 B
Bash
Executable File
23 lines
610 B
Bash
Executable File
#!/usr/bin/env bash
|
|
python run_asr.py \
|
|
--output_dir="./wav2vec2-base-timit-asr" \
|
|
--num_train_epochs="30" \
|
|
--per_device_train_batch_size="20" \
|
|
--per_device_eval_batch_size="20" \
|
|
--evaluation_strategy="steps" \
|
|
--save_steps="500" \
|
|
--eval_steps="100" \
|
|
--logging_steps="50" \
|
|
--learning_rate="5e-4" \
|
|
--warmup_steps="3000" \
|
|
--model_name_or_path="facebook/wav2vec2-base" \
|
|
--fp16 \
|
|
--dataset_name="timit_asr" \
|
|
--train_split_name="train" \
|
|
--validation_split_name="test" \
|
|
--orthography="timit" \
|
|
--preprocessing_num_workers="$(nproc)" \
|
|
--group_by_length \
|
|
--freeze_feature_extractor \
|
|
--verbose_logging \
|