* Initial commit
* Another bunch of updates
* make style quliaty + delete debug arg from bash script
* Use compue_metrics func
* Do a few fixes
* Add copyright
* Fix typos
* Add NER example with accelerate library
* This commit contains the first (yet really unfinished)
version of a script for showing how to train HuggingFace model
with their new accelerate library.
* Fix metric calculation
* make style quality
* mv ner_no_trainer to token-classification dir
* Delete --debug flag from running script
* hf_datasets -> raw_datasets
* Make a few slight adjustments
* Add an informative comment + rewrite a help comment
* Change header
* Fix a few things
* Enforce to use fast tokenizers only
* DataCollatorWithPadding -> DataCollatorForTokenClassification
* Change bash script: python3 -> accelerate launch
* make style
* Add a few missing things (see below)
* Add a max-lenghth padding to predictions and labels to
enable accelerate gather functionality
* Add PyTorch no trainer example to the example README.md
* Remove --do-train from args as being redundant for now
* DataCollatorWithPadding -> DataCollatorForTokenClassification
* Remove some obsolete args.do_train conditions from the script
* Delete --do_train from bash running script
* Delete use_slow_tokenizer from args
* Add unintentionally removed flag --label_all_tokens
* Delete --debug flag from running script
* [examples/seq2seq] fix t5 examples
This PR:
* fixes T5 examples to include `--source_prefix` - it's **not** optional. If you give it a try you will see that you get 10x worse bleu scores w/o it. w/ `27.6849`, w/ `2.374`
* added a normal translation example w/o the peculiarities of MBart and T5
* reduces the default max samples to 50 so it's much faster to test quickly
summarization seems to be broken for t5 score-wise: https://github.com/huggingface/transformers/issues/10733
@sgugger
* specify explicitly the t5 models requiring the special handling
* one more
* update the t5 summarization example to use cnn_dailymail
* move max*samples into the top level README.md
* better wording
* better wording
* add initial script
* finish script
* add shell script example
* accept chars_to_ignor as cl arg
* align the script with other example scripts
* add torchaudio dep
* wav2vec2: support datasets other than LibriSpeech
* Formatting run_asr.py to pass code quality test
* bundled orthography options and added verbose logs
* fixing a typo in timit fine-tuning script
* update comment for clarity
* resize_lm_head and load custom vocab from file
* adding a max_duration_in_seconds filter
* do not assign `duration_filter` lambda, use a def
* log untransliterated text as well
* fix base model for arabic
* fix duration filter when target_sr is not set
* drop duration_in_seconds when unneeded
* script for wav2vec2-large-lv60-timit-asr
* fix for "tha" in arabic corpus (huggingface#10581)
* adding more options to work with common_voice
* PR feedback (huggingface#10581)
* small README change
* pass hf optimizer and scheduler to deepspeed if not specified in ds config
* pass hf optimizer and scheduler to deepspeed if not specified in ds config
* update
* make init_deepspeed support config dict
* fix docstring formatting
* clean up trainer's comments
* add new tests
* fix type
* composit argparse doesn't work
* style
* add a new test, rename others
* document new functionality
* complete tests, add docs
* style
* correct level
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add new methods to the doc
* must tell DS we are using a non-native optimizer
* add protection against cpu_offload + HF optimizer combo
* fix the cli overrides
* sync docs + tests
* restore AdamW
* better docs
* need new version
* no longer needed
* remove outdate information
* refactor duplicated code
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>