Commit Graph

37 Commits

Author SHA1 Message Date
Jackmin801
145109382a
Allow trust_remote_code in example scripts (#25248)
* pytorch examples

* pytorch mim no trainer

* cookiecutter

* flax examples

* missed line in pytorch run_glue

* tensorflow examples

* tensorflow run_clip

* tensorflow run_mlm

* tensorflow run_ner

* tensorflow run_clm

* pytorch example from_configs

* pytorch no trainer examples

* Revert "tensorflow run_clip"

This reverts commit 261f86ac1f.

* fix: duplicated argument
2023-08-07 16:32:25 +02:00
Yih-Dar
149cb0cce2
Add token arugment in example scripts (#25172)
* fix

* fix

* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-08-02 11:17:31 +02:00
Yih-Dar
d53b8ad780
Update use_auth_token -> token in example scripts (#25167)
* pytorch examples

* tensorflow examples

* flax examples

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-28 15:33:45 +02:00
Zach Mueller
aa1b09c5d1
Change logic for logging in the examples (#24956)
Change logic
2023-07-20 12:30:10 -04:00
Sylvain Gugger
e9ad51306f
4.32.0.dev0 2023-07-17 13:30:44 -04:00
Sylvain Gugger
ba695c1efd
v4.31.0.dev0 2023-06-07 16:49:00 -04:00
Sylvain Gugger
a0c0a78233
v4.30.0.dev0 2023-05-09 14:59:38 -04:00
Sylvain Gugger
888c4a2ae0
v4.29.0.dev0 2023-04-12 20:04:29 -04:00
Mikel Penagarikano
d5239bab5b
Sync preprocesses before loading the processor at run_speech_recognition_ctc.py (#21926)
* Update run_speech_recognition_ctc.py

Make sure all processes wait until data is saved before loading the processor from the output_dit

* Make sure all processes wait until data is saved before loading the processor from the output_dit

* Update run_speech_recognition_ctc.py

* Update run_speech_recognition_seq2seq.py
2023-04-05 09:36:04 -04:00
Sylvain Gugger
ebdb185bef
v4.28.0.dev0 2023-03-14 13:49:10 -04:00
bofeng huang
6192549c1f
[examples/speech-recognition] Add SpecAugment to run_speech_recognition_seq2seq.py (#21942)
* Add specaugment to run_speech_recognition_seq2seq.py

* Remove useless argument: text_column

* Fix quality

* Update return_attention_mask condition

* Update specaugment arguments only for whisper models

* Remove SpecAugment arguments from ModelArguments, only leave default values for simplicity

* Apply suggestions from code review

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update apply_spec_augment only for whisper models

* Apply suggestions from code review

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Rename return_attention_mask to forward_attention_mask to avoid confusion with wav2vec2 models

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2023-03-08 17:59:31 +01:00
Sylvain Gugger
6f79d26442
Update quality tooling for formatting (#21480)
* Result of black 23.1

* Update target to Python 3.7

* Switch flake8 to ruff

* Configure isort

* Configure isort

* Apply isort with line limit

* Put the right black version

* adapt black in check copies

* Fix copies
2023-02-06 18:10:56 -05:00
Sylvain Gugger
7119bb052a
v4.27.0.dev0 2023-01-23 16:52:35 -05:00
Emmanuel Schmidbauer
0526a075c5
run_speech_recognition_seq2seq.py: add cache_dir param to dataset (#20540) 2022-12-07 18:23:16 +00:00
Sylvain Gugger
60d1f31bb0
v4.26.0.dev0 2022-12-01 16:19:33 -05:00
Sanchit Gandhi
af1a7c8ca3
[Examples] Generalise Seq2Seq ASR to handle Whisper (#19519)
* merge conflicts

* bos and eos in datacollator

* (temp) hardcode removal of attention mask

* freeze encoder

* actually freeze encoder

* set max length / num beams according to gen kwargs

* (temp) fix tests

* don't pop attn mask

* override return attention mask config from Hub

* Hub configs updated 🤗

* final fixes

* update type annotations

* backward comp
2022-11-14 17:45:46 +00:00
Sylvain Gugger
c3a93d8d82
v4.25.0.dev0 2022-10-31 21:48:40 -04:00
Sanchit Gandhi
f38a145418
[ASR] Update 'tasks' for model card (#19986) 2022-10-31 16:50:17 +00:00
Sanchit Gandhi
eefcecaa35
[Examples] Fix typos in run speech recognition seq2seq (#19514) 2022-10-12 15:33:22 +01:00
Lysandre
10100979ed Dev version 2022-10-10 17:25:40 -04:00
Lysandre
16913b3c92 Dev version 2022-09-14 14:58:20 -04:00
Julien Chaumond
9129fd0377
transformers-cli login => huggingface-cli login (#18490)
* zero chance anyone's using that constant no?

* `transformers-cli login` => `huggingface-cli login`

* `transformers-cli repo create` => `huggingface-cli repo create`

* `make style`
2022-08-06 09:42:55 +02:00
atturaioe
1f84399171
Migrate metric to Evaluate in Pytorch examples (#18369)
* Migrate metric to Evaluate in pytorch examples

* Remove unused imports
2022-08-01 07:40:25 -04:00
Lysandre
c89a592e87 Dev version 2022-07-27 17:13:57 +02:00
Sylvain Gugger
7c6ec195ad v4.21.0.dev0 2022-06-16 12:20:53 -04:00
Sylvain Gugger
3cab90279f
Add examples telemetry (#17552)
* Add examples telemetry

* Alternative approach

* Add to all other examples

* Add to templates as well

* Put framework separately

* Same for TensorFlow
2022-06-07 11:57:52 -04:00
Sylvain Gugger
afe5d42d8d
Black preview (#17217)
* Black preview

* Fixup too!

* Fix check copies

* Use the same version as the CI

* Bump black
2022-05-12 16:25:55 -04:00
Lysandre Debut
5294fa12ee Dev version 2022-05-12 11:04:23 -04:00
Lysandre Debut
a180efe7fd Dev version 2022-04-06 11:08:12 -04:00
Karim Foda
24a85cca61
Add use_auth to load_datasets for private datasets to PT and TF examples (#16521)
* fix formatting and remove use_auth

* Add use_auth_token to Flax examples
2022-04-04 10:27:45 -04:00
Sylvain Gugger
79d28e80b6 v4.18.0.dev.0 2022-03-03 10:19:58 -05:00
Anton Lozhkov
a459f7f97d
Add ASR CTC streaming example (#15309)
* Single-epoch run

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Infinite dataset

* Trainer fix + distributed benchmark

* Benchmark fix

* unused import

* interleaved splits

* interleaved splits

* has_length util

* Move to research projects

* Leftover Sized checks

* Bump min version

* Unused import

* Revert trainer changes

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-02-07 18:35:37 +03:00
Lysandre
eab338104d Docs for version v4.16.0 2022-01-27 13:11:51 -05:00
Lysandre
f87db5e412 Release: v4.16.0 2022-01-27 13:06:33 -05:00
flozi00
b67f345d00
Update run_speech_recognition_seq2seq.py (#14967) 2022-01-06 19:26:45 +03:00
Patrick von Platen
600496fa50
[Wav2Vec2] Rename model's feature extractor to feature encoder (#14959)
* rename classes

* clean up more namings

* remove bogus file

* Apply suggestions from code review

* Apply suggestions from code review

* replace more names

* more regex replace

* make style

* correct

* correct more

* make style

* finish

* correct more in wav2vec2

* make style

* improve freeze_extractor

* add aliases

* add tf aliases
2021-12-28 20:33:23 +01:00
Patrick von Platen
1c121916f3
Add Speech Seq2Seq Training script (#14792)
* start

* add gradient checkpointing and feature extractor freezing

* Apply suggestions from code review

* up

* up

* up

* correct

* up

* more changes

* up

* up

* up

* remove rst
2021-12-28 10:20:51 +01:00