Sylvain Gugger
7119bb052a
v4.27.0.dev0
2023-01-23 16:52:35 -05:00
Emmanuel Schmidbauer
0526a075c5
run_speech_recognition_seq2seq.py: add cache_dir param to dataset ( #20540 )
2022-12-07 18:23:16 +00:00
Francisco Kurucz
f821bea0ad
Fix link to speech encoder decoder model in speech recognition readme ( #20633 )
2022-12-06 15:46:41 -05:00
Sylvain Gugger
60d1f31bb0
v4.26.0.dev0
2022-12-01 16:19:33 -05:00
Sanchit Gandhi
c29a2f7c9c
[ASR Examples] Update README for Whisper ( #20230 )
...
* [ASR Examples] Update README for seq2seq
* add language info
* add training results
* re-word
2022-11-18 11:24:25 +00:00
Sanchit Gandhi
af1a7c8ca3
[Examples] Generalise Seq2Seq ASR to handle Whisper ( #19519 )
...
* merge conflicts
* bos and eos in datacollator
* (temp) hardcode removal of attention mask
* freeze encoder
* actually freeze encoder
* set max length / num beams according to gen kwargs
* (temp) fix tests
* don't pop attn mask
* override return attention mask config from Hub
* Hub configs updated 🤗
* final fixes
* update type annotations
* backward comp
2022-11-14 17:45:46 +00:00
bhuang
3502c202f9
Update README.md ( #20063 )
2022-11-04 08:56:54 -04:00
Sylvain Gugger
c3a93d8d82
v4.25.0.dev0
2022-10-31 21:48:40 -04:00
Sanchit Gandhi
f38a145418
[ASR] Update 'tasks' for model card ( #19986 )
2022-10-31 16:50:17 +00:00
Sanchit Gandhi
eefcecaa35
[Examples] Fix typos in run speech recognition seq2seq ( #19514 )
2022-10-12 15:33:22 +01:00
Lysandre
10100979ed
Dev version
2022-10-10 17:25:40 -04:00
ddobokki
fa4bcd5274
edit: cast attention_mask to long in DataCollatorCTCWithPadding ( #19369 )
...
* edit: casting attention_mask to long in DataCollatorCTCWithPadding
* edit: casting attention_mask to long in DataCollatorCTCWithPadding
2022-10-07 10:05:48 -04:00
Lysandre
16913b3c92
Dev version
2022-09-14 14:58:20 -04:00
Zachary Mueller
358fc18613
Add evaluate to examples requirements ( #18666 )
2022-08-18 10:57:39 -04:00
Julien Chaumond
9129fd0377
transformers-cli login
=> huggingface-cli login
(#18490 )
...
* zero chance anyone's using that constant no?
* `transformers-cli login` => `huggingface-cli login`
* `transformers-cli repo create` => `huggingface-cli repo create`
* `make style`
2022-08-06 09:42:55 +02:00
atturaioe
1f84399171
Migrate metric to Evaluate in Pytorch examples ( #18369 )
...
* Migrate metric to Evaluate in pytorch examples
* Remove unused imports
2022-08-01 07:40:25 -04:00
Sylvain Gugger
986526a0e4
Replace as_target
context managers by direct calls ( #18325 )
...
* Preliminary work on tokenizers
* Quality + fix tests
* Treat processors
* Fix pad
* Remove all uses of in tests, docs and examples
* Replace all as_target_tokenizer
* Fix tests
* Fix quality
* Update examples/flax/image-captioning/run_image_captioning_flax.py
Co-authored-by: amyeroberts <amy@huggingface.co>
* Style
Co-authored-by: amyeroberts <amy@huggingface.co>
2022-07-29 08:09:09 -04:00
Lysandre
c89a592e87
Dev version
2022-07-27 17:13:57 +02:00
Sylvain Gugger
7c6ec195ad
v4.21.0.dev0
2022-06-16 12:20:53 -04:00
Sylvain Gugger
3cab90279f
Add examples telemetry ( #17552 )
...
* Add examples telemetry
* Alternative approach
* Add to all other examples
* Add to templates as well
* Put framework separately
* Same for TensorFlow
2022-06-07 11:57:52 -04:00
Patrick von Platen
a9eca74372
Wav2vec2 finetuning shared file system ( #17423 )
...
* fix_torch_device_generate_test
* remove @
* [Fix shared file system]
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2022-05-25 22:04:43 +02:00
Sylvain Gugger
afe5d42d8d
Black preview ( #17217 )
...
* Black preview
* Fixup too!
* Fix check copies
* Use the same version as the CI
* Bump black
2022-05-12 16:25:55 -04:00
Lysandre Debut
5294fa12ee
Dev version
2022-05-12 11:04:23 -04:00
Lysandre Debut
a180efe7fd
Dev version
2022-04-06 11:08:12 -04:00
Karim Foda
24a85cca61
Add use_auth to load_datasets for private datasets to PT and TF examples ( #16521 )
...
* fix formatting and remove use_auth
* Add use_auth_token to Flax examples
2022-04-04 10:27:45 -04:00
Lysandre Debut
eca77f4719
Updates the default branch from master to main ( #16326 )
...
* Updates the default branch from master to main
* Links from `master` to `main`
* Typo
* Update examples/flax/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-23 03:46:59 -04:00
Sylvain Gugger
79d28e80b6
v4.18.0.dev.0
2022-03-03 10:19:58 -05:00
Anton Lozhkov
a459f7f97d
Add ASR CTC streaming example ( #15309 )
...
* Single-epoch run
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Infinite dataset
* Trainer fix + distributed benchmark
* Benchmark fix
* unused import
* interleaved splits
* interleaved splits
* has_length util
* Move to research projects
* Leftover Sized checks
* Bump min version
* Unused import
* Revert trainer changes
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-02-07 18:35:37 +03:00
François REMY
0094eba363
Fix additional DataTrainingArguments documentation ( #15408 )
...
(This is an editorial change only)
2022-01-31 07:45:11 -05:00
Lysandre
eab338104d
Docs for version v4.16.0
2022-01-27 13:11:51 -05:00
Lysandre
f87db5e412
Release: v4.16.0
2022-01-27 13:06:33 -05:00
François REMY
19732cc07a
Fix 'eval_split_name' described as defaulting to 'train' ( #15348 )
...
The default is correct (`test`) but the description is not.
2022-01-26 10:19:38 -05:00
Patrick von Platen
d72343d2b8
[Wav2Vec2 Speech Event] Add speech event v2 ( #15083 )
...
* up
* up
* up
* up
* up
* up
* improve
* up
* up
* Update src/transformers/trainer.py
* up
* up
* up
2022-01-10 10:46:21 +01:00
flozi00
b67f345d00
Update run_speech_recognition_seq2seq.py ( #14967 )
2022-01-06 19:26:45 +03:00
Patrick von Platen
600496fa50
[Wav2Vec2] Rename model's feature extractor to feature encoder ( #14959 )
...
* rename classes
* clean up more namings
* remove bogus file
* Apply suggestions from code review
* Apply suggestions from code review
* replace more names
* more regex replace
* make style
* correct
* correct more
* make style
* finish
* correct more in wav2vec2
* make style
* improve freeze_extractor
* add aliases
* add tf aliases
2021-12-28 20:33:23 +01:00
Patrick von Platen
f80775df2b
Update README.md ( #14965 )
2021-12-28 13:41:27 +01:00
Patrick von Platen
1c121916f3
Add Speech Seq2Seq Training script ( #14792 )
...
* start
* add gradient checkpointing and feature extractor freezing
* Apply suggestions from code review
* up
* up
* up
* correct
* up
* more changes
* up
* up
* up
* remove rst
2021-12-28 10:20:51 +01:00
Patrick von Platen
fa39ff9fc4
Docs for v4.16.0dev0
2021-12-22 20:39:44 +01:00
Patrick von Platen
05fa1a7ac1
Release: v4.15.0
2021-12-22 18:43:15 +01:00
Patrick von Platen
7ae6f07004
[ASR example] Improve example + add more examples ( #14848 )
...
* up
* load up
* up
2021-12-21 13:12:22 +01:00
Patrick von Platen
c4a96cecbc
Wav2Vec2 meets phonemes ( #14353 )
...
* up
* add tokenizer
* improve more
* finish tokenizer
* finish
* adapt speech recognition script
* adapt convert
* more fixes
* more fixes
* update phonemizer wav2vec2
* better naming
* fix more tests
* more fixes swedish
* correct tests
* finish
* improve script
* remove file
* up
* lets get those 100 model architectures until the end of the month
* make fix-copies
* correct more
* correct script
* more fixes
* more fixes
* add to docs
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* replace assert
* fix copies
* fix docs
* new try docs
* boom boom
* update
* add phonemizer to audio tests
* make fix-copies
* up
* upload models
* some changes
* Update tests/test_tokenization_wav2vec2_phoneme.py
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* more fixes
* remove @
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
2021-12-17 19:56:44 +01:00
Lysandre
7c9c41f43c
Docs for v4.14.0
2021-12-15 18:29:53 +01:00
Lysandre
960d8cb41d
Release: v4.14.0
2021-12-15 18:20:35 +01:00
Lysandre
ab31b3e41b
Docs for v4.14.0dev0
2021-12-09 17:09:23 +01:00
Lysandre
4da3a696e4
Release: v4.13.0
2021-12-09 16:55:21 +01:00
Patrick von Platen
efea0f868b
[Speech Recognition] More examples
...
Add more XLS-R training runs to the official examples
2021-11-18 23:42:02 +01:00
Patrick von Platen
55f49c5f4b
[Wav2Vec2 Example] Improve fine-tuning script ( #14373 )
...
* improve some stuff
* finish
* correct last
2021-11-12 16:35:57 +01:00
Lysandre
b8fad022a0
v4.13.0.dev0
2021-10-28 12:56:46 -04:00
Lysandre
62bf536631
Release v4.12.0
2021-10-28 12:09:49 -04:00
Patrick von Platen
88cd82e801
Update README.md
2021-10-28 02:35:01 +02:00