Commit Graph

111 Commits

Author SHA1 Message Date
Anton Lozhkov
dbaf49203e
[Examples] Use Audio feature in speech classification (#14052)
* Update SEW integration test tolerance

* Update audio classification

* Update test

* Remove torchaudio

* Add dataset revision

* Hub branch naming

* Revert dataset revisions

* Update datasets
2021-10-20 12:22:43 +03:00
Weizhe Yuan
7a3147e9b8
fix typo (#14049) 2021-10-18 18:03:11 -04:00
Patrick von Platen
bdf31d6e0a
[Speech] Move all examples to new audio feature (#14045)
* up

* up

* up

* finish
2021-10-18 12:52:40 +02:00
Patrick von Platen
37c5759cbe
[Speech Examples] Add new audio feature (#14027)
* finish

* up

* finish all

* up
2021-10-17 23:01:03 +02:00
Patrick von Platen
7fb2a8b3d9
up (#14008) 2021-10-14 15:46:22 +02:00
Sylvain Gugger
0ef61d392c Revert "Skip faulty test"
This reverts commit 5b6bd4e788.
2021-10-14 09:02:41 -04:00
Sylvain Gugger
5b6bd4e788 Skip faulty test 2021-10-13 22:04:40 -04:00
Patrick von Platen
d45fc7da3d
[Speech Examples] Add pytorch speech pretraining (#13877)
* adapt wav2vec2

* add example

* add files

* adapt

* remove bogus file

* Apply suggestions from code review

* adapt files more

* upload changes

* del old files

* up

* up

* up

* up

* up

* correct gradient checkpoitning

* add readme

* finish

* finish

* up

* more fixes

* up

* up

* add demo run to readme

* up
2021-10-12 00:46:32 +02:00
Chungman Lee
46dfe99e44
Fix typo in README.md (#13883) 2021-10-08 14:25:32 -04:00
Dhananjay Shettigar
319beb64eb
#12789 Replace assert statements with exceptions (#13909)
* #12789 Replace assert statements with exceptions

* fix-copies: made copy changes to utils_qa.py in examples/pytorch/question-answering and examples/tensorflow/question-answering

* minor refactor for clarity
2021-10-07 09:09:01 -04:00
Akul Agrawal
dac7798144
Update run_qa.py (#13857) 2021-10-05 23:10:24 -04:00
Nathan Raw
cc0a415e2f
update image classification example (#13824)
*  update image classification example

* 📌 update reqs
2021-10-04 11:49:51 -07:00
Anton Lozhkov
4213728067
[Examples] Add an official audio classification example (#13722)
* Restore broken merge

* Additional args, DDP, remove CommonLanguage

* Update examples for V100, add training results

* Style

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Remove custom datasets for simplicity, apply suggestions from code review

* Add the attention_mask flag, reorganize README

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-10-01 18:52:45 +02:00
Patrick von Platen
44eb8bdeea
map only on one process (#13810) 2021-09-30 18:52:53 +02:00
Stas Bekman
b90096fe14
[examples run_glue.py] missing requirements scipy, sklearn (#13768)
* missing requirement

* list both
2021-09-29 13:45:19 -07:00
Lysandre
11c69b8045 Docs for version v4.11.0 2021-09-27 14:19:38 -04:00
Lysandre
dc193c906d Release: v4.11.0 2021-09-27 14:14:09 -04:00
Sylvain Gugger
044eff5bf0
Update requirements for speech example (#13745) 2021-09-26 09:02:45 +02:00
Patrick von Platen
469b80d4e7
Update README.md 2021-09-24 18:53:58 +02:00
Patrick von Platen
493643fff8
up (#13733) 2021-09-24 18:32:35 +02:00
Gunjan Chhablani
38580455de
Add model card creation snippet to example scripts (#13730)
* Update run_glue.py

* Update run_glue.py

* Add model creation snippet to other scripts

* Fix style
2021-09-24 15:51:46 +02:00
Patrick von Platen
95f888fd6a
Update README.md 2021-09-24 09:53:37 +02:00
Patrick von Platen
4a320f6c9a
[ASR] Add official ASR CTC example to examples/pytorch/speech-recognition (#13620)
* up

* rename

* add asr example

* add auto feature extractor

* some more fixes

* correct layerdrop

* correct for multi-gpu dist

* clean up

* refactor

* refactor

* more fixes

* more fixes

* clean-up

* finish

* up

* Apply suggestions from code review

* fix isort

* update

* up

* add note

* apply surajs suggestions

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* isort

* small change

* Apply suggestions from code review

Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Apply suggestions from code review

Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* add hubert

* Update examples/pytorch/speech-recognition/run_speech_recognition_ctc.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
2021-09-24 07:01:11 +02:00
Sylvain Gugger
27d4639779
Make gradient_checkpointing a training argument (#13657)
* Make gradient_checkpointing a training argument

* Update src/transformers/modeling_utils.py

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/configuration_utils.py

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Fix tests

* Style

* document Gradient Checkpointing as a performance feature

* Small rename

* PoC for not using the config

* Adapt BC to new PoC

* Forgot to save

* Rollout changes to all other models

* Fix typo

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
2021-09-22 07:51:38 -04:00
Sylvain Gugger
b7d264be0d
Add push_to_hub to no_trainer examples (#13659)
* Add push_to_hub to no_trainer examples

* Quality

* Document integration

* Roll out to other examples
2021-09-21 13:13:30 -04:00
Suraj Patil
87d5057d86
fix typo (#13647) 2021-09-20 13:22:26 +05:30
Patrick von Platen
95f933ea85
[Pretrained Model] Add resize_position_embeddings (#13559)
* finish

* delete bogus file

* correct some stuff

* finish

* finish
2021-09-15 19:03:56 +02:00
Aleksander Smywiński-Pohl
008c2d0b7a
Fix typo in documentation (#13494)
* Fix typo in deepspeed documentation

* Add missing import in deepspeed configuration

* Fix path in translation examples
2021-09-09 08:00:05 -04:00
Nathan Raw
79815090ea
Fix img classification tests (#13456)
*  Update image-classification example's tests

* 🔥 remove cats_and_dogs test samples

* 💄 fix flake8
2021-09-07 05:58:45 -04:00
Suraj Patil
2dd975b235
skip image classification test (#13451) 2021-09-06 21:46:25 +05:30
Suraj Patil
6b29bff852
add torchvision in example test requirements (#13438) 2021-09-06 15:17:54 +02:00
Nathan Raw
76c4d8bf26
Add PyTorch image classification example (#13134)
*  add pytorch image classification example

* 🔥 remove utils.py

* 💄 fix flake8 style issues

* 🔥 remove unnecessary line

*  limit dataset sizes

* 📌 update reqs

* 🎨 restructure - use datasets lib

* 🎨 import transforms directly

* 📝 add comments

* 💄 style

* 🔥 remove flag

* 📌 update requirement warning

* 📝 add vision README.md

* 📝 update README.md

* 📝 update README.md

* 🎨 add image-classification tag to model card

* 🚚 rename vision ➡️ image-classification

* 📝 update image-classification README.md
2021-09-02 13:29:42 -06:00
Lysandre
5ee67a4412 Docs for v4.10.0 2021-08-31 16:02:31 +02:00
Lysandre
d12bbe4942 Release: v4.10.0 2021-08-31 15:53:10 +02:00
Sylvain Gugger
c76de1053e
Add generate kwargs to Seq2SeqTrainingArguments (#13339)
* Add generate kwargs to Seq2SeqTrainingArguments

* typo

* Address review comments + doc

* Style
2021-08-31 08:42:00 -04:00
Sylvain Gugger
139e830158
Update label2id in the model config for run_glue (#13334) 2021-08-30 10:35:09 -04:00
Stefan Schweter
4046e66e40
examples: only use keep_linebreaks when reading TXT files (#13320)
* examples: only use keep_linebreaks when reading TXT files for all CLM examples

* examples: only use keep_linebreaks when reading TXT files for all CLM examples

* examples: only use keep_linebreaks when reading TXT files for all CLM examples
2021-08-28 16:22:29 +02:00
Stefan Schweter
319d840b46
examples: add keep_linebreaks option to CLM examples (#13150)
* examples: add keep_linebreaks option to text dataset loader for all CLM examples

* examples: introduce new keep_linebreaks option as data argument in CLM examples
2021-08-27 11:35:45 +02:00
Allan Lin
91ff480e26
Update namespaces inside torch.utils.data to the latest. (#13167)
* Update torch.utils.data namespaces to the latest.

* Format

* Update Dataloader.

* Style
2021-08-19 14:29:51 +02:00
Sylvain Gugger
7fcee113c1
Tpu tie weights (#13030)
* Fix tied weights on TPU

* Manually tie weights in no trainer examples

* Fix for test

* One last missing

* Gettning owned by my scripts

* Address review comments

* Fix test

* Fix tests

* Fix reformer tests
2021-08-06 20:41:39 +02:00
Chungman Lee
75b8990d90
fix typo in example/text-classification README (#12974)
* fix typo in example/text-classification README

* add space to align the table
2021-08-02 12:58:43 +02:00
Sylvain Gugger
3ec851dc5e
Fix QA examples for roberta tokenizer (#12928) 2021-07-28 09:47:49 -04:00
Sylvain Gugger
fd85734e0e
Add option to set max_len in run_ner (#12929) 2021-07-28 09:38:12 -04:00
Sylvain Gugger
303989de0e
Add accelerate to examples requirements (#12888) 2021-07-26 09:57:34 -04:00
Lysandre
40de2d5a4f Docs for v4.10.0dev0 2021-07-22 12:52:25 +02:00
Lysandre
72aee83ced Release: v4.9.0 2021-07-22 12:11:55 +02:00
Maxwell Forbes
fcf83011df
Fix type of max_seq_length arg in run_swag.py (#12832) 2021-07-22 02:14:14 -04:00
Sylvain Gugger
6f1adc4334
Fix group_lengths for short datasets (#12558) 2021-07-08 07:23:41 -04:00
Souvic Chakraborty
1d6623c6a2
MLM training fails with no validation file(same as #12406 for pytorch now) (#12517)
* Validation split percentage to be used for custom data files also

Issue same as https://github.com/huggingface/transformers/issues/12406 fixed for pytorch branch run_mlm.py

* Validation split added in the right place

* Update run_clm.py

* validation split added for custom files

* Validation split added for custom files

* Update run_plm.py

* fixed validation split for custom files as input for pytorch examples in lm

* Update run_clm_no_trainer.py

* args modified
2021-07-07 09:05:44 -04:00
Bhadresh Savani
04dbea31a9
[Examples] Added context manager to datasets map (#12367)
* added cotext manager to datasets map

* fixed style and spaces

* fixed warning of deprecation

* changed desc
2021-06-28 09:14:00 -07:00