Commit Graph

177 Commits

Author SHA1 Message Date
Sylvain Gugger
45cac3fade
Fix labels stored in model config for token classification examples (#15482)
* Playing

* Properly set labels in model config for token classification example

* Port to run_ner_no_trainer

* Quality
2022-02-02 14:23:43 -05:00
Sylvain Gugger
d0b5ed110a
Harder check for IndexErrors in QA scripts (#15438)
* Harder check for IndexErrors in QA scripts

* Make test stronger
2022-02-01 15:49:13 -05:00
François REMY
0094eba363
Fix additional DataTrainingArguments documentation (#15408)
(This is an editorial change only)
2022-01-31 07:45:11 -05:00
Sylvain Gugger
c98a6ac211
Use argument for preprocessing workers in run_summairzation (#15394) 2022-01-28 18:34:10 -05:00
Lysandre
eab338104d Docs for version v4.16.0 2022-01-27 13:11:51 -05:00
Lysandre
f87db5e412 Release: v4.16.0 2022-01-27 13:06:33 -05:00
François REMY
19732cc07a
Fix 'eval_split_name' described as defaulting to 'train' (#15348)
The default is correct (`test`) but the description is not.
2022-01-26 10:19:38 -05:00
Patrick von Platen
457dd4392b
[Examples] Correct run ner label2id for fine-tuned models (#15017)
* up

* up

* make style

* apply sylvains suggestions

* apply changes to accelerate as well

* more changes

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-01-24 21:18:04 +01:00
Sylvain Gugger
4cff3fae11 Second failing test 2022-01-21 12:19:28 -05:00
Sylvain Gugger
f6253147df Skip failing test 2022-01-21 12:03:21 -05:00
NielsRogge
6c7b68d414
[ViTMAE] Add image pretraining script (#15242)
* Add script

* Improve script

* Fix data collator

* Update README

* Add label_names argument

* Apply suggestions from code review

* Add config parameters

* Update script

* Fix bug

* Improve README

* Improve README and add test

* Fix import

* Add image_column_name
2022-01-21 12:11:08 +01:00
Sylvain Gugger
531336bbfd
Fix deprecation warnings for int div (#15180)
* Fix deprecation warnings for int div

Co-authored-by: mgoldey <matthew.goldey@gmail.com>

* Fix import

* ensure that tensor output is python scalar

* make backward compatible

* make code more readable

* adapt test functions

Co-authored-by: mgoldey <matthew.goldey@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-01-18 07:28:53 -05:00
Sylvain Gugger
96881729ce Remove assert on optional arg 2022-01-13 17:34:41 -05:00
Edoardo Federici
9a94bb8e21
mBART support for run_summarization.py (#15125)
* Update run_summarization.py

* Fixed languages and added missing code

* fixed obj, docs, removed source_lang and target_lang

* make style, run_summarization.py reformatted
2022-01-12 16:39:33 -05:00
Patrick von Platen
d72343d2b8
[Wav2Vec2 Speech Event] Add speech event v2 (#15083)
* up

* up

* up

* up

* up

* up

* improve

* up

* up

* Update src/transformers/trainer.py

* up

* up

* up
2022-01-10 10:46:21 +01:00
flozi00
b67f345d00
Update run_speech_recognition_seq2seq.py (#14967) 2022-01-06 19:26:45 +03:00
flozi00
774ed4a027
Fix Code block (#14983) 2022-01-04 12:59:20 +01:00
Patrick von Platen
600496fa50
[Wav2Vec2] Rename model's feature extractor to feature encoder (#14959)
* rename classes

* clean up more namings

* remove bogus file

* Apply suggestions from code review

* Apply suggestions from code review

* replace more names

* more regex replace

* make style

* correct

* correct more

* make style

* finish

* correct more in wav2vec2

* make style

* improve freeze_extractor

* add aliases

* add tf aliases
2021-12-28 20:33:23 +01:00
Patrick von Platen
f80775df2b
Update README.md (#14965) 2021-12-28 13:41:27 +01:00
Patrick von Platen
1c121916f3
Add Speech Seq2Seq Training script (#14792)
* start

* add gradient checkpointing and feature extractor freezing

* Apply suggestions from code review

* up

* up

* up

* correct

* up

* more changes

* up

* up

* up

* remove rst
2021-12-28 10:20:51 +01:00
Patrick von Platen
fa39ff9fc4 Docs for v4.16.0dev0 2021-12-22 20:39:44 +01:00
Patrick von Platen
05fa1a7ac1 Release: v4.15.0 2021-12-22 18:43:15 +01:00
Mario Šaško
1045a36c1f
Fix pytorch image classification example (#14883)
* Update example

* Remove skip in tests
2021-12-22 14:42:19 +01:00
Sylvain Gugger
e51c7b5872 Skip failing test 2021-12-21 15:15:17 -05:00
Stas Bekman
033c3ed95a
[examples/summarization] deal with None in data records (#14816)
* [examples/summarization] deal with None in data records

* rewrite to use a simpler (slower) variant
2021-12-21 09:17:28 -08:00
Patrick von Platen
7ae6f07004
[ASR example] Improve example + add more examples (#14848)
* up

* load up

* up
2021-12-21 13:12:22 +01:00
Patrick von Platen
c4a96cecbc
Wav2Vec2 meets phonemes (#14353)
* up

* add tokenizer

* improve more

* finish tokenizer

* finish

* adapt speech recognition script

* adapt convert

* more fixes

* more fixes

* update phonemizer wav2vec2

* better naming

* fix more tests

* more fixes swedish

* correct tests

* finish

* improve script

* remove file

* up

* lets get those 100 model architectures until the end of the month

* make fix-copies

* correct more

* correct script

* more fixes

* more fixes

* add to docs

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* replace assert

* fix copies

* fix docs

* new try docs

* boom boom

* update

* add phonemizer to audio tests

* make fix-copies

* up

* upload models

* some changes

* Update tests/test_tokenization_wav2vec2_phoneme.py

Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* more fixes

* remove @

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
2021-12-17 19:56:44 +01:00
Lysandre
7c9c41f43c Docs for v4.14.0 2021-12-15 18:29:53 +01:00
Lysandre
960d8cb41d Release: v4.14.0 2021-12-15 18:20:35 +01:00
Josué Nascimento
971e36667a
Change how to load config of XLNetLMHeadModel (#14746) 2021-12-13 12:34:26 -05:00
Lysandre
ab31b3e41b Docs for v4.14.0dev0 2021-12-09 17:09:23 +01:00
Lysandre
4da3a696e4 Release: v4.13.0 2021-12-09 16:55:21 +01:00
Gaurang Tandon
4ea19de80c
fix: verify jsonlines file in run_translation (#14660) (#14661)
* fix: verify jsonl in run_translation (#14660)

* fix(run_translation.py): json/jsonl validation

Both json and jsonl are to be accepted as valid jsonlines file extension

* fix(run_translation.py): make black happy

* Ran make style
2021-12-08 13:25:30 -05:00
Julien Chaumond
6cdc3a7844
[urls to hub] Replace outdated model tags with their now-canonical pipeline types (#14617)
* Replace outdated model tags with their now-canonical pipeline types

* spam the CI till it's green
2021-12-06 04:35:01 -05:00
Kamal Raj
803a8cd18f
updated readme with proper arguments (#14624) 2021-12-05 22:12:51 -05:00
(Bill) Yuchen Lin
3977b58437
fix a typo (#14626) 2021-12-05 11:31:23 +05:30
Nicholas Broad
69e16abf98
Switch from using sum for flattening lists of lists in group_texts (#14472)
* remove sum for list flattening

* change to chain(*)

* make chain object a list

* delete empty lines

per sgugger's suggestions

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-11-22 16:17:26 -05:00
Stas Bekman
11f65d4158
[test] add test for --config_overrides (#14466)
* add test for --config_overrides

* remove unneeded parts of the test
2021-11-22 11:33:43 -05:00
Patrick von Platen
efea0f868b
[Speech Recognition] More examples
Add more XLS-R training runs to the official examples
2021-11-18 23:42:02 +01:00
William Held
01f8e639d3
Recover Deleted XNLI Instructions (#14437) 2021-11-17 20:16:47 -05:00
Patrick von Platen
55f49c5f4b
[Wav2Vec2 Example] Improve fine-tuning script (#14373)
* improve some stuff

* finish

* correct last
2021-11-12 16:35:57 +01:00
karthikrangasai
4f24058c58
Update Seq2Seq QA example script to use SQuAD metric. (#14335)
* Update postporcessing accordingly to use SQuAD metric.

* Update assets accordingly based on SQuAD metrics.

* Fix function naming error.
2021-11-09 08:04:23 -05:00
Sylvain Gugger
08a5f57567
Add new LFS prune API (#14294) 2021-11-05 18:58:51 -04:00
NielsRogge
7396095af7
Update README of QA examples (#14172) 2021-11-01 12:52:22 +01:00
Patrick von Platen
ba71f1b57f
Update README.md 2021-10-28 19:43:05 +02:00
Lysandre
b8fad022a0 v4.13.0.dev0 2021-10-28 12:56:46 -04:00
Lysandre
62bf536631 Release v4.12.0 2021-10-28 12:09:49 -04:00
Anton Lozhkov
78b6a2ecbd
Add audio-classification benchmarking results (#14192) 2021-10-28 15:59:18 +03:00
Patrick von Platen
88cd82e801
Update README.md 2021-10-28 02:35:01 +02:00
Patrick von Platen
e118db15d6
Update README.md 2021-10-28 01:59:27 +02:00