Commit Graph

1573 Commits

Author SHA1 Message Date
Lysandre
9853c5dd58 Development on v4.6.0dev0 2021-04-06 12:53:25 -04:00
Lysandre
4906a29f7f Release v4.5.0 2021-04-06 12:37:47 -04:00
Hemil Desai
6ab7d1a429
Add Readme for language modeling scripts with accelerate (#11073) 2021-04-05 20:56:12 -04:00
Hemil Desai
b51b87c41d
Add examples/language_modeling/run_clm_no_trainer.py (#11026)
* Initial draft for clm no trainer

* Remove unwanted args

* Fix bug

* Update examples/language-modeling/run_clm_no_trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-05 12:27:52 -04:00
Stas Bekman
3d39226a51
s|Pretrained|PreTrained| (#11048) 2021-04-04 18:08:42 -07:00
versis
335c0ca35c
fixed typo: logging instead of logger (#11025) 2021-04-02 09:22:22 -04:00
Hemil Desai
838f83d84c
Add examples/language_modeling/run_mlm_no_trainer.py (#11001)
* Add initial script for finetuning MLM models with accelerate

* Add evaluation metric calculation

* Fix bugs

* Use no_grad on evaluation

* update script docstring

* Update examples/language-modeling/run_mlm_no_trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* PR feedback

* Fix CI failure

* Update examples/language-modeling/run_mlm_no_trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-31 18:49:45 -04:00
Sylvain Gugger
acc3bd9d2a
Enforce string-formatting with f-strings (#10980)
* First third

* Styling and fix mistake

* Quality

* All the rest

* Treat %s and %d

* typo

* Missing )

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-31 10:00:27 -04:00
WybeKoper
645f45c462
Fixed some typos and removed legacy url (#10989)
* Fixed typos

* Removed legacy colab notebook from readme

Co-authored-by: WybeKoper <WybeKoper@users.noreply.github.com>
2021-03-31 16:53:15 +05:30
Yih-Dar
e031162a6b
fix md file to avoid evaluation crash (#10962) 2021-03-30 21:26:22 +03:00
Philipp Schmid
3e09d813aa
[examples/s2s] added py7zr dep (#10971)
* added py7zr

* comment out check_min for sagemaker test

* added min version again
2021-03-30 23:17:12 +05:30
Stas Bekman
05c966f24b
[vulnerability] dep fix (#10954)
Fixes https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/Pygments/open

@LysandreJik
2021-03-29 17:25:47 -04:00
Daniel Stancl
5057213bcc
Add examples/multiple-choice/run_swag_no_trainer.py (#10934)
* Initial commit

* Another bunch of updates

* make style quliaty + delete debug arg from bash script

* Use compue_metrics func

* Do a few fixes

* Add copyright

* Fix typos
2021-03-29 16:41:09 -04:00
Sylvain Gugger
4002f95eb6 Remove duplicate code 2021-03-29 15:27:12 -04:00
Daniel Stancl
d7b50ce469
Add examples/run_ner_no_trainer.py (#10902)
* Add NER example with accelerate library

* This commit contains the first (yet really unfinished)
version of a script for showing how to train HuggingFace model
with their new accelerate library.

* Fix metric calculation

* make style quality

* mv ner_no_trainer to token-classification dir

* Delete --debug flag from running script

* hf_datasets -> raw_datasets

* Make a few slight adjustments

* Add an informative comment + rewrite a help comment

* Change header

* Fix a few things

* Enforce to use fast tokenizers only

* DataCollatorWithPadding -> DataCollatorForTokenClassification

* Change bash script: python3 -> accelerate launch

* make style

* Add a few missing things (see below)

* Add a max-lenghth padding to predictions and labels to
enable accelerate gather functionality

* Add PyTorch no trainer example to the example README.md

* Remove --do-train from args as being redundant for now

* DataCollatorWithPadding -> DataCollatorForTokenClassification

* Remove some obsolete args.do_train conditions from the script

* Delete --do_train from bash running script

* Delete use_slow_tokenizer from args

* Add unintentionally removed flag --label_all_tokens

* Delete --debug flag from running script
2021-03-29 15:11:23 -04:00
WybeKoper
ddea8771c6
Updated colab links in readme of examples (#10932)
Co-authored-by: WybeKoper <WybeKoper@users.noreply.github.com>
2021-03-29 08:47:09 -04:00
Bhadresh Savani
4f21e1ddd6
fixed finename (#10939) 2021-03-28 09:48:12 -07:00
Stas Bekman
3c27d246e5
[vulnerability] fix dependency (#10914)
this PR fixes https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/PyYAML/open
2021-03-26 09:06:11 -04:00
Jethro Kuan
5f1491d3b3
run_glue_no_trainer: datasets -> raw_datasets (#10898)
Use the correct variable (raw_datasets) instead of the module (datasets)
where appropriate.
2021-03-25 08:28:17 -04:00
Bhadresh Savani
7ef40120a0
[Examples] Added predict stage and Updated Example Template (#10868)
* added predict stage

* added test keyword in exception message

* removed example specific saving predictions

* fixed f-string error

* removed extra line

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-03-23 10:37:59 -07:00
Eliza Szczechla
9f8fa4e973
Use DataCollatorForSeq2Seq in run_summarization in all cases (#10856)
Co-authored-by: Eliza <eliza@habanero.tiger.com.pl>
2021-03-22 15:05:39 -04:00
Boris Dayma
125ccead71
feat(wandb): logging and configuration improvements (#10826)
* feat: ensure unique artifact id

* feat: allow manual init

* fix: simplify reinit logic

* fix: no dropped value + immediate commits

* fix: wandb use in sagemaker

* docs: improve documenation and formatting

* fix: typos

* docs: improve formatting
2021-03-22 10:45:17 -04:00
Stas Bekman
8fb4671811
[vulnerability] in example deps fix (#10817)
Takes care of:
https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/jinja2/open

@LysandreJik

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-22 09:05:24 -04:00
dependabot[bot]
dbfe379514
Bump jinja2 from 2.11.2 to 2.11.3 in /examples/research_projects/lxmert (#10818)
Bumps [jinja2](https://github.com/pallets/jinja) from 2.11.2 to 2.11.3.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/master/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/2.11.2...2.11.3)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-03-22 08:54:50 -04:00
Qiushi Pan
29904a967b
Update FINE_TUNE_XLSR_WAV2VEC2.md (#10849)
Fix typo.
2021-03-22 07:58:59 -04:00
Patrick von Platen
0f226f78ce
push (#10846) 2021-03-22 10:32:21 +03:00
Suraj Patil
82b8d8c7b0
Update FINE_TUNE_XLSR_WAV2VEC2.md 2021-03-21 22:47:09 +05:30
Patrick von Platen
af6125ffdb
Update FINE_TUNE_XLSR_WAV2VEC2.md 2021-03-21 12:31:33 +03:00
Patrick von Platen
5aaf6e1460
small improvements for wav2vec2 info script (#10829) 2021-03-21 11:41:44 +03:00
Suraj Patil
68b55885ed
add doc for Local machine (#10828) 2021-03-21 13:25:34 +05:30
Julien Chaumond
1438c487df
wav2vec doc tweaks (#10808)
* wording/typos tweaks

* Make model upload instructions simpler
2021-03-19 12:48:54 -04:00
Patrick von Platen
b9570a813c
Update FINE_TUNE_XLSR_WAV2VEC2.md 2021-03-19 19:45:28 +03:00
Sylvain Gugger
946400fb68
Expand a bit the presentation of examples (#10799)
* Expand a bit the presentation of examples

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-03-19 10:06:08 -04:00
Bhadresh Savani
fd1d9f1ab8
[Example] Updating Question Answering examples for Predict Stage (#10792)
* added prediction stage and eval fix

* style correction

* removed extra lines
2021-03-19 09:42:17 -04:00
Patrick von Platen
e8968bd03a
[XLSR-Wav2Vec2 Info doc] Add a couple of lines (#10806)
* finish

* fix

* fix

* fix

* fix
2021-03-19 12:52:54 +03:00
Stas Bekman
427ea3fecb
addressing vulnerability report in research project deps (#10802)
Following up on a security alert:
https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/Pillow/open
2021-03-18 22:02:10 -04:00
Patrick von Platen
2ae678229f
Update FINE_TUNE_XLSR_WAV2VEC2.md 2021-03-19 00:29:20 +03:00
Patrick von Platen
68a3215949
Update FINE_TUNE_XLSR_WAV2VEC2.md 2021-03-19 00:27:40 +03:00
Patrick von Platen
03df3fbcb4
Update FINE_TUNE_XLSR_WAV2VEC2.md 2021-03-19 00:26:49 +03:00
Patrick von Platen
e84adbed40
Add XLSR-Wav2Vec2 Fine-Tuning README.md (#10786)
* upload

* upload fine-tuning script

* improve

* adapt

* Apply suggestions from code review

* correct

* upload

* finalize

* remove @

* correct typos
2021-03-19 00:22:43 +03:00
Stas Bekman
9352b5151a
[examples/seq2seq/README.md] fix t5 examples (#10734)
* [examples/seq2seq] fix t5 examples

This PR:
* fixes T5 examples to include `--source_prefix` - it's **not** optional. If you give it a try you will see that you get 10x worse bleu scores w/o it. w/ `27.6849`, w/ `2.374`
* added a normal translation example w/o the peculiarities of MBart and T5
* reduces the default max samples to 50 so it's much faster to test quickly

summarization seems to be broken for t5 score-wise: https://github.com/huggingface/transformers/issues/10733

@sgugger

* specify explicitly the t5 models requiring the special handling

* one more

* update the t5 summarization example to use cnn_dailymail

* move max*samples into the top level README.md

* better wording

* better wording
2021-03-18 09:55:39 -07:00
Julien Chaumond
4f3e93cfaf
[file_utils] do not gobble certain kinds of requests.ConnectionError (#10235)
* do not gobble certain kinds of requests.ConnectionError

* Apply review comments

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-03-18 12:37:45 -04:00
Suraj Patil
5f19c07a70
add run_common_voice script (#10767)
* add initial script

* finish script

* add shell script example

* accept chars_to_ignor as cl arg

* align the script with other example scripts

* add torchaudio dep
2021-03-18 17:21:16 +05:30
Mohamed El-Geish
af8afdc88d
wav2vec2: support datasets other than LibriSpeech (#10581)
* wav2vec2: support datasets other than LibriSpeech

* Formatting run_asr.py to pass code quality test

* bundled orthography options and added verbose logs

* fixing a typo in timit fine-tuning script

* update comment for clarity

* resize_lm_head and load custom vocab from file

* adding a max_duration_in_seconds filter

* do not assign `duration_filter` lambda, use a def

* log untransliterated text as well

* fix base model for arabic

* fix duration filter when target_sr is not set

* drop duration_in_seconds when unneeded

* script for wav2vec2-large-lv60-timit-asr

* fix for "tha" in arabic corpus (huggingface#10581)

* adding more options to work with common_voice

* PR feedback (huggingface#10581)

* small README change
2021-03-18 10:20:26 +03:00
Stas Bekman
393739194e
[examples] document resuming (#10776)
* document resuming in examples

* fix

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* put trainer code last, adjust notes

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-17 12:48:35 -07:00
Stas Bekman
cd8c93f701
[DeepSpeed] improve checkpoint loading code plus tests (#10760)
* deepspeed checkpoint loading code plus tests

* style

* style
2021-03-17 10:22:58 -07:00
Cheng Li
c83fbc5f2d
[Deepspeed] Allow HF optimizer and scheduler to be passed to deepspeed (#10464)
* pass hf optimizer and scheduler to deepspeed if not specified in ds config

* pass hf optimizer and scheduler to deepspeed if not specified in ds config

* update

* make init_deepspeed support config dict

* fix docstring formatting

* clean up trainer's comments

* add new tests

* fix type

* composit argparse doesn't work

* style

* add a new test, rename others

* document new functionality

* complete tests, add docs

* style

* correct level

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add new methods to the doc

* must tell DS we are using a non-native optimizer

* add protection against cpu_offload + HF optimizer combo

* fix the cli overrides

* sync docs + tests

* restore AdamW

* better docs

* need new version

* no longer needed

* remove outdate information

* refactor duplicated code

Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-16 15:51:09 -07:00
Lysandre
1b5ce1e63b Development on v4.5.0dev0 2021-03-16 11:41:15 -04:00
Lysandre
c988db5af2 Release v4.4.0 2021-03-16 11:33:35 -04:00
Russell Klopfer
87d685b8a9
independent training / eval with local files (#10710)
* independent training / eval with local files

* remove redundant assert
2021-03-15 19:35:26 -04:00
Sylvain Gugger
4c379daf64
Add minimum version check in examples (#10724)
* Add minimum version check in examples

* Style

* No need for new line maybe?

* Add helpful comment
2021-03-15 19:29:54 -04:00
Joe Davison
966ba081c9
zero-shot pipeline multi_class -> multi_label (#10727) 2021-03-15 16:02:46 -06:00
Théo Matussière
6f840990a7
split seq2seq script into summarization & translation (#10611)
* split seq2seq script, update docs

* needless diff

* fix readme

* remove test diff

* s/summarization/translation

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* cr

* fix arguments & better mbart/t5 refs

* copyright

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* reword readme

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* s/summarization/translation

* short script names

* fix tests

* fix isort, include mbart doc

* delete old script, update tests

* automate source prefix

* automate source prefix for translation

* s/translation/trans

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* fix script name (short version)

* typos

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* exact parameter

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* remove superfluous source_prefix calls in docs

* rename scripts & warn for source prefix

* black

* flake8

Co-authored-by: theo <theo@matussie.re>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-03-15 09:11:42 -04:00
Stas Bekman
4c32f9f26e
AdamW is now supported by default (#9624) 2021-03-12 13:40:07 -08:00
Lysandre Debut
9fbb4cdc80
Specify minimum version for sacrebleu (#10662) 2021-03-11 13:45:06 -05:00
ArvidYin
27d9e05ce2
Update README.md (#10647)
correct spell error: 'nether'
2021-03-11 08:58:04 -05:00
Sylvain Gugger
efb5c0a453
Add new GLUE example with no Trainer. (#10555)
* Add new GLUE example with no Trainer.

* Style

* Address review comments
2021-03-10 09:29:19 -05:00
Allen Wang
6f52fce673
Fixes an issue in text-classification where MNLI eval/test datasets are not being preprocessed. (#10621)
* Fix MNLI tests

* Linter fix
2021-03-09 22:13:45 -05:00
Sylvain Gugger
0d909f6bd8
Fairscale FSDP fix model save (#10596)
* Hotfix fairscale FSDP

* Evaluation works

* Save on process zero
2021-03-09 14:42:07 -05:00
Stas Bekman
f284089ec4
[examples tests on multigpu] resolving require_torch_non_multi_gpu_but_fix_me (#10561)
* batch 1

* this is tpu

* deebert attempt

* the rest
2021-03-08 11:11:40 -08:00
Bhadresh Savani
dfd16af832
Added max_sample_ arguments (#10551)
* reverted changes of logging and saving metrics

* added max_sample arguments

* fixed code

* white space diff

* reformetting code

* reformatted code
2021-03-08 13:57:10 -05:00
Stas Bekman
917f104502
[examples tests] various fixes (#10584)
* fix sharded ddp enum

* test fixes

* stronger validation + apex breaks other tests
2021-03-08 10:28:44 -08:00
Stas Bekman
e6ce636e02
fix nltk lookup (#10585) 2021-03-07 22:09:58 -08:00
Stas Bekman
88a951e3cc
offline mode for firewalled envs (#10407)
* offline mode start

* add specific values

* fix fallback

* add test

* better values check and range

* test that actually works

* document the offline mode

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* more strict check

* cleaner test

* pt-only test

* style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-05 17:27:48 -08:00
Patrick von Platen
395ffcd757
fix run seq2seq (#10547) 2021-03-05 18:17:12 +03:00
Sylvain Gugger
a5bd40b75c
Not always consider a local model a checkpoint in run_glue (#10517) 2021-03-04 11:11:39 -05:00
Sylvain Gugger
745ea78dcc Revert "Not always consider a local model a checkpoint in run_glue"
This reverts commit f3660613bc.
2021-03-04 09:45:18 -05:00
Sylvain Gugger
f3660613bc Not always consider a local model a checkpoint in run_glue 2021-03-04 09:44:02 -05:00
Patrick von Platen
0234de8418
Add Fine-Tuning for Wav2Vec2 (#10145)
* add encode labels function to tokenizer

* start adding finetuning

* init dropout

* upload

* correct convert script

* apply changes

* fix second typo

* make first dummy training run

* adapt convert script

* push confg for comparison

* remove conf

* finish training

* adapt data collator

* add research folder

* update according to fairseq feedback

* some minor corrections

* refactor masking indices a bit

* some minor changes

* clean tokenizer

* finish clean-up

* remove previous logic

* update run script

* correct training

* finish changes

* finish model

* correct bug

* fix training a bit more

* add some tests

* finish gradient checkpointing

* finish example

* correct gradient checkpointing

* improve tokenization method

* revert changes in tokenizer

* revert general change

* adapt fine-tuning

* update

* save intermediate test

* Update README.md

* finish finetuning

* delete conversion script

* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* finish wav2vec2 script

* finish wav2vec2 fine-tuning

* finalize test

* correct test

* adapt tests

* finish

* remove test file

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-01 12:13:17 +03:00
Bhadresh Savani
aca6288ff4
updated logging and saving metrics (#10436)
* updated logging and saving metrics

* space removal
2021-02-27 09:53:44 -08:00
Stas Bekman
f52a15897b
[run_seq2seq.py] restore functionality: saving to test_generations.txt (#10428)
This PR restores the original functionality that for some reason was modified.

Fixes: https://github.com/huggingface/transformers/issues/10381

@sgugger
2021-02-27 08:21:50 -08:00
Stas Bekman
ee04b69822
[examples] better model example (#10427)
* refactors

* typo
2021-02-26 17:01:01 -08:00
Sylvain Gugger
17b6e0d474
Fix run_glue evaluation when model has a label correspondence (#10401) 2021-02-25 15:30:38 -05:00
Sylvain Gugger
9d14be5c20
Add support for ZeRO-2/3 and ZeRO-offload in fairscale (#10354)
* Ass support for ZeRO-2/3 and ZeRO-offload in fairscale

* Quality

* Rework from review comments

* Add doc

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-02-25 11:07:53 -05:00
Patrick von Platen
cb38ffcc5e
[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer (#10324)
* push to show

* small improvement

* small improvement

* Update src/transformers/feature_extraction_utils.py

* Update src/transformers/feature_extraction_utils.py

* implement base

* add common tests

* make all tests pass for wav2vec2

* make padding work & add more tests

* finalize feature extractor utils

* add call method to feature extraction

* finalize feature processor

* finish tokenizer

* finish general processor design

* finish tests

* typo

* remove bogus file

* finish docstring

* add docs

* finish docs

* small fix

* correct docs

* save intermediate

* load changes

* apply changes

* apply changes to doc

* change tests

* apply surajs recommend

* final changes

* Apply suggestions from code review

* fix typo

* fix import

* correct docstring
2021-02-25 17:42:46 +03:00
Stas Bekman
3437d12134
[Trainer/Deepspeed] handle get_last_lr() before first step() (#10362)
* handle get_last_lr() before first step()

* abstract away the lr getting logic

* cleanup

* add test

* move to utils
2021-02-23 17:42:25 -08:00
Akmal
23e87c27be
Fix broken examples/seq2seq/README.md markdown (#10344) 2021-02-23 10:49:25 -05:00
Stas Bekman
622a8c5995
[trainer] add Trainer methods for metrics logging and saving (#10266)
* make logging and saving trainer built-in

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-22 13:02:53 -08:00
Stas Bekman
eab0afc19c
[Trainer] implement gradient_accumulation_steps support in DeepSpeed integration (#10310)
* implement gradient_accumulation_steps support in DeepSpeed integration

* typo

* cleanup

* cleanup
2021-02-22 11:15:59 -08:00
Stas Bekman
f991daed18
defensive programming + expand/correct README (#10295) 2021-02-22 10:58:50 -08:00
Julien Plu
536aee99bb
Move the TF NER example (#10276) 2021-02-19 16:06:13 -05:00
Joe Davison
cbadb5243c
Zero shot distillation script cuda patch (#10284) 2021-02-19 14:06:57 -05:00
Joe Davison
c6fe17557e
Script for distilling zero-shot classifier to more efficient student (#10244)
* add zero-shot distillation script

* readme wordsmithing

* clean up code

* add multi-gpu teacher inference
plus tidying up more code

* add use_fast_tokenizer arg

* update results in readme

* more readme wordsmithing

* style

* Add handle to readme

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* fix code block

* add error+docs about distributed & tpu

* add @sgugger format requests

* xla -> tpu

* support fp16 for teacher preds

* no checkpoint by default

* add demo colab link

* add model sharing prompt + model link

* correct resulting acc of example

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-02-18 17:08:45 -05:00
Stas Bekman
97e688bc22
[Trainer] memory tracker metrics (#10225)
* memory tracker metrics

* go back to eval for somewhat consistency

* handle no-gpu case

* deal with stackable eval calls

* restore callback order

* style

* simplify the API

* add test

* docs

* consistently use eval_ prefix

* improve docs

* Update src/transformers/trainer_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* rename method

* style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-18 09:27:32 -08:00
Stas Bekman
d1eb88f42d
[CI] 2 fixes (#10248)
* fix invalid port

* missing requirements
2021-02-17 14:12:39 -08:00
Zhang Cheng
df1b0fb54d
set tgt_lang of MBart Tokenizer for summarization (#10205) 2021-02-16 09:39:37 -05:00
Suraj Patil
1c8c2d9ab3
[WIP][examples/seq2seq] move old s2s scripts to legacy (#10136)
* move old s2s scripts to legacy

* add the tests back

* proper rename

* restore

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-15 10:48:02 -08:00
Stas Bekman
0b1f552a24
fix run_seq2seq.py; porting trainer tests to it (#10162)
* fix run_seq2seq.py; porting DeepSpeed tests to it

* unrefactor

* defensive programming

* defensive programming 2

* port the rest of the trainer tests

* style

* a cleaner scripts dir finder

* cleanup
2021-02-15 09:12:17 -08:00
Suraj Patil
f51188cbe7
[examples/run_s2s] remove task_specific_params and update rouge computation (#10133)
* fix rouge metrics and task specific params

* fix typo

* round metrics

* typo

* remove task_specific_params
2021-02-12 17:18:21 +05:30
Stas Bekman
b54cb0bd82
[DeepSpeed in notebooks] Jupyter + Colab (#10130)
* init devices/setup explicitly

* docs + test

* simplify

* cleanup

* cleanup

* cleanup

* correct the required dist setup

* derive local_rank from env LOCAL_RANK
2021-02-11 14:02:05 -08:00
Qbiwan
8dcfaea08d
Update run_xnli.py to use Datasets library (#9829)
* remove xnli_compute_metrics, add load_dataset, load_metric, set_seed,metric.compute,load_metric

* fix

* fix

* fix

* push

* fix

* everything works

* fix init

* fix

* special treatment for sepconv1d

* style

* 🙏🏽

* add doc and cleanup


* fix doc

* fix doc again

* fix doc again

* Apply suggestions from code review

* make style

* Proposal that should work

* Remove needless code

* Fix test

* Apply suggestions from code review

* remove xnli_compute_metrics, add load_dataset, load_metric, set_seed,metric.compute,load_metric

* amend README

* removed data_args.task_name and replaced with task_name = "xnli"; use split function to load train and validation dataset separately; remove __post_init__; remove flag --task_name from README.

* removed dict task_to_keys, use str "xnli" instead of variable task_name, change preprocess_function to use examples["premise"], examples["hypothesis"] directly, remove sentence1_key and sentence2_key, change compute_metrics function to cater only to accuracy metric, add condition for train_langauge is None when using dataset.load_dataset()

* removed `torch.distributed.barrier()` and `import torch` as `from_pretrained` is able to do the work; amend README
2021-02-11 10:27:23 +05:30
Stas Bekman
77b862847b
[DeepSpeed] restore memory for evaluation (#10114)
* free up memory at the end of train

* rework tests

* consistent formatting

* correction
2021-02-10 09:09:48 -08:00
Lysandre Debut
0d8e554d42
Line endings should be LF across repo and not CRLF (#10119) 2021-02-10 10:50:00 -05:00
Boris Dayma
7c7962ba89
doc: update W&B related doc (#10086)
* doc: update W&B related doc

* doc(wandb): mention report_to

* doc(wandb): commit suggestion

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* doc(wandb): fix typo

* doc(wandb): remove WANDB_DISABLED

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-09 14:47:52 -05:00
Suraj Patil
63fddcf69c
[examples/s2s] add test set predictions (#10085)
* add do_predict, pass eval_beams durig eval

* update help

* apply suggestions from code review
2021-02-09 20:41:41 +05:30
Stas Bekman
781220acab
transition to new tests dir (#10080) 2021-02-08 12:41:52 -08:00
Stas Bekman
322037e842
[trainer] deepspeed bug fixes and tests (#10039)
* deepspeed bug fixes and tests

* manual wrap?
2021-02-08 09:44:02 -08:00
Olivier
ece6c51458
[s2s examples] Replace -100 token ids with the tokenizer pad_id for compute_metrics (#10046)
* replace -100 token ids with the tokenizer pad_id for compute_metrics

* fixed typo for label_ids
2021-02-08 10:08:16 -05:00
Sylvain Gugger
b01483faa0
Truncate max length if needed in all examples (#10034) 2021-02-08 05:03:55 -05:00
Stas Bekman
24db8cc329
Can't mix --fp16 and --device cpu (#10041) 2021-02-07 17:54:20 -08:00