Lysandre
9853c5dd58
Development on v4.6.0dev0
2021-04-06 12:53:25 -04:00
Lysandre
4906a29f7f
Release v4.5.0
2021-04-06 12:37:47 -04:00
Hemil Desai
6ab7d1a429
Add Readme for language modeling scripts with accelerate ( #11073 )
2021-04-05 20:56:12 -04:00
Hemil Desai
b51b87c41d
Add examples/language_modeling/run_clm_no_trainer.py
( #11026 )
...
* Initial draft for clm no trainer
* Remove unwanted args
* Fix bug
* Update examples/language-modeling/run_clm_no_trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-05 12:27:52 -04:00
Stas Bekman
3d39226a51
s|Pretrained|PreTrained| ( #11048 )
2021-04-04 18:08:42 -07:00
versis
335c0ca35c
fixed typo: logging instead of logger ( #11025 )
2021-04-02 09:22:22 -04:00
Hemil Desai
838f83d84c
Add examples/language_modeling/run_mlm_no_trainer.py
( #11001 )
...
* Add initial script for finetuning MLM models with accelerate
* Add evaluation metric calculation
* Fix bugs
* Use no_grad on evaluation
* update script docstring
* Update examples/language-modeling/run_mlm_no_trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* PR feedback
* Fix CI failure
* Update examples/language-modeling/run_mlm_no_trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-31 18:49:45 -04:00
Sylvain Gugger
acc3bd9d2a
Enforce string-formatting with f-strings ( #10980 )
...
* First third
* Styling and fix mistake
* Quality
* All the rest
* Treat %s and %d
* typo
* Missing )
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-31 10:00:27 -04:00
WybeKoper
645f45c462
Fixed some typos and removed legacy url ( #10989 )
...
* Fixed typos
* Removed legacy colab notebook from readme
Co-authored-by: WybeKoper <WybeKoper@users.noreply.github.com>
2021-03-31 16:53:15 +05:30
Yih-Dar
e031162a6b
fix md file to avoid evaluation crash ( #10962 )
2021-03-30 21:26:22 +03:00
Philipp Schmid
3e09d813aa
[examples/s2s] added py7zr dep ( #10971 )
...
* added py7zr
* comment out check_min for sagemaker test
* added min version again
2021-03-30 23:17:12 +05:30
Stas Bekman
05c966f24b
[vulnerability] dep fix ( #10954 )
...
Fixes https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/Pygments/open
@LysandreJik
2021-03-29 17:25:47 -04:00
Daniel Stancl
5057213bcc
Add examples/multiple-choice/run_swag_no_trainer.py
( #10934 )
...
* Initial commit
* Another bunch of updates
* make style quliaty + delete debug arg from bash script
* Use compue_metrics func
* Do a few fixes
* Add copyright
* Fix typos
2021-03-29 16:41:09 -04:00
Sylvain Gugger
4002f95eb6
Remove duplicate code
2021-03-29 15:27:12 -04:00
Daniel Stancl
d7b50ce469
Add examples/run_ner_no_trainer.py
( #10902 )
...
* Add NER example with accelerate library
* This commit contains the first (yet really unfinished)
version of a script for showing how to train HuggingFace model
with their new accelerate library.
* Fix metric calculation
* make style quality
* mv ner_no_trainer to token-classification dir
* Delete --debug flag from running script
* hf_datasets -> raw_datasets
* Make a few slight adjustments
* Add an informative comment + rewrite a help comment
* Change header
* Fix a few things
* Enforce to use fast tokenizers only
* DataCollatorWithPadding -> DataCollatorForTokenClassification
* Change bash script: python3 -> accelerate launch
* make style
* Add a few missing things (see below)
* Add a max-lenghth padding to predictions and labels to
enable accelerate gather functionality
* Add PyTorch no trainer example to the example README.md
* Remove --do-train from args as being redundant for now
* DataCollatorWithPadding -> DataCollatorForTokenClassification
* Remove some obsolete args.do_train conditions from the script
* Delete --do_train from bash running script
* Delete use_slow_tokenizer from args
* Add unintentionally removed flag --label_all_tokens
* Delete --debug flag from running script
2021-03-29 15:11:23 -04:00
WybeKoper
ddea8771c6
Updated colab links in readme of examples ( #10932 )
...
Co-authored-by: WybeKoper <WybeKoper@users.noreply.github.com>
2021-03-29 08:47:09 -04:00
Bhadresh Savani
4f21e1ddd6
fixed finename ( #10939 )
2021-03-28 09:48:12 -07:00
Stas Bekman
3c27d246e5
[vulnerability] fix dependency ( #10914 )
...
this PR fixes https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/PyYAML/open
2021-03-26 09:06:11 -04:00
Jethro Kuan
5f1491d3b3
run_glue_no_trainer: datasets -> raw_datasets ( #10898 )
...
Use the correct variable (raw_datasets) instead of the module (datasets)
where appropriate.
2021-03-25 08:28:17 -04:00
Bhadresh Savani
7ef40120a0
[Examples] Added predict stage and Updated Example Template ( #10868 )
...
* added predict stage
* added test keyword in exception message
* removed example specific saving predictions
* fixed f-string error
* removed extra line
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-03-23 10:37:59 -07:00
Eliza Szczechla
9f8fa4e973
Use DataCollatorForSeq2Seq in run_summarization in all cases ( #10856 )
...
Co-authored-by: Eliza <eliza@habanero.tiger.com.pl>
2021-03-22 15:05:39 -04:00
Boris Dayma
125ccead71
feat(wandb): logging and configuration improvements ( #10826 )
...
* feat: ensure unique artifact id
* feat: allow manual init
* fix: simplify reinit logic
* fix: no dropped value + immediate commits
* fix: wandb use in sagemaker
* docs: improve documenation and formatting
* fix: typos
* docs: improve formatting
2021-03-22 10:45:17 -04:00
Stas Bekman
8fb4671811
[vulnerability] in example deps fix ( #10817 )
...
Takes care of:
https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/jinja2/open
@LysandreJik
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-22 09:05:24 -04:00
dependabot[bot]
dbfe379514
Bump jinja2 from 2.11.2 to 2.11.3 in /examples/research_projects/lxmert ( #10818 )
...
Bumps [jinja2](https://github.com/pallets/jinja ) from 2.11.2 to 2.11.3.
- [Release notes](https://github.com/pallets/jinja/releases )
- [Changelog](https://github.com/pallets/jinja/blob/master/CHANGES.rst )
- [Commits](https://github.com/pallets/jinja/compare/2.11.2...2.11.3 )
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-03-22 08:54:50 -04:00
Qiushi Pan
29904a967b
Update FINE_TUNE_XLSR_WAV2VEC2.md ( #10849 )
...
Fix typo.
2021-03-22 07:58:59 -04:00
Patrick von Platen
0f226f78ce
push ( #10846 )
2021-03-22 10:32:21 +03:00
Suraj Patil
82b8d8c7b0
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-21 22:47:09 +05:30
Patrick von Platen
af6125ffdb
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-21 12:31:33 +03:00
Patrick von Platen
5aaf6e1460
small improvements for wav2vec2 info script ( #10829 )
2021-03-21 11:41:44 +03:00
Suraj Patil
68b55885ed
add doc for Local machine ( #10828 )
2021-03-21 13:25:34 +05:30
Julien Chaumond
1438c487df
wav2vec doc tweaks ( #10808 )
...
* wording/typos tweaks
* Make model upload instructions simpler
2021-03-19 12:48:54 -04:00
Patrick von Platen
b9570a813c
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-19 19:45:28 +03:00
Sylvain Gugger
946400fb68
Expand a bit the presentation of examples ( #10799 )
...
* Expand a bit the presentation of examples
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-03-19 10:06:08 -04:00
Bhadresh Savani
fd1d9f1ab8
[Example] Updating Question Answering examples for Predict Stage ( #10792 )
...
* added prediction stage and eval fix
* style correction
* removed extra lines
2021-03-19 09:42:17 -04:00
Patrick von Platen
e8968bd03a
[XLSR-Wav2Vec2 Info doc] Add a couple of lines ( #10806 )
...
* finish
* fix
* fix
* fix
* fix
2021-03-19 12:52:54 +03:00
Stas Bekman
427ea3fecb
addressing vulnerability report in research project deps ( #10802 )
...
Following up on a security alert:
https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/Pillow/open
2021-03-18 22:02:10 -04:00
Patrick von Platen
2ae678229f
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-19 00:29:20 +03:00
Patrick von Platen
68a3215949
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-19 00:27:40 +03:00
Patrick von Platen
03df3fbcb4
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-19 00:26:49 +03:00
Patrick von Platen
e84adbed40
Add XLSR-Wav2Vec2 Fine-Tuning README.md ( #10786 )
...
* upload
* upload fine-tuning script
* improve
* adapt
* Apply suggestions from code review
* correct
* upload
* finalize
* remove @
* correct typos
2021-03-19 00:22:43 +03:00
Stas Bekman
9352b5151a
[examples/seq2seq/README.md] fix t5 examples ( #10734 )
...
* [examples/seq2seq] fix t5 examples
This PR:
* fixes T5 examples to include `--source_prefix` - it's **not** optional. If you give it a try you will see that you get 10x worse bleu scores w/o it. w/ `27.6849`, w/ `2.374`
* added a normal translation example w/o the peculiarities of MBart and T5
* reduces the default max samples to 50 so it's much faster to test quickly
summarization seems to be broken for t5 score-wise: https://github.com/huggingface/transformers/issues/10733
@sgugger
* specify explicitly the t5 models requiring the special handling
* one more
* update the t5 summarization example to use cnn_dailymail
* move max*samples into the top level README.md
* better wording
* better wording
2021-03-18 09:55:39 -07:00
Julien Chaumond
4f3e93cfaf
[file_utils] do not gobble certain kinds of requests.ConnectionError ( #10235 )
...
* do not gobble certain kinds of requests.ConnectionError
* Apply review comments
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-03-18 12:37:45 -04:00
Suraj Patil
5f19c07a70
add run_common_voice script ( #10767 )
...
* add initial script
* finish script
* add shell script example
* accept chars_to_ignor as cl arg
* align the script with other example scripts
* add torchaudio dep
2021-03-18 17:21:16 +05:30
Mohamed El-Geish
af8afdc88d
wav2vec2: support datasets other than LibriSpeech ( #10581 )
...
* wav2vec2: support datasets other than LibriSpeech
* Formatting run_asr.py to pass code quality test
* bundled orthography options and added verbose logs
* fixing a typo in timit fine-tuning script
* update comment for clarity
* resize_lm_head and load custom vocab from file
* adding a max_duration_in_seconds filter
* do not assign `duration_filter` lambda, use a def
* log untransliterated text as well
* fix base model for arabic
* fix duration filter when target_sr is not set
* drop duration_in_seconds when unneeded
* script for wav2vec2-large-lv60-timit-asr
* fix for "tha" in arabic corpus (huggingface#10581)
* adding more options to work with common_voice
* PR feedback (huggingface#10581)
* small README change
2021-03-18 10:20:26 +03:00
Stas Bekman
393739194e
[examples] document resuming ( #10776 )
...
* document resuming in examples
* fix
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* put trainer code last, adjust notes
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-17 12:48:35 -07:00
Stas Bekman
cd8c93f701
[DeepSpeed] improve checkpoint loading code plus tests ( #10760 )
...
* deepspeed checkpoint loading code plus tests
* style
* style
2021-03-17 10:22:58 -07:00
Cheng Li
c83fbc5f2d
[Deepspeed] Allow HF optimizer and scheduler to be passed to deepspeed ( #10464 )
...
* pass hf optimizer and scheduler to deepspeed if not specified in ds config
* pass hf optimizer and scheduler to deepspeed if not specified in ds config
* update
* make init_deepspeed support config dict
* fix docstring formatting
* clean up trainer's comments
* add new tests
* fix type
* composit argparse doesn't work
* style
* add a new test, rename others
* document new functionality
* complete tests, add docs
* style
* correct level
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add new methods to the doc
* must tell DS we are using a non-native optimizer
* add protection against cpu_offload + HF optimizer combo
* fix the cli overrides
* sync docs + tests
* restore AdamW
* better docs
* need new version
* no longer needed
* remove outdate information
* refactor duplicated code
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-16 15:51:09 -07:00
Lysandre
1b5ce1e63b
Development on v4.5.0dev0
2021-03-16 11:41:15 -04:00
Lysandre
c988db5af2
Release v4.4.0
2021-03-16 11:33:35 -04:00
Russell Klopfer
87d685b8a9
independent training / eval with local files ( #10710 )
...
* independent training / eval with local files
* remove redundant assert
2021-03-15 19:35:26 -04:00
Sylvain Gugger
4c379daf64
Add minimum version check in examples ( #10724 )
...
* Add minimum version check in examples
* Style
* No need for new line maybe?
* Add helpful comment
2021-03-15 19:29:54 -04:00
Joe Davison
966ba081c9
zero-shot pipeline multi_class -> multi_label ( #10727 )
2021-03-15 16:02:46 -06:00
Théo Matussière
6f840990a7
split seq2seq script into summarization & translation ( #10611 )
...
* split seq2seq script, update docs
* needless diff
* fix readme
* remove test diff
* s/summarization/translation
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* cr
* fix arguments & better mbart/t5 refs
* copyright
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* reword readme
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* s/summarization/translation
* short script names
* fix tests
* fix isort, include mbart doc
* delete old script, update tests
* automate source prefix
* automate source prefix for translation
* s/translation/trans
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* fix script name (short version)
* typos
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* exact parameter
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* remove superfluous source_prefix calls in docs
* rename scripts & warn for source prefix
* black
* flake8
Co-authored-by: theo <theo@matussie.re>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-03-15 09:11:42 -04:00
Stas Bekman
4c32f9f26e
AdamW is now supported by default ( #9624 )
2021-03-12 13:40:07 -08:00
Lysandre Debut
9fbb4cdc80
Specify minimum version for sacrebleu ( #10662 )
2021-03-11 13:45:06 -05:00
ArvidYin
27d9e05ce2
Update README.md ( #10647 )
...
correct spell error: 'nether'
2021-03-11 08:58:04 -05:00
Sylvain Gugger
efb5c0a453
Add new GLUE example with no Trainer. ( #10555 )
...
* Add new GLUE example with no Trainer.
* Style
* Address review comments
2021-03-10 09:29:19 -05:00
Allen Wang
6f52fce673
Fixes an issue in text-classification
where MNLI eval/test datasets are not being preprocessed. ( #10621 )
...
* Fix MNLI tests
* Linter fix
2021-03-09 22:13:45 -05:00
Sylvain Gugger
0d909f6bd8
Fairscale FSDP fix model save ( #10596 )
...
* Hotfix fairscale FSDP
* Evaluation works
* Save on process zero
2021-03-09 14:42:07 -05:00
Stas Bekman
f284089ec4
[examples tests on multigpu] resolving require_torch_non_multi_gpu_but_fix_me ( #10561 )
...
* batch 1
* this is tpu
* deebert attempt
* the rest
2021-03-08 11:11:40 -08:00
Bhadresh Savani
dfd16af832
Added max_sample_ arguments ( #10551 )
...
* reverted changes of logging and saving metrics
* added max_sample arguments
* fixed code
* white space diff
* reformetting code
* reformatted code
2021-03-08 13:57:10 -05:00
Stas Bekman
917f104502
[examples tests] various fixes ( #10584 )
...
* fix sharded ddp enum
* test fixes
* stronger validation + apex breaks other tests
2021-03-08 10:28:44 -08:00
Stas Bekman
e6ce636e02
fix nltk lookup ( #10585 )
2021-03-07 22:09:58 -08:00
Stas Bekman
88a951e3cc
offline mode for firewalled envs ( #10407 )
...
* offline mode start
* add specific values
* fix fallback
* add test
* better values check and range
* test that actually works
* document the offline mode
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* more strict check
* cleaner test
* pt-only test
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-05 17:27:48 -08:00
Patrick von Platen
395ffcd757
fix run seq2seq ( #10547 )
2021-03-05 18:17:12 +03:00
Sylvain Gugger
a5bd40b75c
Not always consider a local model a checkpoint in run_glue ( #10517 )
2021-03-04 11:11:39 -05:00
Sylvain Gugger
745ea78dcc
Revert "Not always consider a local model a checkpoint in run_glue"
...
This reverts commit f3660613bc
.
2021-03-04 09:45:18 -05:00
Sylvain Gugger
f3660613bc
Not always consider a local model a checkpoint in run_glue
2021-03-04 09:44:02 -05:00
Patrick von Platen
0234de8418
Add Fine-Tuning for Wav2Vec2 ( #10145 )
...
* add encode labels function to tokenizer
* start adding finetuning
* init dropout
* upload
* correct convert script
* apply changes
* fix second typo
* make first dummy training run
* adapt convert script
* push confg for comparison
* remove conf
* finish training
* adapt data collator
* add research folder
* update according to fairseq feedback
* some minor corrections
* refactor masking indices a bit
* some minor changes
* clean tokenizer
* finish clean-up
* remove previous logic
* update run script
* correct training
* finish changes
* finish model
* correct bug
* fix training a bit more
* add some tests
* finish gradient checkpointing
* finish example
* correct gradient checkpointing
* improve tokenization method
* revert changes in tokenizer
* revert general change
* adapt fine-tuning
* update
* save intermediate test
* Update README.md
* finish finetuning
* delete conversion script
* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py
* Update src/transformers/models/wav2vec2/processing_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* finish wav2vec2 script
* finish wav2vec2 fine-tuning
* finalize test
* correct test
* adapt tests
* finish
* remove test file
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-01 12:13:17 +03:00
Bhadresh Savani
aca6288ff4
updated logging and saving metrics ( #10436 )
...
* updated logging and saving metrics
* space removal
2021-02-27 09:53:44 -08:00
Stas Bekman
f52a15897b
[run_seq2seq.py] restore functionality: saving to test_generations.txt ( #10428 )
...
This PR restores the original functionality that for some reason was modified.
Fixes: https://github.com/huggingface/transformers/issues/10381
@sgugger
2021-02-27 08:21:50 -08:00
Stas Bekman
ee04b69822
[examples] better model example ( #10427 )
...
* refactors
* typo
2021-02-26 17:01:01 -08:00
Sylvain Gugger
17b6e0d474
Fix run_glue evaluation when model has a label correspondence ( #10401 )
2021-02-25 15:30:38 -05:00
Sylvain Gugger
9d14be5c20
Add support for ZeRO-2/3 and ZeRO-offload in fairscale ( #10354 )
...
* Ass support for ZeRO-2/3 and ZeRO-offload in fairscale
* Quality
* Rework from review comments
* Add doc
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-02-25 11:07:53 -05:00
Patrick von Platen
cb38ffcc5e
[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer ( #10324 )
...
* push to show
* small improvement
* small improvement
* Update src/transformers/feature_extraction_utils.py
* Update src/transformers/feature_extraction_utils.py
* implement base
* add common tests
* make all tests pass for wav2vec2
* make padding work & add more tests
* finalize feature extractor utils
* add call method to feature extraction
* finalize feature processor
* finish tokenizer
* finish general processor design
* finish tests
* typo
* remove bogus file
* finish docstring
* add docs
* finish docs
* small fix
* correct docs
* save intermediate
* load changes
* apply changes
* apply changes to doc
* change tests
* apply surajs recommend
* final changes
* Apply suggestions from code review
* fix typo
* fix import
* correct docstring
2021-02-25 17:42:46 +03:00
Stas Bekman
3437d12134
[Trainer/Deepspeed] handle get_last_lr() before first step() ( #10362 )
...
* handle get_last_lr() before first step()
* abstract away the lr getting logic
* cleanup
* add test
* move to utils
2021-02-23 17:42:25 -08:00
Akmal
23e87c27be
Fix broken examples/seq2seq/README.md markdown ( #10344 )
2021-02-23 10:49:25 -05:00
Stas Bekman
622a8c5995
[trainer] add Trainer methods for metrics logging and saving ( #10266 )
...
* make logging and saving trainer built-in
* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-22 13:02:53 -08:00
Stas Bekman
eab0afc19c
[Trainer] implement gradient_accumulation_steps support in DeepSpeed integration ( #10310 )
...
* implement gradient_accumulation_steps support in DeepSpeed integration
* typo
* cleanup
* cleanup
2021-02-22 11:15:59 -08:00
Stas Bekman
f991daed18
defensive programming + expand/correct README ( #10295 )
2021-02-22 10:58:50 -08:00
Julien Plu
536aee99bb
Move the TF NER example ( #10276 )
2021-02-19 16:06:13 -05:00
Joe Davison
cbadb5243c
Zero shot distillation script cuda patch ( #10284 )
2021-02-19 14:06:57 -05:00
Joe Davison
c6fe17557e
Script for distilling zero-shot classifier to more efficient student ( #10244 )
...
* add zero-shot distillation script
* readme wordsmithing
* clean up code
* add multi-gpu teacher inference
plus tidying up more code
* add use_fast_tokenizer arg
* update results in readme
* more readme wordsmithing
* style
* Add handle to readme
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* fix code block
* add error+docs about distributed & tpu
* add @sgugger format requests
* xla -> tpu
* support fp16 for teacher preds
* no checkpoint by default
* add demo colab link
* add model sharing prompt + model link
* correct resulting acc of example
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-02-18 17:08:45 -05:00
Stas Bekman
97e688bc22
[Trainer] memory tracker metrics ( #10225 )
...
* memory tracker metrics
* go back to eval for somewhat consistency
* handle no-gpu case
* deal with stackable eval calls
* restore callback order
* style
* simplify the API
* add test
* docs
* consistently use eval_ prefix
* improve docs
* Update src/transformers/trainer_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* rename method
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-18 09:27:32 -08:00
Stas Bekman
d1eb88f42d
[CI] 2 fixes ( #10248 )
...
* fix invalid port
* missing requirements
2021-02-17 14:12:39 -08:00
Zhang Cheng
df1b0fb54d
set tgt_lang of MBart Tokenizer for summarization ( #10205 )
2021-02-16 09:39:37 -05:00
Suraj Patil
1c8c2d9ab3
[WIP][examples/seq2seq] move old s2s scripts to legacy ( #10136 )
...
* move old s2s scripts to legacy
* add the tests back
* proper rename
* restore
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-15 10:48:02 -08:00
Stas Bekman
0b1f552a24
fix run_seq2seq.py; porting trainer tests to it ( #10162 )
...
* fix run_seq2seq.py; porting DeepSpeed tests to it
* unrefactor
* defensive programming
* defensive programming 2
* port the rest of the trainer tests
* style
* a cleaner scripts dir finder
* cleanup
2021-02-15 09:12:17 -08:00
Suraj Patil
f51188cbe7
[examples/run_s2s] remove task_specific_params and update rouge computation ( #10133 )
...
* fix rouge metrics and task specific params
* fix typo
* round metrics
* typo
* remove task_specific_params
2021-02-12 17:18:21 +05:30
Stas Bekman
b54cb0bd82
[DeepSpeed in notebooks] Jupyter + Colab ( #10130 )
...
* init devices/setup explicitly
* docs + test
* simplify
* cleanup
* cleanup
* cleanup
* correct the required dist setup
* derive local_rank from env LOCAL_RANK
2021-02-11 14:02:05 -08:00
Qbiwan
8dcfaea08d
Update run_xnli.py to use Datasets library ( #9829 )
...
* remove xnli_compute_metrics, add load_dataset, load_metric, set_seed,metric.compute,load_metric
* fix
* fix
* fix
* push
* fix
* everything works
* fix init
* fix
* special treatment for sepconv1d
* style
* 🙏🏽
* add doc and cleanup
* fix doc
* fix doc again
* fix doc again
* Apply suggestions from code review
* make style
* Proposal that should work
* Remove needless code
* Fix test
* Apply suggestions from code review
* remove xnli_compute_metrics, add load_dataset, load_metric, set_seed,metric.compute,load_metric
* amend README
* removed data_args.task_name and replaced with task_name = "xnli"; use split function to load train and validation dataset separately; remove __post_init__; remove flag --task_name from README.
* removed dict task_to_keys, use str "xnli" instead of variable task_name, change preprocess_function to use examples["premise"], examples["hypothesis"] directly, remove sentence1_key and sentence2_key, change compute_metrics function to cater only to accuracy metric, add condition for train_langauge is None when using dataset.load_dataset()
* removed `torch.distributed.barrier()` and `import torch` as `from_pretrained` is able to do the work; amend README
2021-02-11 10:27:23 +05:30
Stas Bekman
77b862847b
[DeepSpeed] restore memory for evaluation ( #10114 )
...
* free up memory at the end of train
* rework tests
* consistent formatting
* correction
2021-02-10 09:09:48 -08:00
Lysandre Debut
0d8e554d42
Line endings should be LF across repo and not CRLF ( #10119 )
2021-02-10 10:50:00 -05:00
Boris Dayma
7c7962ba89
doc: update W&B related doc ( #10086 )
...
* doc: update W&B related doc
* doc(wandb): mention report_to
* doc(wandb): commit suggestion
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* doc(wandb): fix typo
* doc(wandb): remove WANDB_DISABLED
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-09 14:47:52 -05:00
Suraj Patil
63fddcf69c
[examples/s2s] add test set predictions ( #10085 )
...
* add do_predict, pass eval_beams durig eval
* update help
* apply suggestions from code review
2021-02-09 20:41:41 +05:30
Stas Bekman
781220acab
transition to new tests dir ( #10080 )
2021-02-08 12:41:52 -08:00
Stas Bekman
322037e842
[trainer] deepspeed bug fixes and tests ( #10039 )
...
* deepspeed bug fixes and tests
* manual wrap?
2021-02-08 09:44:02 -08:00
Olivier
ece6c51458
[s2s examples] Replace -100 token ids with the tokenizer pad_id for compute_metrics ( #10046 )
...
* replace -100 token ids with the tokenizer pad_id for compute_metrics
* fixed typo for label_ids
2021-02-08 10:08:16 -05:00
Sylvain Gugger
b01483faa0
Truncate max length if needed in all examples ( #10034 )
2021-02-08 05:03:55 -05:00
Stas Bekman
24db8cc329
Can't mix --fp16 and --device cpu ( #10041 )
2021-02-07 17:54:20 -08:00