Commit Graph

1460 Commits

Author SHA1 Message Date
Stas Bekman
88a951e3cc
offline mode for firewalled envs (#10407)
* offline mode start

* add specific values

* fix fallback

* add test

* better values check and range

* test that actually works

* document the offline mode

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* more strict check

* cleaner test

* pt-only test

* style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-05 17:27:48 -08:00
Patrick von Platen
395ffcd757
fix run seq2seq (#10547) 2021-03-05 18:17:12 +03:00
Sylvain Gugger
a5bd40b75c
Not always consider a local model a checkpoint in run_glue (#10517) 2021-03-04 11:11:39 -05:00
Sylvain Gugger
745ea78dcc Revert "Not always consider a local model a checkpoint in run_glue"
This reverts commit f3660613bc.
2021-03-04 09:45:18 -05:00
Sylvain Gugger
f3660613bc Not always consider a local model a checkpoint in run_glue 2021-03-04 09:44:02 -05:00
Patrick von Platen
0234de8418
Add Fine-Tuning for Wav2Vec2 (#10145)
* add encode labels function to tokenizer

* start adding finetuning

* init dropout

* upload

* correct convert script

* apply changes

* fix second typo

* make first dummy training run

* adapt convert script

* push confg for comparison

* remove conf

* finish training

* adapt data collator

* add research folder

* update according to fairseq feedback

* some minor corrections

* refactor masking indices a bit

* some minor changes

* clean tokenizer

* finish clean-up

* remove previous logic

* update run script

* correct training

* finish changes

* finish model

* correct bug

* fix training a bit more

* add some tests

* finish gradient checkpointing

* finish example

* correct gradient checkpointing

* improve tokenization method

* revert changes in tokenizer

* revert general change

* adapt fine-tuning

* update

* save intermediate test

* Update README.md

* finish finetuning

* delete conversion script

* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* finish wav2vec2 script

* finish wav2vec2 fine-tuning

* finalize test

* correct test

* adapt tests

* finish

* remove test file

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-01 12:13:17 +03:00
Bhadresh Savani
aca6288ff4
updated logging and saving metrics (#10436)
* updated logging and saving metrics

* space removal
2021-02-27 09:53:44 -08:00
Stas Bekman
f52a15897b
[run_seq2seq.py] restore functionality: saving to test_generations.txt (#10428)
This PR restores the original functionality that for some reason was modified.

Fixes: https://github.com/huggingface/transformers/issues/10381

@sgugger
2021-02-27 08:21:50 -08:00
Stas Bekman
ee04b69822
[examples] better model example (#10427)
* refactors

* typo
2021-02-26 17:01:01 -08:00
Sylvain Gugger
17b6e0d474
Fix run_glue evaluation when model has a label correspondence (#10401) 2021-02-25 15:30:38 -05:00
Sylvain Gugger
9d14be5c20
Add support for ZeRO-2/3 and ZeRO-offload in fairscale (#10354)
* Ass support for ZeRO-2/3 and ZeRO-offload in fairscale

* Quality

* Rework from review comments

* Add doc

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-02-25 11:07:53 -05:00
Patrick von Platen
cb38ffcc5e
[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer (#10324)
* push to show

* small improvement

* small improvement

* Update src/transformers/feature_extraction_utils.py

* Update src/transformers/feature_extraction_utils.py

* implement base

* add common tests

* make all tests pass for wav2vec2

* make padding work & add more tests

* finalize feature extractor utils

* add call method to feature extraction

* finalize feature processor

* finish tokenizer

* finish general processor design

* finish tests

* typo

* remove bogus file

* finish docstring

* add docs

* finish docs

* small fix

* correct docs

* save intermediate

* load changes

* apply changes

* apply changes to doc

* change tests

* apply surajs recommend

* final changes

* Apply suggestions from code review

* fix typo

* fix import

* correct docstring
2021-02-25 17:42:46 +03:00
Stas Bekman
3437d12134
[Trainer/Deepspeed] handle get_last_lr() before first step() (#10362)
* handle get_last_lr() before first step()

* abstract away the lr getting logic

* cleanup

* add test

* move to utils
2021-02-23 17:42:25 -08:00
Akmal
23e87c27be
Fix broken examples/seq2seq/README.md markdown (#10344) 2021-02-23 10:49:25 -05:00
Stas Bekman
622a8c5995
[trainer] add Trainer methods for metrics logging and saving (#10266)
* make logging and saving trainer built-in

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-22 13:02:53 -08:00
Stas Bekman
eab0afc19c
[Trainer] implement gradient_accumulation_steps support in DeepSpeed integration (#10310)
* implement gradient_accumulation_steps support in DeepSpeed integration

* typo

* cleanup

* cleanup
2021-02-22 11:15:59 -08:00
Stas Bekman
f991daed18
defensive programming + expand/correct README (#10295) 2021-02-22 10:58:50 -08:00
Julien Plu
536aee99bb
Move the TF NER example (#10276) 2021-02-19 16:06:13 -05:00
Joe Davison
cbadb5243c
Zero shot distillation script cuda patch (#10284) 2021-02-19 14:06:57 -05:00
Joe Davison
c6fe17557e
Script for distilling zero-shot classifier to more efficient student (#10244)
* add zero-shot distillation script

* readme wordsmithing

* clean up code

* add multi-gpu teacher inference
plus tidying up more code

* add use_fast_tokenizer arg

* update results in readme

* more readme wordsmithing

* style

* Add handle to readme

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* fix code block

* add error+docs about distributed & tpu

* add @sgugger format requests

* xla -> tpu

* support fp16 for teacher preds

* no checkpoint by default

* add demo colab link

* add model sharing prompt + model link

* correct resulting acc of example

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-02-18 17:08:45 -05:00
Stas Bekman
97e688bc22
[Trainer] memory tracker metrics (#10225)
* memory tracker metrics

* go back to eval for somewhat consistency

* handle no-gpu case

* deal with stackable eval calls

* restore callback order

* style

* simplify the API

* add test

* docs

* consistently use eval_ prefix

* improve docs

* Update src/transformers/trainer_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* rename method

* style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-18 09:27:32 -08:00
Stas Bekman
d1eb88f42d
[CI] 2 fixes (#10248)
* fix invalid port

* missing requirements
2021-02-17 14:12:39 -08:00
Zhang Cheng
df1b0fb54d
set tgt_lang of MBart Tokenizer for summarization (#10205) 2021-02-16 09:39:37 -05:00
Suraj Patil
1c8c2d9ab3
[WIP][examples/seq2seq] move old s2s scripts to legacy (#10136)
* move old s2s scripts to legacy

* add the tests back

* proper rename

* restore

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-15 10:48:02 -08:00
Stas Bekman
0b1f552a24
fix run_seq2seq.py; porting trainer tests to it (#10162)
* fix run_seq2seq.py; porting DeepSpeed tests to it

* unrefactor

* defensive programming

* defensive programming 2

* port the rest of the trainer tests

* style

* a cleaner scripts dir finder

* cleanup
2021-02-15 09:12:17 -08:00
Suraj Patil
f51188cbe7
[examples/run_s2s] remove task_specific_params and update rouge computation (#10133)
* fix rouge metrics and task specific params

* fix typo

* round metrics

* typo

* remove task_specific_params
2021-02-12 17:18:21 +05:30
Stas Bekman
b54cb0bd82
[DeepSpeed in notebooks] Jupyter + Colab (#10130)
* init devices/setup explicitly

* docs + test

* simplify

* cleanup

* cleanup

* cleanup

* correct the required dist setup

* derive local_rank from env LOCAL_RANK
2021-02-11 14:02:05 -08:00
Qbiwan
8dcfaea08d
Update run_xnli.py to use Datasets library (#9829)
* remove xnli_compute_metrics, add load_dataset, load_metric, set_seed,metric.compute,load_metric

* fix

* fix

* fix

* push

* fix

* everything works

* fix init

* fix

* special treatment for sepconv1d

* style

* 🙏🏽

* add doc and cleanup


* fix doc

* fix doc again

* fix doc again

* Apply suggestions from code review

* make style

* Proposal that should work

* Remove needless code

* Fix test

* Apply suggestions from code review

* remove xnli_compute_metrics, add load_dataset, load_metric, set_seed,metric.compute,load_metric

* amend README

* removed data_args.task_name and replaced with task_name = "xnli"; use split function to load train and validation dataset separately; remove __post_init__; remove flag --task_name from README.

* removed dict task_to_keys, use str "xnli" instead of variable task_name, change preprocess_function to use examples["premise"], examples["hypothesis"] directly, remove sentence1_key and sentence2_key, change compute_metrics function to cater only to accuracy metric, add condition for train_langauge is None when using dataset.load_dataset()

* removed `torch.distributed.barrier()` and `import torch` as `from_pretrained` is able to do the work; amend README
2021-02-11 10:27:23 +05:30
Stas Bekman
77b862847b
[DeepSpeed] restore memory for evaluation (#10114)
* free up memory at the end of train

* rework tests

* consistent formatting

* correction
2021-02-10 09:09:48 -08:00
Lysandre Debut
0d8e554d42
Line endings should be LF across repo and not CRLF (#10119) 2021-02-10 10:50:00 -05:00
Boris Dayma
7c7962ba89
doc: update W&B related doc (#10086)
* doc: update W&B related doc

* doc(wandb): mention report_to

* doc(wandb): commit suggestion

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* doc(wandb): fix typo

* doc(wandb): remove WANDB_DISABLED

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-09 14:47:52 -05:00
Suraj Patil
63fddcf69c
[examples/s2s] add test set predictions (#10085)
* add do_predict, pass eval_beams durig eval

* update help

* apply suggestions from code review
2021-02-09 20:41:41 +05:30
Stas Bekman
781220acab
transition to new tests dir (#10080) 2021-02-08 12:41:52 -08:00
Stas Bekman
322037e842
[trainer] deepspeed bug fixes and tests (#10039)
* deepspeed bug fixes and tests

* manual wrap?
2021-02-08 09:44:02 -08:00
Olivier
ece6c51458
[s2s examples] Replace -100 token ids with the tokenizer pad_id for compute_metrics (#10046)
* replace -100 token ids with the tokenizer pad_id for compute_metrics

* fixed typo for label_ids
2021-02-08 10:08:16 -05:00
Sylvain Gugger
b01483faa0
Truncate max length if needed in all examples (#10034) 2021-02-08 05:03:55 -05:00
Stas Bekman
24db8cc329
Can't mix --fp16 and --device cpu (#10041) 2021-02-07 17:54:20 -08:00
Stas Bekman
769948fad2
json to jsonlines, and doc, and typo (#10043) 2021-02-07 17:51:34 -08:00
Stas Bekman
8ea412a86f
[examples] make run scripts executable (#10037)
* make executable

* make executable

* same for the template

* cleanup
2021-02-05 15:51:18 -08:00
Suraj Patil
1cd16512dc
[examples/seq2seq] support label smoothing (#9844)
* add prepare_decoder_input_ids_from_labels in s2s models

* support lbl smoothing and enc/emb freezing

* fix freezing

* use pad_token_id from config

* remove embed freezing and add warning

* prepare decoder_input_ids inside DataCollatorForSeq2Seq
2021-02-05 23:21:57 +05:30
Suraj Patil
bca0dd5ee3
[run_clm.py] fix getting extention 2021-02-03 20:14:42 +05:30
Stas Bekman
d55e10beab
[research proj] [lxmert] rm bleach dependency (#9970)
Looks like a vulnerability and it's not really used anywhere in the code, so just as well remove it completely from deps.
https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/bleach/open
2021-02-03 05:24:40 -05:00
Patrick von Platen
538b3b4607
[Tokenizer Utils Base] Make pad function more flexible (#9928)
* change tokenizer requirement

* split line

* Correct typo from list to str

* improve style

* make other function pretty as well

* add comment

* correct typo

* add new test

* pass tests for tok without padding token

* Apply suggestions from code review
2021-02-02 10:35:27 +03:00
Sylvain Gugger
115d97dd2f
Remove subclass for sortish sampler (#9907)
* Remove subclass for sortish sampler

* Use old Seq2SeqTrainer in script

* Styling
2021-02-01 08:06:32 -05:00
wlhgtc
1682804ebd
Fit chinese wwm to new datasets (#9887)
* MOD: fit chinese wwm to new datasets

* MOD: move wwm to new folder

* MOD: formate code

* Styling

* MOD add param and recover trainer

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
2021-02-01 03:37:59 -05:00
Stas Bekman
6bab83683b
fix logger format for non-main process (#9911) 2021-02-01 03:08:12 -05:00
Stas Bekman
6bf94bc0b6
correctly handle mt5 (#9879) 2021-01-29 08:11:22 -08:00
Sylvain Gugger
b4e559cfa1
Deprecate model_path in Trainer.train (#9854) 2021-01-28 08:32:46 -05:00
Sylvain Gugger
f2fabedbab
Setup logging with a stdout handler (#9816) 2021-01-27 03:39:11 -05:00
Yusuke Mori
059bb25817
Fix a bug in run_glue.py (#9812) (#9815) 2021-01-26 14:32:19 -05:00