Commit Graph

78 Commits

Author SHA1 Message Date
Sam Shleifer
ce37be9d94
[s2s] warn if --fp16 for torch 1.6 (#6977) 2020-09-06 20:41:29 -04:00
Sam Shleifer
a4fc0c80b1
[s2s] run_eval.py parses generate_kwargs (#6948) 2020-09-04 14:19:31 -04:00
Sam Shleifer
6078b12098
[s2s] distill: --normalize_hidden --supervise_forward (#6834) 2020-09-04 14:05:56 -04:00
Sam Shleifer
e95d262f25
[s2s] support early stopping based on loss, rather than rouge (#6927) 2020-09-03 17:31:35 -04:00
Sam Shleifer
207ed8cb78
[s2s] use --eval_beams command line arg (#6926) 2020-09-03 12:42:09 -04:00
Sam Shleifer
39ed68d597
[s2s] allow task_specific_params=summarization_xsum (#6923) 2020-09-03 11:11:40 -04:00
Sam Shleifer
5a318f075a
[s2s]: script to convert pl checkpoints to hf checkpoints (#6911)
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-09-03 09:47:00 -04:00
brett koonce
b8e4906c97
tweak tar command in readme (#6919) 2020-09-03 09:29:01 -04:00
Sam Shleifer
b9772897ec
[s2s] command line args for faster val steps (#6833) 2020-08-31 16:16:10 -04:00
Sam Shleifer
61b7ba93f5
Marian distill scripts + integration test (#6799) 2020-08-31 13:48:26 -04:00
Sam Shleifer
dfa10a41ba
[s2s README] Add more dataset download instructions (#6737) 2020-08-30 16:29:24 -04:00
Sam Shleifer
0f58903bb6
Pegasus finetune script: add --adafactor (#6811) 2020-08-29 17:43:32 -04:00
Sam Shleifer
ac47458a02
[s2s] round runtime in run_eval (#6798) 2020-08-29 17:36:31 -04:00
Sam Shleifer
5ab21b072f
[s2s] Test hub configs in self-scheduled CI (#6809) 2020-08-28 17:05:52 -04:00
Sam Shleifer
9336086ab5
prepare_seq2seq_batch makes labels/ decoder_input_ids made later. (#6654)
* broken test

* batch parity

* tests pass

* boom boom

* boom boom

* split out bart tokenizer tests

* fix tests

* boom boom

* Fixed dataset bug

* Fix marian

* Undo extra

* Get marian working

* Fix t5 tok tests

* Test passing

* Cleanup

* better assert msg

* require torch

* Fix mbart tests

* undo extra decoder_attn_mask change

* Fix import

* pegasus tokenizer can ignore src_lang kwargs

* unused kwarg test cov

* boom boom

* add todo for pegasus issue

* cover one word translation edge case

* Cleanup

* doc
2020-08-28 11:15:17 -04:00
Sam Shleifer
fb78a90d6a
PL: --adafactor option (#6776) 2020-08-27 22:19:46 -04:00
Sam Shleifer
4bd7be9a42
s2s distillation uses AutoModelForSeqToSeqLM (#6761) 2020-08-26 23:25:11 -04:00
Sam Shleifer
61518e2df3
[s2s] run_eval.py QOL improvements and cleanup(#6746) 2020-08-26 18:59:20 -04:00
Lysandre
a75c64d80c Black 20 release 2020-08-26 17:20:22 +02:00
Sam Shleifer
0344428f79
[s2s] round bleu, rouge to 4 digits (#6704) 2020-08-25 00:33:11 -04:00
Sylvain Gugger
a573777901
Update repo to isort v5 (#6686)
* Run new isort

* More changes

* Update CI, CONTRIBUTING and benchmarks
2020-08-24 11:03:01 -04:00
Sam Shleifer
d2da2cb232
allow spaces in bash args with "$@" (#6521) 2020-08-17 09:06:35 -04:00
Sam Shleifer
84c265ffcc
[lightning_base] fix s2s logging, only make train_loader once (#6404) 2020-08-16 22:49:41 -04:00
Sam Shleifer
72add6c98f
[s2s] docs, document desired filenames nicely (#6525) 2020-08-16 20:31:22 -04:00
Kyle Piira
2060181126
Fixes paths with spaces in seq2seq example (#6493) 2020-08-16 13:36:38 -04:00
Sam Shleifer
e92efcf728
Mult rouge by 100: standard units (#6359) 2020-08-13 12:15:54 -04:00
Sam Shleifer
f94a52cd79
[s2s] add BartTranslationDistiller for distilling mBART (#6363) 2020-08-12 11:41:04 -04:00
Stas Bekman
87b359439f
[test] replace capsys with the more refined CaptureStderr/CaptureStdout (#6422)
* replace capsys with the more refined CaptureStderr/CaptureStdout

* Update examples/seq2seq/test_seq2seq_examples.py

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-08-12 07:54:28 -04:00
Sam Shleifer
be1520d3a3
rename prepare_translation_batch -> prepare_seq2seq_batch (#6103) 2020-08-11 15:57:07 -04:00
Sam Shleifer
66fa8ceaea
PegasusForConditionalGeneration (torch version) (#6340)
Co-authored-by: Jingqing  Zhang <jingqing.zhang15@imperial.ac.uk>
2020-08-11 14:31:23 -04:00
Stas Bekman
f6cb0f806e
[s2s] wmt download script use less ram (#6405) 2020-08-11 12:04:17 -04:00
Sam Shleifer
b9ecd92ee4
[s2s] Script to save wmt data to disk (#6403) 2020-08-10 22:49:39 -04:00
Stas Bekman
0830e79512
the test now works again (#6371) 2020-08-10 02:55:52 -04:00
Suraj Patil
9bed355449
[s2s] fix label_smoothed_nll_loss (#6344) 2020-08-08 04:21:12 -04:00
Sam Shleifer
99f73bcc71
[s2s] tiny QOL improvement: run_eval prints scores (#6341) 2020-08-08 02:45:55 -04:00
Stas Bekman
175cd45e13
fix the shuffle agrument usage and the default (#6307) 2020-08-06 20:32:28 -04:00
Sam Shleifer
2804fff839
[s2s]Use prepare_translation_batch for Marian finetuning (#6293)
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-08-06 14:58:38 -04:00
Stas Bekman
376c02e9a9
[WIP] lightning_base: support --lr_scheduler with multiple possibilities (#6232)
* support --lr_scheduler with multiple possibilities

* correct the error message

* add a note about supported schedulers

* cleanup

* cleanup2

* needs the argument default

* style

* add another assert in the test

* implement requested changes

* cleanups

* fix relative import

* cleanup
2020-08-05 09:01:17 -04:00
Sam Shleifer
57eb1cb68d
[s2s] Document better mbart finetuning command (#6229)
* Document better MT command

* improve multigpu command
2020-08-03 18:22:31 -04:00
Sam Shleifer
b6b2f2270f
s2s: fix LR logging, remove some dead code. (#6205) 2020-08-03 10:36:26 -04:00
Stas Bekman
d8dbf3b75d
[s2s] clean up + doc (#6184)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-08-01 14:51:07 -04:00
Sylvain Gugger
91cb95461e
Switch from return_tuple to return_dict (#6138)
* Switch from return_tuple to return_dict

* Fix test

* [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614)

* Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests

* AutoModels


Tiny tweaks

* Style

* Final changes before merge

* Re-order for simpler review

* Final fixes

* Addressing @sgugger's comments

* Test MultipleChoice

* Rework TF trainer (#6038)

* Fully rework training/prediction loops

* fix method name

* Fix variable name

* Fix property name

* Fix scope

* Fix method name

* Fix tuple index

* Fix tuple index

* Fix indentation

* Fix variable name

* fix eval before log

* Add drop remainder for test dataset

* Fix step number + fix logging datetime

* fix eval loss value

* use global step instead of step + fix logging at step 0

* Fix logging datetime

* Fix global_step usage

* Fix breaking loop + logging datetime

* Fix step in prediction loop

* Fix step breaking

* Fix train/test loops

* Force TF at least 2.2 for the trainer

* Use assert_cardinality to facilitate the dataset size computation

* Log steps per epoch

* Make tfds compliant with TPU

* Make tfds compliant with TPU

* Use TF dataset enumerate instead of the Python one

* revert previous commit

* Fix data_dir

* Apply style

* rebase on master

* Address Sylvain's comments

* Address Sylvain's and Lysandre comments

* Trigger CI

* Remove unused import

* Switch from return_tuple to return_dict

* Fix test

* Add recent model

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Plu <plu.julien@gmail.com>
2020-07-30 09:17:00 -04:00
Stas Bekman
3212b8850d
[s2s] add support for overriding config params (#6149) 2020-07-30 01:09:46 -04:00
Sam Shleifer
dafa296c95
[s2s] Delete useless method, log tokens_per_batch (#6081) 2020-07-28 11:24:23 -04:00
Stas Bekman
f0c70085c2
link to README.md (#6068)
* add a link to README.md

* Update README.md
2020-07-28 20:34:58 +08:00
Sam Shleifer
3c7fbf35a6
MBART: support summarization tasks where max_src_len > max_tgt_len (#6003)
* MBART: support summarization tasks

* fix test

* Style

* add tokenizer test
2020-07-28 08:18:11 -04:00
Sam Shleifer
7a68d40138
[s2s] Don't mention packed data in README (#6079) 2020-07-27 20:07:21 -04:00
Sam Shleifer
1e00ef681d
[s2s] dont document packing because it hurts performance (#6077) 2020-07-27 18:26:00 -04:00
Sam Shleifer
11792d7826
CL util to convert models to fp16 before upload (#5953) 2020-07-27 12:21:25 -04:00
Sam Shleifer
4302ace5bd
[pack_dataset] don't sort before packing, only pack train (#5954) 2020-07-27 12:14:23 -04:00