Sam Shleifer
ce37be9d94
[s2s] warn if --fp16 for torch 1.6 ( #6977 )
2020-09-06 20:41:29 -04:00
Sam Shleifer
a4fc0c80b1
[s2s] run_eval.py parses generate_kwargs ( #6948 )
2020-09-04 14:19:31 -04:00
Sam Shleifer
6078b12098
[s2s] distill: --normalize_hidden --supervise_forward ( #6834 )
2020-09-04 14:05:56 -04:00
Sam Shleifer
e95d262f25
[s2s] support early stopping based on loss, rather than rouge ( #6927 )
2020-09-03 17:31:35 -04:00
Sam Shleifer
207ed8cb78
[s2s] use --eval_beams command line arg ( #6926 )
2020-09-03 12:42:09 -04:00
Sam Shleifer
39ed68d597
[s2s] allow task_specific_params=summarization_xsum ( #6923 )
2020-09-03 11:11:40 -04:00
Sam Shleifer
5a318f075a
[s2s]: script to convert pl checkpoints to hf checkpoints ( #6911 )
...
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-09-03 09:47:00 -04:00
brett koonce
b8e4906c97
tweak tar command in readme ( #6919 )
2020-09-03 09:29:01 -04:00
Sam Shleifer
b9772897ec
[s2s] command line args for faster val steps ( #6833 )
2020-08-31 16:16:10 -04:00
Sam Shleifer
61b7ba93f5
Marian distill scripts + integration test ( #6799 )
2020-08-31 13:48:26 -04:00
Sam Shleifer
dfa10a41ba
[s2s README] Add more dataset download instructions ( #6737 )
2020-08-30 16:29:24 -04:00
Sam Shleifer
0f58903bb6
Pegasus finetune script: add --adafactor ( #6811 )
2020-08-29 17:43:32 -04:00
Sam Shleifer
ac47458a02
[s2s] round runtime in run_eval ( #6798 )
2020-08-29 17:36:31 -04:00
Sam Shleifer
5ab21b072f
[s2s] Test hub configs in self-scheduled CI ( #6809 )
2020-08-28 17:05:52 -04:00
Sam Shleifer
9336086ab5
prepare_seq2seq_batch makes labels/ decoder_input_ids made later. ( #6654 )
...
* broken test
* batch parity
* tests pass
* boom boom
* boom boom
* split out bart tokenizer tests
* fix tests
* boom boom
* Fixed dataset bug
* Fix marian
* Undo extra
* Get marian working
* Fix t5 tok tests
* Test passing
* Cleanup
* better assert msg
* require torch
* Fix mbart tests
* undo extra decoder_attn_mask change
* Fix import
* pegasus tokenizer can ignore src_lang kwargs
* unused kwarg test cov
* boom boom
* add todo for pegasus issue
* cover one word translation edge case
* Cleanup
* doc
2020-08-28 11:15:17 -04:00
Sam Shleifer
fb78a90d6a
PL: --adafactor option ( #6776 )
2020-08-27 22:19:46 -04:00
Sam Shleifer
4bd7be9a42
s2s distillation uses AutoModelForSeqToSeqLM ( #6761 )
2020-08-26 23:25:11 -04:00
Sam Shleifer
61518e2df3
[s2s] run_eval.py QOL improvements and cleanup( #6746 )
2020-08-26 18:59:20 -04:00
Lysandre
a75c64d80c
Black 20 release
2020-08-26 17:20:22 +02:00
Sam Shleifer
0344428f79
[s2s] round bleu, rouge to 4 digits ( #6704 )
2020-08-25 00:33:11 -04:00
Sylvain Gugger
a573777901
Update repo to isort v5 ( #6686 )
...
* Run new isort
* More changes
* Update CI, CONTRIBUTING and benchmarks
2020-08-24 11:03:01 -04:00
Sam Shleifer
d2da2cb232
allow spaces in bash args with "$@" ( #6521 )
2020-08-17 09:06:35 -04:00
Sam Shleifer
84c265ffcc
[lightning_base] fix s2s logging, only make train_loader once ( #6404 )
2020-08-16 22:49:41 -04:00
Sam Shleifer
72add6c98f
[s2s] docs, document desired filenames nicely ( #6525 )
2020-08-16 20:31:22 -04:00
Kyle Piira
2060181126
Fixes paths with spaces in seq2seq example ( #6493 )
2020-08-16 13:36:38 -04:00
Sam Shleifer
e92efcf728
Mult rouge by 100: standard units ( #6359 )
2020-08-13 12:15:54 -04:00
Sam Shleifer
f94a52cd79
[s2s] add BartTranslationDistiller for distilling mBART ( #6363 )
2020-08-12 11:41:04 -04:00
Stas Bekman
87b359439f
[test] replace capsys with the more refined CaptureStderr/CaptureStdout ( #6422 )
...
* replace capsys with the more refined CaptureStderr/CaptureStdout
* Update examples/seq2seq/test_seq2seq_examples.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-08-12 07:54:28 -04:00
Sam Shleifer
be1520d3a3
rename prepare_translation_batch -> prepare_seq2seq_batch ( #6103 )
2020-08-11 15:57:07 -04:00
Sam Shleifer
66fa8ceaea
PegasusForConditionalGeneration (torch version) ( #6340 )
...
Co-authored-by: Jingqing Zhang <jingqing.zhang15@imperial.ac.uk>
2020-08-11 14:31:23 -04:00
Stas Bekman
f6cb0f806e
[s2s] wmt download script use less ram ( #6405 )
2020-08-11 12:04:17 -04:00
Sam Shleifer
b9ecd92ee4
[s2s] Script to save wmt data to disk ( #6403 )
2020-08-10 22:49:39 -04:00
Stas Bekman
0830e79512
the test now works again ( #6371 )
2020-08-10 02:55:52 -04:00
Suraj Patil
9bed355449
[s2s] fix label_smoothed_nll_loss ( #6344 )
2020-08-08 04:21:12 -04:00
Sam Shleifer
99f73bcc71
[s2s] tiny QOL improvement: run_eval prints scores ( #6341 )
2020-08-08 02:45:55 -04:00
Stas Bekman
175cd45e13
fix the shuffle agrument usage and the default ( #6307 )
2020-08-06 20:32:28 -04:00
Sam Shleifer
2804fff839
[s2s]Use prepare_translation_batch for Marian finetuning ( #6293 )
...
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-08-06 14:58:38 -04:00
Stas Bekman
376c02e9a9
[WIP] lightning_base: support --lr_scheduler with multiple possibilities ( #6232 )
...
* support --lr_scheduler with multiple possibilities
* correct the error message
* add a note about supported schedulers
* cleanup
* cleanup2
* needs the argument default
* style
* add another assert in the test
* implement requested changes
* cleanups
* fix relative import
* cleanup
2020-08-05 09:01:17 -04:00
Sam Shleifer
57eb1cb68d
[s2s] Document better mbart finetuning command ( #6229 )
...
* Document better MT command
* improve multigpu command
2020-08-03 18:22:31 -04:00
Sam Shleifer
b6b2f2270f
s2s: fix LR logging, remove some dead code. ( #6205 )
2020-08-03 10:36:26 -04:00
Stas Bekman
d8dbf3b75d
[s2s] clean up + doc ( #6184 )
...
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-08-01 14:51:07 -04:00
Sylvain Gugger
91cb95461e
Switch from return_tuple to return_dict ( #6138 )
...
* Switch from return_tuple to return_dict
* Fix test
* [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614 )
* Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests
* AutoModels
Tiny tweaks
* Style
* Final changes before merge
* Re-order for simpler review
* Final fixes
* Addressing @sgugger's comments
* Test MultipleChoice
* Rework TF trainer (#6038 )
* Fully rework training/prediction loops
* fix method name
* Fix variable name
* Fix property name
* Fix scope
* Fix method name
* Fix tuple index
* Fix tuple index
* Fix indentation
* Fix variable name
* fix eval before log
* Add drop remainder for test dataset
* Fix step number + fix logging datetime
* fix eval loss value
* use global step instead of step + fix logging at step 0
* Fix logging datetime
* Fix global_step usage
* Fix breaking loop + logging datetime
* Fix step in prediction loop
* Fix step breaking
* Fix train/test loops
* Force TF at least 2.2 for the trainer
* Use assert_cardinality to facilitate the dataset size computation
* Log steps per epoch
* Make tfds compliant with TPU
* Make tfds compliant with TPU
* Use TF dataset enumerate instead of the Python one
* revert previous commit
* Fix data_dir
* Apply style
* rebase on master
* Address Sylvain's comments
* Address Sylvain's and Lysandre comments
* Trigger CI
* Remove unused import
* Switch from return_tuple to return_dict
* Fix test
* Add recent model
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Plu <plu.julien@gmail.com>
2020-07-30 09:17:00 -04:00
Stas Bekman
3212b8850d
[s2s] add support for overriding config params ( #6149 )
2020-07-30 01:09:46 -04:00
Sam Shleifer
dafa296c95
[s2s] Delete useless method, log tokens_per_batch ( #6081 )
2020-07-28 11:24:23 -04:00
Stas Bekman
f0c70085c2
link to README.md ( #6068 )
...
* add a link to README.md
* Update README.md
2020-07-28 20:34:58 +08:00
Sam Shleifer
3c7fbf35a6
MBART: support summarization tasks where max_src_len > max_tgt_len ( #6003 )
...
* MBART: support summarization tasks
* fix test
* Style
* add tokenizer test
2020-07-28 08:18:11 -04:00
Sam Shleifer
7a68d40138
[s2s] Don't mention packed data in README ( #6079 )
2020-07-27 20:07:21 -04:00
Sam Shleifer
1e00ef681d
[s2s] dont document packing because it hurts performance ( #6077 )
2020-07-27 18:26:00 -04:00
Sam Shleifer
11792d7826
CL util to convert models to fp16 before upload ( #5953 )
2020-07-27 12:21:25 -04:00
Sam Shleifer
4302ace5bd
[pack_dataset] don't sort before packing, only pack train ( #5954 )
2020-07-27 12:14:23 -04:00