Stas Bekman
|
8b38173398
|
[seq2seq testing] multigpu test run via subprocess (#7281)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
|
2020-10-21 17:20:53 -04:00 |
|
Sam Shleifer
|
827c519494
|
[examples] bump pl=0.9.0 (#7053)
|
2020-10-11 16:39:38 -04:00 |
|
Sam Shleifer
|
d5d2744aa7
|
Support T5 Distillation w/hidden state supervision (#7599)
|
2020-10-05 21:31:48 -04:00 |
|
Suraj Patil
|
eab5f59682
|
[s2s] add create student script (#7290)
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
|
2020-09-27 15:10:46 -04:00 |
|
Stas Bekman
|
eadd870b2f
|
[seq2seq] make it easier to run the scripts (#7274)
|
2020-09-24 15:23:48 -04:00 |
|
Stas Bekman
|
7cbf0f722d
|
examples/seq2seq/__init__.py mutates sys.path (#7194)
|
2020-09-20 16:54:42 -04:00 |
|
Sam Shleifer
|
0fab39695a
|
[s2s distill] allow pegasus-12-12 (#7104)
|
2020-09-14 00:03:59 -04:00 |
|
Sam Shleifer
|
6078b12098
|
[s2s] distill: --normalize_hidden --supervise_forward (#6834)
|
2020-09-04 14:05:56 -04:00 |
|
Sam Shleifer
|
5a318f075a
|
[s2s]: script to convert pl checkpoints to hf checkpoints (#6911)
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
|
2020-09-03 09:47:00 -04:00 |
|
Sam Shleifer
|
b9772897ec
|
[s2s] command line args for faster val steps (#6833)
|
2020-08-31 16:16:10 -04:00 |
|
Sam Shleifer
|
9336086ab5
|
prepare_seq2seq_batch makes labels/ decoder_input_ids made later. (#6654)
* broken test
* batch parity
* tests pass
* boom boom
* boom boom
* split out bart tokenizer tests
* fix tests
* boom boom
* Fixed dataset bug
* Fix marian
* Undo extra
* Get marian working
* Fix t5 tok tests
* Test passing
* Cleanup
* better assert msg
* require torch
* Fix mbart tests
* undo extra decoder_attn_mask change
* Fix import
* pegasus tokenizer can ignore src_lang kwargs
* unused kwarg test cov
* boom boom
* add todo for pegasus issue
* cover one word translation edge case
* Cleanup
* doc
|
2020-08-28 11:15:17 -04:00 |
|
Sam Shleifer
|
4bd7be9a42
|
s2s distillation uses AutoModelForSeqToSeqLM (#6761)
|
2020-08-26 23:25:11 -04:00 |
|
Lysandre
|
a75c64d80c
|
Black 20 release
|
2020-08-26 17:20:22 +02:00 |
|
Sam Shleifer
|
0344428f79
|
[s2s] round bleu, rouge to 4 digits (#6704)
|
2020-08-25 00:33:11 -04:00 |
|
Sylvain Gugger
|
a573777901
|
Update repo to isort v5 (#6686)
* Run new isort
* More changes
* Update CI, CONTRIBUTING and benchmarks
|
2020-08-24 11:03:01 -04:00 |
|
Sam Shleifer
|
84c265ffcc
|
[lightning_base] fix s2s logging, only make train_loader once (#6404)
|
2020-08-16 22:49:41 -04:00 |
|
Sam Shleifer
|
f94a52cd79
|
[s2s] add BartTranslationDistiller for distilling mBART (#6363)
|
2020-08-12 11:41:04 -04:00 |
|
Sam Shleifer
|
66fa8ceaea
|
PegasusForConditionalGeneration (torch version) (#6340)
Co-authored-by: Jingqing Zhang <jingqing.zhang15@imperial.ac.uk>
|
2020-08-11 14:31:23 -04:00 |
|
Sam Shleifer
|
09a2f40684
|
Seq2SeqDataset uses linecache to save memory by @Pradhy729 (#5792)
Co-authored-by: Pradhy729 <49659913+Pradhy729@users.noreply.github.com>
|
2020-07-18 13:57:33 -04:00 |
|
Sam Shleifer
|
dad5e12e54
|
[seq2seq] distillation.py accepts trainer arguments (#5865)
|
2020-07-18 07:43:57 -04:00 |
|
Sam Shleifer
|
45e26125de
|
save_pretrained: mkdir(exist_ok=True) (#5258)
* all save_pretrained methods mkdir if not os.path.exists
|
2020-06-28 14:53:47 -04:00 |
|
Sam Shleifer
|
40457bcebb
|
examples/seq2seq supports translation (#5202)
|
2020-06-24 23:58:11 -04:00 |
|