Sam Shleifer
|
9336086ab5
|
prepare_seq2seq_batch makes labels/ decoder_input_ids made later. (#6654)
* broken test
* batch parity
* tests pass
* boom boom
* boom boom
* split out bart tokenizer tests
* fix tests
* boom boom
* Fixed dataset bug
* Fix marian
* Undo extra
* Get marian working
* Fix t5 tok tests
* Test passing
* Cleanup
* better assert msg
* require torch
* Fix mbart tests
* undo extra decoder_attn_mask change
* Fix import
* pegasus tokenizer can ignore src_lang kwargs
* unused kwarg test cov
* boom boom
* add todo for pegasus issue
* cover one word translation edge case
* Cleanup
* doc
|
2020-08-28 11:15:17 -04:00 |
|
Lysandre
|
a75c64d80c
|
Black 20 release
|
2020-08-26 17:20:22 +02:00 |
|
Sam Shleifer
|
be1520d3a3
|
rename prepare_translation_batch -> prepare_seq2seq_batch (#6103)
|
2020-08-11 15:57:07 -04:00 |
|
Sam Shleifer
|
5abe50381a
|
Fix #6096: MBartTokenizer's mask token (#6098)
|
2020-07-28 18:27:58 -04:00 |
|
Sam Shleifer
|
3c7fbf35a6
|
MBART: support summarization tasks where max_src_len > max_tgt_len (#6003)
* MBART: support summarization tasks
* fix test
* Style
* add tokenizer test
|
2020-07-28 08:18:11 -04:00 |
|
Sam Shleifer
|
9827d666eb
|
MbartTokenizer: do not hardcode vocab size (#5998)
|
2020-07-23 15:41:14 -04:00 |
|
Sam Shleifer
|
353b8f1e7a
|
Add mbart-large-cc25, support translation finetuning (#5129)
improve unittests for finetuning, especially w.r.t testing frozen parameters
fix freeze_embeds for T5
add streamlit setup.cfg
|
2020-07-07 13:23:01 -04:00 |
|