Suraj Patil
0a8c17d53c
[T5Tokenizer] remove prefix_tokens ( #7078 )
2020-09-11 14:18:45 -04:00
Patrick von Platen
7fd1febf38
Add "Leveraging Pretrained Checkpoints for Generation" Seq2Seq models. ( #6594 )
...
* add conversion script
* improve conversion script
* make style
* add tryout files
* fix
* update
* add causal bert
* better names
* add tokenizer file as well
* finish causal_bert
* fix small bugs
* improve generate
* change naming
* renaming
* renaming
* renaming
* remove leftover files
* clean files
* add fix tokenizer
* finalize
* correct slow test
* update docs
* small fixes
* fix link
* adapt check repo
* apply sams and sylvains recommendations
* fix import
* implement Lysandres recommendations
* fix logger warn
2020-09-10 16:40:51 +02:00
Stas Bekman
563485bf95
[tests] fix typos in inputs ( #6818 )
2020-08-30 18:19:57 +08:00
Sam Shleifer
3cac867fac
t5 model should make decoder_attention_mask ( #6800 )
2020-08-28 15:22:33 -04:00
Sam Shleifer
9336086ab5
prepare_seq2seq_batch makes labels/ decoder_input_ids made later. ( #6654 )
...
* broken test
* batch parity
* tests pass
* boom boom
* boom boom
* split out bart tokenizer tests
* fix tests
* boom boom
* Fixed dataset bug
* Fix marian
* Undo extra
* Get marian working
* Fix t5 tok tests
* Test passing
* Cleanup
* better assert msg
* require torch
* Fix mbart tests
* undo extra decoder_attn_mask change
* Fix import
* pegasus tokenizer can ignore src_lang kwargs
* unused kwarg test cov
* boom boom
* add todo for pegasus issue
* cover one word translation edge case
* Cleanup
* doc
2020-08-28 11:15:17 -04:00
Lysandre
a75c64d80c
Black 20 release
2020-08-26 17:20:22 +02:00
Sam Shleifer
624495706c
T5Tokenizer adds EOS token if not already added ( #5866 )
...
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-08-25 14:56:08 -04:00
Suraj Patil
407da12ef1
[T5Tokenizer] add prepare_seq2seq_batch method ( #6122 )
...
* tests
2020-08-17 13:57:19 -04:00
Sam Shleifer
07dd7c2fd8
[cleanup] test_tokenization_common.py ( #4390 )
2020-05-19 10:46:55 -04:00
Julien Chaumond
83a41d39b3
💄 super
2020-01-15 18:33:50 -05:00
alberduris
81d6841b4b
GPU text generation: mMoved the encoded_prompt to correct device
2020-01-06 15:11:12 +01:00
alberduris
dd4df80f0b
Moved the encoded_prompts to correct device
2020-01-06 15:11:12 +01:00
Aymeric Augustin
c824d15aa1
Remove __future__ imports.
2019-12-22 17:47:54 +01:00
Aymeric Augustin
00204f2b4c
Replace CommonTestCases for tokenizers with a mixin.
...
This is the same change as for (TF)CommonTestCases for modeling.
2019-12-22 15:35:25 +01:00
Aymeric Augustin
a3c5883f2c
Rename file for consistency.
2019-12-22 15:35:25 +01:00
Aymeric Augustin
7e98e211f0
Remove unittest.main() in test modules.
...
This construct isn't used anymore these days.
Running python tests/test_foo.py puts the tests/ directory on
PYTHONPATH, which isn't representative of how we run tests.
Use python -m unittest tests/test_foo.py instead.
2019-12-22 14:42:03 +01:00
Aymeric Augustin
ced0a94204
Switch test files to the standard test_*.py scheme.
2019-12-22 14:15:13 +01:00