Commit Graph

14 Commits

Author SHA1 Message Date
Ben Mann
74f7906db4
Fix #537 2019-04-30 19:48:22 -07:00
thomwolf
929579f3b5 fix #497 2019-04-17 12:35:08 +02:00
thomwolf
5afa497cbf fix GPT-2 tokenization to work also on python 3... 2019-04-17 11:04:41 +02:00
thomwolf
bc70779bf0 fixed GPT-2 tokenization on python 2 2019-04-17 10:56:15 +02:00
thomwolf
18a8a15f78 improving GPT2 tokenization and adding tests 2019-04-16 17:00:55 +02:00
thomwolf
e8568a3b17 fixing tests 2019-04-15 12:55:38 +02:00
thomwolf
870b734bfd added tokenizers serialization tests 2019-04-15 12:03:56 +02:00
thomwolf
3e65f255dc add serialization semantics to tokenizers - fix transfo-xl tokenizer 2019-04-15 11:47:25 +02:00
thomwolf
5c85fc3977 fix typo - logger info 2019-03-06 10:05:21 +01:00
Catalin Voss
4a49c22584 Warn instead of raising in BERT and GPT-2 tokenizers as well, to allow for pre-caching of tokens 2019-03-05 12:31:45 -08:00
thomwolf
ab7f5d2943 simple 2019-02-18 11:33:54 +01:00
thomwolf
b450a7faf2 clean up tokenization - fix python 2 tests 2019-02-18 11:27:18 +01:00
thomwolf
d44db1145c update readme 2019-02-18 11:12:09 +01:00
thomwolf
ffd623823d adding gpt2 2019-02-17 23:38:51 +01:00