Commit Graph

24 Commits

Author SHA1 Message Date
thomwolf
fecaed0ed4 add force_download option to from_pretrained methods 2019-08-20 10:56:12 +02:00
LysandreJik
572dcfd1db Doc 2019-08-14 14:56:14 -04:00
LysandreJik
3d87991f60 Fixed error with encoding 2019-08-13 12:00:24 -04:00
LysandreJik
22ac004a7c Added documentation and changed parameters for special_tokens_sentences_pair. 2019-08-12 15:13:53 -04:00
LysandreJik
14e970c271 Tokenization encode/decode class-based sequence handling 2019-08-09 15:01:38 -04:00
LysandreJik
6c41a8f5dc Encode and Decode are back in the superclass. They now handle sentence pairs special tokens. 2019-08-08 18:20:32 -04:00
Thomas Wolf
d43dc48b34
Merge branch 'master' into auto_models 2019-08-05 19:17:35 +02:00
thomwolf
328afb7097 cleaning up tokenizer tests structure (at last) - last remaining ppb refs 2019-08-05 14:08:56 +02:00
thomwolf
00132b7a7a updating docs - adding few tests to tokenizers 2019-08-04 22:42:55 +02:00
thomwolf
009273dbdd big doc update [WIP] 2019-08-04 12:14:57 +02:00
thomwolf
c717d38573 dictionnary => dictionary 2019-07-26 23:30:48 +02:00
thomwolf
7b6e474c9a fix #901 2019-07-26 21:26:44 +02:00
thomwolf
27b0f86d36 clean up pretrained 2019-07-26 17:09:21 +02:00
Joel Grus
ae152cec09
make save_pretrained work with added tokens
right now it's dumping the *decoder* when it should be dumping the *encoder*. this fixes that.
2019-07-24 16:54:48 -07:00
thomwolf
1849aa7d39 update readme and pretrained model weight files 2019-07-16 15:11:29 +02:00
thomwolf
1b35d05d4b update conversion scripts and __main__ 2019-07-16 09:41:55 +02:00
thomwolf
15d8b1266c update tokenizer - update squad example for xlnet 2019-07-15 17:30:42 +02:00
thomwolf
7d4b200e40 good quality generation example for GPT, GPT-2, Transfo-XL, XLNet 2019-07-13 15:25:03 +02:00
thomwolf
d5481cbe1b adding tests to examples - updating summary module - coverage update 2019-07-09 15:29:42 +02:00
thomwolf
c079d7ddff fix python 2 tests 2019-07-09 10:40:59 +02:00
thomwolf
b19786985d unified tokenizer api and serialization + tests 2019-07-09 10:25:18 +02:00
thomwolf
1113f97f33 clean up glue example 2019-07-05 16:31:13 +02:00
thomwolf
6dacc79d39 fix python2 tests 2019-07-05 15:11:59 +02:00
thomwolf
36bca545ff tokenization abstract class - tests for examples 2019-07-05 15:02:59 +02:00