thomwolf
|
8678ff8df5
|
adding 17 and 100 xlm models
|
2019-08-30 16:26:04 +02:00 |
|
thomwolf
|
82462c5cba
|
Added option to setup pretrained tokenizer arguments
|
2019-08-30 15:30:41 +02:00 |
|
Shijie Wu
|
ca4baf8ca1
|
Match order of casing in OSS XLM; Improve document; Clean up dependency
|
2019-08-27 20:03:18 -04:00 |
|
Shijie Wu
|
e85123d398
|
Add custom tokenizer for zh and ja
|
2019-08-23 20:27:52 -04:00 |
|
Shijie Wu
|
436ce07218
|
Tokenization behave the same as original XLM proprocessing for most languages except zh, ja and th; Change API to allow specifying language in tokenize
|
2019-08-23 14:40:17 -04:00 |
|
Guillem García Subies
|
388e3251fa
|
Update tokenization_xlm.py
|
2019-08-20 14:19:39 +02:00 |
|
Guillem García Subies
|
bfd75056b0
|
Update tokenization_xlm.py
|
2019-08-20 14:06:17 +02:00 |
|
LysandreJik
|
22ac004a7c
|
Added documentation and changed parameters for special_tokens_sentences_pair.
|
2019-08-12 15:13:53 -04:00 |
|
LysandreJik
|
14e970c271
|
Tokenization encode/decode class-based sequence handling
|
2019-08-09 15:01:38 -04:00 |
|
thomwolf
|
1849aa7d39
|
update readme and pretrained model weight files
|
2019-07-16 15:11:29 +02:00 |
|
thomwolf
|
15d8b1266c
|
update tokenizer - update squad example for xlnet
|
2019-07-15 17:30:42 +02:00 |
|
LysandreJik
|
7fdbc47822
|
Added the two CLM XLM pretrained checkpoints.
Fixed file extensions for config/vocab/merges of XLM models.
|
2019-07-10 19:37:24 -04:00 |
|
LysandreJik
|
dee3e45b93
|
Fixed XLM weights conversion script. Added 5 new checkpoints for XLM.
|
2019-07-10 19:04:21 -04:00 |
|
LysandreJik
|
f773faa258
|
Fixed all links. Removed TPU. Changed CLI to Converting TF models. Many minor formatting adjustments. Added "TODO Lysandre filled" where necessary.
|
2019-07-10 14:45:56 -04:00 |
|
thomwolf
|
d5481cbe1b
|
adding tests to examples - updating summary module - coverage update
|
2019-07-09 15:29:42 +02:00 |
|
thomwolf
|
b19786985d
|
unified tokenizer api and serialization + tests
|
2019-07-09 10:25:18 +02:00 |
|
thomwolf
|
36bca545ff
|
tokenization abstract class - tests for examples
|
2019-07-05 15:02:59 +02:00 |
|
thomwolf
|
0bab55d5d5
|
[BIG] name change
|
2019-07-05 11:55:36 +02:00 |
|