Thomas Wolf
|
50e615f43d
|
Merge branch 'master' into improved_testing
|
2019-08-30 13:40:35 +02:00 |
|
thomwolf
|
f8aace6bcd
|
update tokenizers to use self.XX_token_id instead of converting self.XX_token
|
2019-08-30 13:39:52 +02:00 |
|
thomwolf
|
4e6a3172ce
|
update roberta docstring as well
|
2019-08-30 12:23:37 +02:00 |
|
thomwolf
|
50e6daf83a
|
fix Roberta tokenizer __init__
|
2019-08-30 11:27:43 +02:00 |
|
thomwolf
|
0517e7a1cb
|
Fix GPT2 and RoBERTa tokenizer to beging with a space - update Roberta tokenizer
|
2019-08-30 11:23:49 +02:00 |
|
Thomas Wolf
|
0ecfd17f49
|
Merge pull request #987 from huggingface/generative-finetuning
Generative finetuning
|
2019-08-28 16:51:50 +02:00 |
|
LysandreJik
|
e08c01aa1a
|
fix #1102
|
2019-08-26 18:13:06 -04:00 |
|
thomwolf
|
3bcbebd440
|
max_len_single_sentence & max_len_sentences_pair as attributes so they can be modified
|
2019-08-23 22:07:26 +02:00 |
|
thomwolf
|
47d6853439
|
adding max_lengths for single sentences and sentences pairs
|
2019-08-23 17:31:11 +02:00 |
|
LysandreJik
|
572dcfd1db
|
Doc
|
2019-08-14 14:56:14 -04:00 |
|
LysandreJik
|
39f426be65
|
Added special tokens <pad> and <mask> to RoBERTa.
|
2019-08-13 15:19:50 -04:00 |
|
LysandreJik
|
22ac004a7c
|
Added documentation and changed parameters for special_tokens_sentences_pair.
|
2019-08-12 15:13:53 -04:00 |
|
LysandreJik
|
75d5f98fd2
|
Roberta tokenization + fixed tests (py3 + py2).
|
2019-08-09 15:02:13 -04:00 |
|
LysandreJik
|
6c41a8f5dc
|
Encode and Decode are back in the superclass. They now handle sentence pairs special tokens.
|
2019-08-08 18:20:32 -04:00 |
|
LysandreJik
|
770043eea2
|
Sentence-pair tasks handling. Using common tests on RoBERTa. Forced push to fix indentation.
|
2019-08-07 12:53:19 -04:00 |
|
Julien Chaumond
|
cb9db101c7
|
Python 2 must DIE
|
2019-08-04 22:04:15 -04:00 |
|
Julien Chaumond
|
05c083520a
|
[RoBERTa] model conversion, inference, tests 🔥
|
2019-08-04 21:39:21 -04:00 |
|