VictorSanh
|
b247b0d880
|
add train.py for distillation
|
2019-08-28 02:12:47 +00:00 |
|
VictorSanh
|
780f183e55
|
add requirements
|
2019-08-28 01:39:52 +00:00 |
|
VictorSanh
|
e424d2e45d
|
add README
|
2019-08-28 01:10:10 +00:00 |
|
VictorSanh
|
1ae81e4aa1
|
add dataset. distiller, utils
|
2019-08-28 01:10:05 +00:00 |
|
VictorSanh
|
5d29f8e99b
|
fix bugs
|
2019-08-28 00:57:16 +00:00 |
|
VictorSanh
|
a8ad83040d
|
fix bugs
|
2019-08-28 00:45:33 +00:00 |
|
Shijie Wu
|
ca4baf8ca1
|
Match order of casing in OSS XLM; Improve document; Clean up dependency
|
2019-08-27 20:03:18 -04:00 |
|
VictorSanh
|
60c984da6c
|
fix bugs
|
2019-08-27 22:25:55 +00:00 |
|
VictorSanh
|
42968138c8
|
wip wouf
|
2019-08-27 22:00:38 +00:00 |
|
VictorSanh
|
1d23240068
|
wip
|
2019-08-27 14:27:47 +00:00 |
|
Thomas Wolf
|
d06c5a2a0a
|
Merge pull request #1120 from CrafterKolyan/patch-3
Change attention mask dtype to be bool. Fix #1119
|
2019-08-27 15:01:01 +02:00 |
|
Thomas Wolf
|
edc5222fc3
|
Merge pull request #1118 from CrafterKolyan/patch-2
Documentation fix #1117
|
2019-08-27 14:58:50 +02:00 |
|
Thomas Wolf
|
9cf298dfc1
|
Merge pull request #1116 from CrafterKolyan/patch-1
Delete nonexistent parameter from documentation fix #1115
|
2019-08-27 14:56:43 +02:00 |
|
thomwolf
|
0d288727b8
|
fix #1106
|
2019-08-27 14:50:22 +02:00 |
|
thomwolf
|
447afe9cdf
|
updating docstring for AutoModel
|
2019-08-27 14:42:03 +02:00 |
|
thomwolf
|
a175a9dc01
|
add kwargs to base encode function
|
2019-08-27 14:05:59 +02:00 |
|
Nikolay Korolev
|
53282b5bd0
|
Change attention mask dtype to be bool. Fix #1119
|
2019-08-27 14:19:03 +03:00 |
|
Nikolay Korolev
|
26bda77225
|
Fix documentation #1117
Rename parameter in documentation + Delete its second occurrence.
|
2019-08-27 12:22:42 +03:00 |
|
Nikolay Korolev
|
c8933bb2d9
|
Delete nonexistent parameter from documentation
Changed documentation of GPT2Model, GPT2LMHeadModel and GPT2DoubleHeadsModel
|
2019-08-27 12:10:36 +03:00 |
|
LysandreJik
|
e08c01aa1a
|
fix #1102
|
2019-08-26 18:13:06 -04:00 |
|
LysandreJik
|
84a3a9689d
|
Pytorch Hub & AutoModels
|
2019-08-26 16:08:43 -04:00 |
|
LysandreJik
|
f68339639a
|
Tests for added AutoModels
|
2019-08-26 16:02:23 -04:00 |
|
LysandreJik
|
cb60ce59dd
|
Added multiple AutoModel classes: AutoModelWithLMHead, AutoModelForQuestionAnswering and AutoModelForSequenceClassification
|
2019-08-26 15:44:30 -04:00 |
|
LysandreJik
|
529a16dec6
|
Generic encoding implementation.
|
2019-08-26 15:00:43 -04:00 |
|
Shijie Wu
|
f1b018740c
|
Add use_lang_emb to config
|
2019-08-23 20:33:01 -04:00 |
|
Shijie Wu
|
e85123d398
|
Add custom tokenizer for zh and ja
|
2019-08-23 20:27:52 -04:00 |
|
thomwolf
|
06510ccb53
|
typo
|
2019-08-23 22:08:10 +02:00 |
|
thomwolf
|
3bcbebd440
|
max_len_single_sentence & max_len_sentences_pair as attributes so they can be modified
|
2019-08-23 22:07:26 +02:00 |
|
Shijie Wu
|
436ce07218
|
Tokenization behave the same as original XLM proprocessing for most languages except zh, ja and th; Change API to allow specifying language in tokenize
|
2019-08-23 14:40:17 -04:00 |
|
thomwolf
|
ab7bd5ef98
|
fixing tokenization and training
|
2019-08-23 17:31:21 +02:00 |
|
thomwolf
|
47d6853439
|
adding max_lengths for single sentences and sentences pairs
|
2019-08-23 17:31:11 +02:00 |
|
Thomas Wolf
|
df9d6effae
|
Merge pull request #1081 from huggingface/fix_distributed_barrier_hang
Fix distributed barrier hang
|
2019-08-23 16:53:53 +02:00 |
|
Thomas Wolf
|
3f20dd7186
|
Merge pull request #1075 from abhishekraok/modeling_utils_config_None
reraise EnvironmentError in modeling_utils.py
|
2019-08-23 12:42:39 +02:00 |
|
David Pollack
|
e13465fb8b
|
change layernorm code to pytorch's native layer norm
|
2019-08-23 12:12:12 +02:00 |
|
Abhishek Rao
|
c603d099aa
|
reraise EnvironmentError in from_pretrained functions of Model and Tokenizer
|
2019-08-22 15:25:40 -07:00 |
|
LysandreJik
|
2ba1a14fb0
|
Decode now calls private property instead of public method
|
2019-08-22 17:25:55 -04:00 |
|
Thomas Wolf
|
90dcd8c05d
|
Merge branch 'master' into generative-finetuning
|
2019-08-22 10:43:30 +02:00 |
|
VictorSanh
|
57272d5ddf
|
fix for glue
|
2019-08-22 00:25:49 -04:00 |
|
VictorSanh
|
b006a7a12f
|
fix for squad
|
2019-08-22 00:25:42 -04:00 |
|
Abhishek Rao
|
14eef67eb2
|
Fix at config rather than model
|
2019-08-21 15:48:43 -07:00 |
|
Abhishek Rao
|
296df2b18c
|
reraise exception
|
2019-08-21 15:29:30 -07:00 |
|
Lysandre
|
55f69a11b6
|
OpenAI GPT tests now extend CommonTests
|
2019-08-21 18:09:25 -04:00 |
|
Lysandre
|
47267ba556
|
OpenAI GPT-2 now depends on CommonTests.
|
2019-08-21 17:50:16 -04:00 |
|
Lysandre
|
034aa0c2d7
|
Fixed GPT2DoubleHeadsModel example and weight tying
|
2019-08-21 17:27:38 -04:00 |
|
thomwolf
|
e00b4ff1de
|
fix #1017
|
2019-08-21 22:22:17 +02:00 |
|
Lysandre
|
814a3f4e01
|
Removed attention_mask from GPT-2 and GPT documentation. Corrected multiple_choice_labels to actual name mc_labels
|
2019-08-21 14:11:14 -04:00 |
|
Lysandre
|
2f9397139d
|
Added GPT-2 LARGE to Pre-trained Models documentation
|
2019-08-21 11:29:37 -04:00 |
|
Lysandre
|
d6bbcbc4cf
|
Added finetuning example to documentation
|
2019-08-21 11:22:05 -04:00 |
|
VictorSanh
|
6f877d9daf
|
Update dev results on GLUE (bert-base-uncased) w/ median on 5 runs
|
2019-08-21 03:43:29 +00:00 |
|
Thomas Wolf
|
07681b6b58
|
Merge pull request #1064 from huggingface/gpt-2-large
Adding gpt-2 large (774M parameters) model
|
2019-08-21 03:05:56 +02:00 |
|