Stefan Schweter
|
a1c34bd286
|
distillation: fix ModuleNotFoundError error in token counts script
|
2019-08-31 12:21:38 +02:00 |
|
Thomas Wolf
|
51e980ce36
|
Merge pull request #1155 from anhnt170489/apex_fp16
Update apex fp16 implementation
|
2019-08-30 23:29:11 +02:00 |
|
VictorSanh
|
282c276e09
|
typos + file name coherence in distillation README
|
2019-08-30 12:02:29 -04:00 |
|
VictorSanh
|
803c1cc4ea
|
fix relative import bug cf Issue #1140
|
2019-08-30 12:01:27 -04:00 |
|
Thomas Wolf
|
0a2fecdf90
|
Merge branch 'master' into master
|
2019-08-30 16:30:08 +02:00 |
|
Rabeeh KARIMI
|
39eb31e11e
|
remove reloading tokenizer in the training, adding it to the evaluation part
|
2019-08-30 15:44:41 +02:00 |
|
Rabeeh KARIMI
|
350bb6bffa
|
updated tokenizer loading for addressing reproducibility issues
|
2019-08-30 15:34:28 +02:00 |
|
Thomas Wolf
|
01ad55f8cf
|
Merge pull request #1026 from rabeehk/master
loads the tokenizer for each checkpoint, to solve the reproducability…
|
2019-08-30 14:15:36 +02:00 |
|
jamin
|
2fb9a934b4
|
re-format
|
2019-08-30 14:05:28 +09:00 |
|
jamin
|
c8731b9583
|
update apex fp16 implementation
|
2019-08-30 13:54:00 +09:00 |
|
LysandreJik
|
caf1d116a6
|
Closing bracket in DistilBERT's token count.
|
2019-08-29 15:30:10 -04:00 |
|
Luis
|
fe8fb10b44
|
Small modification of comment in the run_glue.py example
Add RoBERTa to the comment as it was not explicit that RoBERTa don't use token_type_ids.
|
2019-08-29 14:43:30 +02:00 |
|
LysandreJik
|
bf3dc778b8
|
Changed learning rate for run_squad test
|
2019-08-28 18:24:43 -04:00 |
|
Andreas Daiminger
|
1d15a7f278
|
swap order of optimizer.step() and scheduler.step()
|
2019-08-28 19:18:27 +02:00 |
|
Thomas Wolf
|
0ecfd17f49
|
Merge pull request #987 from huggingface/generative-finetuning
Generative finetuning
|
2019-08-28 16:51:50 +02:00 |
|
thomwolf
|
b5eb283aaa
|
update credits
|
2019-08-28 16:36:55 +02:00 |
|
thomwolf
|
912a377e90
|
dilbert -> distilbert
|
2019-08-28 13:59:42 +02:00 |
|
thomwolf
|
4ce5f36f78
|
update readmes
|
2019-08-28 12:14:31 +02:00 |
|
VictorSanh
|
93e82ab424
|
Write README for DilBERT
|
2019-08-28 06:26:09 +00:00 |
|
VictorSanh
|
fea921d382
|
add licensing
|
2019-08-28 04:45:39 +00:00 |
|
VictorSanh
|
da1e4e53fc
|
some fixes in train.py for loading previous checkpoint
|
2019-08-28 04:01:03 +00:00 |
|
VictorSanh
|
0d8f8848d5
|
add scripts/extract_for_distil.py
|
2019-08-28 04:00:19 +00:00 |
|
VictorSanh
|
7f2c384c80
|
add scripts/token_counts.py
|
2019-08-28 04:00:03 +00:00 |
|
VictorSanh
|
4d16b279e5
|
add scripts/binarized_data.py
|
2019-08-28 03:59:48 +00:00 |
|
VictorSanh
|
b247b0d880
|
add train.py for distillation
|
2019-08-28 02:12:47 +00:00 |
|
VictorSanh
|
780f183e55
|
add requirements
|
2019-08-28 01:39:52 +00:00 |
|
VictorSanh
|
e424d2e45d
|
add README
|
2019-08-28 01:10:10 +00:00 |
|
VictorSanh
|
1ae81e4aa1
|
add dataset. distiller, utils
|
2019-08-28 01:10:05 +00:00 |
|
thomwolf
|
06510ccb53
|
typo
|
2019-08-23 22:08:10 +02:00 |
|
thomwolf
|
ab7bd5ef98
|
fixing tokenization and training
|
2019-08-23 17:31:21 +02:00 |
|
Thomas Wolf
|
90dcd8c05d
|
Merge branch 'master' into generative-finetuning
|
2019-08-22 10:43:30 +02:00 |
|
VictorSanh
|
57272d5ddf
|
fix for glue
|
2019-08-22 00:25:49 -04:00 |
|
VictorSanh
|
b006a7a12f
|
fix for squad
|
2019-08-22 00:25:42 -04:00 |
|
Thomas Wolf
|
9beaa85b07
|
Merge pull request #1055 from qipeng/run_squad_fix
Fix #1015 (tokenizer defaults to use_lower_case=True when loading from trained models)
|
2019-08-21 01:20:46 +02:00 |
|
Lysandre
|
2d042274ac
|
Sequence special token handling for BERT and RoBERTa
|
2019-08-20 14:15:28 -04:00 |
|
Peng Qi
|
3bffd2e8e5
|
more fixes
|
2019-08-20 10:59:28 -07:00 |
|
Thomas Wolf
|
3b56427a1e
|
Merge pull request #1040 from FeiWang96/multi_gpu
Fix bug of multi-gpu training in lm finetuning
|
2019-08-20 17:13:44 +02:00 |
|
thomwolf
|
a690edab17
|
various fix and clean up on run_lm_finetuning
|
2019-08-20 15:52:12 +02:00 |
|
Duzeyao
|
d86b49ac86
|
swap optimizer.step and scheduler.step
|
2019-08-20 16:46:34 +08:00 |
|
Duzeyao
|
45ab8bf60e
|
Revert "Update finetune_on_pregenerated.py"
This reverts commit a1359b970c .
|
2019-08-20 16:40:39 +08:00 |
|
Zeyao Du
|
a1359b970c
|
Update finetune_on_pregenerated.py
|
2019-08-20 16:00:07 +08:00 |
|
Zeyao Du
|
28f7ca1f80
|
swap optimizer.step and scheduler.step
|
2019-08-20 15:58:42 +08:00 |
|
Peng Qi
|
a368b87791
|
Fix #1015
|
2019-08-19 13:07:00 -07:00 |
|
Lysandre
|
f94f1c6016
|
Distributed training + tokenizer agnostic mask token
|
2019-08-19 14:58:50 -04:00 |
|
Thomas Wolf
|
5a49b793d9
|
Merge pull request #1023 from tuvuumass/patch-1
fix issue #824
|
2019-08-19 15:31:46 +02:00 |
|
Chi-Liang Liu
|
40acf6b52a
|
don't save model without training
|
2019-08-18 05:02:25 -04:00 |
|
wangfei
|
856a63da4d
|
Fix: save model/model.module
|
2019-08-18 11:03:47 +08:00 |
|
wangfei
|
1ef41b8337
|
Revert "Fix: save model/model.module"
This reverts commit 00e9c4cc96 .
|
2019-08-18 11:03:12 +08:00 |
|
wangfei
|
00e9c4cc96
|
Fix: save model/model.module
|
2019-08-18 11:02:02 +08:00 |
|
Jason Phang
|
d8923270e6
|
Correct truncation for RoBERTa in 2-input GLUE
|
2019-08-16 16:30:38 -04:00 |
|