LysandreJik
75635072e1
Updated GLUE script to add DistilBERT. Cleaned up unused args in the utils file.
2019-09-19 10:55:06 +02:00
LysandreJik
59057abe52
typo
2019-09-19 10:55:06 +02:00
LysandreJik
bac332fec0
Updated the GLUE data processor. Corrections to RoBERTa and XLNet.
2019-09-19 10:55:06 +02:00
Erik Chan
f0340eccf9
Typo
...
Typo
2019-09-18 13:42:11 -07:00
erenup
8960988f35
fixed to find best dev acc
2019-09-19 01:10:05 +08:00
erenup
46ffc28329
Merge branch 'master' into run_multiple_choice_merge
...
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
2019-09-18 21:43:46 +08:00
erenup
15143fbad6
move run_multiple_choice.py and utils_multiple_choice.py to examples
2019-09-18 21:18:46 +08:00
erenup
3cd6289758
Merge remote-tracking branch 'huggingface/master' into run_multiple_choice_merge
...
# Conflicts:
# examples/contrib/run_swag.py
2019-09-18 21:16:59 +08:00
erenup
36362cf086
move schedule.step after optimizer.step
2019-09-18 21:13:40 +08:00
thomwolf
e768f2322a
update run_openai_gpt to fix #1264
2019-09-18 10:07:47 +02:00
thomwolf
8334993915
clean up examples - updated to new keyword inputs - #1246
2019-09-18 10:01:27 +02:00
erenup
5882c442e5
add example usage
2019-09-16 22:38:08 +08:00
erenup
982f181aa7
Merge remote-tracking branch 'origin/master' into run_multiple_choice_add_doc
2019-09-16 19:12:00 +08:00
erenup
84b9d1c423
Merge remote-tracking branch 'huggingface/master'
...
# Conflicts:
# pytorch_transformers/__init__.py
2019-09-16 19:06:12 +08:00
erenup
603b470a3d
add warnning info
2019-09-16 18:53:37 +08:00
erenup
4812a5a767
add doc string
2019-09-16 11:50:18 +08:00
VictorSanh
32e1332acf
[distil] fix once for all general logger for scripts
2019-09-11 14:19:07 +00:00
VictorSanh
364920e216
fix small bug/typo
2019-09-10 21:45:01 +00:00
Thomas Wolf
23c23f5399
Merge pull request #1229 from SKRohit/master
...
changes in evaluate function in run_lm_finetuning.py
2019-09-10 22:16:45 +02:00
searchivarius
eab980fd68
Fix to prevent crashing on assert len(tokens_b)>=1
2019-09-09 19:58:08 -04:00
VictorSanh
a95ced6260
[Distillation] save last chkpt as pytorch_model.bin
2019-09-09 19:53:35 +00:00
Rohit Kumar Singh
e5df36397b
changes in return statement of evaluate function
...
changed `results` to `result` and removed `results` dict defined previously
2019-09-09 19:55:57 +05:30
LysandreJik
3f91338be9
Patched a few outdated parameters
2019-09-06 17:48:06 -04:00
LysandreJik
f47f9a5874
Updated outdated examples
2019-09-06 17:10:33 -04:00
LysandreJik
5e151f5e77
Table of contents
2019-09-06 12:08:36 -04:00
LysandreJik
593c070435
Better examples
2019-09-06 12:00:12 -04:00
VictorSanh
dddd6b9927
Update DistilBERT training code
2019-09-05 18:26:14 +00:00
Stefan Schweter
a1c34bd286
distillation: fix ModuleNotFoundError error in token counts script
2019-08-31 12:21:38 +02:00
Thomas Wolf
51e980ce36
Merge pull request #1155 from anhnt170489/apex_fp16
...
Update apex fp16 implementation
2019-08-30 23:29:11 +02:00
VictorSanh
282c276e09
typos + file name coherence in distillation README
2019-08-30 12:02:29 -04:00
VictorSanh
803c1cc4ea
fix relative import bug cf Issue #1140
2019-08-30 12:01:27 -04:00
Thomas Wolf
0a2fecdf90
Merge branch 'master' into master
2019-08-30 16:30:08 +02:00
Rabeeh KARIMI
39eb31e11e
remove reloading tokenizer in the training, adding it to the evaluation part
2019-08-30 15:44:41 +02:00
Rabeeh KARIMI
350bb6bffa
updated tokenizer loading for addressing reproducibility issues
2019-08-30 15:34:28 +02:00
Thomas Wolf
01ad55f8cf
Merge pull request #1026 from rabeehk/master
...
loads the tokenizer for each checkpoint, to solve the reproducability…
2019-08-30 14:15:36 +02:00
erenup
6e1ac34e2b
Merge remote-tracking branch 'huggingface/master'
2019-08-30 15:50:11 +08:00
jamin
2fb9a934b4
re-format
2019-08-30 14:05:28 +09:00
jamin
c8731b9583
update apex fp16 implementation
2019-08-30 13:54:00 +09:00
LysandreJik
caf1d116a6
Closing bracket in DistilBERT's token count.
2019-08-29 15:30:10 -04:00
Luis
fe8fb10b44
Small modification of comment in the run_glue.py example
...
Add RoBERTa to the comment as it was not explicit that RoBERTa don't use token_type_ids.
2019-08-29 14:43:30 +02:00
erenup
942d3f4b20
modifiy code of arc label insurance
2019-08-29 10:21:17 +08:00
LysandreJik
bf3dc778b8
Changed learning rate for run_squad test
2019-08-28 18:24:43 -04:00
Andreas Daiminger
1d15a7f278
swap order of optimizer.step() and scheduler.step()
2019-08-28 19:18:27 +02:00
Thomas Wolf
0ecfd17f49
Merge pull request #987 from huggingface/generative-finetuning
...
Generative finetuning
2019-08-28 16:51:50 +02:00
thomwolf
b5eb283aaa
update credits
2019-08-28 16:36:55 +02:00
thomwolf
912a377e90
dilbert -> distilbert
2019-08-28 13:59:42 +02:00
thomwolf
4ce5f36f78
update readmes
2019-08-28 12:14:31 +02:00
erenup
ec4b1c659f
logging truth error
2019-08-28 16:50:40 +08:00
erenup
df52abe373
add sep_toekn between question and choice
2019-08-28 16:36:21 +08:00
erenup
43c243254a
avoid invalid labels of truth
2019-08-28 16:03:17 +08:00
erenup
3c7e676f8b
add test related code: test the best dev acc model when model is training
2019-08-28 15:57:29 +08:00
VictorSanh
93e82ab424
Write README for DilBERT
2019-08-28 06:26:09 +00:00
VictorSanh
fea921d382
add licensing
2019-08-28 04:45:39 +00:00
VictorSanh
da1e4e53fc
some fixes in train.py
for loading previous checkpoint
2019-08-28 04:01:03 +00:00
VictorSanh
0d8f8848d5
add scripts/extract_for_distil.py
2019-08-28 04:00:19 +00:00
VictorSanh
7f2c384c80
add scripts/token_counts.py
2019-08-28 04:00:03 +00:00
VictorSanh
4d16b279e5
add scripts/binarized_data.py
2019-08-28 03:59:48 +00:00
VictorSanh
b247b0d880
add train.py
for distillation
2019-08-28 02:12:47 +00:00
VictorSanh
780f183e55
add requirements
2019-08-28 01:39:52 +00:00
VictorSanh
e424d2e45d
add README
2019-08-28 01:10:10 +00:00
VictorSanh
1ae81e4aa1
add dataset. distiller, utils
2019-08-28 01:10:05 +00:00
thomwolf
06510ccb53
typo
2019-08-23 22:08:10 +02:00
thomwolf
ab7bd5ef98
fixing tokenization and training
2019-08-23 17:31:21 +02:00
Thomas Wolf
90dcd8c05d
Merge branch 'master' into generative-finetuning
2019-08-22 10:43:30 +02:00
VictorSanh
57272d5ddf
fix for glue
2019-08-22 00:25:49 -04:00
VictorSanh
b006a7a12f
fix for squad
2019-08-22 00:25:42 -04:00
Thomas Wolf
9beaa85b07
Merge pull request #1055 from qipeng/run_squad_fix
...
Fix #1015 (tokenizer defaults to use_lower_case=True when loading from trained models)
2019-08-21 01:20:46 +02:00
Lysandre
2d042274ac
Sequence special token handling for BERT and RoBERTa
2019-08-20 14:15:28 -04:00
Peng Qi
3bffd2e8e5
more fixes
2019-08-20 10:59:28 -07:00
Thomas Wolf
3b56427a1e
Merge pull request #1040 from FeiWang96/multi_gpu
...
Fix bug of multi-gpu training in lm finetuning
2019-08-20 17:13:44 +02:00
thomwolf
a690edab17
various fix and clean up on run_lm_finetuning
2019-08-20 15:52:12 +02:00
erenup
fc74132598
add best steps to train
2019-08-20 19:06:41 +08:00
Duzeyao
d86b49ac86
swap optimizer.step and scheduler.step
2019-08-20 16:46:34 +08:00
Duzeyao
45ab8bf60e
Revert "Update finetune_on_pregenerated.py"
...
This reverts commit a1359b970c
.
2019-08-20 16:40:39 +08:00
erenup
97c30b73d5
add test related code
2019-08-20 16:31:04 +08:00
erenup
d5e60e5b7a
add test related code
2019-08-20 16:25:50 +08:00
Zeyao Du
a1359b970c
Update finetune_on_pregenerated.py
2019-08-20 16:00:07 +08:00
Zeyao Du
28f7ca1f80
swap optimizer.step and scheduler.step
2019-08-20 15:58:42 +08:00
Peng Qi
a368b87791
Fix #1015
2019-08-19 13:07:00 -07:00
Lysandre
f94f1c6016
Distributed training + tokenizer agnostic mask token
2019-08-19 14:58:50 -04:00
Thomas Wolf
5a49b793d9
Merge pull request #1023 from tuvuumass/patch-1
...
fix issue #824
2019-08-19 15:31:46 +02:00
erenup
4270d3da1b
fix a bug of evaluating
2019-08-19 16:38:52 +08:00
Chi-Liang Liu
40acf6b52a
don't save model without training
2019-08-18 05:02:25 -04:00
erenup
47e9aea0fe
add args info to evaluate_result.txt
2019-08-18 17:00:53 +08:00
erenup
5582bc4b23
add multiple choice to robreta and xlnet, test on swag, roberta=0.82.28
...
, xlnet=0.80
2019-08-18 16:01:48 +08:00
wangfei
856a63da4d
Fix: save model/model.module
2019-08-18 11:03:47 +08:00
wangfei
1ef41b8337
Revert "Fix: save model/model.module"
...
This reverts commit 00e9c4cc96
.
2019-08-18 11:03:12 +08:00
wangfei
00e9c4cc96
Fix: save model/model.module
2019-08-18 11:02:02 +08:00
erenup
e384ae2b9d
Merge remote-tracking branch 'huggingface/master'
...
merge huggingface/master to update
2019-08-17 12:05:57 +08:00
Jason Phang
d8923270e6
Correct truncation for RoBERTa in 2-input GLUE
2019-08-16 16:30:38 -04:00
Lysandre
5652f54ac2
Simplified data generator + better perplexity calculator
...
GPT-2 now obtains ~20 perplexity on WikiText-2
2019-08-16 13:49:56 -04:00
LysandreJik
7e7fc53da5
Fixing run_glue example with RoBERTa
2019-08-16 11:53:10 -04:00
LysandreJik
715534800a
BERT + RoBERTa masking tokens handling + GPU device update.
2019-08-16 10:10:21 -04:00
LysandreJik
339e556feb
CLM for BERT, beginning of CLM fot RoBERTa; still needs a better masking token mechanism.
2019-08-16 10:10:20 -04:00
LysandreJik
5c18825a18
Removed dataset limit
2019-08-16 10:10:20 -04:00
LysandreJik
3e3e145497
Added GPT to the generative fine-tuning.
2019-08-16 10:10:20 -04:00
LysandreJik
47975ed53e
Language Modeling fine-tuning using GPT-2.
2019-08-16 10:10:20 -04:00
wangfei
b8ff56896c
Fix bug of multi-gpu training in lm finetuning
2019-08-16 12:11:05 +08:00
Rabeeh KARIMI
3d47a7f8ab
loads the tokenizer for each checkpoint, to solve the reproducability issue
2019-08-14 10:58:26 +02:00
LysandreJik
39f426be65
Added special tokens <pad> and <mask> to RoBERTa.
2019-08-13 15:19:50 -04:00