erenup
|
3cd6289758
|
Merge remote-tracking branch 'huggingface/master' into run_multiple_choice_merge
# Conflicts:
# examples/contrib/run_swag.py
|
2019-09-18 21:16:59 +08:00 |
|
erenup
|
36362cf086
|
move schedule.step after optimizer.step
|
2019-09-18 21:13:40 +08:00 |
|
thomwolf
|
e768f2322a
|
update run_openai_gpt to fix #1264
|
2019-09-18 10:07:47 +02:00 |
|
thomwolf
|
8334993915
|
clean up examples - updated to new keyword inputs - #1246
|
2019-09-18 10:01:27 +02:00 |
|
erenup
|
5882c442e5
|
add example usage
|
2019-09-16 22:38:08 +08:00 |
|
erenup
|
982f181aa7
|
Merge remote-tracking branch 'origin/master' into run_multiple_choice_add_doc
|
2019-09-16 19:12:00 +08:00 |
|
erenup
|
84b9d1c423
|
Merge remote-tracking branch 'huggingface/master'
# Conflicts:
# pytorch_transformers/__init__.py
|
2019-09-16 19:06:12 +08:00 |
|
erenup
|
603b470a3d
|
add warnning info
|
2019-09-16 18:53:37 +08:00 |
|
erenup
|
4812a5a767
|
add doc string
|
2019-09-16 11:50:18 +08:00 |
|
VictorSanh
|
32e1332acf
|
[distil] fix once for all general logger for scripts
|
2019-09-11 14:19:07 +00:00 |
|
VictorSanh
|
364920e216
|
fix small bug/typo
|
2019-09-10 21:45:01 +00:00 |
|
Thomas Wolf
|
23c23f5399
|
Merge pull request #1229 from SKRohit/master
changes in evaluate function in run_lm_finetuning.py
|
2019-09-10 22:16:45 +02:00 |
|
searchivarius
|
eab980fd68
|
Fix to prevent crashing on assert len(tokens_b)>=1
|
2019-09-09 19:58:08 -04:00 |
|
VictorSanh
|
a95ced6260
|
[Distillation] save last chkpt as pytorch_model.bin
|
2019-09-09 19:53:35 +00:00 |
|
Rohit Kumar Singh
|
e5df36397b
|
changes in return statement of evaluate function
changed `results` to `result` and removed `results` dict defined previously
|
2019-09-09 19:55:57 +05:30 |
|
LysandreJik
|
3f91338be9
|
Patched a few outdated parameters
|
2019-09-06 17:48:06 -04:00 |
|
LysandreJik
|
f47f9a5874
|
Updated outdated examples
|
2019-09-06 17:10:33 -04:00 |
|
LysandreJik
|
5e151f5e77
|
Table of contents
|
2019-09-06 12:08:36 -04:00 |
|
LysandreJik
|
593c070435
|
Better examples
|
2019-09-06 12:00:12 -04:00 |
|
VictorSanh
|
dddd6b9927
|
Update DistilBERT training code
|
2019-09-05 18:26:14 +00:00 |
|
Stefan Schweter
|
a1c34bd286
|
distillation: fix ModuleNotFoundError error in token counts script
|
2019-08-31 12:21:38 +02:00 |
|
Thomas Wolf
|
51e980ce36
|
Merge pull request #1155 from anhnt170489/apex_fp16
Update apex fp16 implementation
|
2019-08-30 23:29:11 +02:00 |
|
VictorSanh
|
282c276e09
|
typos + file name coherence in distillation README
|
2019-08-30 12:02:29 -04:00 |
|
VictorSanh
|
803c1cc4ea
|
fix relative import bug cf Issue #1140
|
2019-08-30 12:01:27 -04:00 |
|
Thomas Wolf
|
0a2fecdf90
|
Merge branch 'master' into master
|
2019-08-30 16:30:08 +02:00 |
|
Rabeeh KARIMI
|
39eb31e11e
|
remove reloading tokenizer in the training, adding it to the evaluation part
|
2019-08-30 15:44:41 +02:00 |
|
Rabeeh KARIMI
|
350bb6bffa
|
updated tokenizer loading for addressing reproducibility issues
|
2019-08-30 15:34:28 +02:00 |
|
Thomas Wolf
|
01ad55f8cf
|
Merge pull request #1026 from rabeehk/master
loads the tokenizer for each checkpoint, to solve the reproducability…
|
2019-08-30 14:15:36 +02:00 |
|
erenup
|
6e1ac34e2b
|
Merge remote-tracking branch 'huggingface/master'
|
2019-08-30 15:50:11 +08:00 |
|
jamin
|
2fb9a934b4
|
re-format
|
2019-08-30 14:05:28 +09:00 |
|
jamin
|
c8731b9583
|
update apex fp16 implementation
|
2019-08-30 13:54:00 +09:00 |
|
LysandreJik
|
caf1d116a6
|
Closing bracket in DistilBERT's token count.
|
2019-08-29 15:30:10 -04:00 |
|
Luis
|
fe8fb10b44
|
Small modification of comment in the run_glue.py example
Add RoBERTa to the comment as it was not explicit that RoBERTa don't use token_type_ids.
|
2019-08-29 14:43:30 +02:00 |
|
erenup
|
942d3f4b20
|
modifiy code of arc label insurance
|
2019-08-29 10:21:17 +08:00 |
|
LysandreJik
|
bf3dc778b8
|
Changed learning rate for run_squad test
|
2019-08-28 18:24:43 -04:00 |
|
Andreas Daiminger
|
1d15a7f278
|
swap order of optimizer.step() and scheduler.step()
|
2019-08-28 19:18:27 +02:00 |
|
Thomas Wolf
|
0ecfd17f49
|
Merge pull request #987 from huggingface/generative-finetuning
Generative finetuning
|
2019-08-28 16:51:50 +02:00 |
|
thomwolf
|
b5eb283aaa
|
update credits
|
2019-08-28 16:36:55 +02:00 |
|
thomwolf
|
912a377e90
|
dilbert -> distilbert
|
2019-08-28 13:59:42 +02:00 |
|
thomwolf
|
4ce5f36f78
|
update readmes
|
2019-08-28 12:14:31 +02:00 |
|
erenup
|
ec4b1c659f
|
logging truth error
|
2019-08-28 16:50:40 +08:00 |
|
erenup
|
df52abe373
|
add sep_toekn between question and choice
|
2019-08-28 16:36:21 +08:00 |
|
erenup
|
43c243254a
|
avoid invalid labels of truth
|
2019-08-28 16:03:17 +08:00 |
|
erenup
|
3c7e676f8b
|
add test related code: test the best dev acc model when model is training
|
2019-08-28 15:57:29 +08:00 |
|
VictorSanh
|
93e82ab424
|
Write README for DilBERT
|
2019-08-28 06:26:09 +00:00 |
|
VictorSanh
|
fea921d382
|
add licensing
|
2019-08-28 04:45:39 +00:00 |
|
VictorSanh
|
da1e4e53fc
|
some fixes in train.py for loading previous checkpoint
|
2019-08-28 04:01:03 +00:00 |
|
VictorSanh
|
0d8f8848d5
|
add scripts/extract_for_distil.py
|
2019-08-28 04:00:19 +00:00 |
|
VictorSanh
|
7f2c384c80
|
add scripts/token_counts.py
|
2019-08-28 04:00:03 +00:00 |
|
VictorSanh
|
4d16b279e5
|
add scripts/binarized_data.py
|
2019-08-28 03:59:48 +00:00 |
|
VictorSanh
|
b247b0d880
|
add train.py for distillation
|
2019-08-28 02:12:47 +00:00 |
|
VictorSanh
|
780f183e55
|
add requirements
|
2019-08-28 01:39:52 +00:00 |
|
VictorSanh
|
e424d2e45d
|
add README
|
2019-08-28 01:10:10 +00:00 |
|
VictorSanh
|
1ae81e4aa1
|
add dataset. distiller, utils
|
2019-08-28 01:10:05 +00:00 |
|
thomwolf
|
06510ccb53
|
typo
|
2019-08-23 22:08:10 +02:00 |
|
thomwolf
|
ab7bd5ef98
|
fixing tokenization and training
|
2019-08-23 17:31:21 +02:00 |
|
Thomas Wolf
|
90dcd8c05d
|
Merge branch 'master' into generative-finetuning
|
2019-08-22 10:43:30 +02:00 |
|
VictorSanh
|
57272d5ddf
|
fix for glue
|
2019-08-22 00:25:49 -04:00 |
|
VictorSanh
|
b006a7a12f
|
fix for squad
|
2019-08-22 00:25:42 -04:00 |
|
Thomas Wolf
|
9beaa85b07
|
Merge pull request #1055 from qipeng/run_squad_fix
Fix #1015 (tokenizer defaults to use_lower_case=True when loading from trained models)
|
2019-08-21 01:20:46 +02:00 |
|
Lysandre
|
2d042274ac
|
Sequence special token handling for BERT and RoBERTa
|
2019-08-20 14:15:28 -04:00 |
|
Peng Qi
|
3bffd2e8e5
|
more fixes
|
2019-08-20 10:59:28 -07:00 |
|
Thomas Wolf
|
3b56427a1e
|
Merge pull request #1040 from FeiWang96/multi_gpu
Fix bug of multi-gpu training in lm finetuning
|
2019-08-20 17:13:44 +02:00 |
|
thomwolf
|
a690edab17
|
various fix and clean up on run_lm_finetuning
|
2019-08-20 15:52:12 +02:00 |
|
erenup
|
fc74132598
|
add best steps to train
|
2019-08-20 19:06:41 +08:00 |
|
Duzeyao
|
d86b49ac86
|
swap optimizer.step and scheduler.step
|
2019-08-20 16:46:34 +08:00 |
|
Duzeyao
|
45ab8bf60e
|
Revert "Update finetune_on_pregenerated.py"
This reverts commit a1359b970c .
|
2019-08-20 16:40:39 +08:00 |
|
erenup
|
97c30b73d5
|
add test related code
|
2019-08-20 16:31:04 +08:00 |
|
erenup
|
d5e60e5b7a
|
add test related code
|
2019-08-20 16:25:50 +08:00 |
|
Zeyao Du
|
a1359b970c
|
Update finetune_on_pregenerated.py
|
2019-08-20 16:00:07 +08:00 |
|
Zeyao Du
|
28f7ca1f80
|
swap optimizer.step and scheduler.step
|
2019-08-20 15:58:42 +08:00 |
|
Peng Qi
|
a368b87791
|
Fix #1015
|
2019-08-19 13:07:00 -07:00 |
|
Lysandre
|
f94f1c6016
|
Distributed training + tokenizer agnostic mask token
|
2019-08-19 14:58:50 -04:00 |
|
Thomas Wolf
|
5a49b793d9
|
Merge pull request #1023 from tuvuumass/patch-1
fix issue #824
|
2019-08-19 15:31:46 +02:00 |
|
erenup
|
4270d3da1b
|
fix a bug of evaluating
|
2019-08-19 16:38:52 +08:00 |
|
Chi-Liang Liu
|
40acf6b52a
|
don't save model without training
|
2019-08-18 05:02:25 -04:00 |
|
erenup
|
47e9aea0fe
|
add args info to evaluate_result.txt
|
2019-08-18 17:00:53 +08:00 |
|
erenup
|
5582bc4b23
|
add multiple choice to robreta and xlnet, test on swag, roberta=0.82.28
, xlnet=0.80
|
2019-08-18 16:01:48 +08:00 |
|
wangfei
|
856a63da4d
|
Fix: save model/model.module
|
2019-08-18 11:03:47 +08:00 |
|
wangfei
|
1ef41b8337
|
Revert "Fix: save model/model.module"
This reverts commit 00e9c4cc96 .
|
2019-08-18 11:03:12 +08:00 |
|
wangfei
|
00e9c4cc96
|
Fix: save model/model.module
|
2019-08-18 11:02:02 +08:00 |
|
erenup
|
e384ae2b9d
|
Merge remote-tracking branch 'huggingface/master'
merge huggingface/master to update
|
2019-08-17 12:05:57 +08:00 |
|
Jason Phang
|
d8923270e6
|
Correct truncation for RoBERTa in 2-input GLUE
|
2019-08-16 16:30:38 -04:00 |
|
Lysandre
|
5652f54ac2
|
Simplified data generator + better perplexity calculator
GPT-2 now obtains ~20 perplexity on WikiText-2
|
2019-08-16 13:49:56 -04:00 |
|
LysandreJik
|
7e7fc53da5
|
Fixing run_glue example with RoBERTa
|
2019-08-16 11:53:10 -04:00 |
|
LysandreJik
|
715534800a
|
BERT + RoBERTa masking tokens handling + GPU device update.
|
2019-08-16 10:10:21 -04:00 |
|
LysandreJik
|
339e556feb
|
CLM for BERT, beginning of CLM fot RoBERTa; still needs a better masking token mechanism.
|
2019-08-16 10:10:20 -04:00 |
|
LysandreJik
|
5c18825a18
|
Removed dataset limit
|
2019-08-16 10:10:20 -04:00 |
|
LysandreJik
|
3e3e145497
|
Added GPT to the generative fine-tuning.
|
2019-08-16 10:10:20 -04:00 |
|
LysandreJik
|
47975ed53e
|
Language Modeling fine-tuning using GPT-2.
|
2019-08-16 10:10:20 -04:00 |
|
wangfei
|
b8ff56896c
|
Fix bug of multi-gpu training in lm finetuning
|
2019-08-16 12:11:05 +08:00 |
|
Rabeeh KARIMI
|
3d47a7f8ab
|
loads the tokenizer for each checkpoint, to solve the reproducability issue
|
2019-08-14 10:58:26 +02:00 |
|
LysandreJik
|
39f426be65
|
Added special tokens <pad> and <mask> to RoBERTa.
|
2019-08-13 15:19:50 -04:00 |
|
Julien Chaumond
|
baf08ca1d4
|
[RoBERTa] run_glue: correct pad_token + reorder labels
|
2019-08-13 12:51:15 -04:00 |
|
tuvuumass
|
ba4bce2581
|
fix issue #824
|
2019-08-13 11:26:27 -04:00 |
|
Julien Chaumond
|
912fdff899
|
[RoBERTa] Update run_glue for RoBERTa
|
2019-08-12 13:49:50 -04:00 |
|
erenup
|
b219029c45
|
refactoring old run_swag. This script is mainly refatored from run_squad in pytorch_transformers
|
2019-08-11 15:20:37 +08:00 |
|
Thomas Wolf
|
b4f9464f90
|
Merge pull request #960 from ethanjperez/patch-1
Fixing unused weight_decay argument
|
2019-08-07 10:09:55 +02:00 |
|
Thomas Wolf
|
d43dc48b34
|
Merge branch 'master' into auto_models
|
2019-08-05 19:17:35 +02:00 |
|
thomwolf
|
70c10caa06
|
add option mentioned in #940
|
2019-08-05 17:09:37 +02:00 |
|