Commit Graph

19226 Commits

Author SHA1 Message Date
thomwolf
5622d8320f allowing to load small number of examples 2018-11-05 13:21:24 +01:00
thomwolf
a725db4f6c fixing BertForQuestionAnswering loss computation 2018-11-05 13:21:11 +01:00
thomwolf
bb5ce67a14 adding back tf code + adding models comparison on SQuAD 2018-11-05 12:11:32 +01:00
VictorSanh
290633b882 Fix args.gradient_accumulation_steps used before assigment. 2018-11-04 17:31:50 -05:00
VictorSanh
649e9774cd Fix bug train_batch_size not an int.
Division makes args.train_batch_size becoming a float.
cc @thomwolf
2018-11-04 17:19:40 -05:00
VictorSanh
d55c3ae83f Small logger bug (multi-gpu, distribution) in training 2018-11-04 16:28:10 -05:00
thomwolf
3d291dea4a clean up tests 2018-11-04 21:27:19 +01:00
thomwolf
87da161c2a finishing model test 2018-11-04 21:27:10 +01:00
thomwolf
d69b0b0e90 fixes + clean up + mask is long 2018-11-04 21:26:54 +01:00
thomwolf
3ddff783c1 clean up + mask is long 2018-11-04 21:26:44 +01:00
thomwolf
88c1037991 update requirements 2018-11-04 21:26:18 +01:00
thomwolf
d0cb9fa2a7 clean up model 2018-11-04 21:26:11 +01:00
thomwolf
6cc651778a update readme 2018-11-04 21:26:03 +01:00
thomwolf
efb44a8310 distributed in extract features 2018-11-04 21:25:55 +01:00
thomwolf
d9d7d1a462 update float() 2018-11-04 21:25:36 +01:00
thomwolf
c6207d85b6 remove old methods 2018-11-04 15:34:00 +01:00
thomwolf
965b2565a0 add distributed training 2018-11-04 15:32:04 +01:00
thomwolf
1ceac85e23 add gradient accumulation 2018-11-04 15:26:14 +01:00
thomwolf
6b0da96b4b clean up 2018-11-04 15:17:55 +01:00
thomwolf
834b485b2e logging + update copyright 2018-11-04 12:07:38 +01:00
thomwolf
1701291ef9 multi-gpu cleanup 2018-11-04 11:54:57 +01:00
thomwolf
5ee171689c what's in loss again 2018-11-04 11:45:44 +01:00
thomwolf
0b7a20c651 add tqdm, clean up logging 2018-11-04 11:07:34 +01:00
thomwolf
d4e3cf3520 add numpy import 2018-11-04 10:54:16 +01:00
thomwolf
cf366417d5 remove run_squad_pytorch 2018-11-04 09:56:00 +01:00
thomwolf
26bdef4321 fixing verbose_argument 2018-11-04 09:53:29 +01:00
thomwolf
d6418c5ef3 tweaking the readme 2018-11-03 23:52:35 +01:00
thomwolf
3b70b270e0 update readme 2018-11-03 23:39:55 +01:00
thomwolf
eaa6db92f1 Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT 2018-11-03 23:35:16 +01:00
thomwolf
f8276008df update readme, file names, removing TF code, moving tests 2018-11-03 23:35:14 +01:00
Ubuntu
f18ae210e1 fix typo 2018-11-03 22:34:37 +00:00
VictorSanh
3c24e4bef1 Multi-Gpu loss - Cleaning 2018-11-03 18:03:17 -04:00
Tim Rault
5de1517d6b WIP modeling_test_pytorch.py 2018-11-03 22:40:50 +01:00
VictorSanh
1ba5b58c20 fix typo 2018-11-03 17:10:23 -04:00
VictorSanh
5858e8e4dd Fix both loss and eval metrics -> more coherence on the loss (eval vs train and tf vs pt) 2018-11-03 16:48:24 -04:00
VictorSanh
cd09cd5b40 Fix import on initalization 2018-11-03 15:38:30 -04:00
Tim Rault
ec66841afa WIP 2018-11-03 19:12:20 +01:00
thomwolf
139873f6e3 Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT 2018-11-03 19:06:17 +01:00
thomwolf
04287a4d68 special edition script 2018-11-03 19:06:15 +01:00
VictorSanh
a1af5247e1 Add seed in initialization 2018-11-03 14:00:36 -04:00
Ubuntu
4faeb38b51 Fix loss loss logging for multi-gpu compatibility 2018-11-03 17:52:51 +00:00
thomwolf
25f73add07 update optimizer run_squad 2018-11-03 17:56:34 +01:00
thomwolf
f514cbbf30 update run_squad with tqdm 2018-11-03 17:52:44 +01:00
thomwolf
cb76c1ddd3 add model.zero_grad() 2018-11-03 17:40:12 +01:00
thomwolf
a4086c5de5 Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT 2018-11-03 17:38:17 +01:00
thomwolf
088ad45888 fixing optimization 2018-11-03 17:38:15 +01:00
VictorSanh
8bd6b235b7 typo on tokenization 2018-11-03 10:27:59 -04:00
VictorSanh
2c55568c40 scatter_ and scatter 2018-11-03 10:27:38 -04:00
VictorSanh
a6efe1235f
Merge pull request #1 from huggingface/multi-gpu-support
Create DataParallel model if several GPUs
2018-11-03 10:10:34 -04:00
VictorSanh
5f432480c0 Create DataParallel model if several GPUs 2018-11-03 10:10:01 -04:00