Commit Graph

15053 Commits

Author SHA1 Message Date
thomwolf
a99b971738 bump up version minor 2018-11-17 10:43:39 +01:00
thomwolf
4e46affc34 updating examples 2018-11-17 10:30:54 +01:00
thomwolf
d0673c7dbd fix links 2018-11-17 08:59:29 +01:00
thomwolf
68b937aa40 sub section overviews 2018-11-17 08:55:56 +01:00
thomwolf
c54d8b1847 fixing links in readme 2018-11-17 08:46:17 +01:00
thomwolf
f920eff8c3 update readme 2018-11-17 08:42:45 +01:00
thomwolf
886cb49792 updating readme and notebooks 2018-11-16 14:31:15 +01:00
thomwolf
fd647e8c87 comparison masked LM ok 2018-11-16 11:04:31 +01:00
thomwolf
02173a1a0a fixing error in isnan test for optimizer_on_cpu & fp16 2018-11-15 21:49:12 +01:00
thomwolf
cba85a67b9 fix nan in optimizer_on_cpu 2018-11-15 21:47:41 +01:00
thomwolf
1de35b624b preparing for first release 2018-11-15 20:56:10 +01:00
Thomas Wolf
8513741b57
Merge pull request #17 from lukovnikov/master
activation function in BERTIntermediate
2018-11-13 17:00:09 +01:00
lukovnikov
470076e419 Merge remote-tracking branch 'origin/master' 2018-11-13 16:49:26 +01:00
lukovnikov
9f3cd27187 clean up pr 2018-11-13 16:48:59 +01:00
Denis
3d4c7a6f5d
Delete __init__.py 2018-11-13 16:48:43 +01:00
lukovnikov
d64db6dfb9 clean up pr 2018-11-13 16:41:01 +01:00
lukovnikov
7ba83730c4 clean up pr 2018-11-13 16:31:20 +01:00
lukovnikov
fa0c5a2ea1 clean up pr 2018-11-13 16:24:53 +01:00
lukovnikov
f4d79f44c9 Merge remote-tracking branch 'upstream/master' 2018-11-13 16:22:23 +01:00
Thomas Wolf
5cd8d7ad27
Merge pull request #16 from donatasrep/master
Excluding AdamWeightDecayOptimizer internal variables from restoring
2018-11-13 16:19:28 +01:00
Donatas Repecka
20d07b3a7f Excluding AdamWeightDecayOptimizer internal variables from restoring 2018-11-13 16:56:25 +02:00
Thomas Wolf
278fd28a32
added results for 16-bit fine-tuning in readme 2018-11-13 09:34:49 +01:00
thomwolf
d940eeda54 typo 2018-11-12 15:26:46 +01:00
thomwolf
1cf0a16c67 cleaning up readme 2018-11-12 15:24:47 +01:00
thomwolf
66b0090877 add fp16 training 2018-11-12 15:15:02 +01:00
Thomas Wolf
5dfd19060a
fix typo in readme 2018-11-12 12:39:57 +01:00
Thomas Wolf
fa1aa81f26
fix typo in readme bach examples 2018-11-12 08:37:43 +01:00
Thomas Wolf
6d6b916f48
update to BERT-large results 2018-11-11 17:00:49 +01:00
Thomas Wolf
c4bfc646f5
Add results of fine-tuning BERT-large on GPUs 2018-11-11 16:59:35 +01:00
Thomas Wolf
48930a4cff
Merge pull request #2 from elyase/patch-1
Port tokenization for the multilingual model
2018-11-10 22:27:45 +01:00
thomwolf
a81a1ef8e9 fixing learning rate schedule when using gradient_accumulation_steps 2018-11-10 16:11:14 +01:00
thomwolf
ea85cca8ab adding optimize_on_cpu explanation in readme 2018-11-09 11:42:37 +01:00
thomwolf
5f04aa00ed option to perform optimization and keep the optimizer averages on CPU 2018-11-09 11:28:14 +01:00
thomwolf
9e95cd8cd6 clean up optimizer from unused functions 2018-11-09 11:23:55 +01:00
thomwolf
34a1a01091 update code comment 2018-11-09 09:31:20 +01:00
thomwolf
34bdc8b54f remove duplicate accumulate gradient step arguments 2018-11-09 09:19:45 +01:00
Thomas Wolf
0c24db9d5f
update results for SQuAD 2018-11-09 09:11:59 +01:00
thomwolf
2c5d993ba4 update readme - fix SQuAD model on multi-GPU 2018-11-08 21:22:22 +01:00
Gopal Krishna
4850ec5888 fixed small typos in the README.md (#8) 2018-11-08 15:00:02 -05:00
Thomas Wolf
3bfbc21376
updating pytest command 2018-11-08 00:44:17 +01:00
Thomas Wolf
0ed7696191
Updated MRPC results 2018-11-08 00:39:42 +01:00
thomwolf
48d4a5317c typo fix in output tuple 2018-11-07 23:51:12 +01:00
Thomas Wolf
d92a7f7721
Removing note on run_squad.py example 2018-11-07 23:37:55 +01:00
Thomas Wolf
5c0838d846
Merge pull request #7 from huggingface/develop
Develop
2018-11-07 23:36:46 +01:00
Thomas Wolf
efeb6b1a0d
Merge branch 'master' into develop 2018-11-07 23:35:42 +01:00
thomwolf
dbc318a4c6 cleaning up - speeding up a bit multi-gpu 2018-11-07 22:22:55 +01:00
thomwolf
6bb7510a50 fixing pre-processing bug - averaging loss for gradient accumulation - no_grad on evaluation 2018-11-07 22:12:41 +01:00
lukovnikov
bd91ae654f moved bert to qelos-util 2018-11-06 18:21:44 +01:00
lukovnikov
4e52188433 bert weight loading from tf 2018-11-06 17:47:03 +01:00
thomwolf
a1126237a9 clean up logits extraction logic 2018-11-06 17:31:15 +01:00