Commit Graph

829 Commits

Author SHA1 Message Date
Thomas Wolf
35e6baab37
Merge branch 'master' into attention 2019-06-14 16:41:56 +02:00
thomwolf
5e1207b8ad add attention to all bert models and add test 2019-06-14 16:28:25 +02:00
thomwolf
bcc9e93e6f fix test 2019-06-14 15:38:20 +02:00
Thomas Wolf
f9cde97b31
Merge pull request #675 from meetshah1995/patch-1
[hotfix] Fix frozen pooler parameters in SWAG example.
2019-06-12 10:01:21 +02:00
Meet Pragnesh Shah
e02ce4dc79
[hotfix] Fix frozen pooler parameters in SWAG example. 2019-06-11 15:13:53 -07:00
Thomas Wolf
784c0ed89a
Merge pull request #668 from jeonsworld/patch-2
apply Whole Word Masking technique
2019-06-11 11:29:10 +02:00
jeonsworld
a3a604cefb
Update pregenerate_training_data.py
apply Whole Word Masking technique.
referred to [create_pretraining_data.py](https://github.com/google-research/bert/blob/master/create_pretraining_data.py)
2019-06-10 12:17:23 +09:00
VictorSanh
ee0308f79d fix typo 2019-06-06 17:30:49 +02:00
VictorSanh
2d07f945ad fix error with torch.no_grad and loss computation 2019-06-06 17:10:24 +02:00
VictorSanh
6b8d227092 some cleaning 2019-06-06 17:07:03 +02:00
VictorSanh
122d5c52ac distinguish was is not trained 2019-06-06 17:02:51 +02:00
VictorSanh
2647ac3294 forgot bertForPreTraining 2019-06-06 16:57:40 +02:00
VictorSanh
cf44d98392 Add more examples to BERT models for torchhub 2019-06-06 16:36:02 +02:00
thomwolf
a3274ac40b adding attention outputs in bert 2019-06-03 16:11:45 -05:00
VictorSanh
826496580b Revert "add output_attentions for BertModel"
This reverts commit de5e5682a1.
2019-06-03 17:10:25 -04:00
VictorSanh
de5e5682a1 add output_attentions for BertModel 2019-06-03 17:05:24 -04:00
Thomas Wolf
2a329c6186
Merge pull request #651 from huggingface/gpt_torchhub
Add GPT* compatibility to torchhub
2019-05-31 14:44:52 +02:00
VictorSanh
45d21502f0 update doc 2019-05-31 01:04:16 -04:00
VictorSanh
98f5c7864f decorelate dependencies + fix bug 2019-05-31 01:00:29 -04:00
VictorSanh
c8bd026ef6 move dependecies list to hubconf 2019-05-31 00:36:58 -04:00
VictorSanh
19ef2b0a66 Fix typo in hubconf 2019-05-31 00:33:33 -04:00
VictorSanh
d0f591051c gpt_hubconf 2019-05-31 00:28:10 -04:00
VictorSanh
4a210c9fc6 Move bert_hubconf to hubconfs 2019-05-31 00:28:00 -04:00
VictorSanh
0c5a4fe9c9 modify from_pretrained for OpenAIGPT 2019-05-31 00:27:18 -04:00
VictorSanh
372a5c1cee Hubconf doc - Specia case loading 2019-05-30 16:06:21 -04:00
Victor SANH
96592b544b
default in __init__s for classification BERT models (#650) 2019-05-30 15:53:13 -04:00
VictorSanh
4cda86b08f Update hubconf for torchhub: paths+examples+doc 2019-05-30 18:38:00 +00:00
Thomas Wolf
3fc63f126d
Merge pull request #598 from burcturkoglu/master
Updating learning rate with special warm up in examples
2019-05-10 13:48:12 +02:00
burcturkoglu
00c7fd2b79 Division to num_train_optimizer of global_step in lr_this_step is removed. 2019-05-09 10:57:03 +03:00
burcturkoglu
fa37b4da77 Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT 2019-05-09 10:55:24 +03:00
burcturkoglu
5289b4b9e0 Division to num_train_optimizer of global_step in lr_this_step is removed. 2019-05-09 10:51:38 +03:00
thomwolf
275179a003 output attentions in GPT-2 2019-05-08 22:24:42 +02:00
thomwolf
366a3b0285 clean up in tokenization 2019-05-08 21:43:51 +02:00
Thomas Wolf
701bd59b8b
Merge pull request #585 from huntzhan/master
Make the epsilon of LayerNorm configurable.
2019-05-08 16:56:38 +02:00
Thomas Wolf
303b5e2b92
Merge pull request #545 from ailzhang/cache_dir
move pytroch_pretrained_bert cache folder under same path as torch
2019-05-08 16:55:27 +02:00
Thomas Wolf
0198399d84
Merge pull request #570 from MottoX/fix-1
Create optimizer only when args.do_train is True
2019-05-08 16:07:50 +02:00
Thomas Wolf
50fa92c026
Merge pull request #571 from MottoX/patch-1
Fix documentation typo
2019-05-08 16:06:13 +02:00
thomwolf
0efc4ab632 adding dropout to GPT-2 and embedding dropout to GPT 2019-05-08 10:41:35 +02:00
thomwolf
ea9dbea9d5 update GPT2 loss computation for more flexbility 2019-05-07 23:27:18 +02:00
thomwolf
ce86336545 add predict_special_tokens option to GPT also 2019-05-07 16:47:22 +02:00
thomwolf
d1b6979aa5 GPT-2 option to avoid predicting special tokens 2019-05-07 16:25:53 +02:00
huntzhan
101ab4dd8e Make the epsilon of LayerNorm configurable. 2019-05-06 00:26:21 +08:00
thomwolf
e211785ada extract attention weights from GPT 2019-05-02 18:31:26 +02:00
MottoX
18c8aef9d3 Fix documentation typo 2019-05-02 19:23:36 +08:00
MottoX
74dbba64bc Prepare optimizer only when args.do_train is True 2019-05-02 19:09:29 +08:00
thomwolf
db98a4a48b gpt-2 tokenizer 2019-05-01 11:40:48 +02:00
Thomas Wolf
3ae8c8be1e
Merge pull request #562 from apappu97/roc_stories_lmlabels_fix
Small fix to remove shifting of lm labels during pre process of RocStories.
2019-05-01 11:20:17 +02:00
Thomas Wolf
e89520175d
Merge pull request #564 from 8enmann/patch-2
Fix #537
2019-05-01 11:18:46 +02:00
Ben Mann
74f7906db4
Fix #537 2019-04-30 19:48:22 -07:00
Aneesh Pappu
365fb34c6c small fix to remove shifting of lm labels during pre process of roc stories, as this shifting happens interanlly in the model 2019-04-30 13:53:04 -07:00