Commit Graph

265 Commits

Author SHA1 Message Date
thomwolf
868de8d1d7 updating weights loading 2019-06-18 10:58:20 +02:00
thomwolf
64e0adda81 better error message 2019-06-18 10:51:31 +02:00
thomwolf
382e2d1e50 spliting config and weight files for bert also 2019-06-18 10:37:16 +02:00
thomwolf
33d3db5c43 updating head masking, readme and docstrings 2019-06-17 15:51:28 +02:00
thomwolf
965f172de6 output all hidden layers states in GPT/GPT-2 2019-06-17 14:34:12 +02:00
thomwolf
f12007e421 add head masking and pruning to openai GPT 2019-06-17 14:19:40 +02:00
thomwolf
b860e47cf5 add head masking and pruning to gpt-2 2019-06-17 14:12:10 +02:00
thomwolf
7220d47a1c adding head pruning and tests 2019-06-17 13:20:45 +02:00
thomwolf
8415a38b23 better error messages 2019-06-17 13:03:48 +02:00
thomwolf
96c4d3d988 add head masking tests 2019-06-17 12:17:26 +02:00
thomwolf
34858ae1d9 adding bert whole words, bertgerman and gpt-2 medium models, head masking 2019-06-17 11:02:39 +02:00
Thomas Wolf
80684f6f86
Merge pull request #690 from shashwath94/projadpsftmax_fix
Transformer XL ProjectedAdaptiveLogSoftmax output fix
2019-06-15 23:14:10 +02:00
Thomas Wolf
9e363703d6
Merge pull request #688 from deepset-ai/german_bert
Add German Bert model to code, update readme
2019-06-15 23:13:41 +02:00
vanche
8289646d4e
import class "GPT2MultipleChoiceHead" 2019-06-15 22:19:30 +09:00
Shashwath H A
5076a5daa7 Fix proj adp softmax output return when n_clusters=0 2019-06-14 22:03:21 -04:00
timoeller
16af9ff7b0 Add German Bert model to code, update readme 2019-06-14 17:42:46 +02:00
thomwolf
44e9ddd7fe fix num_special_tokens in GPT 2 test 2019-06-14 17:17:43 +02:00
Thomas Wolf
ff276fc00c
Merge branch 'master' into finish_torchhub_interfaces 2019-06-14 16:59:07 +02:00
Thomas Wolf
35e6baab37
Merge branch 'master' into attention 2019-06-14 16:41:56 +02:00
thomwolf
5e1207b8ad add attention to all bert models and add test 2019-06-14 16:28:25 +02:00
thomwolf
a3274ac40b adding attention outputs in bert 2019-06-03 16:11:45 -05:00
VictorSanh
826496580b Revert "add output_attentions for BertModel"
This reverts commit de5e5682a1.
2019-06-03 17:10:25 -04:00
VictorSanh
de5e5682a1 add output_attentions for BertModel 2019-06-03 17:05:24 -04:00
VictorSanh
8f97f6c57f fix typo
cc @thomwolf
2019-06-01 17:29:07 -04:00
VictorSanh
c0c7ff5751 add transformer xl compatibility for torchhub 2019-06-01 16:08:24 -04:00
VictorSanh
a92b6dc3c1 add GPT2 torchhub compatibility 2019-06-01 15:27:43 -04:00
VictorSanh
0c5a4fe9c9 modify from_pretrained for OpenAIGPT 2019-05-31 00:27:18 -04:00
Victor SANH
96592b544b
default in __init__s for classification BERT models (#650) 2019-05-30 15:53:13 -04:00
thomwolf
275179a003 output attentions in GPT-2 2019-05-08 22:24:42 +02:00
thomwolf
366a3b0285 clean up in tokenization 2019-05-08 21:43:51 +02:00
Thomas Wolf
701bd59b8b
Merge pull request #585 from huntzhan/master
Make the epsilon of LayerNorm configurable.
2019-05-08 16:56:38 +02:00
Thomas Wolf
303b5e2b92
Merge pull request #545 from ailzhang/cache_dir
move pytroch_pretrained_bert cache folder under same path as torch
2019-05-08 16:55:27 +02:00
thomwolf
0efc4ab632 adding dropout to GPT-2 and embedding dropout to GPT 2019-05-08 10:41:35 +02:00
thomwolf
ea9dbea9d5 update GPT2 loss computation for more flexbility 2019-05-07 23:27:18 +02:00
thomwolf
ce86336545 add predict_special_tokens option to GPT also 2019-05-07 16:47:22 +02:00
thomwolf
d1b6979aa5 GPT-2 option to avoid predicting special tokens 2019-05-07 16:25:53 +02:00
huntzhan
101ab4dd8e Make the epsilon of LayerNorm configurable. 2019-05-06 00:26:21 +08:00
thomwolf
e211785ada extract attention weights from GPT 2019-05-02 18:31:26 +02:00
thomwolf
db98a4a48b gpt-2 tokenizer 2019-05-01 11:40:48 +02:00
Ben Mann
74f7906db4
Fix #537 2019-04-30 19:48:22 -07:00
thomwolf
80f53f7380 gpt-2 from_pretrained can use special tokens 2019-04-30 11:10:22 +02:00
thomwolf
e79ceb1533 gpt-2 special tokens 2019-04-30 11:05:54 +02:00
thomwolf
c30139a013 add special tokens to gpt-2 2019-04-30 10:45:26 +02:00
Ailing Zhang
3963d57c89 move pytroch_pretrained_bert cache folder under same path as torch 2019-04-27 11:09:11 -07:00
thomwolf
b832d5bb8a Release: 0.6.2 2019-04-25 21:37:47 +02:00
Thomas Wolf
e6cf62d499
Merge pull request #488 from dhpollack/fix_multichoice
fixed BertForMultipleChoice model init and forward pass
2019-04-25 21:04:16 +02:00
lukovnikov
704037ad51 - updated docs for new LR API
- added some images for illustration
- updated comments in optimization
2019-04-25 15:59:39 +02:00
Thomas Wolf
d76a57b0ba
Merge pull request #506 from ailzhang/hubconf
Hubconf
2019-04-24 20:59:21 +02:00
thomwolf
80f995a141 revert BertForMultipleChoice linear classifier 2019-04-24 16:51:54 +02:00
lukovnikov
69850b4011 python 2 compat 2019-04-21 14:02:38 +02:00