thomwolf
44e9ddd7fe
fix num_special_tokens in GPT 2 test
2019-06-14 17:17:43 +02:00
Thomas Wolf
ff276fc00c
Merge branch 'master' into finish_torchhub_interfaces
2019-06-14 16:59:07 +02:00
Thomas Wolf
35e6baab37
Merge branch 'master' into attention
2019-06-14 16:41:56 +02:00
thomwolf
5e1207b8ad
add attention to all bert models and add test
2019-06-14 16:28:25 +02:00
thomwolf
a3274ac40b
adding attention outputs in bert
2019-06-03 16:11:45 -05:00
VictorSanh
826496580b
Revert "add output_attentions for BertModel"
...
This reverts commit de5e5682a1
.
2019-06-03 17:10:25 -04:00
VictorSanh
de5e5682a1
add output_attentions for BertModel
2019-06-03 17:05:24 -04:00
VictorSanh
8f97f6c57f
fix typo
...
cc @thomwolf
2019-06-01 17:29:07 -04:00
VictorSanh
c0c7ff5751
add transformer xl compatibility for torchhub
2019-06-01 16:08:24 -04:00
VictorSanh
a92b6dc3c1
add GPT2 torchhub compatibility
2019-06-01 15:27:43 -04:00
VictorSanh
0c5a4fe9c9
modify from_pretrained for OpenAIGPT
2019-05-31 00:27:18 -04:00
Victor SANH
96592b544b
default in __init__s for classification BERT models ( #650 )
2019-05-30 15:53:13 -04:00
thomwolf
275179a003
output attentions in GPT-2
2019-05-08 22:24:42 +02:00
thomwolf
366a3b0285
clean up in tokenization
2019-05-08 21:43:51 +02:00
Thomas Wolf
701bd59b8b
Merge pull request #585 from huntzhan/master
...
Make the epsilon of LayerNorm configurable.
2019-05-08 16:56:38 +02:00
Thomas Wolf
303b5e2b92
Merge pull request #545 from ailzhang/cache_dir
...
move pytroch_pretrained_bert cache folder under same path as torch
2019-05-08 16:55:27 +02:00
thomwolf
0efc4ab632
adding dropout to GPT-2 and embedding dropout to GPT
2019-05-08 10:41:35 +02:00
thomwolf
ea9dbea9d5
update GPT2 loss computation for more flexbility
2019-05-07 23:27:18 +02:00
thomwolf
ce86336545
add predict_special_tokens option to GPT also
2019-05-07 16:47:22 +02:00
thomwolf
d1b6979aa5
GPT-2 option to avoid predicting special tokens
2019-05-07 16:25:53 +02:00
huntzhan
101ab4dd8e
Make the epsilon of LayerNorm configurable.
2019-05-06 00:26:21 +08:00
thomwolf
e211785ada
extract attention weights from GPT
2019-05-02 18:31:26 +02:00
thomwolf
db98a4a48b
gpt-2 tokenizer
2019-05-01 11:40:48 +02:00
Ben Mann
74f7906db4
Fix #537
2019-04-30 19:48:22 -07:00
thomwolf
80f53f7380
gpt-2 from_pretrained can use special tokens
2019-04-30 11:10:22 +02:00
thomwolf
e79ceb1533
gpt-2 special tokens
2019-04-30 11:05:54 +02:00
thomwolf
c30139a013
add special tokens to gpt-2
2019-04-30 10:45:26 +02:00
Ailing Zhang
3963d57c89
move pytroch_pretrained_bert cache folder under same path as torch
2019-04-27 11:09:11 -07:00
thomwolf
b832d5bb8a
Release: 0.6.2
2019-04-25 21:37:47 +02:00
Thomas Wolf
e6cf62d499
Merge pull request #488 from dhpollack/fix_multichoice
...
fixed BertForMultipleChoice model init and forward pass
2019-04-25 21:04:16 +02:00
lukovnikov
704037ad51
- updated docs for new LR API
...
- added some images for illustration
- updated comments in optimization
2019-04-25 15:59:39 +02:00
Thomas Wolf
d76a57b0ba
Merge pull request #506 from ailzhang/hubconf
...
Hubconf
2019-04-24 20:59:21 +02:00
thomwolf
80f995a141
revert BertForMultipleChoice linear classifier
2019-04-24 16:51:54 +02:00
lukovnikov
69850b4011
python 2 compat
2019-04-21 14:02:38 +02:00
lukovnikov
bb7557d3ab
- removed __all__ in optimization
...
- removed unused plotting code
- using ABC for LRSchedule
- added some schedule object init tests
2019-04-21 13:48:33 +02:00
lukovnikov
34ccc8ebf4
Merge remote-tracking branch 'upstream/master'
2019-04-21 13:16:15 +02:00
Ailing Zhang
bfd6f6b257
fix from_pretrained positional args
2019-04-17 16:31:40 -07:00
thomwolf
23d4554ec0
is python 2 happy now
2019-04-17 14:48:34 +02:00
thomwolf
265550ec34
relax network connection requirements
2019-04-17 14:22:35 +02:00
thomwolf
fa76520240
fix file_utils on python 2
2019-04-17 13:32:22 +02:00
thomwolf
bcde2c61cb
fix #497
2019-04-17 12:35:38 +02:00
Thomas Wolf
2e153930cf
Merge pull request #495 from SudoSharma/patch-2
...
Fix gradient overflow issue during attention mask
2019-04-17 11:10:36 +02:00
thomwolf
5afa497cbf
fix GPT-2 tokenization to work also on python 3...
2019-04-17 11:04:41 +02:00
thomwolf
bc70779bf0
fixed GPT-2 tokenization on python 2
2019-04-17 10:56:15 +02:00
Abhi Sharma
9e666aaa29
Fix gradient overflow issue during attention mask
...
This fix is in reference to issue #382 . GPT2 can now be trained in mixed precision, which I've confirmed with testing. I also tested unconditional generation on multiple seeds before and after changing 1e10 to 1e4 and there was no difference. Please let me know if there is anything else I can do to make this pull request better. Thanks for all your work!
2019-04-16 11:42:34 -07:00
thomwolf
bdaba1897c
updating GPT tokenization
2019-04-16 17:44:06 +02:00
thomwolf
18a8a15f78
improving GPT2 tokenization and adding tests
2019-04-16 17:00:55 +02:00
Thomas Wolf
3d78e226e6
Merge pull request #489 from huggingface/tokenization_serialization
...
Better serialization for Tokenizers and Configuration classes - Also fix #466
2019-04-16 08:49:54 +02:00
Thomas Wolf
64b6ef4db0
Merge pull request #490 from huggingface/better_finetuning_GPT_GPT-2
...
Clean up GPT and GPT-2 losses computation
2019-04-15 16:14:50 +02:00
thomwolf
d616022455
fix openai special tokens loading
2019-04-15 16:07:45 +02:00