Commit Graph

44 Commits

Author SHA1 Message Date
thomwolf
99ae5ab883 update config tests and circle-ci 2019-07-02 12:40:39 +02:00
thomwolf
1484d67de9 [LARGE] updating all tests and API 2019-07-02 12:13:17 +02:00
thomwolf
4d47f4985d slight refactoring, add abstract class for model loading 2019-06-26 12:52:44 +02:00
thomwolf
7e3070ae4f add from_pretrained method to all configuration classes 2019-06-26 11:12:00 +02:00
thomwolf
45709d7532 model running with simple inputs 2019-06-21 00:28:42 +02:00
thomwolf
7f00a36e27 pruning should keep on device 2019-06-19 22:23:12 +02:00
thomwolf
34d706a0e1 pruning in bertology 2019-06-19 15:25:49 +02:00
thomwolf
64e0adda81 better error message 2019-06-18 10:51:31 +02:00
thomwolf
382e2d1e50 spliting config and weight files for bert also 2019-06-18 10:37:16 +02:00
thomwolf
33d3db5c43 updating head masking, readme and docstrings 2019-06-17 15:51:28 +02:00
thomwolf
965f172de6 output all hidden layers states in GPT/GPT-2 2019-06-17 14:34:12 +02:00
thomwolf
b860e47cf5 add head masking and pruning to gpt-2 2019-06-17 14:12:10 +02:00
thomwolf
8415a38b23 better error messages 2019-06-17 13:03:48 +02:00
thomwolf
44e9ddd7fe fix num_special_tokens in GPT 2 test 2019-06-14 17:17:43 +02:00
Thomas Wolf
ff276fc00c
Merge branch 'master' into finish_torchhub_interfaces 2019-06-14 16:59:07 +02:00
VictorSanh
8f97f6c57f fix typo
cc @thomwolf
2019-06-01 17:29:07 -04:00
VictorSanh
a92b6dc3c1 add GPT2 torchhub compatibility 2019-06-01 15:27:43 -04:00
thomwolf
275179a003 output attentions in GPT-2 2019-05-08 22:24:42 +02:00
thomwolf
366a3b0285 clean up in tokenization 2019-05-08 21:43:51 +02:00
thomwolf
0efc4ab632 adding dropout to GPT-2 and embedding dropout to GPT 2019-05-08 10:41:35 +02:00
thomwolf
ea9dbea9d5 update GPT2 loss computation for more flexbility 2019-05-07 23:27:18 +02:00
thomwolf
d1b6979aa5 GPT-2 option to avoid predicting special tokens 2019-05-07 16:25:53 +02:00
thomwolf
80f53f7380 gpt-2 from_pretrained can use special tokens 2019-04-30 11:10:22 +02:00
thomwolf
e79ceb1533 gpt-2 special tokens 2019-04-30 11:05:54 +02:00
thomwolf
c30139a013 add special tokens to gpt-2 2019-04-30 10:45:26 +02:00
Abhi Sharma
9e666aaa29
Fix gradient overflow issue during attention mask
This fix is in reference to issue #382. GPT2 can now be trained in mixed precision, which I've confirmed with testing. I also tested unconditional generation on multiple seeds before and after changing 1e10 to 1e4 and there was no difference. Please let me know if there is anything else I can do to make this pull request better. Thanks for all your work!
2019-04-16 11:42:34 -07:00
thomwolf
df5d9c3551 load all models on cpu 2019-04-15 15:43:01 +02:00
thomwolf
60ea6c59d2 added best practices for serialization in README and examples 2019-04-15 15:00:33 +02:00
thomwolf
9761aa4845 add to_json_file method to configuration classes 2019-04-15 14:12:08 +02:00
Catalin Voss
01520d5412 Remove my unhelpful comments :) 2019-03-27 10:45:28 -07:00
Catalin Voss
fda2f62395 Fix test failures due to old torch issue with non-contiguous view 2019-03-24 14:37:13 -07:00
Catalin Voss
0dd796e359 Also fix loss function issue with the double head models 2019-03-24 14:35:55 -07:00
Catalin Voss
472857c47f Fix typo syntax err (sorry, c/p from my repo) 2019-03-24 14:14:49 -07:00
Catalin Voss
5938f31fa7 Fix c/p typo from my experiment code 2019-03-24 14:14:40 -07:00
Catalin Voss
7797d21b8d Fix GPT2 language modeling loss computation 2019-03-24 14:14:35 -07:00
thomwolf
e5f2d9122c adding absolute imports to gpt2, openai and transfo-xl 2019-03-14 09:55:01 +01:00
thomwolf
5c85fc3977 fix typo - logger info 2019-03-06 10:05:21 +01:00
Joel Grus
8722e9eb3b finish updating docstrings 2019-02-23 06:31:59 -08:00
Joel Grus
33aa7a80ca update documentation 2019-02-22 15:37:59 -08:00
thomwolf
690a0dbf36 fix example - masking 2019-02-18 10:50:30 +01:00
thomwolf
fbb248a2e4 examples testing 2019-02-18 01:28:18 +01:00
thomwolf
5ff0c60505 language update 2019-02-18 00:55:47 +01:00
thomwolf
009ee86a19 fix tests - bump up version 2019-02-17 23:57:23 +01:00
thomwolf
ffd623823d adding gpt2 2019-02-17 23:38:51 +01:00