Commit Graph

119 Commits

Author SHA1 Message Date
thomwolf
e0855e8929 forgot to add regex to requirements :( 2019-02-18 11:54:51 +01:00
thomwolf
ab7f5d2943 simple 2019-02-18 11:33:54 +01:00
thomwolf
b450a7faf2 clean up tokenization - fix python 2 tests 2019-02-18 11:27:18 +01:00
thomwolf
d44db1145c update readme 2019-02-18 11:12:09 +01:00
thomwolf
690a0dbf36 fix example - masking 2019-02-18 10:50:30 +01:00
thomwolf
fbb248a2e4 examples testing 2019-02-18 01:28:18 +01:00
thomwolf
5ff0c60505 language update 2019-02-18 00:55:47 +01:00
thomwolf
210d407245 updating init 2019-02-18 00:55:39 +01:00
thomwolf
009ee86a19 fix tests - bump up version 2019-02-17 23:57:23 +01:00
thomwolf
ffd623823d adding gpt2 2019-02-17 23:38:51 +01:00
Dan Hendrycks
434d15da8e
Update activation function docstring 2019-02-16 12:17:52 -08:00
thomwolf
321d70a7a9 bump up to 0.5.1 2019-02-13 10:11:20 +01:00
thomwolf
c6bea08448 OpenAI GPT Tokenizer can fallback on using BERT BasicTokenizer 2019-02-13 10:11:00 +01:00
thomwolf
e7cfc46fc1 fix TransfoXLModel loading 2019-02-13 09:32:46 +01:00
thomwolf
e8fe6b7140 adapting transfo tokenizer to transposed inputs 2019-02-11 13:30:04 +01:00
thomwolf
884ca81d87 transposing the inputs of Transformer-XL to have a unified interface 2019-02-11 13:19:59 +01:00
thomwolf
2071a9b86e fix python 2.7 imports 2019-02-11 10:35:36 +01:00
thomwolf
b514a60c36 added tests for OpenAI GPT and Transformer-XL tokenizers 2019-02-11 10:17:16 +01:00
thomwolf
f0bf81e141 back compatibility with Path inputs in fle_utils 2019-02-09 17:05:23 +01:00
thomwolf
6cd769957e update transfo xl example 2019-02-09 16:59:17 +01:00
thomwolf
1320e4ec0c mc_token_mask => mc_token_ids 2019-02-09 16:58:53 +01:00
thomwolf
cfcb95417c fix hasattr 2019-02-08 23:08:53 +01:00
thomwolf
1756b5e956 fix loading from Transfo-XL LM model 2019-02-08 22:32:17 +01:00
thomwolf
dadd0c1b13 updating __main__ 2019-02-08 22:31:57 +01:00
thomwolf
102c6b238c adding file cache to __init__ 2019-02-08 22:31:46 +01:00
thomwolf
80607874c1 fix layer norm epsilon in OpenAI GPT 2019-02-08 21:49:05 +01:00
thomwolf
5ee4f17234 adding option to load on cpu 2019-02-08 10:37:40 +01:00
thomwolf
777459b471 run openai example running 2019-02-08 10:33:14 +01:00
thomwolf
edcb56fd96 more explicit variable name 2019-02-08 09:54:49 +01:00
thomwolf
eb8fda51f4 update docstrings 2019-02-07 23:15:20 +01:00
thomwolf
f99f2fb661 docstrings 2019-02-07 17:07:22 +01:00
thomwolf
438db43d46 update adaptive softmax head 2019-02-07 17:07:15 +01:00
thomwolf
c306869ea2 add two transformer xl models 2019-02-07 17:07:03 +01:00
thomwolf
9c3c24800b split saved model in config & weights 2019-02-07 17:06:17 +01:00
thomwolf
ed47cb6cba fixing transfo eval script 2019-02-06 16:22:17 +01:00
thomwolf
973926431e fix differencies with tensorflow version (mem cells and adaptive sofmax clusters) 2019-02-06 15:42:29 +01:00
Thomas Wolf
848aae49e1
Merge branch 'master' into python_2 2019-02-06 00:13:20 +01:00
thomwolf
448937c00d python 2 compatibility 2019-02-06 00:07:46 +01:00
thomwolf
822915142b fix docstring 2019-02-05 16:34:32 +01:00
Thibault Fevry
f3bda2352a Only keep the active part mof the loss for token classification 2019-02-04 11:46:36 -05:00
thomwolf
6179f537a3 clean up tokenization spaces 2019-02-04 17:41:22 +01:00
thomwolf
850da1cc36 strip decoded outputs 2019-02-04 17:35:05 +01:00
thomwolf
01a3966bc6 more options on special tokens 2019-02-04 17:26:25 +01:00
thomwolf
05f961840b logging 2019-02-04 13:06:19 +01:00
thomwolf
3a848111e6 update config, docstrings and readme to switch to seperated tokens and position embeddings 2019-01-29 11:00:11 +01:00
thomwolf
98c96fb1a7 splitting position and tokens embeddings in OpenAI GPT - updating tf imports - tests 2019-01-29 10:31:42 +01:00
thomwolf
5456d82311 more versatile model loading 2019-01-29 09:54:18 +01:00
thomwolf
9b2540b5a7 update __init__ 2019-01-29 09:54:08 +01:00
thomwolf
bd3b3aee9c update 2019-01-28 17:47:29 +01:00
thomwolf
b12616fd8e updating code organization to fix imports 2019-01-28 17:03:39 +01:00