thomwolf
5c85fc3977
fix typo - logger info
2019-03-06 10:05:21 +01:00
Thomas Wolf
21c88a07b7
Merge pull request #341 from potatochip/patch-1
...
catch exception if pathlib not install
2019-03-06 09:48:01 +01:00
Thomas Wolf
477ec4b6cc
Merge pull request #337 from CatalinVoss/patch-2
...
Allow tokenization of sequences > 512 for caching
2019-03-06 09:45:49 +01:00
Thomas Wolf
7b9e5a54b5
Merge pull request #327 from lukovnikov/master
...
Issue#324: warmup linear fixes
2019-03-06 09:44:56 +01:00
Catalin Voss
4a49c22584
Warn instead of raising in BERT and GPT-2 tokenizers as well, to allow for pre-caching of tokens
2019-03-05 12:31:45 -08:00
Aaron Mangum
0c970caa4a
catch exception if pathlib not install
2019-03-04 14:30:19 -08:00
Catalin Voss
9775b2eb27
Allow tokenization of sequences > 512 for caching
...
For many applications requiring randomized data access, it's easier to cache the tokenized representations than the words. So why not turn this into a warning?
2019-03-02 16:30:21 -08:00
John Hewitt
4d1ad83236
update docstring of BERT tokenizer to reflect do_wordpiece_only
2019-02-27 14:50:41 -08:00
lukovnikov
35410da758
added warning
2019-02-27 17:11:42 +01:00
lukovnikov
4d79e0d386
added warning
2019-02-27 16:50:05 +01:00
lukovnikov
66a84b63b0
added warning
2019-02-27 16:38:00 +01:00
lukovnikov
070f3b21d8
added warning
2019-02-27 16:26:45 +01:00
lukovnikov
46ef646016
added warning
2019-02-27 16:22:27 +01:00
lukovnikov
9bc3773c84
added warning
2019-02-27 16:10:31 +01:00
lukovnikov
60a372387f
added warning
2019-02-27 15:54:09 +01:00
John Hewitt
e14c6b52e3
add BertTokenizer flag to skip basic tokenization
2019-02-26 20:11:24 -08:00
lukovnikov
da2d8ca265
fix for negative learning rate with warmup_linear in BertAdam (happens when t_total is specified incorrectly)
...
+ copied BERT optimization warmup functions to OpenAI optimization file + added comments
2019-02-26 17:16:06 +01:00
lukovnikov
e04bab59e1
fix for negative learning rate with warmup_linear in BertAdam (happens when t_total is specified incorrectly)
...
+ copied BERT optimization warmup functions to OpenAI optimization file + added comments
2019-02-26 16:22:52 +01:00
Joel Grus
8722e9eb3b
finish updating docstrings
2019-02-23 06:31:59 -08:00
Joel Grus
33aa7a80ca
update documentation
2019-02-22 15:37:59 -08:00
Yongbo Wang
2fdab323d1
typo
2019-02-20 21:11:06 +08:00
Yongbo Wang
813e4d18ba
typo
2019-02-20 21:10:07 +08:00
thomwolf
e0855e8929
forgot to add regex to requirements :(
2019-02-18 11:54:51 +01:00
thomwolf
ab7f5d2943
simple
2019-02-18 11:33:54 +01:00
thomwolf
b450a7faf2
clean up tokenization - fix python 2 tests
2019-02-18 11:27:18 +01:00
thomwolf
d44db1145c
update readme
2019-02-18 11:12:09 +01:00
thomwolf
690a0dbf36
fix example - masking
2019-02-18 10:50:30 +01:00
thomwolf
fbb248a2e4
examples testing
2019-02-18 01:28:18 +01:00
thomwolf
5ff0c60505
language update
2019-02-18 00:55:47 +01:00
thomwolf
210d407245
updating init
2019-02-18 00:55:39 +01:00
thomwolf
009ee86a19
fix tests - bump up version
2019-02-17 23:57:23 +01:00
thomwolf
ffd623823d
adding gpt2
2019-02-17 23:38:51 +01:00
Dan Hendrycks
434d15da8e
Update activation function docstring
2019-02-16 12:17:52 -08:00
thomwolf
321d70a7a9
bump up to 0.5.1
2019-02-13 10:11:20 +01:00
thomwolf
c6bea08448
OpenAI GPT Tokenizer can fallback on using BERT BasicTokenizer
2019-02-13 10:11:00 +01:00
thomwolf
e7cfc46fc1
fix TransfoXLModel loading
2019-02-13 09:32:46 +01:00
thomwolf
e8fe6b7140
adapting transfo tokenizer to transposed inputs
2019-02-11 13:30:04 +01:00
thomwolf
884ca81d87
transposing the inputs of Transformer-XL to have a unified interface
2019-02-11 13:19:59 +01:00
thomwolf
2071a9b86e
fix python 2.7 imports
2019-02-11 10:35:36 +01:00
thomwolf
b514a60c36
added tests for OpenAI GPT and Transformer-XL tokenizers
2019-02-11 10:17:16 +01:00
thomwolf
f0bf81e141
back compatibility with Path inputs in fle_utils
2019-02-09 17:05:23 +01:00
thomwolf
6cd769957e
update transfo xl example
2019-02-09 16:59:17 +01:00
thomwolf
1320e4ec0c
mc_token_mask => mc_token_ids
2019-02-09 16:58:53 +01:00
thomwolf
cfcb95417c
fix hasattr
2019-02-08 23:08:53 +01:00
thomwolf
1756b5e956
fix loading from Transfo-XL LM model
2019-02-08 22:32:17 +01:00
thomwolf
dadd0c1b13
updating __main__
2019-02-08 22:31:57 +01:00
thomwolf
102c6b238c
adding file cache to __init__
2019-02-08 22:31:46 +01:00
thomwolf
80607874c1
fix layer norm epsilon in OpenAI GPT
2019-02-08 21:49:05 +01:00
thomwolf
5ee4f17234
adding option to load on cpu
2019-02-08 10:37:40 +01:00
thomwolf
777459b471
run openai example running
2019-02-08 10:33:14 +01:00