Commit Graph

285 Commits

Author SHA1 Message Date
thomwolf
4a82f4f856 update special token addition 2019-04-11 13:11:22 +02:00
thomwolf
991b8e65f4 Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT 2019-04-11 11:43:15 +02:00
thomwolf
e99b2014cc fixes #471 2019-04-11 11:43:13 +02:00
lukovnikov
fc7693adc3 schedule fix 2019-04-03 18:16:47 +02:00
lukovnikov
20686b78fc schedule fix 2019-04-03 18:13:52 +02:00
lukovnikov
5fed5bb3d6 schedule fix 2019-04-03 17:20:29 +02:00
lukovnikov
91a073f804 schedule fix 2019-04-03 17:10:08 +02:00
lukovnikov
1758c8fc72 - updated docs for optimization 2019-04-03 16:08:34 +02:00
lukovnikov
725a56329d Merge remote-tracking branch 'upstream/master' into optim
# Conflicts:
#	pytorch_pretrained_bert/optimization.py

- updated docs for optimization
2019-04-03 16:07:50 +02:00
Thomas Wolf
94980b529f
Merge pull request #404 from CatalinVoss/fix_lm_loss
Fix Language Modeling Loss
2019-04-03 11:35:30 +02:00
Thomas Wolf
db4dccd1b5
Merge pull request #389 from lukovnikov/master
Fix cosine schedule
2019-04-03 11:21:43 +02:00
thomwolf
19666dcb3b Should fix #438 2019-04-03 11:01:01 +02:00
thomwolf
1d8c232324 Fix #436 2019-04-03 10:51:03 +02:00
Mike Arpaia
8b5c63e4de Fixes to the TensorFlow conversion tool 2019-04-01 13:17:54 -06:00
Catalin Voss
01520d5412 Remove my unhelpful comments :) 2019-03-27 10:45:28 -07:00
Ikuya Yamada
0401317b23 Remove padding_idx from position_embeddings and token_type_embeddings 2019-03-26 21:56:35 +09:00
Catalin Voss
fda2f62395 Fix test failures due to old torch issue with non-contiguous view 2019-03-24 14:37:13 -07:00
Catalin Voss
0dd796e359 Also fix loss function issue with the double head models 2019-03-24 14:35:55 -07:00
Catalin Voss
472857c47f Fix typo syntax err (sorry, c/p from my repo) 2019-03-24 14:14:49 -07:00
Catalin Voss
2e6f5ffb96 Fix GPT language model loss here as well 2019-03-24 14:14:44 -07:00
Catalin Voss
5938f31fa7 Fix c/p typo from my experiment code 2019-03-24 14:14:40 -07:00
Catalin Voss
7797d21b8d Fix GPT2 language modeling loss computation 2019-03-24 14:14:35 -07:00
lukovnikov
262a9992d7 class weights 2019-03-18 18:29:12 +01:00
lukovnikov
19cc2c084e same 2019-03-18 15:13:35 +01:00
lukovnikov
2283dcca5e import revert 2019-03-18 13:40:12 +01:00
lukovnikov
b6c1cae67b branches, optim cosine fix 2019-03-18 13:32:04 +01:00
lukovnikov
ef28b2c747 branches, optim cosine fix 2019-03-18 13:18:07 +01:00
lukovnikov
90430ae7ec Merge remote-tracking branch 'origin/master'
# Conflicts:
#	pytorch_pretrained_bert/optimization.py
2019-03-18 13:15:29 +01:00
lukovnikov
bed6408dcc branches, optim cosine fix 2019-03-18 13:09:55 +01:00
thomwolf
e5f2d9122c adding absolute imports to gpt2, openai and transfo-xl 2019-03-14 09:55:01 +01:00
lukovnikov
20e652209c relation classification: replacing entity mention with mask token 2019-03-13 16:13:37 +01:00
lukovnikov
eac039d21f changing docker 2019-03-12 13:45:12 +01:00
lukovnikov
471daf1b6c changing docker 2019-03-12 13:32:42 +01:00
lukovnikov
9024613337 changing docker 2019-03-12 13:23:58 +01:00
lukovnikov
baf66d1419 restart cosine lr schedule 2019-03-12 13:22:23 +01:00
Thomas Wolf
9b03d67b83
Merge pull request #362 from Bharat123rox/patch-1
Make the hyperlink of NVIDIA Apex clickable
2019-03-11 09:08:51 +01:00
Thomas Wolf
13aa13dbc0
Merge pull request #358 from cdjhz/patch-1
add 'padding_idx=0' for BertEmbeddings
2019-03-11 09:06:55 +01:00
Bharat Raghunathan
f91ce0b803
Make the hyperlink of NVIDIA Apex clickable 2019-03-09 20:05:39 +05:30
lukovnikov
51efde54a9 cos fix 2019-03-09 02:45:25 +01:00
lukovnikov
f113a2dfdc readme de 2019-03-09 02:29:57 +01:00
lukovnikov
90a41dbe14 BertAdam schedule objects 2019-03-09 02:23:20 +01:00
lukovnikov
88874f6cf0 BertAdam schedule objects 2019-03-08 19:08:30 +01:00
Haozhe Ji
72fa8d03a7
add 'padding_idx=0' for BertEmbeddings 2019-03-07 20:02:55 +08:00
Philipp Glock
6190e8ce4c Fix: use dropout layer 2019-03-07 10:12:45 +01:00
thomwolf
5c85fc3977 fix typo - logger info 2019-03-06 10:05:21 +01:00
Thomas Wolf
21c88a07b7
Merge pull request #341 from potatochip/patch-1
catch exception if pathlib not install
2019-03-06 09:48:01 +01:00
Thomas Wolf
477ec4b6cc
Merge pull request #337 from CatalinVoss/patch-2
Allow tokenization of sequences > 512 for caching
2019-03-06 09:45:49 +01:00
Thomas Wolf
7b9e5a54b5
Merge pull request #327 from lukovnikov/master
Issue#324: warmup linear fixes
2019-03-06 09:44:56 +01:00
Catalin Voss
4a49c22584 Warn instead of raising in BERT and GPT-2 tokenizers as well, to allow for pre-caching of tokens 2019-03-05 12:31:45 -08:00
Aaron Mangum
0c970caa4a
catch exception if pathlib not install 2019-03-04 14:30:19 -08:00
Catalin Voss
9775b2eb27
Allow tokenization of sequences > 512 for caching
For many applications requiring randomized data access, it's easier to cache the tokenized representations than the words. So why not turn this into a warning?
2019-03-02 16:30:21 -08:00
John Hewitt
4d1ad83236 update docstring of BERT tokenizer to reflect do_wordpiece_only 2019-02-27 14:50:41 -08:00
lukovnikov
35410da758 added warning 2019-02-27 17:11:42 +01:00
lukovnikov
4d79e0d386 added warning 2019-02-27 16:50:05 +01:00
lukovnikov
66a84b63b0 added warning 2019-02-27 16:38:00 +01:00
lukovnikov
070f3b21d8 added warning 2019-02-27 16:26:45 +01:00
lukovnikov
46ef646016 added warning 2019-02-27 16:22:27 +01:00
lukovnikov
9bc3773c84 added warning 2019-02-27 16:10:31 +01:00
lukovnikov
60a372387f added warning 2019-02-27 15:54:09 +01:00
John Hewitt
e14c6b52e3 add BertTokenizer flag to skip basic tokenization 2019-02-26 20:11:24 -08:00
lukovnikov
da2d8ca265 fix for negative learning rate with warmup_linear in BertAdam (happens when t_total is specified incorrectly)
+ copied BERT optimization warmup functions to OpenAI optimization file + added comments
2019-02-26 17:16:06 +01:00
lukovnikov
e04bab59e1 fix for negative learning rate with warmup_linear in BertAdam (happens when t_total is specified incorrectly)
+ copied BERT optimization warmup functions to OpenAI optimization file + added comments
2019-02-26 16:22:52 +01:00
Joel Grus
8722e9eb3b finish updating docstrings 2019-02-23 06:31:59 -08:00
Joel Grus
33aa7a80ca update documentation 2019-02-22 15:37:59 -08:00
Yongbo Wang
2fdab323d1
typo 2019-02-20 21:11:06 +08:00
Yongbo Wang
813e4d18ba
typo 2019-02-20 21:10:07 +08:00
thomwolf
e0855e8929 forgot to add regex to requirements :( 2019-02-18 11:54:51 +01:00
thomwolf
ab7f5d2943 simple 2019-02-18 11:33:54 +01:00
thomwolf
b450a7faf2 clean up tokenization - fix python 2 tests 2019-02-18 11:27:18 +01:00
thomwolf
d44db1145c update readme 2019-02-18 11:12:09 +01:00
thomwolf
690a0dbf36 fix example - masking 2019-02-18 10:50:30 +01:00
thomwolf
fbb248a2e4 examples testing 2019-02-18 01:28:18 +01:00
thomwolf
5ff0c60505 language update 2019-02-18 00:55:47 +01:00
thomwolf
210d407245 updating init 2019-02-18 00:55:39 +01:00
thomwolf
009ee86a19 fix tests - bump up version 2019-02-17 23:57:23 +01:00
thomwolf
ffd623823d adding gpt2 2019-02-17 23:38:51 +01:00
Dan Hendrycks
434d15da8e
Update activation function docstring 2019-02-16 12:17:52 -08:00
thomwolf
321d70a7a9 bump up to 0.5.1 2019-02-13 10:11:20 +01:00
thomwolf
c6bea08448 OpenAI GPT Tokenizer can fallback on using BERT BasicTokenizer 2019-02-13 10:11:00 +01:00
thomwolf
e7cfc46fc1 fix TransfoXLModel loading 2019-02-13 09:32:46 +01:00
thomwolf
e8fe6b7140 adapting transfo tokenizer to transposed inputs 2019-02-11 13:30:04 +01:00
thomwolf
884ca81d87 transposing the inputs of Transformer-XL to have a unified interface 2019-02-11 13:19:59 +01:00
thomwolf
2071a9b86e fix python 2.7 imports 2019-02-11 10:35:36 +01:00
thomwolf
b514a60c36 added tests for OpenAI GPT and Transformer-XL tokenizers 2019-02-11 10:17:16 +01:00
thomwolf
f0bf81e141 back compatibility with Path inputs in fle_utils 2019-02-09 17:05:23 +01:00
thomwolf
6cd769957e update transfo xl example 2019-02-09 16:59:17 +01:00
thomwolf
1320e4ec0c mc_token_mask => mc_token_ids 2019-02-09 16:58:53 +01:00
thomwolf
cfcb95417c fix hasattr 2019-02-08 23:08:53 +01:00
thomwolf
1756b5e956 fix loading from Transfo-XL LM model 2019-02-08 22:32:17 +01:00
thomwolf
dadd0c1b13 updating __main__ 2019-02-08 22:31:57 +01:00
thomwolf
102c6b238c adding file cache to __init__ 2019-02-08 22:31:46 +01:00
thomwolf
80607874c1 fix layer norm epsilon in OpenAI GPT 2019-02-08 21:49:05 +01:00
thomwolf
5ee4f17234 adding option to load on cpu 2019-02-08 10:37:40 +01:00
thomwolf
777459b471 run openai example running 2019-02-08 10:33:14 +01:00
thomwolf
edcb56fd96 more explicit variable name 2019-02-08 09:54:49 +01:00
thomwolf
eb8fda51f4 update docstrings 2019-02-07 23:15:20 +01:00
thomwolf
f99f2fb661 docstrings 2019-02-07 17:07:22 +01:00
thomwolf
438db43d46 update adaptive softmax head 2019-02-07 17:07:15 +01:00
thomwolf
c306869ea2 add two transformer xl models 2019-02-07 17:07:03 +01:00
thomwolf
9c3c24800b split saved model in config & weights 2019-02-07 17:06:17 +01:00
thomwolf
ed47cb6cba fixing transfo eval script 2019-02-06 16:22:17 +01:00
thomwolf
973926431e fix differencies with tensorflow version (mem cells and adaptive sofmax clusters) 2019-02-06 15:42:29 +01:00
Thomas Wolf
848aae49e1
Merge branch 'master' into python_2 2019-02-06 00:13:20 +01:00
thomwolf
448937c00d python 2 compatibility 2019-02-06 00:07:46 +01:00
thomwolf
822915142b fix docstring 2019-02-05 16:34:32 +01:00
Thibault Fevry
f3bda2352a Only keep the active part mof the loss for token classification 2019-02-04 11:46:36 -05:00
thomwolf
6179f537a3 clean up tokenization spaces 2019-02-04 17:41:22 +01:00
thomwolf
850da1cc36 strip decoded outputs 2019-02-04 17:35:05 +01:00
thomwolf
01a3966bc6 more options on special tokens 2019-02-04 17:26:25 +01:00
thomwolf
05f961840b logging 2019-02-04 13:06:19 +01:00
thomwolf
3a848111e6 update config, docstrings and readme to switch to seperated tokens and position embeddings 2019-01-29 11:00:11 +01:00
thomwolf
98c96fb1a7 splitting position and tokens embeddings in OpenAI GPT - updating tf imports - tests 2019-01-29 10:31:42 +01:00
thomwolf
5456d82311 more versatile model loading 2019-01-29 09:54:18 +01:00
thomwolf
9b2540b5a7 update __init__ 2019-01-29 09:54:08 +01:00
thomwolf
bd3b3aee9c update 2019-01-28 17:47:29 +01:00
thomwolf
b12616fd8e updating code organization to fix imports 2019-01-28 17:03:39 +01:00
thomwolf
d77dd62ff8 directly load from TF checkpoints + code cleanup 2019-01-28 16:50:23 +01:00
thomwolf
9c35c132fa apex LayerNorm 2019-01-17 09:19:19 +01:00
thomwolf
b9c77b98d5 fix transposition in model conversion and memory initialization 2019-01-17 00:33:21 +01:00
thomwolf
009101de12 fix loading bug and check full conversion of model 2019-01-16 12:16:20 +01:00
thomwolf
fea15cc9f5 update model conversion 2019-01-16 11:54:54 +01:00
thomwolf
c03c12687f fix __main__ entry script 2019-01-16 10:55:22 +01:00
thomwolf
8831c68803 fixing various parts of model conversion, loading and weights sharing 2019-01-16 10:31:16 +01:00
thomwolf
a69ec2c722 improved corpus and tokenization conversion - added evaluation script 2019-01-15 23:17:46 +01:00
thomwolf
7d03c53718 conversion working 2019-01-15 16:07:25 +01:00
thomwolf
3a9c88377f adding Transformer XL 2019-01-15 12:59:38 +01:00
nhatchan
cd30565aed Fix importing unofficial TF models
Importing unofficial TF models seems to be working well, at least for me.
This PR resolves #50.
2019-01-14 13:35:40 +09:00
thomwolf
e5c78c6684 update readme and few typos 2019-01-10 01:40:00 +01:00
thomwolf
ab90d4cddd adding docs and example for OpenAI GPT 2019-01-09 00:12:43 +01:00
thomwolf
dc5df92fa8 added LM head for OpenAI 2019-01-08 17:18:47 +01:00
thomwolf
3cf12b235a added tests + fixed losses 2019-01-08 16:24:23 +01:00
thomwolf
eed51c5bdf add OpenAI GPT 2019-01-08 12:26:58 +01:00
WrRan
3f60a60eed text in never_split should not lowercase 2019-01-08 13:33:57 +08:00
WrRan
751beb9e73 never split some text 2019-01-08 10:54:51 +08:00
thomwolf
793dcd236b Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT into fifth-release 2019-01-07 13:37:55 +01:00
thomwolf
93f563b8a8 adding OpenAI GPT 2019-01-07 12:55:36 +01:00
Thomas Wolf
e048c7f1c8
Merge pull request #171 from donglixp/patch-1
LayerNorm initialization
2019-01-07 12:44:46 +01:00
Thomas Wolf
bcd607542c
Merge pull request #145 from wlhgtc/master
Correct the  wrong note
2019-01-07 12:23:05 +01:00
Li Dong
d0d9b384f2
LayerNorm initialization
The LayerNorm gamma and beta should be initialized by .fill_(1.0) and .zero_().

reference links:

989e78c412/tensorflow/contrib/layers/python/layers/layers.py (L2298)

989e78c412/tensorflow/contrib/layers/python/layers/layers.py (L2308)
2019-01-07 15:51:33 +08:00
wlhgtc
e626eecc25
Update modeling.py 2018-12-22 20:26:05 +08:00
Grégory Châtel
7176674849 Fixing various class documentations. 2018-12-20 13:11:17 +01:00
Thomas Wolf
7fb94ab934
Merge pull request #127 from patrick-s-h-lewis/tokenizer-error-on-long-seqs
raises value error for bert tokenizer for long sequences
2018-12-19 10:29:17 +01:00
Patrick Sodré
87c1244c7d Convert scripts into entry_points
The recommended approach to create launch scripts is to use entry_points
and console_scripts.

xref: https://packaging.python.org/guides/distributing-packages-using-setuptools/#scripts
2018-12-19 02:26:08 +00:00
Julien Chaumond
d57763f582 Fix typos 2018-12-18 19:23:22 -05:00
Patrick Lewis
78cf7b4ab4 added code to raise value error for bert tokenizer for covert_tokens_to_indices 2018-12-18 14:41:30 +00:00
thomwolf
4a4b0e5783 remove logging. basicConfig from library code 2018-12-14 14:46:25 +01:00
thomwolf
ae88eb88a4 set encoding to 'utf-8' in calls to open 2018-12-14 13:48:58 +01:00
thomwolf
52c53f39d0 clean up apex integration 2018-12-13 13:02:17 +01:00
thomwolf
d23eed85bb model loading apex modification 2018-12-13 12:53:17 +01:00
thomwolf
1cbb32a542 include version number + comment in setup.py 2018-12-13 12:50:44 +01:00