thomwolf
d9e60f4f0d
Merge branch 'master' into pr/1383
2019-10-09 17:25:08 +02:00
Lysandre Debut
e84470ef81
Merge pull request #1384 from huggingface/encoding-qol
...
Quality of life enhancements in encoding + patch MLM masking
2019-10-09 11:18:24 -04:00
thomwolf
07d055f849
higher tolerance
2019-10-09 17:10:04 +02:00
thomwolf
48b438ff2a
doc and conversion
2019-10-09 17:06:30 +02:00
jinoobaek-qz
69629c4f0f
Improve naming and only do regex when necessary
2019-10-09 08:48:40 -04:00
jinoobaek-qz
bf34a252b8
Golden path
2019-10-09 08:48:40 -04:00
jinoobaek-qz
528d3f327b
Improve readability and improve make less assumptions about checkpoint format
2019-10-09 08:48:40 -04:00
jinoobaek-qz
56301bd9e8
Extract method
2019-10-09 08:48:40 -04:00
jinoobaek-qz
d6c5469712
Delete older checkpoint after saving new checkpoint
2019-10-09 08:48:40 -04:00
jinoobaek-qz
54a31f50fb
Add save_total_limit
2019-10-09 08:48:40 -04:00
thomwolf
c19b8e4ae0
fixing CTRL tests and OpenAI GPT tests
2019-10-09 13:51:05 +02:00
thomwolf
6dce6dda1b
fixing TF 2.0 model - adding more severe test on pt/tf equivalence
2019-10-09 11:57:55 +02:00
thomwolf
c56d921dda
adding TF 2.0 model
2019-10-09 11:07:43 +02:00
thomwolf
1c5079952f
simpler distilbert mask - fix tf tests
2019-10-09 04:26:20 +02:00
Thomas Wolf
58b302caf3
Merge pull request #1398 from dveselov/patch-1
...
Fixed typo in docs README
2019-10-09 03:52:42 +02:00
Thomas Wolf
439fac723a
Merge pull request #1409 from brian41005/master
...
Evaluation result.txt path changing #1286
2019-10-09 03:14:34 +02:00
thomwolf
23b7138ab4
fix #1378 and #1453
2019-10-09 01:54:44 +02:00
Bilal Khan
5ce8d29abe
Change tensorboard imports to use built-in tensorboard if available
2019-10-08 16:29:43 -05:00
Julien Chaumond
d688af19e5
Update link to swift-coreml-transformers
...
cc @lysandrejik
2019-10-08 16:37:52 -04:00
thomwolf
45dc04f33d
tf model [WIP]
2019-10-08 17:37:17 +02:00
Rémi Louf
770b15b58c
rename class in __init__
2019-10-08 17:32:28 +02:00
thomwolf
248314772f
fix tokenization
2019-10-08 17:19:28 +02:00
thomwolf
03c2c762a6
update tokenizer
2019-10-08 17:12:03 +02:00
thomwolf
3edfa1d6aa
update model to use past
2019-10-08 17:11:58 +02:00
Rémi Louf
f4d41fe33e
Merge pull request #1448 from huggingface/contributing
...
add contribution guidelines
2019-10-08 16:55:34 +02:00
Rémi Louf
61ed889005
remove old seq2seq file
2019-10-08 16:30:58 +02:00
Rémi Louf
8abfee9ec3
rename Bert2Bert -> Bert2Rnd
2019-10-08 16:30:58 +02:00
Rémi Louf
82628b0fc9
add a placeholder test
2019-10-08 16:30:58 +02:00
Rémi Louf
0700983090
Add BertDecoderModel and Bert2Bert classes
...
I am not sure what happens when the class is initialized with the
pretrained weights.
2019-10-08 16:30:58 +02:00
Rémi Louf
75feacf172
add general structure for Bert2Bert class
2019-10-08 16:30:58 +02:00
Rémi Louf
15a2fc88a6
add General attention classes
...
The modifications that I introduced in a previous commit did break
Bert's internal API. I reverted these changes and added more general
classes to handle the encoder-decoder attention case.
There may be a more elegant way to deal with retro-compatibility (I am
not comfortable with the current state of the code), but I cannot see it
right now.
2019-10-08 16:30:58 +02:00
Rémi Louf
cd6a59d5c1
add a decoder layer for Bert
2019-10-08 16:30:58 +02:00
Rémi Louf
45de313a9e
add bullet point on modifying an existing PR
2019-10-08 11:54:10 +02:00
Rémi Louf
ade05b6cef
add code contribution
2019-10-07 23:20:25 +02:00
Rémi Louf
e9c09052a4
add issues and requests guidelines
2019-10-07 22:30:55 +02:00
LysandreJik
8fcc6507ce
Multilingual
2019-10-07 15:02:42 -04:00
Rémi Louf
6e3e1c959e
Merge pull request #1447 from huggingface/dev-requirements
...
Provide requirements.txt for development dependencies
2019-10-07 18:49:26 +02:00
VictorSanh
7ce83b4931
update weights for distilgpt2
2019-10-07 12:30:27 -04:00
VictorSanh
9f81f1cba8
fix convert pt_to_tf2 for custom weights
2019-10-07 12:30:19 -04:00
Rémi Louf
7afd00a661
freeze dev requirements
2019-10-07 17:58:13 +02:00
Rémi Louf
a0dcefa382
generalize BertSelfAttention to take separate query, key, value
...
There is currently no way to specify the quey, key and value separately
in the Attention module. However, the decoder's "encoder-decoder
attention" layers take the decoder's last output as a query, the
encoder's states as key and value. We thus modify the existing code so
query, key and value can be added separately.
This obviously poses some naming conventions; `BertSelfAttention` is not
a self-attention module anymore. The way the residual is forwarded is
now awkard, etc. We will need to do some refacto once the decoder is
fully implemented.
2019-10-07 17:53:58 +02:00
Rémi Louf
31adbb247c
add class wireframes for Bert decoder
2019-10-07 16:43:21 +02:00
Rémi Louf
dda1adad6d
rename BertLayer to BertEncoderLayer
2019-10-07 16:31:46 +02:00
Rémi Louf
0053c0e052
do some (light) housekeeping
...
Several packages were imported but never used, indentation and line
spaces did not follow PEP8.
2019-10-07 16:29:15 +02:00
thomwolf
bd5363cc83
update CTRL configuration
2019-10-07 15:37:30 +02:00
thomwolf
dc89441167
update CTRL pytorch model
2019-10-07 15:37:25 +02:00
thomwolf
320b7a7e01
fix #1416
2019-10-07 14:26:59 +02:00
Rémi Louf
386e86e222
raise exception when class initialized with __init__
2019-10-07 13:00:06 +02:00
Rémi Louf
4446c02b8a
add wireframe for seq2seq model
2019-10-07 12:04:05 +02:00
Thomas Wolf
1615360c71
Merge pull request #1438 from SeanBE/master
...
fix pytorch-transformers migration description in README
2019-10-07 05:02:23 -04:00