Rémi Louf
770b15b58c
rename class in __init__
2019-10-08 17:32:28 +02:00
thomwolf
248314772f
fix tokenization
2019-10-08 17:19:28 +02:00
thomwolf
03c2c762a6
update tokenizer
2019-10-08 17:12:03 +02:00
thomwolf
3edfa1d6aa
update model to use past
2019-10-08 17:11:58 +02:00
Rémi Louf
f4d41fe33e
Merge pull request #1448 from huggingface/contributing
...
add contribution guidelines
2019-10-08 16:55:34 +02:00
Rémi Louf
61ed889005
remove old seq2seq file
2019-10-08 16:30:58 +02:00
Rémi Louf
8abfee9ec3
rename Bert2Bert -> Bert2Rnd
2019-10-08 16:30:58 +02:00
Rémi Louf
82628b0fc9
add a placeholder test
2019-10-08 16:30:58 +02:00
Rémi Louf
0700983090
Add BertDecoderModel and Bert2Bert classes
...
I am not sure what happens when the class is initialized with the
pretrained weights.
2019-10-08 16:30:58 +02:00
Rémi Louf
75feacf172
add general structure for Bert2Bert class
2019-10-08 16:30:58 +02:00
Rémi Louf
15a2fc88a6
add General attention classes
...
The modifications that I introduced in a previous commit did break
Bert's internal API. I reverted these changes and added more general
classes to handle the encoder-decoder attention case.
There may be a more elegant way to deal with retro-compatibility (I am
not comfortable with the current state of the code), but I cannot see it
right now.
2019-10-08 16:30:58 +02:00
Rémi Louf
cd6a59d5c1
add a decoder layer for Bert
2019-10-08 16:30:58 +02:00
Rémi Louf
45de313a9e
add bullet point on modifying an existing PR
2019-10-08 11:54:10 +02:00
Rémi Louf
ade05b6cef
add code contribution
2019-10-07 23:20:25 +02:00
Rémi Louf
e9c09052a4
add issues and requests guidelines
2019-10-07 22:30:55 +02:00
LysandreJik
8fcc6507ce
Multilingual
2019-10-07 15:02:42 -04:00
Rémi Louf
6e3e1c959e
Merge pull request #1447 from huggingface/dev-requirements
...
Provide requirements.txt for development dependencies
2019-10-07 18:49:26 +02:00
VictorSanh
7ce83b4931
update weights for distilgpt2
2019-10-07 12:30:27 -04:00
VictorSanh
9f81f1cba8
fix convert pt_to_tf2 for custom weights
2019-10-07 12:30:19 -04:00
Rémi Louf
7afd00a661
freeze dev requirements
2019-10-07 17:58:13 +02:00
Rémi Louf
a0dcefa382
generalize BertSelfAttention to take separate query, key, value
...
There is currently no way to specify the quey, key and value separately
in the Attention module. However, the decoder's "encoder-decoder
attention" layers take the decoder's last output as a query, the
encoder's states as key and value. We thus modify the existing code so
query, key and value can be added separately.
This obviously poses some naming conventions; `BertSelfAttention` is not
a self-attention module anymore. The way the residual is forwarded is
now awkard, etc. We will need to do some refacto once the decoder is
fully implemented.
2019-10-07 17:53:58 +02:00
Rémi Louf
31adbb247c
add class wireframes for Bert decoder
2019-10-07 16:43:21 +02:00
Rémi Louf
dda1adad6d
rename BertLayer to BertEncoderLayer
2019-10-07 16:31:46 +02:00
Rémi Louf
0053c0e052
do some (light) housekeeping
...
Several packages were imported but never used, indentation and line
spaces did not follow PEP8.
2019-10-07 16:29:15 +02:00
thomwolf
bd5363cc83
update CTRL configuration
2019-10-07 15:37:30 +02:00
thomwolf
dc89441167
update CTRL pytorch model
2019-10-07 15:37:25 +02:00
thomwolf
320b7a7e01
fix #1416
2019-10-07 14:26:59 +02:00
Rémi Louf
386e86e222
raise exception when class initialized with __init__
2019-10-07 13:00:06 +02:00
Rémi Louf
4446c02b8a
add wireframe for seq2seq model
2019-10-07 12:04:05 +02:00
Thomas Wolf
1615360c71
Merge pull request #1438 from SeanBE/master
...
fix pytorch-transformers migration description in README
2019-10-07 05:02:23 -04:00
seanBE
6dc6c716c5
fix pytorch-transformers migration description in README
2019-10-07 09:59:54 +01:00
Christopher Goh
904158ac4d
Rephrase forward method to reduce ambiguity
2019-10-06 23:40:52 -04:00
Christopher Goh
0f65d8cbbe
Fix some typos in README
2019-10-06 23:40:52 -04:00
Santiago Castro
1dea291a02
Remove unnecessary use of FusedLayerNorm in XLNet
2019-10-06 13:35:01 -04:00
LysandreJik
f3e0218fbb
Correct device assignment in run_generation
2019-10-05 21:05:16 -04:00
thomwolf
78ef1a9930
fixes
2019-10-04 17:59:44 -04:00
thomwolf
6c1d0bc066
update encode_plus - add truncation strategies
2019-10-04 17:38:38 -04:00
VictorSanh
0820bb0555
unecessary carriage return
2019-10-04 17:23:15 -04:00
VictorSanh
f5891c3821
run_squad --> run_squad_w_distillation
2019-10-04 17:23:15 -04:00
VictorSanh
764a7923ec
add distillation+finetuning option in run_squad
2019-10-04 17:23:15 -04:00
Lysandre Debut
bb464289ce
New model addition issue template
2019-10-04 16:41:26 -04:00
thomwolf
92c0f2fb90
Merge remote-tracking branch 'origin/julien_multiple-choice' into encoding-qol
2019-10-04 15:48:06 -04:00
Julien Chaumond
9e136ff57c
Honor args.overwrite_cache (h/t @erenup)
2019-10-04 15:00:56 -04:00
LysandreJik
7bddb45a6f
Decode documentaton
2019-10-04 14:27:38 -04:00
keskarnitish
dbed1c5d94
Adding CTRL (squashed commit)
...
adding conversion script
adding first draft of modeling & tokenization
adding placeholder for test files
bunch of changes
registering the tokenizer/model/etc
tests
change link; something is very VERY wrong here
weird end-of-word thingy going on
i think the tokenization works now ; wrote the unit tests
overall structure works;load w next
the monster is alive!
works after some cleanup as well
adding emacs autosave to gitignore
currently only supporting the 48 layer one; seems to infer fine on my macbook
cleanup
fixing some documentation
fixing some documentation
tests passing?
now works on CUDA also
adding greedy?
adding greedy sampling
works well
2019-10-03 22:29:03 -07:00
Thomas Wolf
b3cfd97946
Merge pull request #1373 from TimYagan/fix-css
...
Fixed critical css font-family issues
2019-10-03 19:04:02 -04:00
Lysandre Debut
81a1e12469
Merge pull request #1313 from enzoampil/master
...
Add option to use a 'stop token'
2019-10-03 22:43:57 +00:00
Lysandre Debut
d3f24dfad7
Merge branch 'master' into master
2019-10-03 22:43:09 +00:00
LysandreJik
ecc4f1bdfa
XLM use_lang_embedding flag in run_generation
2019-10-03 17:42:16 -04:00
LysandreJik
c2c2ca0fdb
Added XLM to run_generation, with prompt language selection.
2019-10-03 17:18:48 -04:00