Timothy Liu
86f23a1944
Minor enhancements to run_tf_glue.py
2019-10-13 10:21:35 +00:00
Emrah Budur
5a8c6e771a
Fixed the sample code in the title 'Quick tour'.
2019-10-12 14:17:17 +03:00
jeffxtang
e76d71521c
the working example code to use BertForQuestionAnswering and get an answer from a text and a question
2019-10-11 17:04:02 -07:00
VictorSanh
d844db4005
Add citation bibtex
2019-10-11 16:55:42 -04:00
Lysandre
a701c9b321
CTRL to tf automodels
2019-10-11 16:05:30 -04:00
Rémi Louf
b3261e7ace
read parameters from CLI, load model & tokenizer
2019-10-11 18:40:38 +02:00
Rémi Louf
d889e0b71b
add base for seq2seq finetuning
2019-10-11 17:36:12 +02:00
Rémi Louf
f8e98d6779
load pretrained embeddings in Bert decoder
...
In Rothe et al.'s "Leveraging Pre-trained Checkpoints for Sequence
Generation Tasks", Bert2Bert is initialized with pre-trained weights for
the encoder, and only pre-trained embeddings for the decoder. The
current version of the code completely randomizes the weights of the
decoder.
We write a custom function to initiliaze the weights of the decoder; we
first initialize the decoder with the weights and then randomize
everything but the embeddings.
2019-10-11 16:48:11 +02:00
Lysandre
3ddce1d74c
Release: 2.1.1
2019-10-11 06:37:49 -04:00
Thomas Wolf
4428aefc63
Merge pull request #1488 from huggingface/pytorch-tpu
...
GLUE on TPU
2019-10-11 16:33:00 +02:00
Thomas Wolf
3b43b01872
Merge pull request #1482 from huggingface/tf2_integration_tests
...
Integration of TF 2.0 models with other Keras modules
2019-10-11 16:25:43 +02:00
thomwolf
4b8f3e8f32
adding citation
2019-10-11 16:18:16 +02:00
thomwolf
18a3cef7d5
no nans
2019-10-11 16:09:42 +02:00
thomwolf
1f5d9513d8
fix test
2019-10-11 15:55:01 +02:00
thomwolf
0f9fc4fbde
adding option to desactivate past/memory outputs
2019-10-11 15:47:08 +02:00
Thomas Wolf
700331b5ec
Merge pull request #1492 from stefan-it/bert-german-dbmdz-models
...
Add new BERT models for German (cased and uncased)
2019-10-11 13:01:52 +02:00
Thomas Wolf
573dde9b44
Merge pull request #1405 from slayton58/xlnet_layer_reorder
...
Re-order XLNet attention head outputs for better perf
2019-10-11 12:10:58 +02:00
Stefan Schweter
5f25a5f367
model: add support for new German BERT models (cased and uncased) from @dbmdz
2019-10-11 10:20:33 +02:00
Luran He
f382a8decd
convert int to str before adding to a str
2019-10-10 19:20:39 -04:00
Lysandre
639f4b7190
Don't save/load when on TPU
2019-10-10 19:17:25 +00:00
Lysandre
d4e7934ac3
GLUE on TPU
2019-10-10 19:03:06 +00:00
Rémi Louf
1e68c28670
add test for initialization of Bert2Rnd
2019-10-10 18:07:11 +02:00
thomwolf
2a4fef837a
move Circle-CI from TF2-rc0 to official TF2
2019-10-10 15:57:35 +02:00
thomwolf
751e246087
using tf.print in roberta
2019-10-10 15:47:20 +02:00
Rémi Louf
fa218e648a
fix syntax errors
2019-10-10 15:16:07 +02:00
thomwolf
c9e8c51946
fixing SequenceSummary head in TF 2.0
2019-10-10 15:16:05 +02:00
thomwolf
da26bae61b
adding more tests on TF and pytorch serialization - updating configuration for better serialization
2019-10-10 14:30:48 +02:00
Rémi Louf
3e1cd8241e
fix stupid (re)naming issue
2019-10-10 14:18:20 +02:00
Rémi Louf
81ee29ee8d
remove the staticmethod used to load the config
2019-10-10 14:13:37 +02:00
thomwolf
bb04edb45b
Add tests that TF 2.0 model can be integrated with other Keras modules
2019-10-10 13:08:24 +02:00
Rémi Louf
d7092d592c
rename the attributes in the Bert Layer
...
Since the preloading of weights relies on the name of the class's
attributes changing the namespace breaks loading pretrained weights on
Bert and all related models. I reverted `self_attention` to `attention`
and us `crossattention` for the decoder instead.
2019-10-10 12:51:14 +02:00
Rémi Louf
51261167b4
prune both attention and self-attention heads
2019-10-10 12:17:22 +02:00
Rémi Louf
17177e7379
add is_decoder as an attribute to Config class
2019-10-10 12:03:58 +02:00
Thomas Wolf
6596e3d566
Merge pull request #1454 from bkkaggle/pytorch-built-in-tensorboard
...
Change tensorboard imports to use built-in tensorboard if available
2019-10-10 11:56:55 +02:00
Thomas Wolf
4bc4601192
Merge pull request #1480 from huggingface/fix_ctrl_tokenizer
...
Fixing CTRL tokenizer - Update error messages - XLM-MLM in run_generation
2019-10-10 11:56:20 +02:00
thomwolf
177a721205
move back to simple space spliting
2019-10-10 11:45:47 +02:00
Rémi Louf
df85a0ff0b
replace double quotes with simple quotes
2019-10-10 11:38:26 +02:00
Rémi Louf
9ca788b2e8
merge the two Bert layers classes
2019-10-10 11:33:28 +02:00
thomwolf
a5997dd81a
better error messages
2019-10-10 11:31:01 +02:00
Rémi Louf
edfc8f8225
Remove and do the branching in
2019-10-10 10:17:27 +02:00
Rémi Louf
09cfd12235
remove and do the branching in
2019-10-10 10:15:27 +02:00
thomwolf
43a237f15e
switching to moses tokenizer
2019-10-10 10:11:16 +02:00
Rémi Louf
877ef2c6ca
override from_pretrained
in Bert2Rnd
...
In the seq2seq model we need to both load pretrained weights in the
encoder and initialize the decoder randomly. Because the
`from_pretrained` method defined in the base class relies on module
names to assign weights, it would also initialize the decoder with
pretrained weights. To avoid this we override the method to only
initialize the encoder with pretrained weights.
2019-10-10 10:02:18 +02:00
Rémi Louf
851ef592c5
add comment on recursive weights loading
2019-10-10 10:02:03 +02:00
LysandreJik
036483fae5
Temporary CTRL tokenizer fix
2019-10-09 16:33:15 -04:00
LysandreJik
9c2e0a4acf
Release: 2.1.0
2019-10-09 12:14:03 -04:00
LysandreJik
7fe98d8c18
Update CTRL documentation
2019-10-09 12:12:36 -04:00
LysandreJik
89f86f9661
CTRL added to the documentation
2019-10-09 12:04:06 -04:00
LysandreJik
e17ea08e24
Pycharm folder added to gitignore
2019-10-09 11:32:21 -04:00
Lysandre Debut
2431fea98a
Merge pull request #1383 from keskarnitish/master
...
Adding CTRL
2019-10-09 11:31:05 -04:00