Commit Graph

2332 Commits

Author SHA1 Message Date
Ethan Perez
96e83506d1 Always use SequentialSampler during evaluation
When evaluating, shouldn't we always use the SequentialSampler instead of DistributedSampler? Evaluation only runs on 1 GPU no matter what, so if you use the DistributedSampler with N GPUs, I think you'll only evaluate on 1/N of the evaluation set. That's at least what I'm finding when I run an older/modified version of this repo.
2019-12-03 10:15:39 -05:00
Julien Chaumond
3b48806f75 [pplm] README: add setup + tweaks 2019-12-03 10:14:02 -05:00
Julien Chaumond
0cb2c90890 readme
Co-Authored-By: Rosanne Liu <mimosavvy@gmail.com>
2019-12-03 10:14:02 -05:00
Julien Chaumond
1efb2ae7fc [pplm] move scripts under examples/pplm/ 2019-12-03 10:14:02 -05:00
Piero Molino
a59fdd1627 generate_text_pplm now works with batch_size > 1 2019-12-03 10:14:02 -05:00
w4nderlust
893d0d64fe Changed order of some parameters to be more consistent. Identical results. 2019-12-03 10:14:02 -05:00
w4nderlust
f42816e7fc Added additional check for url and path in discriminator model params 2019-12-03 10:14:02 -05:00
w4nderlust
f10b925015 Imrpovements: model_path renamed pretrained_model, tokenizer loaded from pretrained_model, pretrained_model set to discriminator's when discrim is specified, sample = False by default but cli parameter introduced. To obtain identical samples call the cli with --sample 2019-12-03 10:14:02 -05:00
w4nderlust
75904dae66 Removed global variable device 2019-12-03 10:14:02 -05:00
piero
7fd54b55a3 Added support for generic discriminators 2019-12-03 10:14:02 -05:00
piero
b0eaff36e6 Added a +1 to epoch when saving weights 2019-12-03 10:14:02 -05:00
piero
611961ade7 Added tqdm to preprocessing 2019-12-03 10:14:02 -05:00
piero
afc7dcd94d Now run_pplm works on cpu. Identical output as before (when using gpu). 2019-12-03 10:14:02 -05:00
piero
61399e5afe Cleaned perturb_past. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
ffc2935405 Fix for making unditioned generation work. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
9f693a0c48 Cleaned generate_text_pplm. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
61a12f790d Renamed SmallConst to SMALL_CONST and introduced BIG_CONST. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
ef47b2c03a Removed commented code. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
7ea12db3f5 Removed commented code. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
08c6e456a3 Cleaned full_text_generation. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
6c9c131780 More cleanup for run_model. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
7ffe47c888 Improved device specification 2019-12-03 10:14:02 -05:00
piero
4f2164e40e First cleanup step, changing function names and passing parameters all the way through without using args. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
821de121e8 Minor changes 2019-12-03 10:14:02 -05:00
w4nderlust
7469d03b1c Fixed minor bug when running training on cuda 2019-12-03 10:14:02 -05:00
piero
0b51fba20b Added script for training a discriminator for pplm to use 2019-12-03 10:14:02 -05:00
Piero Molino
34a83faabe Let's make PPLM great again 2019-12-03 10:14:02 -05:00
Julien Chaumond
d5faa74cd6 tokenizer white space: revert to previous behavior 2019-12-03 10:14:02 -05:00
Julien Chaumond
0b77d66a6d rm extraneous import 2019-12-03 10:14:02 -05:00
Rosanne Liu
83b1e6ac9e fix the loss backward issue
(cherry picked from commit 566468cc984c6ec7e10dfc62b5b4191781a99cd2)
2019-12-03 10:14:02 -05:00
Julien Chaumond
572c24cfa2 PPLM (squashed)
Co-authored-by: piero <piero@uber.com>
Co-authored-by: Rosanne Liu <mimosavvy@gmail.com>
2019-12-03 10:14:02 -05:00
Thomas Wolf
f19a78a634
Merge pull request #1903 from valohai/master
Valohai integration
2019-12-03 16:13:01 +01:00
Thomas Wolf
d100ad99c0
Merge pull request #2014 from aaugustin/mark-tf-auto-model-test-as-slow
Mark tests in TFAutoModelTest as slow.
2019-12-03 16:03:48 +01:00
Juha Kiili
66fc8d25a5 Change ref to original GLUE downloader script 2019-12-03 10:49:50 +02:00
LysandreJik
fbaf05bd92 Remove annoying tokenization message 2019-12-02 18:23:00 -05:00
Lysandre
e85855f2c4 Fix ALBERT exports with pretraining + sp classifier; Fix naming for ALBERT TF models 2019-12-02 18:00:19 -05:00
Lysandre
b3d834ae11 Reorganize ALBERT conversion script 2019-12-02 15:01:52 -05:00
Aymeric Augustin
5ab93083e4 Mark tests in TFAutoModelTest as slow.
Each test forces downloading the same 536MB file, which is slow
even with a decent internet connection.
2019-12-01 18:25:15 +01:00
maxvidal
b0ee7c7df3 Added Camembert to available models 2019-11-29 14:17:02 -05:00
Elad Segal
ecf15ebf3b Add ALBERT to AutoClasses 2019-11-29 11:25:37 -05:00
thomwolf
4a666885b5 reducing my level of enthousiasm 2019-11-29 09:40:50 -05:00
thomwolf
adb5c79ff2 update all tf.shape and tensor.shape to shape_list 2019-11-29 09:40:50 -05:00
Juha Kiili
2421e54f8c Add link to original source and license to download_glue.data.py 2019-11-29 15:39:28 +02:00
Juha Kiili
41aa0e8003 Refactor logs and fix loss bug 2019-11-29 15:33:25 +02:00
Thomas Wolf
1ab8dc44b3
Merge pull request #1876 from huggingface/mean-fix
Mean does not exist in TF2
2019-11-29 09:26:33 +01:00
Thomas Wolf
f0d22b6363
Merge pull request #1873 from stefan-it/distilbert-german
German DistilBERT
2019-11-29 09:25:47 +01:00
Thomas Wolf
d49c43ff78
Merge pull request #1778 from eukaryote31/patch-2
from_pretrained: convert DialoGPT format
2019-11-28 16:08:37 +01:00
Thomas Wolf
91caf2462c
Merge pull request #1770 from huggingface/initi-encoder-mask
Only init encoder_attention_mask if stack is decoder
2019-11-28 16:06:55 +01:00
Thomas Wolf
49a69d5b78
Merge pull request #1753 from digantamisra98/patch-1
Added Mish Activation Function
2019-11-28 15:24:08 +01:00
Thomas Wolf
96e7ee7238
Merge pull request #1740 from huggingface/fix-ctrl-past
Fix CTRL past
2019-11-27 23:28:30 +01:00