LysandreJik
de276de1c1
Working evaluation
2019-12-03 17:15:51 -05:00
Julien Chaumond
7edb51f3a5
[pplm] split classif head into its own file
2019-12-03 22:07:25 +00:00
LysandreJik
c835bc85c2
Compute predictions
2019-12-03 15:28:16 -05:00
LysandreJik
285b1241e3
Added SquadResult
2019-12-03 15:00:49 -05:00
LysandreJik
8101924a68
Patch: v2.2.1
2019-12-03 11:20:26 -05:00
VictorSanh
48cbf267c9
Use full dataset for eval (SequentialSampler in Distributed setting)
2019-12-03 11:01:37 -05:00
Julien Chaumond
f434bfc623
[pplm] Update S3 links
...
Co-Authored-By: Piero Molino <w4nderlust@gmail.com>
2019-12-03 10:53:02 -05:00
Ethan Perez
96e83506d1
Always use SequentialSampler during evaluation
...
When evaluating, shouldn't we always use the SequentialSampler instead of DistributedSampler? Evaluation only runs on 1 GPU no matter what, so if you use the DistributedSampler with N GPUs, I think you'll only evaluate on 1/N of the evaluation set. That's at least what I'm finding when I run an older/modified version of this repo.
2019-12-03 10:15:39 -05:00
Julien Chaumond
3b48806f75
[pplm] README: add setup + tweaks
2019-12-03 10:14:02 -05:00
Julien Chaumond
0cb2c90890
readme
...
Co-Authored-By: Rosanne Liu <mimosavvy@gmail.com>
2019-12-03 10:14:02 -05:00
Julien Chaumond
1efb2ae7fc
[pplm] move scripts under examples/pplm/
2019-12-03 10:14:02 -05:00
Piero Molino
a59fdd1627
generate_text_pplm now works with batch_size > 1
2019-12-03 10:14:02 -05:00
w4nderlust
893d0d64fe
Changed order of some parameters to be more consistent. Identical results.
2019-12-03 10:14:02 -05:00
w4nderlust
f42816e7fc
Added additional check for url and path in discriminator model params
2019-12-03 10:14:02 -05:00
w4nderlust
f10b925015
Imrpovements: model_path renamed pretrained_model, tokenizer loaded from pretrained_model, pretrained_model set to discriminator's when discrim is specified, sample = False by default but cli parameter introduced. To obtain identical samples call the cli with --sample
2019-12-03 10:14:02 -05:00
w4nderlust
75904dae66
Removed global variable device
2019-12-03 10:14:02 -05:00
piero
7fd54b55a3
Added support for generic discriminators
2019-12-03 10:14:02 -05:00
piero
b0eaff36e6
Added a +1 to epoch when saving weights
2019-12-03 10:14:02 -05:00
piero
611961ade7
Added tqdm to preprocessing
2019-12-03 10:14:02 -05:00
piero
afc7dcd94d
Now run_pplm works on cpu. Identical output as before (when using gpu).
2019-12-03 10:14:02 -05:00
piero
61399e5afe
Cleaned perturb_past. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
ffc2935405
Fix for making unditioned generation work. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
9f693a0c48
Cleaned generate_text_pplm. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
61a12f790d
Renamed SmallConst to SMALL_CONST and introduced BIG_CONST. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
ef47b2c03a
Removed commented code. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
7ea12db3f5
Removed commented code. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
08c6e456a3
Cleaned full_text_generation. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
6c9c131780
More cleanup for run_model. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
7ffe47c888
Improved device specification
2019-12-03 10:14:02 -05:00
piero
4f2164e40e
First cleanup step, changing function names and passing parameters all the way through without using args. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
821de121e8
Minor changes
2019-12-03 10:14:02 -05:00
w4nderlust
7469d03b1c
Fixed minor bug when running training on cuda
2019-12-03 10:14:02 -05:00
piero
0b51fba20b
Added script for training a discriminator for pplm to use
2019-12-03 10:14:02 -05:00
Piero Molino
34a83faabe
Let's make PPLM great again
2019-12-03 10:14:02 -05:00
Julien Chaumond
d5faa74cd6
tokenizer white space: revert to previous behavior
2019-12-03 10:14:02 -05:00
Julien Chaumond
0b77d66a6d
rm extraneous import
2019-12-03 10:14:02 -05:00
Rosanne Liu
83b1e6ac9e
fix the loss backward issue
...
(cherry picked from commit 566468cc984c6ec7e10dfc62b5b4191781a99cd2)
2019-12-03 10:14:02 -05:00
Julien Chaumond
572c24cfa2
PPLM (squashed)
...
Co-authored-by: piero <piero@uber.com>
Co-authored-by: Rosanne Liu <mimosavvy@gmail.com>
2019-12-03 10:14:02 -05:00
Thomas Wolf
f19a78a634
Merge pull request #1903 from valohai/master
...
Valohai integration
2019-12-03 16:13:01 +01:00
Thomas Wolf
d100ad99c0
Merge pull request #2014 from aaugustin/mark-tf-auto-model-test-as-slow
...
Mark tests in TFAutoModelTest as slow.
2019-12-03 16:03:48 +01:00
Juha Kiili
66fc8d25a5
Change ref to original GLUE downloader script
2019-12-03 10:49:50 +02:00
LysandreJik
fbaf05bd92
Remove annoying tokenization message
2019-12-02 18:23:00 -05:00
Lysandre
e85855f2c4
Fix ALBERT exports with pretraining + sp classifier; Fix naming for ALBERT TF models
2019-12-02 18:00:19 -05:00
Lysandre
b3d834ae11
Reorganize ALBERT conversion script
2019-12-02 15:01:52 -05:00
thomwolf
f3776df0f3
WIP debugging
2019-12-02 15:47:00 +01:00
Aymeric Augustin
5ab93083e4
Mark tests in TFAutoModelTest as slow.
...
Each test forces downloading the same 536MB file, which is slow
even with a decent internet connection.
2019-12-01 18:25:15 +01:00
Aditya Soni
c356290c8d
typo fix as per Pytorch v1.1+
2019-12-01 14:08:14 +05:30
Rostislav Nedelchev
76c0bc06d5
[XLNet] Changed post-processing of attention w.r.t to target_mapping
...
Whenever target_mapping is provided to the input, XLNet outputs two different attention streams.
Based on that the attention output would be on of the two:
- a list of tensors (usual case for most transformers)
- a list of 2-tuples of tensors, one tesor for each of attention streams
Docs and unit-tests have been updated
2019-11-30 21:01:04 +01:00
Rostislav Nedelchev
b90791e950
fixed XLNet attenttion output for both attention streams
2019-11-30 15:57:51 +01:00
maxvidal
b0ee7c7df3
Added Camembert to available models
2019-11-29 14:17:02 -05:00