Lysandre
dc4e9e5cb3
DataParallel for SQuAD + fix XLM
2019-12-10 19:21:20 +00:00
Rémi Louf
07bc8efbc3
add greedy decoding and sampling
2019-12-10 17:27:50 +01:00
Rémi Louf
4b82c485de
remove misplaced summarization documentation
2019-12-10 09:13:33 -05:00
Thomas Wolf
e57d00ee10
Merge pull request #1984 from huggingface/squad-refactor
...
[WIP] Squad refactor
2019-12-10 11:07:26 +01:00
Suvrat Bhooshan
df3961121f
Add MMBT Model to Transformers Repo
2019-12-09 18:36:48 -08:00
Julien Chaumond
1d18930462
Harmonize no_cuda
flag with other scripts
2019-12-09 20:37:55 -05:00
Rémi Louf
f7eba09007
clean for release
2019-12-09 20:37:55 -05:00
Rémi Louf
2a64107e44
improve device usage
2019-12-09 20:37:55 -05:00
Rémi Louf
c0707a85d2
add README
2019-12-09 20:37:55 -05:00
Rémi Louf
ade3cdf5ad
integrate ROUGE
2019-12-09 20:37:55 -05:00
Rémi Louf
076602bdc4
prevent BERT weights from being downloaded twice
2019-12-09 20:37:55 -05:00
Rémi Louf
a1994a71ee
simplified model and configuration
2019-12-09 20:37:55 -05:00
Rémi Louf
3a9a9f7861
default output dir to documents dir
2019-12-09 20:37:55 -05:00
Rémi Louf
693606a75c
update the docs
2019-12-09 20:37:55 -05:00
Rémi Louf
2403a66598
give transformers API to BertAbs
2019-12-09 20:37:55 -05:00
Rémi Louf
ba089c780b
share pretrained embeddings
2019-12-09 20:37:55 -05:00
Rémi Louf
9660ba1cbd
Add beam search
2019-12-09 20:37:55 -05:00
Rémi Louf
1c71ecc880
load the pretrained weights for encoder-decoder
...
We currently save the pretrained_weights of the encoder and decoder in
two separate directories `encoder` and `decoder`. However, for the
`from_pretrained` function to operate with automodels we need to
specify the type of model in the path to the weights.
The path to the encoder/decoder weights is handled by the
`PreTrainedEncoderDecoder` class in the `save_pretrained` function. Sice
there is no easy way to infer the type of model that was initialized for
the encoder and decoder we add a parameter `model_type` to the function.
This is not an ideal solution as it is error prone, and the model type
should be carried by the Model classes somehow.
This is a temporary fix that should be changed before merging.
2019-12-09 20:37:55 -05:00
Rémi Louf
07f4cd73f6
update function to add special tokens
...
Since I started my PR the `add_special_token_single_sequence` function
has been deprecated for another; I replaced it with the new function.
2019-12-09 20:37:55 -05:00
Bilal Khan
79526f82f5
Remove unnecessary epoch variable
2019-12-09 16:24:35 -05:00
Bilal Khan
9626e0458c
Add functionality to continue training from last saved global_step
2019-12-09 16:24:35 -05:00
Bilal Khan
2d73591a18
Stop saving current epoch
2019-12-09 16:24:35 -05:00
Bilal Khan
0eb973b0d9
Use saved optimizer and scheduler states if available
2019-12-09 16:24:35 -05:00
Bilal Khan
a03fcf570d
Save tokenizer after each epoch to be able to resume training from a checkpoint
2019-12-09 16:24:35 -05:00
Bilal Khan
f71b1bb05a
Save optimizer state, scheduler state and current epoch
2019-12-09 16:24:35 -05:00
LysandreJik
2a4ef098d6
Add ALBERT and XLM to SQuAD script
2019-12-09 10:46:47 -05:00
Lysandre Debut
00c4e39581
Merge branch 'master' into squad-refactor
2019-12-09 10:41:15 -05:00
Thomas Wolf
5482822a2b
Merge pull request #2046 from jplu/tf2-ner-example
...
Add NER TF2 example.
2019-12-06 12:12:22 +01:00
LysandreJik
e9217da5ff
Cleanup
...
Improve global visibility on the run_squad script, remove unused files and fixes related to XLNet.
2019-12-05 16:01:51 -05:00
LysandreJik
9ecd83dace
Patch evaluation for impossible values + cleanup
2019-12-05 14:44:57 -05:00
VictorSanh
35ff345fc9
update requirements
2019-12-05 12:07:04 -05:00
VictorSanh
552c44a9b1
release distilm-bert
2019-12-05 10:14:58 -05:00
Rosanne Liu
ee53de7aac
Pr for pplm ( #2060 )
...
* license
* changes
* ok
* Update paper link and commands to run
* pointer to uber repo
2019-12-05 09:20:07 -05:00
Julien Plu
9200a759d7
Add few tests on the TF optimization file with some info in the documentation. Complete the README.
2019-12-05 12:56:43 +01:00
thomwolf
75a97af6bc
fix #1450 - add doc
2019-12-05 11:26:55 +01:00
LysandreJik
f7e4a7cdfa
Cleanup
2019-12-04 16:24:15 -05:00
LysandreJik
cca75e7884
Kill the demon spawn
2019-12-04 15:42:29 -05:00
LysandreJik
9ddc3f1a12
Naming update + XLNet/XLM evaluation
2019-12-04 10:37:00 -05:00
thomwolf
5bfcd0485e
fix #1991
2019-12-04 14:53:11 +01:00
Julien Plu
ecb923da9c
Create a NER example similar to the Pytorch one. It takes the same options, and can be run the same way.
2019-12-04 09:43:15 +01:00
LysandreJik
de276de1c1
Working evaluation
2019-12-03 17:15:51 -05:00
Julien Chaumond
7edb51f3a5
[pplm] split classif head into its own file
2019-12-03 22:07:25 +00:00
VictorSanh
48cbf267c9
Use full dataset for eval (SequentialSampler in Distributed setting)
2019-12-03 11:01:37 -05:00
Julien Chaumond
f434bfc623
[pplm] Update S3 links
...
Co-Authored-By: Piero Molino <w4nderlust@gmail.com>
2019-12-03 10:53:02 -05:00
Ethan Perez
96e83506d1
Always use SequentialSampler during evaluation
...
When evaluating, shouldn't we always use the SequentialSampler instead of DistributedSampler? Evaluation only runs on 1 GPU no matter what, so if you use the DistributedSampler with N GPUs, I think you'll only evaluate on 1/N of the evaluation set. That's at least what I'm finding when I run an older/modified version of this repo.
2019-12-03 10:15:39 -05:00
Julien Chaumond
3b48806f75
[pplm] README: add setup + tweaks
2019-12-03 10:14:02 -05:00
Julien Chaumond
0cb2c90890
readme
...
Co-Authored-By: Rosanne Liu <mimosavvy@gmail.com>
2019-12-03 10:14:02 -05:00
Julien Chaumond
1efb2ae7fc
[pplm] move scripts under examples/pplm/
2019-12-03 10:14:02 -05:00
Piero Molino
a59fdd1627
generate_text_pplm now works with batch_size > 1
2019-12-03 10:14:02 -05:00
w4nderlust
893d0d64fe
Changed order of some parameters to be more consistent. Identical results.
2019-12-03 10:14:02 -05:00
w4nderlust
f42816e7fc
Added additional check for url and path in discriminator model params
2019-12-03 10:14:02 -05:00
w4nderlust
f10b925015
Imrpovements: model_path renamed pretrained_model, tokenizer loaded from pretrained_model, pretrained_model set to discriminator's when discrim is specified, sample = False by default but cli parameter introduced. To obtain identical samples call the cli with --sample
2019-12-03 10:14:02 -05:00
w4nderlust
75904dae66
Removed global variable device
2019-12-03 10:14:02 -05:00
piero
7fd54b55a3
Added support for generic discriminators
2019-12-03 10:14:02 -05:00
piero
b0eaff36e6
Added a +1 to epoch when saving weights
2019-12-03 10:14:02 -05:00
piero
611961ade7
Added tqdm to preprocessing
2019-12-03 10:14:02 -05:00
piero
afc7dcd94d
Now run_pplm works on cpu. Identical output as before (when using gpu).
2019-12-03 10:14:02 -05:00
piero
61399e5afe
Cleaned perturb_past. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
ffc2935405
Fix for making unditioned generation work. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
9f693a0c48
Cleaned generate_text_pplm. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
61a12f790d
Renamed SmallConst to SMALL_CONST and introduced BIG_CONST. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
ef47b2c03a
Removed commented code. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
7ea12db3f5
Removed commented code. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
08c6e456a3
Cleaned full_text_generation. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
6c9c131780
More cleanup for run_model. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
7ffe47c888
Improved device specification
2019-12-03 10:14:02 -05:00
piero
4f2164e40e
First cleanup step, changing function names and passing parameters all the way through without using args. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
821de121e8
Minor changes
2019-12-03 10:14:02 -05:00
w4nderlust
7469d03b1c
Fixed minor bug when running training on cuda
2019-12-03 10:14:02 -05:00
piero
0b51fba20b
Added script for training a discriminator for pplm to use
2019-12-03 10:14:02 -05:00
Piero Molino
34a83faabe
Let's make PPLM great again
2019-12-03 10:14:02 -05:00
Julien Chaumond
d5faa74cd6
tokenizer white space: revert to previous behavior
2019-12-03 10:14:02 -05:00
Julien Chaumond
0b77d66a6d
rm extraneous import
2019-12-03 10:14:02 -05:00
Rosanne Liu
83b1e6ac9e
fix the loss backward issue
...
(cherry picked from commit 566468cc984c6ec7e10dfc62b5b4191781a99cd2)
2019-12-03 10:14:02 -05:00
Julien Chaumond
572c24cfa2
PPLM (squashed)
...
Co-authored-by: piero <piero@uber.com>
Co-authored-by: Rosanne Liu <mimosavvy@gmail.com>
2019-12-03 10:14:02 -05:00
Thomas Wolf
f19a78a634
Merge pull request #1903 from valohai/master
...
Valohai integration
2019-12-03 16:13:01 +01:00
maxvidal
b0ee7c7df3
Added Camembert to available models
2019-11-29 14:17:02 -05:00
Juha Kiili
41aa0e8003
Refactor logs and fix loss bug
2019-11-29 15:33:25 +02:00
Lysandre
bd41e8292a
Cleanup & Evaluation now works
2019-11-28 16:03:56 -05:00
Stefan Schweter
8c276b9c92
Merge branch 'master' into distilbert-german
2019-11-27 18:11:49 +01:00
VictorSanh
d5478b939d
add distilbert + update run_xnli wrt run_glue
2019-11-27 11:07:22 -05:00
VictorSanh
73fe2e7385
remove fstrings
2019-11-27 11:07:22 -05:00
VictorSanh
3e7656f7ac
update readme
2019-11-27 11:07:22 -05:00
VictorSanh
abd397e954
uniformize w/ the cache_dir update
2019-11-27 11:07:22 -05:00
VictorSanh
d5910b312f
move xnli processor (and utils) to transformers/data/processors
2019-11-27 11:07:22 -05:00
VictorSanh
289cf4d2b7
change default for XNLI: dev --> test
2019-11-27 11:07:22 -05:00
VictorSanh
84a0b522cf
mbert reproducibility results
2019-11-27 11:07:22 -05:00
VictorSanh
c4336ecbbd
xnli - output_mode consistency
2019-11-27 11:07:22 -05:00
VictorSanh
d52e98ff9a
add xnli examples/README.md
2019-11-27 11:07:22 -05:00
VictorSanh
71f71ddb3e
run_xnli + utils_xnli
2019-11-27 11:07:22 -05:00
Julien Chaumond
b5d884d25c
Uniformize #1952
2019-11-27 11:05:55 -05:00
Lysandre
4374eaea78
ALBERT for SQuAD
2019-11-26 13:08:12 -05:00
Lysandre
c110c41fdb
Run GLUE and remove LAMB
2019-11-26 13:08:12 -05:00
manansanghi
5d3b8daad2
Minor bug fixes on run_ner.py
2019-11-25 16:48:03 -05:00
İbrahim Ethem Demirci
aa92a184d2
resize model when special tokenizer present
2019-11-25 15:06:32 -05:00
Lysandre
7485caefb0
fix #1894
2019-11-25 09:33:39 -05:00
Julien Chaumond
176cd1ce1b
[doc] homogenize instructions slightly
2019-11-23 11:18:54 -05:00
Lysandre
c3ba645237
Works for XLNet
2019-11-22 16:27:37 -05:00
Lysandre
72e506b22e
wip
2019-11-22 16:26:00 -05:00
Rémi Louf
26db31e0c0
update the documentation
2019-11-21 14:41:19 -05:00