w4nderlust
f42816e7fc
Added additional check for url and path in discriminator model params
2019-12-03 10:14:02 -05:00
w4nderlust
f10b925015
Imrpovements: model_path renamed pretrained_model, tokenizer loaded from pretrained_model, pretrained_model set to discriminator's when discrim is specified, sample = False by default but cli parameter introduced. To obtain identical samples call the cli with --sample
2019-12-03 10:14:02 -05:00
w4nderlust
75904dae66
Removed global variable device
2019-12-03 10:14:02 -05:00
piero
7fd54b55a3
Added support for generic discriminators
2019-12-03 10:14:02 -05:00
piero
b0eaff36e6
Added a +1 to epoch when saving weights
2019-12-03 10:14:02 -05:00
piero
611961ade7
Added tqdm to preprocessing
2019-12-03 10:14:02 -05:00
piero
afc7dcd94d
Now run_pplm works on cpu. Identical output as before (when using gpu).
2019-12-03 10:14:02 -05:00
piero
61399e5afe
Cleaned perturb_past. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
ffc2935405
Fix for making unditioned generation work. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
9f693a0c48
Cleaned generate_text_pplm. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
61a12f790d
Renamed SmallConst to SMALL_CONST and introduced BIG_CONST. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
ef47b2c03a
Removed commented code. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
7ea12db3f5
Removed commented code. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
08c6e456a3
Cleaned full_text_generation. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
6c9c131780
More cleanup for run_model. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
7ffe47c888
Improved device specification
2019-12-03 10:14:02 -05:00
piero
4f2164e40e
First cleanup step, changing function names and passing parameters all the way through without using args. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
821de121e8
Minor changes
2019-12-03 10:14:02 -05:00
w4nderlust
7469d03b1c
Fixed minor bug when running training on cuda
2019-12-03 10:14:02 -05:00
piero
0b51fba20b
Added script for training a discriminator for pplm to use
2019-12-03 10:14:02 -05:00
Piero Molino
34a83faabe
Let's make PPLM great again
2019-12-03 10:14:02 -05:00
Julien Chaumond
d5faa74cd6
tokenizer white space: revert to previous behavior
2019-12-03 10:14:02 -05:00
Julien Chaumond
0b77d66a6d
rm extraneous import
2019-12-03 10:14:02 -05:00
Rosanne Liu
83b1e6ac9e
fix the loss backward issue
...
(cherry picked from commit 566468cc984c6ec7e10dfc62b5b4191781a99cd2)
2019-12-03 10:14:02 -05:00
Julien Chaumond
572c24cfa2
PPLM (squashed)
...
Co-authored-by: piero <piero@uber.com>
Co-authored-by: Rosanne Liu <mimosavvy@gmail.com>
2019-12-03 10:14:02 -05:00
Thomas Wolf
f19a78a634
Merge pull request #1903 from valohai/master
...
Valohai integration
2019-12-03 16:13:01 +01:00
maxvidal
b0ee7c7df3
Added Camembert to available models
2019-11-29 14:17:02 -05:00
Juha Kiili
41aa0e8003
Refactor logs and fix loss bug
2019-11-29 15:33:25 +02:00
Lysandre
bd41e8292a
Cleanup & Evaluation now works
2019-11-28 16:03:56 -05:00
Stefan Schweter
8c276b9c92
Merge branch 'master' into distilbert-german
2019-11-27 18:11:49 +01:00
VictorSanh
d5478b939d
add distilbert + update run_xnli wrt run_glue
2019-11-27 11:07:22 -05:00
VictorSanh
73fe2e7385
remove fstrings
2019-11-27 11:07:22 -05:00
VictorSanh
3e7656f7ac
update readme
2019-11-27 11:07:22 -05:00
VictorSanh
abd397e954
uniformize w/ the cache_dir update
2019-11-27 11:07:22 -05:00
VictorSanh
d5910b312f
move xnli processor (and utils) to transformers/data/processors
2019-11-27 11:07:22 -05:00
VictorSanh
289cf4d2b7
change default for XNLI: dev --> test
2019-11-27 11:07:22 -05:00
VictorSanh
84a0b522cf
mbert reproducibility results
2019-11-27 11:07:22 -05:00
VictorSanh
c4336ecbbd
xnli - output_mode consistency
2019-11-27 11:07:22 -05:00
VictorSanh
d52e98ff9a
add xnli examples/README.md
2019-11-27 11:07:22 -05:00
VictorSanh
71f71ddb3e
run_xnli + utils_xnli
2019-11-27 11:07:22 -05:00
Julien Chaumond
b5d884d25c
Uniformize #1952
2019-11-27 11:05:55 -05:00
Lysandre
4374eaea78
ALBERT for SQuAD
2019-11-26 13:08:12 -05:00
Lysandre
c110c41fdb
Run GLUE and remove LAMB
2019-11-26 13:08:12 -05:00
manansanghi
5d3b8daad2
Minor bug fixes on run_ner.py
2019-11-25 16:48:03 -05:00
İbrahim Ethem Demirci
aa92a184d2
resize model when special tokenizer present
2019-11-25 15:06:32 -05:00
Lysandre
7485caefb0
fix #1894
2019-11-25 09:33:39 -05:00
Julien Chaumond
176cd1ce1b
[doc] homogenize instructions slightly
2019-11-23 11:18:54 -05:00
Lysandre
c3ba645237
Works for XLNet
2019-11-22 16:27:37 -05:00
Lysandre
72e506b22e
wip
2019-11-22 16:26:00 -05:00
Rémi Louf
26db31e0c0
update the documentation
2019-11-21 14:41:19 -05:00
Juha Kiili
2cf3447e0a
Glue: log in Valohai-compatible JSON format too
2019-11-21 12:35:25 +02:00
Thomas Wolf
0cdfcca24b
Merge pull request #1860 from stefan-it/camembert-for-token-classification
...
[WIP] Add support for CamembertForTokenClassification
2019-11-21 10:56:07 +01:00
Jin Young Sohn
e70cdf083d
Cleanup TPU bits from run_glue.py
...
TPU runner is currently implemented in:
https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py .
We plan to upstream this directly into `huggingface/transformers`
(either `master` or `tpu`) branch once it's been more thoroughly tested.
2019-11-20 17:54:34 -05:00
Lysandre
454455c695
fix #1879
2019-11-20 09:42:48 -05:00
Stefan Schweter
e7cf2ccd15
distillation: add German distilbert model
2019-11-19 19:55:19 +01:00
Kazutoshi Shinoda
f3386d9383
typo "deay" -> "decay"
2019-11-18 11:50:06 -05:00
Stefan Schweter
56c84863a1
camembert: add support for CamemBERT in run_ner example
2019-11-18 17:06:57 +01:00
Julien Chaumond
26858f27cb
[camembert] Upload to s3 + rename script
2019-11-16 00:11:07 -05:00
Louis MARTIN
3e20c2e871
Update demo_camembert.py with new classes
2019-11-16 00:11:07 -05:00
Louis MARTIN
f12e4d8da7
Move demo_camembert.py to examples/contrib
2019-11-16 00:11:07 -05:00
Louis MARTIN
6e72fd094c
Add demo_camembert.py
2019-11-16 00:11:07 -05:00
Xu Hongshen
ca99a2d500
Update example readme
2019-11-15 14:55:26 +08:00
Xu Hongshen
7da3ef24cd
add is_impossible tensor to model inputs during fine-tuning xlnet on squad2.0
2019-11-15 14:18:53 +08:00
Thomas Wolf
74ce8de7d8
Merge pull request #1792 from stefan-it/distilbert-for-token-classification
...
DistilBERT for token classification
2019-11-14 22:47:53 +01:00
Thomas Wolf
05db5bc1af
added small comparison between BERT, RoBERTa and DistilBERT
2019-11-14 22:40:22 +01:00
Thomas Wolf
9629e2c676
Merge pull request #1804 from ronakice/master
...
fix multi-gpu eval in torch examples
2019-11-14 22:24:05 +01:00
Thomas Wolf
df99f8c5a1
Merge pull request #1832 from huggingface/memory-leak-schedulers
...
replace LambdaLR scheduler wrappers by function
2019-11-14 22:10:31 +01:00
Rémi Louf
2276bf69b7
update the examples, docs and template
2019-11-14 20:38:02 +01:00
Lysandre
d7929899da
Specify checkpoint in saved file for run_lm_finetuning.py
2019-11-14 10:49:00 -05:00
ronakice
2e31176557
fix multi-gpu eval
2019-11-12 05:55:11 -05:00
Stefan Schweter
2b07b9e5ee
examples: add DistilBert support for NER fine-tuning
2019-11-11 16:19:34 +01:00
Adrian Bauer
7a9aae1044
Fix run_bertology.py
...
Make imports and args.overwrite_cache match run_glue.py
2019-11-08 16:28:40 -05:00
Julien Chaumond
f88c104d8f
[run_tf_glue] Add comment for context
2019-11-05 19:56:43 -05:00
Julien Chaumond
30968d70af
misc doc
2019-11-05 19:06:12 -05:00
Thomas Wolf
e99071f105
Merge pull request #1734 from orena1/patch-1
...
add progress bar to convert_examples_to_features
2019-11-05 11:34:20 +01:00
Thomas Wolf
ba973342e3
Merge pull request #1553 from WilliamTambellini/timeSquadInference
...
Add speed log to examples/run_squad.py
2019-11-05 11:13:12 +01:00
Thomas Wolf
237fad339c
Merge pull request #1709 from oneraghavan/master
...
Fixing mode in evaluate during training
2019-11-05 10:55:33 +01:00
Oren Amsalem
d7906165a3
add progress bar for convert_examples_to_features
...
It takes considerate amount of time (~10 min) to parse the examples to features, it is good to have a progress-bar to track this
2019-11-05 10:34:27 +02:00
thomwolf
89d6272898
Fix #1623
2019-11-04 16:21:12 +01:00
Thomas Wolf
9a3b173cd3
Merge branch 'master' into master
2019-11-04 11:41:26 +01:00
thomwolf
ad90868627
Update example readme
2019-11-04 11:27:22 +01:00
Raghavan
e5b1048bae
Fixing mode in evaluate during training
2019-11-03 16:14:46 +05:30
Lysandre
1a2b40cb53
run_tf_glue MRPC evaluation only for MRPC
2019-10-31 18:00:51 -04:00
Timothy Liu
be36cf92fb
Added mixed precision support to benchmarks.py
2019-10-31 17:24:37 -04:00
Julien Chaumond
f96ce1c241
[run_generation] Fix generation with batch_size>1
2019-10-31 18:27:11 +00:00
Julien Chaumond
3c1b6f594e
Merge branch 'master' into fix_top_k_top_p_filtering
2019-10-31 13:53:51 -04:00
Victor SANH
fa735208c9
update readme - fix example command distil*
2019-10-30 14:27:28 -04:00
Thomas Wolf
c7058d8224
Merge pull request #1608 from focox/master
...
Error raised by "tmp_eval_loss += tmp_eval_loss.item()" when using multi-gpu
2019-10-30 17:14:07 +01:00
Thomas Wolf
04c69db399
Merge pull request #1628 from huggingface/tfglue
...
run_tf_glue works with all tasks
2019-10-30 17:04:03 +01:00
Thomas Wolf
3df4367244
Merge pull request #1601 from huggingface/clean-roberta
...
Clean roberta model & all tokenizers now add special tokens by default (breaking change)
2019-10-30 17:00:40 +01:00
Thomas Wolf
36174696cc
Merge branch 'master' into clean-roberta
2019-10-30 16:51:06 +01:00
Thomas Wolf
228cdd6a6e
Merge branch 'master' into conditional-generation
2019-10-30 16:40:35 +01:00
Rémi Louf
070507df1f
format utils for summarization
2019-10-30 11:24:12 +01:00
Rémi Louf
da10de8466
fix bug with padding mask + add corresponding test
2019-10-30 11:19:58 +01:00
Rémi Louf
3b0d2fa30e
rename seq2seq to encoder_decoder
2019-10-30 10:54:46 +01:00
Rémi Louf
9c1bdb5b61
revert renaming of lm_labels to ltr_lm_labels
2019-10-30 10:43:13 +01:00
Rémi Louf
098a89f312
update docstrings; rename lm_labels to more explicit ltr_lm_labels
2019-10-29 20:08:03 +01:00
Rémi Louf
dfce409691
resolve PR comments
2019-10-29 17:10:20 +01:00
altsoph
079bfb32fb
Evaluation fixed.
2019-10-28 10:18:58 -04:00
altsoph
438f2730a0
Evaluation code fixed.
2019-10-28 10:18:58 -04:00
Rémi Louf
4c3ac4a7d8
here's one big commit
2019-10-28 10:49:50 +01:00
Rémi Louf
932543f77e
fix test of truncation function
2019-10-28 10:49:49 +01:00
Rémi Louf
a67413ccc8
extend works in-place
2019-10-28 10:49:49 +01:00
Rémi Louf
b915ba9dfe
pad sequence with 0, mask with -1
2019-10-28 10:49:49 +01:00
Lysandre
bab6ad01aa
run_tf_glue works with all tasks
2019-10-24 21:41:45 +00:00
Matt Maybeno
ae1d03fc51
Add roberta to doc
2019-10-24 14:32:48 -04:00
Matt Maybeno
4e5f88b74f
Add Roberta to run_ner.py
2019-10-24 14:32:48 -04:00
VictorSanh
5b6cafb11b
[release] fix table weirdness
2019-10-23 10:35:16 -04:00
VictorSanh
8ad5c591cd
[RELEASE] DistilRoBERTa
2019-10-23 10:29:47 -04:00
focox@qq.com
bd847ce7d7
fixed the bug raised by "tmp_eval_loss += tmp_eval_loss.item()" when parallelly using multi-gpu.
2019-10-23 20:27:13 +08:00
Julien Chaumond
ef1b8b2ae5
[CTRL] warn if generation prompt does not start with a control code
...
see also https://github.com/salesforce/ctrl/pull/50
2019-10-22 21:30:32 +00:00
Lysandre
7d709e55ed
Remove
2019-10-22 14:12:33 -04:00
Lysandre
1cfd974868
Option to benchmark only one of the two libraries
2019-10-22 13:32:23 -04:00
Pasquale Minervini
abd7110e21
gradient norm clipping should be done right before calling the optimiser - fixing run_glue and run_ner as well
2019-10-21 19:56:52 +01:00
Pasquale Minervini
3775550c4b
gradient norm clipping should be done right before calling the optimiser
2019-10-20 22:33:56 +01:00
LysandreJik
7dd29ed2f1
Benchmarks example script
2019-10-18 10:53:04 -04:00
William Tambellini
0919389d9a
Add speed log to examples/run_squad.py
...
Add a speed estimate log (time per example)
for evaluation to examples/run_squad.py
2019-10-17 14:41:04 -07:00
leo-du
ecd15667f3
fix repetition penalty
2019-10-17 14:47:14 -04:00
thomwolf
8cd56e3036
fix data processing in script
2019-10-17 16:33:26 +02:00
Rémi Louf
578d23e061
add training pipeline (formatting temporary)
2019-10-17 14:02:27 +02:00
Rémi Louf
47a06d88a0
use two different tokenizers for storyand summary
2019-10-17 13:04:26 +02:00
Rémi Louf
bfb9b540d4
add Model2Model to __init__
2019-10-17 12:59:51 +02:00
Rémi Louf
c1bc709c35
correct the truncation and padding of dataset
2019-10-17 10:41:53 +02:00
Rémi Louf
e4e0ee14bd
add separator between data import and train
2019-10-16 20:05:32 +02:00
Rémi Louf
0d81fc853e
specify in readme that both datasets are required
2019-10-15 15:26:33 +02:00
Rémi Louf
1aec940587
test the full story processing
2019-10-15 15:18:07 +02:00
Rémi Louf
22e1af6859
truncation function is fully tested
2019-10-15 14:43:50 +02:00
Rémi Louf
260ac7d9a8
wip commit, switching computers
2019-10-15 12:24:35 +02:00
thomwolf
be916cb3fb
Merge branch 'master' of https://github.com/huggingface/transformers
2019-10-15 10:37:13 +02:00
thomwolf
5875aaf762
install tensorboard
2019-10-15 10:36:46 +02:00
Thomas Wolf
40f14ff545
Merge pull request #1513 from slayton58/amp_fp16_einsum
...
Force einsum to run in fp16
2019-10-15 10:25:00 +02:00
Thomas Wolf
d147671c6c
Merge pull request #1508 from tlkh/master
...
Added performance enhancements (XLA, AMP) to examples
2019-10-15 09:57:18 +02:00
thomwolf
2c1d5564ad
add readme information
2019-10-15 09:56:52 +02:00
thomwolf
c55badcee0
Add NER finetuning details by @stefan-it in example readme
2019-10-15 09:33:52 +02:00
Julien Chaumond
788e632622
[ner] Honor args.overwrite_cache
2019-10-15 09:17:31 +02:00
thomwolf
0f9ebb0b43
add seqeval as requirement for examples
2019-10-15 09:17:31 +02:00
thomwolf
66adb71734
update to transformers
2019-10-15 09:17:31 +02:00
Marianne Stecklina
5ff9cd158a
Add option to predict on test set
2019-10-15 09:17:31 +02:00
Marianne Stecklina
7f5367e0b1
Add cli argument for configuring labels
2019-10-15 09:17:31 +02:00
Marianne Stecklina
e1d4179b64
Make file reading more robust
2019-10-15 09:17:31 +02:00
Marianne Stecklina
383ef96747
Implement fine-tuning BERT on CoNLL-2003 named entity recognition task
2019-10-15 09:17:31 +02:00
Marianne Stecklina
5adb39e757
Add option to predict on test set
2019-10-15 09:14:53 +02:00
Marianne Stecklina
99b189df6d
Add cli argument for configuring labels
2019-10-15 09:14:53 +02:00
Marianne Stecklina
3e9420add1
Make file reading more robust
2019-10-15 09:14:53 +02:00
Marianne Stecklina
cde42c4354
Implement fine-tuning BERT on CoNLL-2003 named entity recognition task
2019-10-15 09:14:53 +02:00
hlums
74c5035808
Fix token order in xlnet preprocessing.
2019-10-14 21:27:11 +00:00
Rémi Louf
fe25eefc15
add instructions to fetch the dataset
2019-10-14 20:45:39 +02:00
Rémi Louf
412793275d
delegate the padding with special tokens to the tokenizer
2019-10-14 20:45:16 +02:00
Rémi Louf
447fffb21f
process the raw CNN/Daily Mail dataset
...
the data provided by Li Dong et al. were already tokenized, which means
that they are not compatible with all the models in the library. We
thus process the raw data directly and tokenize them using the models'
tokenizers.
2019-10-14 18:12:20 +02:00
Simon Layton
4e6a55751a
Force einsum to fp16
2019-10-14 11:12:41 -04:00