VictorSanh
5b6cafb11b
[release] fix table weirdness
2019-10-23 10:35:16 -04:00
VictorSanh
8ad5c591cd
[RELEASE] DistilRoBERTa
2019-10-23 10:29:47 -04:00
focox@qq.com
bd847ce7d7
fixed the bug raised by "tmp_eval_loss += tmp_eval_loss.item()" when parallelly using multi-gpu.
2019-10-23 20:27:13 +08:00
Julien Chaumond
ef1b8b2ae5
[CTRL] warn if generation prompt does not start with a control code
...
see also https://github.com/salesforce/ctrl/pull/50
2019-10-22 21:30:32 +00:00
Lysandre
7d709e55ed
Remove
2019-10-22 14:12:33 -04:00
Lysandre
1cfd974868
Option to benchmark only one of the two libraries
2019-10-22 13:32:23 -04:00
Pasquale Minervini
abd7110e21
gradient norm clipping should be done right before calling the optimiser - fixing run_glue and run_ner as well
2019-10-21 19:56:52 +01:00
Pasquale Minervini
3775550c4b
gradient norm clipping should be done right before calling the optimiser
2019-10-20 22:33:56 +01:00
LysandreJik
7dd29ed2f1
Benchmarks example script
2019-10-18 10:53:04 -04:00
William Tambellini
0919389d9a
Add speed log to examples/run_squad.py
...
Add a speed estimate log (time per example)
for evaluation to examples/run_squad.py
2019-10-17 14:41:04 -07:00
leo-du
ecd15667f3
fix repetition penalty
2019-10-17 14:47:14 -04:00
thomwolf
8cd56e3036
fix data processing in script
2019-10-17 16:33:26 +02:00
Rémi Louf
578d23e061
add training pipeline (formatting temporary)
2019-10-17 14:02:27 +02:00
Rémi Louf
47a06d88a0
use two different tokenizers for storyand summary
2019-10-17 13:04:26 +02:00
Rémi Louf
bfb9b540d4
add Model2Model to __init__
2019-10-17 12:59:51 +02:00
Rémi Louf
c1bc709c35
correct the truncation and padding of dataset
2019-10-17 10:41:53 +02:00
Rémi Louf
e4e0ee14bd
add separator between data import and train
2019-10-16 20:05:32 +02:00
Rémi Louf
0d81fc853e
specify in readme that both datasets are required
2019-10-15 15:26:33 +02:00
Rémi Louf
1aec940587
test the full story processing
2019-10-15 15:18:07 +02:00
Rémi Louf
22e1af6859
truncation function is fully tested
2019-10-15 14:43:50 +02:00
Rémi Louf
260ac7d9a8
wip commit, switching computers
2019-10-15 12:24:35 +02:00
thomwolf
be916cb3fb
Merge branch 'master' of https://github.com/huggingface/transformers
2019-10-15 10:37:13 +02:00
thomwolf
5875aaf762
install tensorboard
2019-10-15 10:36:46 +02:00
Thomas Wolf
40f14ff545
Merge pull request #1513 from slayton58/amp_fp16_einsum
...
Force einsum to run in fp16
2019-10-15 10:25:00 +02:00
Thomas Wolf
d147671c6c
Merge pull request #1508 from tlkh/master
...
Added performance enhancements (XLA, AMP) to examples
2019-10-15 09:57:18 +02:00
thomwolf
2c1d5564ad
add readme information
2019-10-15 09:56:52 +02:00
thomwolf
c55badcee0
Add NER finetuning details by @stefan-it in example readme
2019-10-15 09:33:52 +02:00
Julien Chaumond
788e632622
[ner] Honor args.overwrite_cache
2019-10-15 09:17:31 +02:00
thomwolf
0f9ebb0b43
add seqeval as requirement for examples
2019-10-15 09:17:31 +02:00
thomwolf
66adb71734
update to transformers
2019-10-15 09:17:31 +02:00
Marianne Stecklina
5ff9cd158a
Add option to predict on test set
2019-10-15 09:17:31 +02:00
Marianne Stecklina
7f5367e0b1
Add cli argument for configuring labels
2019-10-15 09:17:31 +02:00
Marianne Stecklina
e1d4179b64
Make file reading more robust
2019-10-15 09:17:31 +02:00
Marianne Stecklina
383ef96747
Implement fine-tuning BERT on CoNLL-2003 named entity recognition task
2019-10-15 09:17:31 +02:00
Marianne Stecklina
5adb39e757
Add option to predict on test set
2019-10-15 09:14:53 +02:00
Marianne Stecklina
99b189df6d
Add cli argument for configuring labels
2019-10-15 09:14:53 +02:00
Marianne Stecklina
3e9420add1
Make file reading more robust
2019-10-15 09:14:53 +02:00
Marianne Stecklina
cde42c4354
Implement fine-tuning BERT on CoNLL-2003 named entity recognition task
2019-10-15 09:14:53 +02:00
hlums
74c5035808
Fix token order in xlnet preprocessing.
2019-10-14 21:27:11 +00:00
Rémi Louf
fe25eefc15
add instructions to fetch the dataset
2019-10-14 20:45:39 +02:00
Rémi Louf
412793275d
delegate the padding with special tokens to the tokenizer
2019-10-14 20:45:16 +02:00
Rémi Louf
447fffb21f
process the raw CNN/Daily Mail dataset
...
the data provided by Li Dong et al. were already tokenized, which means
that they are not compatible with all the models in the library. We
thus process the raw data directly and tokenize them using the models'
tokenizers.
2019-10-14 18:12:20 +02:00
Simon Layton
4e6a55751a
Force einsum to fp16
2019-10-14 11:12:41 -04:00
Rémi Louf
67d10960ae
load and prepare CNN/Daily Mail data
...
We write a function to load an preprocess the CNN/Daily Mail dataset as
provided by Li Dong et al. The issue is that this dataset has already
been tokenized by the authors, so we actually need to find the original,
plain-text dataset if we want to apply it to all models.
2019-10-14 14:11:20 +02:00
Timothy Liu
376e65a674
Added automatic mixed precision and XLA options to run_tf_glue.py
2019-10-13 13:19:06 +00:00
Timothy Liu
86f23a1944
Minor enhancements to run_tf_glue.py
2019-10-13 10:21:35 +00:00
VictorSanh
d844db4005
Add citation bibtex
2019-10-11 16:55:42 -04:00
Rémi Louf
b3261e7ace
read parameters from CLI, load model & tokenizer
2019-10-11 18:40:38 +02:00
Rémi Louf
d889e0b71b
add base for seq2seq finetuning
2019-10-11 17:36:12 +02:00
Thomas Wolf
4428aefc63
Merge pull request #1488 from huggingface/pytorch-tpu
...
GLUE on TPU
2019-10-11 16:33:00 +02:00
Luran He
f382a8decd
convert int to str before adding to a str
2019-10-10 19:20:39 -04:00
Lysandre
639f4b7190
Don't save/load when on TPU
2019-10-10 19:17:25 +00:00
Lysandre
d4e7934ac3
GLUE on TPU
2019-10-10 19:03:06 +00:00
Rémi Louf
1e68c28670
add test for initialization of Bert2Rnd
2019-10-10 18:07:11 +02:00
Thomas Wolf
6596e3d566
Merge pull request #1454 from bkkaggle/pytorch-built-in-tensorboard
...
Change tensorboard imports to use built-in tensorboard if available
2019-10-10 11:56:55 +02:00
thomwolf
177a721205
move back to simple space spliting
2019-10-10 11:45:47 +02:00
thomwolf
a5997dd81a
better error messages
2019-10-10 11:31:01 +02:00
Lysandre Debut
2431fea98a
Merge pull request #1383 from keskarnitish/master
...
Adding CTRL
2019-10-09 11:31:05 -04:00
thomwolf
d9e60f4f0d
Merge branch 'master' into pr/1383
2019-10-09 17:25:08 +02:00
Lysandre Debut
e84470ef81
Merge pull request #1384 from huggingface/encoding-qol
...
Quality of life enhancements in encoding + patch MLM masking
2019-10-09 11:18:24 -04:00
jinoobaek-qz
69629c4f0f
Improve naming and only do regex when necessary
2019-10-09 08:48:40 -04:00
jinoobaek-qz
bf34a252b8
Golden path
2019-10-09 08:48:40 -04:00
jinoobaek-qz
528d3f327b
Improve readability and improve make less assumptions about checkpoint format
2019-10-09 08:48:40 -04:00
jinoobaek-qz
56301bd9e8
Extract method
2019-10-09 08:48:40 -04:00
jinoobaek-qz
d6c5469712
Delete older checkpoint after saving new checkpoint
2019-10-09 08:48:40 -04:00
jinoobaek-qz
54a31f50fb
Add save_total_limit
2019-10-09 08:48:40 -04:00
Thomas Wolf
439fac723a
Merge pull request #1409 from brian41005/master
...
Evaluation result.txt path changing #1286
2019-10-09 03:14:34 +02:00
Bilal Khan
5ce8d29abe
Change tensorboard imports to use built-in tensorboard if available
2019-10-08 16:29:43 -05:00
VictorSanh
7ce83b4931
update weights for distilgpt2
2019-10-07 12:30:27 -04:00
LysandreJik
f3e0218fbb
Correct device assignment in run_generation
2019-10-05 21:05:16 -04:00
thomwolf
78ef1a9930
fixes
2019-10-04 17:59:44 -04:00
thomwolf
6c1d0bc066
update encode_plus - add truncation strategies
2019-10-04 17:38:38 -04:00
VictorSanh
0820bb0555
unecessary carriage return
2019-10-04 17:23:15 -04:00
VictorSanh
f5891c3821
run_squad --> run_squad_w_distillation
2019-10-04 17:23:15 -04:00
VictorSanh
764a7923ec
add distillation+finetuning option in run_squad
2019-10-04 17:23:15 -04:00
thomwolf
92c0f2fb90
Merge remote-tracking branch 'origin/julien_multiple-choice' into encoding-qol
2019-10-04 15:48:06 -04:00
Julien Chaumond
9e136ff57c
Honor args.overwrite_cache (h/t @erenup)
2019-10-04 15:00:56 -04:00
keskarnitish
dbed1c5d94
Adding CTRL (squashed commit)
...
adding conversion script
adding first draft of modeling & tokenization
adding placeholder for test files
bunch of changes
registering the tokenizer/model/etc
tests
change link; something is very VERY wrong here
weird end-of-word thingy going on
i think the tokenization works now ; wrote the unit tests
overall structure works;load w next
the monster is alive!
works after some cleanup as well
adding emacs autosave to gitignore
currently only supporting the 48 layer one; seems to infer fine on my macbook
cleanup
fixing some documentation
fixing some documentation
tests passing?
now works on CUDA also
adding greedy?
adding greedy sampling
works well
2019-10-03 22:29:03 -07:00
Lysandre Debut
d3f24dfad7
Merge branch 'master' into master
2019-10-03 22:43:09 +00:00
LysandreJik
ecc4f1bdfa
XLM use_lang_embedding flag in run_generation
2019-10-03 17:42:16 -04:00
LysandreJik
c2c2ca0fdb
Added XLM to run_generation, with prompt language selection.
2019-10-03 17:18:48 -04:00
LysandreJik
aebd83230f
Update naming + remove f string in run_lm_finetuning example
2019-10-03 11:31:36 -04:00
LysandreJik
5ed50a93fb
LM finetuning won't mask special tokens anymore
2019-10-03 11:31:36 -04:00
Brian Ma
7af0777910
Update run_glue.py
...
add DistilBert model shortcut into ALL_MODELS
2019-10-03 15:31:11 +00:00
VictorSanh
5f07d8f11a
prepare release
2019-10-03 10:27:11 -04:00
VictorSanh
35071007cb
incoming release 🔥 update links to arxiv preprint
2019-10-03 10:27:11 -04:00
VictorSanh
2a91f6071f
upddate README - TODO updadte link to paper
2019-10-03 10:27:11 -04:00
VictorSanh
c51e533a5f
update train.py
2019-10-03 10:27:11 -04:00
VictorSanh
a76c3f9cb0
update requirements
2019-10-03 10:27:11 -04:00
VictorSanh
bb9c5ead54
update distiller
2019-10-03 10:27:11 -04:00
VictorSanh
a12ab0a8db
update binarized_data
2019-10-03 10:27:11 -04:00
VictorSanh
4d6dfbd376
update extract
2019-10-03 10:27:11 -04:00
VictorSanh
23edebc079
update extract_distilbert
2019-10-03 10:27:11 -04:00
VictorSanh
cbfcfce205
update token_counts
2019-10-03 10:27:11 -04:00
VictorSanh
19e4ebbe3f
grouped_batch_sampler
2019-10-03 10:27:11 -04:00
VictorSanh
594202a934
lm_seqs_dataset
2019-10-03 10:27:11 -04:00
VictorSanh
38084507c4
add distillation_configs
2019-10-03 10:27:11 -04:00
Brian Ma
2195c0d5f9
Evaluation result.txt path changing #1286
2019-10-03 12:49:12 +08:00
Thomas Wolf
963529e29b
Merge pull request #1288 from echan00/master
...
Typo with LM Fine tuning script
2019-10-01 18:46:07 -04:00
thomwolf
f7978f70ec
use format instead of f-strings
2019-10-01 18:45:38 -04:00
Julien Chaumond
b350662955
overflowing_tokens do not really make sense here, let's just return a number
...
Co-Authored-By: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2019-09-30 16:37:09 -04:00
Julien Chaumond
f5bcde0b2f
[multiple-choice] Simplify and use tokenizer.encode_plus
2019-09-30 16:04:55 -04:00
Denny
9478590630
Update run_lm_finetuning.py
...
The previous method, just as phrased, did not exist in the class.
2019-09-27 15:18:42 -03:00
Thomas Wolf
d83d295763
Merge pull request #1337 from mgrankin/fastdataset
...
faster dataset building
2019-09-27 10:35:12 +02:00
thomwolf
da2e47ad15
clean up a little run_tf_glue
2019-09-27 09:41:15 +02:00
thomwolf
528c288fa9
clean up run_tf_glue
2019-09-27 09:40:29 +02:00
VictorSanh
702f589848
fix input in run_glue for distilbert
2019-09-27 00:20:14 -04:00
mgrankin
f71a4577b8
faster dataset building
2019-09-26 16:53:13 +03:00
thomwolf
481d9c4fb5
Merge branch 'master' into tf2
2019-09-26 12:02:54 +02:00
thomwolf
31c23bd5ee
[BIG] pytorch-transformers => transformers
2019-09-26 10:15:53 +02:00
thomwolf
5705333441
add initialization for everybody
2019-09-26 10:06:20 +02:00
thomwolf
7c9f8f93f9
fix tests
2019-09-26 01:59:53 +02:00
thomwolf
d6dde438ea
add batch dimension in encode
2019-09-26 01:45:55 +02:00
thomwolf
4a21c4d88d
add warning if neither pt nor tf are found
2019-09-26 01:30:06 +02:00
thomwolf
3b7fb48c3b
fix loading from tf/pt
2019-09-25 17:46:16 +02:00
thomwolf
a049c8043b
push fix to training
2019-09-25 17:33:16 +02:00
mataney
a9f24a16bc
[FIX] fix run_generation.py to work with batch_size > 1
2019-09-25 15:53:29 +03:00
thomwolf
5def3302f4
update run_glue
2019-09-25 12:38:08 +02:00
thomwolf
f71758f7a4
update internal glue processors
2019-09-25 12:00:50 +02:00
thomwolf
b5ec526f85
updated data processor and metrics
2019-09-24 17:10:50 +02:00
LysandreJik
f09e5ecef0
[Proposal] GLUE processors included in library
2019-09-24 09:47:34 -04:00
LysandreJik
c832f43a4d
output_token_type
-> token_type_ids
2019-09-24 07:21:38 -04:00
LysandreJik
3927d7756c
Updated the GLUE pre-processing method
2019-09-24 07:15:11 -04:00
LysandreJik
9d44236f70
Updated DistilBERT
2019-09-24 07:03:24 -04:00
Lorenzo Ampil
4b543c3007
Add option to use a 'stop token' which will be used to truncate the output text to everything till right before the 'stop token'
2019-09-22 21:38:38 +08:00
VictorSanh
9f995b99d4
minor fixes
2019-09-19 21:36:06 +00:00
VictorSanh
3fe5c8e8a8
update bert-base-uncased rslts
2019-09-19 19:34:22 +00:00
VictorSanh
354944e607
[distillation] big update w/ new weights
2019-09-19 19:25:21 +00:00
LysandreJik
60414f31a9
GLUE updated with new methods
2019-09-19 10:55:06 +02:00
LysandreJik
bf503158c5
Sentence -> Sequence. Removed output_mask from the special token addition methods.
2019-09-19 10:55:06 +02:00
LysandreJik
de8e14b6c0
Added DistilBERT to run_squad script
2019-09-19 10:55:06 +02:00
LysandreJik
88368c2a16
Added DistilBERT to run_lm_finetuning
2019-09-19 10:55:06 +02:00
LysandreJik
75635072e1
Updated GLUE script to add DistilBERT. Cleaned up unused args in the utils file.
2019-09-19 10:55:06 +02:00
LysandreJik
59057abe52
typo
2019-09-19 10:55:06 +02:00
LysandreJik
bac332fec0
Updated the GLUE data processor. Corrections to RoBERTa and XLNet.
2019-09-19 10:55:06 +02:00
Erik Chan
f0340eccf9
Typo
...
Typo
2019-09-18 13:42:11 -07:00
erenup
8960988f35
fixed to find best dev acc
2019-09-19 01:10:05 +08:00
erenup
46ffc28329
Merge branch 'master' into run_multiple_choice_merge
...
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
2019-09-18 21:43:46 +08:00
erenup
15143fbad6
move run_multiple_choice.py and utils_multiple_choice.py to examples
2019-09-18 21:18:46 +08:00
erenup
3cd6289758
Merge remote-tracking branch 'huggingface/master' into run_multiple_choice_merge
...
# Conflicts:
# examples/contrib/run_swag.py
2019-09-18 21:16:59 +08:00
erenup
36362cf086
move schedule.step after optimizer.step
2019-09-18 21:13:40 +08:00
thomwolf
e768f2322a
update run_openai_gpt to fix #1264
2019-09-18 10:07:47 +02:00
thomwolf
8334993915
clean up examples - updated to new keyword inputs - #1246
2019-09-18 10:01:27 +02:00
erenup
5882c442e5
add example usage
2019-09-16 22:38:08 +08:00
erenup
982f181aa7
Merge remote-tracking branch 'origin/master' into run_multiple_choice_add_doc
2019-09-16 19:12:00 +08:00
erenup
84b9d1c423
Merge remote-tracking branch 'huggingface/master'
...
# Conflicts:
# pytorch_transformers/__init__.py
2019-09-16 19:06:12 +08:00
erenup
603b470a3d
add warnning info
2019-09-16 18:53:37 +08:00
erenup
4812a5a767
add doc string
2019-09-16 11:50:18 +08:00
VictorSanh
32e1332acf
[distil] fix once for all general logger for scripts
2019-09-11 14:19:07 +00:00
VictorSanh
364920e216
fix small bug/typo
2019-09-10 21:45:01 +00:00
Thomas Wolf
23c23f5399
Merge pull request #1229 from SKRohit/master
...
changes in evaluate function in run_lm_finetuning.py
2019-09-10 22:16:45 +02:00
searchivarius
eab980fd68
Fix to prevent crashing on assert len(tokens_b)>=1
2019-09-09 19:58:08 -04:00
VictorSanh
a95ced6260
[Distillation] save last chkpt as pytorch_model.bin
2019-09-09 19:53:35 +00:00
Rohit Kumar Singh
e5df36397b
changes in return statement of evaluate function
...
changed `results` to `result` and removed `results` dict defined previously
2019-09-09 19:55:57 +05:30
LysandreJik
3f91338be9
Patched a few outdated parameters
2019-09-06 17:48:06 -04:00
LysandreJik
f47f9a5874
Updated outdated examples
2019-09-06 17:10:33 -04:00
LysandreJik
5e151f5e77
Table of contents
2019-09-06 12:08:36 -04:00
LysandreJik
593c070435
Better examples
2019-09-06 12:00:12 -04:00
VictorSanh
dddd6b9927
Update DistilBERT training code
2019-09-05 18:26:14 +00:00
Stefan Schweter
a1c34bd286
distillation: fix ModuleNotFoundError error in token counts script
2019-08-31 12:21:38 +02:00
Thomas Wolf
51e980ce36
Merge pull request #1155 from anhnt170489/apex_fp16
...
Update apex fp16 implementation
2019-08-30 23:29:11 +02:00
VictorSanh
282c276e09
typos + file name coherence in distillation README
2019-08-30 12:02:29 -04:00
VictorSanh
803c1cc4ea
fix relative import bug cf Issue #1140
2019-08-30 12:01:27 -04:00
Thomas Wolf
0a2fecdf90
Merge branch 'master' into master
2019-08-30 16:30:08 +02:00
Rabeeh KARIMI
39eb31e11e
remove reloading tokenizer in the training, adding it to the evaluation part
2019-08-30 15:44:41 +02:00
Rabeeh KARIMI
350bb6bffa
updated tokenizer loading for addressing reproducibility issues
2019-08-30 15:34:28 +02:00
Thomas Wolf
01ad55f8cf
Merge pull request #1026 from rabeehk/master
...
loads the tokenizer for each checkpoint, to solve the reproducability…
2019-08-30 14:15:36 +02:00
erenup
6e1ac34e2b
Merge remote-tracking branch 'huggingface/master'
2019-08-30 15:50:11 +08:00
jamin
2fb9a934b4
re-format
2019-08-30 14:05:28 +09:00
jamin
c8731b9583
update apex fp16 implementation
2019-08-30 13:54:00 +09:00
LysandreJik
caf1d116a6
Closing bracket in DistilBERT's token count.
2019-08-29 15:30:10 -04:00
Luis
fe8fb10b44
Small modification of comment in the run_glue.py example
...
Add RoBERTa to the comment as it was not explicit that RoBERTa don't use token_type_ids.
2019-08-29 14:43:30 +02:00
erenup
942d3f4b20
modifiy code of arc label insurance
2019-08-29 10:21:17 +08:00
LysandreJik
bf3dc778b8
Changed learning rate for run_squad test
2019-08-28 18:24:43 -04:00
Andreas Daiminger
1d15a7f278
swap order of optimizer.step() and scheduler.step()
2019-08-28 19:18:27 +02:00
Thomas Wolf
0ecfd17f49
Merge pull request #987 from huggingface/generative-finetuning
...
Generative finetuning
2019-08-28 16:51:50 +02:00
thomwolf
b5eb283aaa
update credits
2019-08-28 16:36:55 +02:00
thomwolf
912a377e90
dilbert -> distilbert
2019-08-28 13:59:42 +02:00
thomwolf
4ce5f36f78
update readmes
2019-08-28 12:14:31 +02:00
erenup
ec4b1c659f
logging truth error
2019-08-28 16:50:40 +08:00
erenup
df52abe373
add sep_toekn between question and choice
2019-08-28 16:36:21 +08:00
erenup
43c243254a
avoid invalid labels of truth
2019-08-28 16:03:17 +08:00
erenup
3c7e676f8b
add test related code: test the best dev acc model when model is training
2019-08-28 15:57:29 +08:00
VictorSanh
93e82ab424
Write README for DilBERT
2019-08-28 06:26:09 +00:00
VictorSanh
fea921d382
add licensing
2019-08-28 04:45:39 +00:00
VictorSanh
da1e4e53fc
some fixes in train.py
for loading previous checkpoint
2019-08-28 04:01:03 +00:00
VictorSanh
0d8f8848d5
add scripts/extract_for_distil.py
2019-08-28 04:00:19 +00:00
VictorSanh
7f2c384c80
add scripts/token_counts.py
2019-08-28 04:00:03 +00:00
VictorSanh
4d16b279e5
add scripts/binarized_data.py
2019-08-28 03:59:48 +00:00
VictorSanh
b247b0d880
add train.py
for distillation
2019-08-28 02:12:47 +00:00
VictorSanh
780f183e55
add requirements
2019-08-28 01:39:52 +00:00
VictorSanh
e424d2e45d
add README
2019-08-28 01:10:10 +00:00
VictorSanh
1ae81e4aa1
add dataset. distiller, utils
2019-08-28 01:10:05 +00:00
thomwolf
06510ccb53
typo
2019-08-23 22:08:10 +02:00
thomwolf
ab7bd5ef98
fixing tokenization and training
2019-08-23 17:31:21 +02:00
Thomas Wolf
90dcd8c05d
Merge branch 'master' into generative-finetuning
2019-08-22 10:43:30 +02:00
VictorSanh
57272d5ddf
fix for glue
2019-08-22 00:25:49 -04:00
VictorSanh
b006a7a12f
fix for squad
2019-08-22 00:25:42 -04:00
Thomas Wolf
9beaa85b07
Merge pull request #1055 from qipeng/run_squad_fix
...
Fix #1015 (tokenizer defaults to use_lower_case=True when loading from trained models)
2019-08-21 01:20:46 +02:00
Lysandre
2d042274ac
Sequence special token handling for BERT and RoBERTa
2019-08-20 14:15:28 -04:00
Peng Qi
3bffd2e8e5
more fixes
2019-08-20 10:59:28 -07:00
Thomas Wolf
3b56427a1e
Merge pull request #1040 from FeiWang96/multi_gpu
...
Fix bug of multi-gpu training in lm finetuning
2019-08-20 17:13:44 +02:00
thomwolf
a690edab17
various fix and clean up on run_lm_finetuning
2019-08-20 15:52:12 +02:00
erenup
fc74132598
add best steps to train
2019-08-20 19:06:41 +08:00
Duzeyao
d86b49ac86
swap optimizer.step and scheduler.step
2019-08-20 16:46:34 +08:00
Duzeyao
45ab8bf60e
Revert "Update finetune_on_pregenerated.py"
...
This reverts commit a1359b970c
.
2019-08-20 16:40:39 +08:00
erenup
97c30b73d5
add test related code
2019-08-20 16:31:04 +08:00
erenup
d5e60e5b7a
add test related code
2019-08-20 16:25:50 +08:00
Zeyao Du
a1359b970c
Update finetune_on_pregenerated.py
2019-08-20 16:00:07 +08:00
Zeyao Du
28f7ca1f80
swap optimizer.step and scheduler.step
2019-08-20 15:58:42 +08:00
Peng Qi
a368b87791
Fix #1015
2019-08-19 13:07:00 -07:00
Lysandre
f94f1c6016
Distributed training + tokenizer agnostic mask token
2019-08-19 14:58:50 -04:00
Thomas Wolf
5a49b793d9
Merge pull request #1023 from tuvuumass/patch-1
...
fix issue #824
2019-08-19 15:31:46 +02:00
erenup
4270d3da1b
fix a bug of evaluating
2019-08-19 16:38:52 +08:00
Chi-Liang Liu
40acf6b52a
don't save model without training
2019-08-18 05:02:25 -04:00
erenup
47e9aea0fe
add args info to evaluate_result.txt
2019-08-18 17:00:53 +08:00
erenup
5582bc4b23
add multiple choice to robreta and xlnet, test on swag, roberta=0.82.28
...
, xlnet=0.80
2019-08-18 16:01:48 +08:00
wangfei
856a63da4d
Fix: save model/model.module
2019-08-18 11:03:47 +08:00
wangfei
1ef41b8337
Revert "Fix: save model/model.module"
...
This reverts commit 00e9c4cc96
.
2019-08-18 11:03:12 +08:00
wangfei
00e9c4cc96
Fix: save model/model.module
2019-08-18 11:02:02 +08:00
erenup
e384ae2b9d
Merge remote-tracking branch 'huggingface/master'
...
merge huggingface/master to update
2019-08-17 12:05:57 +08:00
Jason Phang
d8923270e6
Correct truncation for RoBERTa in 2-input GLUE
2019-08-16 16:30:38 -04:00
Lysandre
5652f54ac2
Simplified data generator + better perplexity calculator
...
GPT-2 now obtains ~20 perplexity on WikiText-2
2019-08-16 13:49:56 -04:00
LysandreJik
7e7fc53da5
Fixing run_glue example with RoBERTa
2019-08-16 11:53:10 -04:00
LysandreJik
715534800a
BERT + RoBERTa masking tokens handling + GPU device update.
2019-08-16 10:10:21 -04:00
LysandreJik
339e556feb
CLM for BERT, beginning of CLM fot RoBERTa; still needs a better masking token mechanism.
2019-08-16 10:10:20 -04:00
LysandreJik
5c18825a18
Removed dataset limit
2019-08-16 10:10:20 -04:00
LysandreJik
3e3e145497
Added GPT to the generative fine-tuning.
2019-08-16 10:10:20 -04:00
LysandreJik
47975ed53e
Language Modeling fine-tuning using GPT-2.
2019-08-16 10:10:20 -04:00
wangfei
b8ff56896c
Fix bug of multi-gpu training in lm finetuning
2019-08-16 12:11:05 +08:00
Rabeeh KARIMI
3d47a7f8ab
loads the tokenizer for each checkpoint, to solve the reproducability issue
2019-08-14 10:58:26 +02:00
LysandreJik
39f426be65
Added special tokens <pad> and <mask> to RoBERTa.
2019-08-13 15:19:50 -04:00
Julien Chaumond
baf08ca1d4
[RoBERTa] run_glue: correct pad_token + reorder labels
2019-08-13 12:51:15 -04:00
tuvuumass
ba4bce2581
fix issue #824
2019-08-13 11:26:27 -04:00
Julien Chaumond
912fdff899
[RoBERTa] Update run_glue
for RoBERTa
2019-08-12 13:49:50 -04:00
erenup
b219029c45
refactoring old run_swag. This script is mainly refatored from run_squad in pytorch_transformers
2019-08-11 15:20:37 +08:00
Thomas Wolf
b4f9464f90
Merge pull request #960 from ethanjperez/patch-1
...
Fixing unused weight_decay argument
2019-08-07 10:09:55 +02:00
Thomas Wolf
d43dc48b34
Merge branch 'master' into auto_models
2019-08-05 19:17:35 +02:00
thomwolf
70c10caa06
add option mentioned in #940
2019-08-05 17:09:37 +02:00
thomwolf
b90e29d52c
working on automodels
2019-08-05 16:06:34 +02:00
Ethan Perez
28ba345ecc
Fixing unused weight_decay argument
...
Currently the L2 regularization is hard-coded to "0.01", even though there is a --weight_decay flag implemented (that is unused). I'm making this flag control the weight decay used for fine-tuning in this script.
2019-08-04 12:31:46 -04:00
Thomas Wolf
c054b5ee64
Merge pull request #896 from zijunsun/master
...
fix multi-gpu training bug when using fp16
2019-07-26 19:31:02 +02:00
zijunsun
f0aeb7a814
multi-gpu training also should be after apex fp16(squad)
2019-07-26 15:23:29 +08:00
zijunsun
adb3ef6368
multi-gpu training also should be after apex fp16
2019-07-25 13:09:10 +08:00
Chi-Liang Liu
a7fce6d917
fix squad v1 error (na_prob_file should be None)
2019-07-24 16:11:36 +08:00
thomwolf
6070b55443
fix #868
2019-07-23 17:46:01 +02:00
thomwolf
2c9a3115b7
fix #858
2019-07-23 16:45:55 +02:00
Thomas Wolf
268c6cc160
Merge pull request #845 from rabeehk/master
...
fixed version issues in run_openai_gpt
2019-07-23 15:29:31 +02:00
Peiqin Lin
76be189b08
typos
2019-07-21 20:39:42 +08:00
Rabeeh KARIMI
f63ff536ad
fixed version issues in run_openai_gpt
2019-07-20 12:43:07 +02:00
Thomas Wolf
a615499076
Merge pull request #797 from yzy5630/fix-examples
...
fix some errors for distributed lm_finetuning
2019-07-18 23:32:33 +02:00
yzy5630
a1fe4ba9c9
use new API for save and load
2019-07-18 15:45:23 +08:00
yzy5630
a7ba27b1b4
add parser for adam
2019-07-18 08:52:51 +08:00
yzy5630
d6522e2873
change loss and optimizer to new API
2019-07-17 21:22:34 +08:00
thomwolf
71d597dad0
fix #800
2019-07-17 13:51:09 +02:00
yzy5630
123da5a2fa
fix errors for lm_finetuning examples
2019-07-17 09:56:07 +08:00
yzy5630
60a1bdcdac
fix some errors for distributed lm_finetuning
2019-07-17 09:16:20 +08:00
thomwolf
e848b54730
fix #792
2019-07-16 21:22:19 +02:00
thomwolf
1849aa7d39
update readme and pretrained model weight files
2019-07-16 15:11:29 +02:00
thomwolf
f31154cb9d
Merge branch 'xlnet'
2019-07-16 11:51:13 +02:00
thomwolf
76da9765b6
fix run_generation test
2019-07-15 17:52:35 +02:00
thomwolf
e691fc0963
update QA models tests + run_generation
2019-07-15 17:45:24 +02:00
thomwolf
15d8b1266c
update tokenizer - update squad example for xlnet
2019-07-15 17:30:42 +02:00
thomwolf
3b469cb422
updating squad for compatibility with XLNet
2019-07-15 15:28:37 +02:00
thomwolf
0e9825e252
small fix to run_glue
2019-07-14 23:43:28 +02:00
thomwolf
2397f958f9
updating examples and doc
2019-07-14 23:20:10 +02:00
thomwolf
c490f5ce87
added generation examples in tests
2019-07-13 15:26:58 +02:00
thomwolf
7d4b200e40
good quality generation example for GPT, GPT-2, Transfo-XL, XLNet
2019-07-13 15:25:03 +02:00
thomwolf
7322c314a6
remove python2 testing for examples
2019-07-12 14:24:08 +02:00
thomwolf
936e813c84
clean up examples - added squad example and test
2019-07-12 14:16:06 +02:00
thomwolf
762ded9b1c
wip examples
2019-07-12 11:28:52 +02:00
LysandreJik
3821ecbf4a
Byte order mark management in TSV glue reading.
2019-07-11 20:16:28 -04:00
thomwolf
c6bf1a400d
fix test examples et model pretrained
2019-07-11 22:29:08 +02:00
thomwolf
92a782b108
fix run_glue test
2019-07-11 22:20:10 +02:00
thomwolf
ccb6947dc1
optimization tests
2019-07-11 17:39:47 +02:00
thomwolf
b21d84b027
update examples
2019-07-11 15:37:34 +02:00
thomwolf
ec07cf5a66
rewamp optimization
2019-07-11 14:48:22 +02:00
thomwolf
4fef5919a5
updating examples
2019-07-11 12:03:08 +02:00
thomwolf
50b7e52a7f
WIP examples
2019-07-10 15:33:34 +02:00
thomwolf
ed6c8d37f4
fix merge
2019-07-09 17:14:52 +02:00
thomwolf
4ce237c880
update run_glue
2019-07-09 17:00:32 +02:00
thomwolf
3b7cb7bf44
small update to run_glue
2019-07-09 16:12:15 +02:00
thomwolf
d0efbd3cd1
update sequencesummary module
2019-07-09 15:46:43 +02:00
thomwolf
d5481cbe1b
adding tests to examples - updating summary module - coverage update
2019-07-09 15:29:42 +02:00
thomwolf
b19786985d
unified tokenizer api and serialization + tests
2019-07-09 10:25:18 +02:00
thomwolf
3d5f291386
updates to run_glue
2019-07-05 17:22:15 +02:00
thomwolf
99b90edab1
cleaning up run_glue example
2019-07-05 17:09:35 +02:00
thomwolf
1113f97f33
clean up glue example
2019-07-05 16:31:13 +02:00
thomwolf
162ba383b0
fix model loading
2019-07-05 15:57:14 +02:00
thomwolf
36bca545ff
tokenization abstract class - tests for examples
2019-07-05 15:02:59 +02:00
Thomas Wolf
78462aad61
Merge pull request #733 from ceremonious/parallel-generation
...
Added option to use multiple workers to create training data
2019-07-05 12:04:30 +02:00
thomwolf
0bab55d5d5
[BIG] name change
2019-07-05 11:55:36 +02:00
thomwolf
c41f2bad69
WIP XLM + refactoring
2019-07-03 22:54:39 +02:00
Lei Mao
64b2a828c0
fix evaluation bug
2019-07-01 14:56:24 -07:00
thomwolf
2b56e98892
standardizing API across models - XLNetForSeqClass working
2019-06-28 16:35:09 +02:00
thomwolf
3a00674cbf
fix imports
2019-06-27 17:18:46 +02:00
Mayhul Arora
08ff056c43
Added option to use multiple workers to create training data for lm fine tuning
2019-06-26 16:16:12 -07:00
thomwolf
59cefd4f98
fix #726 - get_lr in examples
2019-06-26 11:28:27 +02:00
thomwolf
092dacfd62
changing is_regression to unified API
2019-06-26 09:54:05 +02:00
thomwolf
e55d4c4ede
various updates to conversion, models and examples
2019-06-26 00:57:53 +02:00
thomwolf
7334bf6c21
pad on left for xlnet
2019-06-24 15:05:11 +02:00
thomwolf
c888663f18
overwrite output directories if needed
2019-06-24 14:38:24 +02:00
thomwolf
62d78aa37e
updating GLUE utils for compatibility with XLNet
2019-06-24 14:36:11 +02:00
thomwolf
24ed0b9346
updating run_xlnet_classifier
2019-06-24 12:00:09 +02:00
thomwolf
f6081f2255
add xlnetforsequence classif and run_classifier example for xlnet
2019-06-24 10:01:07 +02:00
Rocketknight1
c7b2808ed7
Update LM finetuning README to include a literature reference
2019-06-22 15:04:01 +01:00
thomwolf
181075635d
updating model loading and adding special tokens ids
2019-06-21 23:23:37 +02:00
thomwolf
ebd2cb8d74
update from_pretrained to load XLNetModel as well
2019-06-21 21:08:44 +02:00
thomwolf
edfe91c36e
first version bertology ok
2019-06-19 23:43:04 +02:00
thomwolf
7766ce66dd
update bertology
2019-06-19 22:29:51 +02:00
thomwolf
e4b46d86ce
update head pruning
2019-06-19 22:16:30 +02:00
thomwolf
0f40e8d6a6
debugger
2019-06-19 15:38:46 +02:00
thomwolf
0e1e8128bf
more logging
2019-06-19 15:35:49 +02:00
thomwolf
909d4f1af2
cuda again
2019-06-19 15:32:10 +02:00
thomwolf
14f0e8e557
fix cuda
2019-06-19 15:29:28 +02:00
thomwolf
34d706a0e1
pruning in bertology
2019-06-19 15:25:49 +02:00
thomwolf
dc8e0019b7
updating examples
2019-06-19 13:23:20 +02:00
thomwolf
68ab9599ce
small fix and updates to readme
2019-06-19 09:38:38 +02:00
thomwolf
f7e2ac01ea
update barrier
2019-06-18 22:43:35 +02:00
thomwolf
4d8c4337ae
test barrier in distrib training
2019-06-18 22:41:28 +02:00
thomwolf
3359955622
updating run_classif
2019-06-18 22:23:10 +02:00
thomwolf
29b7b30eaa
updating evaluation on a single gpu
2019-06-18 22:20:21 +02:00
thomwolf
7d2001aa44
overwrite_output_dir
2019-06-18 22:13:30 +02:00
thomwolf
16a1f338c4
fixing
2019-06-18 17:06:31 +02:00
thomwolf
92e0ad5aba
no numpy
2019-06-18 17:00:52 +02:00
thomwolf
4e6edc3274
hop
2019-06-18 16:57:15 +02:00
thomwolf
f55b60b9ee
fixing again
2019-06-18 16:56:52 +02:00
thomwolf
8bd9118294
quick fix
2019-06-18 16:54:41 +02:00
thomwolf
3e847449ad
fix out_label_ids
2019-06-18 16:53:31 +02:00
thomwolf
aad3a54e9c
fix paths
2019-06-18 16:48:04 +02:00
thomwolf
40dbda6871
updating classification example
2019-06-18 16:45:52 +02:00
thomwolf
7388c83b60
update run_classifier for distributed eval
2019-06-18 16:32:49 +02:00
thomwolf
9727723243
fix pickle
2019-06-18 16:02:42 +02:00
thomwolf
9710b68dbc
fix pickles
2019-06-18 16:01:15 +02:00
thomwolf
15ebd67d4e
cache in run_classifier + various fixes to the examples
2019-06-18 15:58:22 +02:00
thomwolf
e6e5f19257
fix
2019-06-18 14:45:14 +02:00
thomwolf
a432b3d466
distributed traing t_total
2019-06-18 14:39:09 +02:00
thomwolf
c5407f343f
split squad example in two
2019-06-18 14:29:03 +02:00
thomwolf
335f57baf8
only on main process
2019-06-18 14:03:46 +02:00
thomwolf
326944d627
add tensorboard to run_squad
2019-06-18 14:02:42 +02:00
thomwolf
d82e5deeb1
set find_unused_parameters=True in DDP
2019-06-18 12:13:14 +02:00
thomwolf
a59abedfb5
DDP update
2019-06-18 12:06:26 +02:00
thomwolf
2ef5e0de87
switch to pytorch DistributedDataParallel
2019-06-18 12:03:13 +02:00
thomwolf
9ce37af99b
oups
2019-06-18 11:47:54 +02:00
thomwolf
a40955f071
no need to duplicate models anymore
2019-06-18 11:46:14 +02:00
thomwolf
382e2d1e50
spliting config and weight files for bert also
2019-06-18 10:37:16 +02:00
Thomas Wolf
cad88e19de
Merge pull request #672 from oliverguhr/master
...
Add vocabulary and model config to the finetune output
2019-06-14 17:02:47 +02:00
Thomas Wolf
460d9afd45
Merge pull request #640 from Barqawiz/master
...
Support latest multi language bert fine tune
2019-06-14 16:57:02 +02:00
Thomas Wolf
277c77f1c5
Merge pull request #630 from tguens/master
...
Update run_squad.py
2019-06-14 16:56:26 +02:00
Thomas Wolf
659af2cbd0
Merge pull request #604 from samuelbroscheit/master
...
Fixing issue "Training beyond specified 't_total' steps with schedule 'warmup_linear'" reported in #556
2019-06-14 16:49:24 +02:00
Meet Pragnesh Shah
e02ce4dc79
[hotfix] Fix frozen pooler parameters in SWAG example.
2019-06-11 15:13:53 -07:00
Oliver Guhr
5c08c8c273
adds the tokenizer + model config to the output
2019-06-11 13:46:33 +02:00
jeonsworld
a3a604cefb
Update pregenerate_training_data.py
...
apply Whole Word Masking technique.
referred to [create_pretraining_data.py](https://github.com/google-research/bert/blob/master/create_pretraining_data.py )
2019-06-10 12:17:23 +09:00
Ahmad Barqawi
c4fe56dcc0
support latest multi language bert fine tune
...
fix issue of bert-base-multilingual and add support for uncased multilingual
2019-05-27 11:27:41 +02:00
tguens
9e7bc51b95
Update run_squad.py
...
Indentation change so that the output "nbest_predictions.json" is not empty.
2019-05-22 17:27:59 +08:00
samuelbroscheit
94247ad6cb
Make num_train_optimization_steps int
2019-05-13 12:38:22 +02:00
samuel.broscheit
49a77ac16f
Clean up a little bit
2019-05-12 00:31:10 +02:00
samuel.broscheit
3bf3f9596f
Fixing the issues reported in https://github.com/huggingface/pytorch-pretrained-BERT/issues/556
...
Reason for issue was that optimzation steps where computed from example size, which is different from actual size of dataloader when an example is chunked into multiple instances.
Solution in this pull request is to compute num_optimization_steps directly from len(data_loader).
2019-05-12 00:13:45 +02:00
burcturkoglu
00c7fd2b79
Division to num_train_optimizer of global_step in lr_this_step is removed.
2019-05-09 10:57:03 +03:00
burcturkoglu
fa37b4da77
Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT
2019-05-09 10:55:24 +03:00
burcturkoglu
5289b4b9e0
Division to num_train_optimizer of global_step in lr_this_step is removed.
2019-05-09 10:51:38 +03:00
Thomas Wolf
0198399d84
Merge pull request #570 from MottoX/fix-1
...
Create optimizer only when args.do_train is True
2019-05-08 16:07:50 +02:00
MottoX
18c8aef9d3
Fix documentation typo
2019-05-02 19:23:36 +08:00
MottoX
74dbba64bc
Prepare optimizer only when args.do_train is True
2019-05-02 19:09:29 +08:00
Aneesh Pappu
365fb34c6c
small fix to remove shifting of lm labels during pre process of roc stories, as this shifting happens interanlly in the model
2019-04-30 13:53:04 -07:00
Thomas Wolf
2dee86319d
Merge pull request #527 from Mathieu-Prouveur/fix_value_training_loss
...
Update example files so that tr_loss is not affected by args.gradient…
2019-04-30 11:12:55 +02:00
Mathieu Prouveur
87b9ec3843
Fix tr_loss rescaling factor using global_step
2019-04-29 12:58:29 +02:00
Mathieu Prouveur
ed8fad7390
Update example files so that tr_loss is not affected by args.gradient_accumulation_step
2019-04-24 14:07:00 +02:00
thomwolf
d94c6b0144
fix training schedules in examples to match new API
2019-04-23 11:17:06 +02:00
Thomas Wolf
c36cca075a
Merge pull request #515 from Rocketknight1/master
...
Fix --reduce_memory in finetune_on_pregenerated
2019-04-23 10:30:23 +02:00
Matthew Carrigan
b8e2a9c584
Made --reduce_memory actually do something in finetune_on_pregenerated
2019-04-22 14:01:48 +01:00
Sangwhan Moon
14b1f719f4
Fix indentation weirdness in GPT-2 example.
2019-04-22 02:20:22 +09:00
Thomas Wolf
8407429d74
Merge pull request #494 from SudoSharma/patch-1
...
Fix indentation for unconditional generation
2019-04-17 11:11:36 +02:00
Ben Mann
87677fcc4d
[run_gpt2.py] temperature should be a float, not int
2019-04-16 15:23:21 -07:00
Abhi Sharma
07154dadb4
Fix indentation for unconditional generation
2019-04-16 11:11:49 -07:00
Thomas Wolf
3d78e226e6
Merge pull request #489 from huggingface/tokenization_serialization
...
Better serialization for Tokenizers and Configuration classes - Also fix #466
2019-04-16 08:49:54 +02:00
thomwolf
3571187ef6
fix saving models in distributed setting examples
2019-04-15 16:43:56 +02:00
thomwolf
2499b0a5fc
add ptvsd to run_squad
2019-04-15 15:33:04 +02:00
thomwolf
7816f7921f
clean up distributed training logging in run_squad example
2019-04-15 15:27:10 +02:00
thomwolf
1135f2384a
clean up logger in examples for distributed case
2019-04-15 15:22:40 +02:00
thomwolf
60ea6c59d2
added best practices for serialization in README and examples
2019-04-15 15:00:33 +02:00
thomwolf
179a2c2ff6
update example to work with new serialization semantic
2019-04-15 14:33:23 +02:00
thomwolf
3e65f255dc
add serialization semantics to tokenizers - fix transfo-xl tokenizer
2019-04-15 11:47:25 +02:00
Thomas Wolf
aff44f0c08
Merge branch 'master' into master
2019-04-15 10:58:34 +02:00
Thomas Wolf
bb61b747df
Merge pull request #474 from jiesutd/master
...
Fix tsv read error in Windows
2019-04-15 10:56:48 +02:00
Matthew Carrigan
dbbd6c7500
Replaced some randints with cleaner randranges, and added a helpful
...
error for users whose corpus is just one giant document.
2019-04-12 15:07:58 +01:00
Thomas Wolf
616743330e
Merge pull request #462 from 8enmann/master
...
fix run_gpt2.py
2019-04-11 21:54:46 +02:00
Thomas Wolf
2cdfb8b254
Merge pull request #467 from yaroslavvb/patch-2
...
Update README.md
2019-04-11 21:53:23 +02:00
Jie Yang
c49ce3c722
fix tsv read error in Windows
2019-04-11 15:40:19 -04:00
thomwolf
4bc4c69af9
finetuning any BERT model - fixes #455
2019-04-11 16:57:59 +02:00
Yaroslav Bulatov
8fffba5f47
Update README.md
...
Fix for
```> > > > 04/09/2019 21:39:38 - INFO - __main__ - device: cuda n_gpu: 1, distributed training: False, 16-bits training: False
Traceback (most recent call last):
File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 642, in <module>
main()
File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 502, in main
raise ValueError("Training is currently the only implemented execution option. Please set `do_train`.")
ValueError: Training is currently the only implemented execution option. Please set `do_train`.
```
2019-04-09 14:45:47 -07:00
Benjamin Mann
fd8a3556f0
fix run_gpt2.py
2019-04-08 17:20:35 -07:00
Dhanajit Brahma
6c4c7be282
Merge remote-tracking branch 'upstream/master'
2019-04-07 16:59:36 +05:30
Dhanajit Brahma
4d3cf0d602
removing some redundant lines
2019-04-07 16:59:07 +05:30
Thomas Wolf
9ca25ce828
Merge pull request #427 from jeonsworld/patch-1
...
fix sample_doc
2019-04-03 11:26:58 +02:00
thomwolf
846b1fd6f8
Fix #419
2019-04-03 10:50:38 +02:00
Thomas Wolf
2f80dbbc0d
Merge pull request #430 from MottoX/master
...
Fix typo in example code
2019-04-02 10:41:56 +02:00
Mike Arpaia
8b5c63e4de
Fixes to the TensorFlow conversion tool
2019-04-01 13:17:54 -06:00
Weixin Wang
d07db28f52
Fix typo in example code
...
Modify 'unambigiously' to 'unambiguously'
2019-03-31 01:20:18 +08:00
jeonsworld
60005f464d
Update pregenerate_training_data.py
...
If the value of rand_end is returned from the randint function, the value of sampled_doc_index that matches current_idx is returned from searchsorted.
example:
cumsum_max = {int64} 30
doc_cumsum = {ndarray} [ 5 7 11 19 30]
doc_lengths = {list} <class 'list'>: [5, 2, 4, 8, 11]
if current_idx = 1,
rand_start = 7
rand_end = 35
sentence_index = randint(7, 35) % cumsum_max
if randint return 35, sentence_index becomes 5.
if sentence_index is 5, np.searchsorted returns 1 equal to current_index.
2019-03-30 14:50:17 +09:00
dhanajitb
f872eb98c2
making unconditional generation work
...
The unconditional generation works now but if the seed is fixed, the sample is the same every time.
n_samples > 1 will give different samples though.
I am giving the start token as '<|endoftext|>' for the unconditional generation.
2019-03-28 22:46:15 +05:30
Thomas Wolf
694e2117f3
Merge pull request #388 from ananyahjha93/master
...
Added remaining GLUE tasks to 'run_classifier.py'
2019-03-28 09:06:53 +01:00
Thomas Wolf
cc8c2d2332
Merge pull request #396 from IndexFziQ/IndexFziQ
...
add tqdm to the process of eval in examples/run_swag.py
2019-03-27 12:03:26 +01:00
thomwolf
361aff6de5
typos
2019-03-27 11:54:59 +01:00
thomwolf
cea8ba1d59
adjusted formating and some wording in the readme
2019-03-27 11:53:44 +01:00
Matthew Carrigan
24e67fbf75
Minor README update
2019-03-25 12:33:30 +00:00
Matthew Carrigan
8d1d1ffde2
Corrected the displayed loss when gradient_accumulation_steps > 1
2019-03-25 12:15:19 +00:00
Matthew Carrigan
abb7d1ff6d
Added proper context management to ensure cleanup happens in the right
...
order.
2019-03-21 17:50:03 +00:00
Matthew Carrigan
06a30cfdf3
Added a --reduce_memory option to the training script to keep training
...
data on disc as a memmap rather than in memory
2019-03-21 17:04:12 +00:00
Matthew Carrigan
7d1ae644ef
Added a --reduce_memory option to the training script to keep training
...
data on disc as a memmap rather than in memory
2019-03-21 17:02:18 +00:00
Matthew Carrigan
2bba7f810e
Added a --reduce_memory option to shelve docs to disc instead of keeping them in memory.
2019-03-21 16:50:16 +00:00
Matthew Carrigan
8733ffcb5e
Removing a couple of other old unnecessary comments
2019-03-21 14:09:57 +00:00
Matthew Carrigan
8a861048dd
Fixed up the notes on a possible future low-memory path
2019-03-21 14:08:39 +00:00
Matthew Carrigan
a8a577ba93
Reduced memory usage for pregenerating the data a lot by writing it
...
out on the fly without shuffling - the Sampler in the finetuning script
will shuffle for us.
2019-03-21 14:05:52 +00:00
Matthew Carrigan
0ae59e662d
Reduced memory usage for pregenerating the data a lot by writing it
...
out on the fly without shuffling - the Sampler in the finetuning script
will shuffle for us.
2019-03-21 14:04:17 +00:00
Matthew Carrigan
6a9038ba53
Removed an old irrelevant comment
2019-03-21 13:36:41 +00:00
Yuqiang Xie
77944d1b31
add tqdm to the process of eval
...
Maybe better.
2019-03-21 20:59:33 +08:00
Matthew Carrigan
29a392fbcf
Small README changes
2019-03-20 17:35:17 +00:00
Matthew Carrigan
832b2b0058
Adding README
2019-03-20 17:31:49 +00:00
Matthew Carrigan
934d3f4d2f
Syncing up argument names between the scripts
2019-03-20 17:23:23 +00:00
Matthew Carrigan
f19ba35b2b
Move old finetuning script into the new folder
2019-03-20 16:47:06 +00:00
Matthew Carrigan
7de5c6aa5e
PEP8 and formatting cleanups
2019-03-20 16:44:04 +00:00
Matthew Carrigan
1798e98e5a
Added final TODOs
2019-03-20 16:42:37 +00:00
Matthew Carrigan
c64c2fc4c2
Fixed embarrassing indentation problem
2019-03-20 15:42:57 +00:00
Matthew Carrigan
0540d360f2
Fixed logging
2019-03-20 15:36:51 +00:00
Matthew Carrigan
976554a472
First commit of the new LM finetuning
2019-03-20 14:23:51 +00:00
Ananya Harsh Jha
e5b63fb542
Merge branch 'master' of https://github.com/ananyahjha93/pytorch-pretrained-BERT
...
pull current master to local
2019-03-17 08:30:13 -04:00
Ananya Harsh Jha
8a4e90ff40
corrected folder creation error for MNLI-MM, verified GLUE results
2019-03-17 08:16:50 -04:00
Ananya Harsh Jha
e0bf01d9a9
added hack for mismatched MNLI
2019-03-16 14:10:48 -04:00
Ananya Harsh Jha
4c721c6b6a
added eval time metrics for GLUE tasks
2019-03-15 23:21:24 -04:00
tseretelitornike
83857ffeaa
Added missing imports.
2019-03-15 12:45:48 +01:00
Yongbo Wang
d1e4fa98a9
typo in annotation
...
modify `heruistic` to `heuristic` in line 660, `charcter` to `character` in line 661.
2019-03-14 17:32:15 +08:00
Yongbo Wang
3d6452163d
typo
...
modify `mull` to `null` in line 474 annotation.
2019-03-14 17:03:38 +08:00
thomwolf
a98dfe4ced
fixing #377 (empty nbest_predictions.json)
2019-03-14 09:57:06 +01:00
Ananya Harsh Jha
043c8781ef
added code for all glue task processors
2019-03-14 04:24:04 -04:00
Yongbo Wang
22a465a91f
Simplify code, delete redundancy line
...
delete redundancy line `if args.train`, simplify code.
2019-03-13 09:42:06 +08:00
Elon Musk
66d8206809
Update run_gpt2.py
2019-03-08 11:59:08 -05:00
thomwolf
7cc35c3104
fix openai gpt example and updating readme
2019-03-06 11:43:21 +01:00
thomwolf
994d86609b
fixing PYTORCH_PRETRAINED_BERT_CACHE use in examples
2019-03-06 10:21:24 +01:00
thomwolf
5c85fc3977
fix typo - logger info
2019-03-06 10:05:21 +01:00
Thomas Wolf
8e36da7acb
Merge pull request #347 from jplehmann/feature/sst2-processor
...
Processor for SST-2 task
2019-03-06 09:48:27 +01:00
Thomas Wolf
3c01dfb775
Merge pull request #338 from CatalinVoss/patch-3
...
Fix top k generation for k != 0
2019-03-06 09:47:33 +01:00
John Lehmann
0f96d4b1f7
Run classifier processor for SST-2.
2019-03-05 13:38:28 -06:00
Catalin Voss
4b4b079272
Fix top k generation for k != 0
2019-03-02 21:54:44 -08:00
Catalin Voss
c0cf0a04d5
Fix typo
2019-02-27 18:01:06 -08:00
Ben Johnson
8607233679
Update run_openai_gpt.py
2019-02-20 13:58:54 -05:00
thomwolf
0202da0271
remove unnecessary example
2019-02-18 13:51:42 +01:00
thomwolf
690a0dbf36
fix example - masking
2019-02-18 10:50:30 +01:00
thomwolf
fbb248a2e4
examples testing
2019-02-18 01:28:18 +01:00
thomwolf
b65f07d8c0
adding examples
2019-02-18 00:55:33 +01:00
wlhgtc
8efaf8f176
fix 'best_non_null_entry' is None error
2019-02-15 15:57:25 +08:00
Davide Fiocco
65df0d78ed
--do_lower_case is duplicated in parser args
...
Deleting one repetition (please review!)
2019-02-13 15:30:05 +01:00
Thomas Wolf
03cdb2a390
Merge pull request #254 from huggingface/python_2
...
Adding OpenAI GPT and Transformer-XL models, compatibility with Python 2
2019-02-11 14:19:26 +01:00
thomwolf
d38caba169
typo in run_squad
2019-02-11 14:10:27 +01:00
thomwolf
af62cc5f20
fix run_squad example
2019-02-11 14:06:32 +01:00
thomwolf
eebc8abbe2
clarify and unify model saving logic in examples
2019-02-11 14:04:19 +01:00
thomwolf
32fea876bb
add distant debugging to run_transfo_xl
2019-02-11 12:53:32 +01:00
thomwolf
b31ba23913
cuda on in the examples by default
2019-02-11 12:15:43 +01:00
thomwolf
6cd769957e
update transfo xl example
2019-02-09 16:59:17 +01:00
thomwolf
1320e4ec0c
mc_token_mask => mc_token_ids
2019-02-09 16:58:53 +01:00
thomwolf
f4a07a392c
mems not splitted
2019-02-09 16:14:31 +01:00
thomwolf
43b9af0cac
mems initialized to None in run_transfo
2019-02-09 16:12:19 +01:00
thomwolf
b80684b23f
fixing run openai gpt example
2019-02-08 22:31:32 +01:00
thomwolf
7b4b0cf966
logging
2019-02-08 11:16:29 +01:00
thomwolf
4bbb9f2d68
log loss - helpers
2019-02-08 11:14:29 +01:00
thomwolf
5d7e845712
fix model on cuda
2019-02-08 11:08:43 +01:00
thomwolf
eccb2f0163
hot fix
2019-02-08 11:05:20 +01:00
thomwolf
5adc20723b
add distant debugging
2019-02-08 11:03:59 +01:00
thomwolf
777459b471
run openai example running
2019-02-08 10:33:14 +01:00
thomwolf
6bc082da0a
updating examples
2019-02-08 00:02:26 +01:00
thomwolf
e77721e4fe
renamed examples
2019-02-07 23:15:15 +01:00
thomwolf
d482e3d79d
adding examples for openai and transformer-xl
2019-02-07 17:06:41 +01:00
tholor
9aebc711c9
adjust error message related to args.do_eval
2019-02-07 11:49:38 +01:00
tholor
4a450b25d5
removing unused argument eval_batch_size from LM finetuning #256
2019-02-07 10:06:38 +01:00
Baoyang Song
7ac3311e48
Fix the undefined variable in squad example
2019-02-06 19:36:08 +01:00
thomwolf
ed47cb6cba
fixing transfo eval script
2019-02-06 16:22:17 +01:00
Thomas Wolf
848aae49e1
Merge branch 'master' into python_2
2019-02-06 00:13:20 +01:00
thomwolf
448937c00d
python 2 compatibility
2019-02-06 00:07:46 +01:00
thomwolf
d609ba24cb
resolving merge conflicts
2019-02-05 16:14:25 +01:00
Thomas Wolf
64ce900974
Merge pull request #248 from JoeDumoulin/squad1.1-fix
...
fix prediction on run-squad.py example
2019-02-05 16:00:51 +01:00
Thomas Wolf
e9e77cd3c4
Merge pull request #218 from matej-svejda/master
...
Fix learning rate problems in run_classifier.py
2019-02-05 15:40:44 +01:00
thomwolf
1579c53635
more explicit notation: num_train_step => num_train_optimization_steps
2019-02-05 15:36:33 +01:00
joe dumoulin
aa90e0c36a
fix prediction on run-squad.py example
2019-02-01 10:15:44 -08:00
Thomas Wolf
8f8bbd4a4c
Merge pull request #244 from deepset-ai/prettify_lm_masking
...
Avoid confusion of inplace LM masking
2019-02-01 12:17:50 +01:00
tholor
ce75b169bd
avoid confusion of inplace masking of tokens_a / tokens_b
2019-01-31 11:42:06 +01:00
Surya Kasturi
9bf528877e
Update run_squad.py
2019-01-30 15:09:31 -05:00
Surya Kasturi
af2b78601b
Update run_squad2.py
2019-01-30 15:08:56 -05:00
Matej Svejda
5169069997
make examples consistent, revert error in num_train_steps calculation
2019-01-30 11:47:25 +01:00
Matej Svejda
9c6a48c8c3
fix learning rate/fp16 and warmup problem for all examples
2019-01-27 14:07:24 +01:00
Matej Svejda
01ff4f82ba
learning rate problems in run_classifier.py
2019-01-22 23:40:06 +01:00
liangtaiwan
be9fa192f0
don't save if do not train
2019-01-18 00:41:55 +08:00
thomwolf
a28dfc8659
fix eval for wt103
2019-01-16 11:18:19 +01:00
thomwolf
8831c68803
fixing various parts of model conversion, loading and weights sharing
2019-01-16 10:31:16 +01:00
thomwolf
bcd4aa8fe0
update evaluation example
2019-01-15 23:32:34 +01:00
thomwolf
a69ec2c722
improved corpus and tokenization conversion - added evaluation script
2019-01-15 23:17:46 +01:00
Thomas Wolf
4e0cba1053
Merge pull request #191 from nhatchan/20190113_py35_finetune
...
lm_finetuning compatibility with Python 3.5
2019-01-14 09:40:07 +01:00
nhatchan
6c65cb2492
lm_finetuning compatibility with Python 3.5
...
dicts are not ordered in Python 3.5 or prior, which is a cause of #175 .
This PR replaces one with a list, to keep its order.
2019-01-13 21:09:13 +09:00
Li Dong
a2da2b4109
[bug fix] args.do_lower_case is always True
...
The "default=True" makes args.do_lower_case always True.
```python
parser.add_argument("--do_lower_case",
default=True,
action='store_true')
```
2019-01-13 19:51:11 +08:00
tholor
506e5bb0c8
add do_lower_case arg and adjust model saving for lm finetuning.
2019-01-11 08:32:46 +01:00
Thomas Wolf
e485829a41
Merge pull request #174 from abeljim/master
...
Added Squad 2.0
2019-01-10 23:40:45 +01:00
Sang-Kil Park
64326dccfb
Fix it to run properly even if without --do_train
param.
...
It was modified similar to `run_classifier.py`, and Fixed to run properly even if without `--do_train` param.
2019-01-10 21:51:39 +09:00
thomwolf
e5c78c6684
update readme and few typos
2019-01-10 01:40:00 +01:00
thomwolf
fa5222c296
update readme
2019-01-10 01:25:28 +01:00
Unknown
b3628f117e
Added Squad 2.0
2019-01-08 15:13:13 -08:00
thomwolf
ab90d4cddd
adding docs and example for OpenAI GPT
2019-01-09 00:12:43 +01:00
thomwolf
2e4db64cab
add do_lower_case tokenizer loading optino in run_squad and ine_tuning examples
2019-01-07 13:06:42 +01:00
thomwolf
c9fd350567
remove default when action is store_true in arguments
2019-01-07 13:01:54 +01:00
Thomas Wolf
d3d56f9a0b
Merge pull request #166 from likejazz/patch-1
...
Fix error when `bert_model` param is path or url.
2019-01-07 12:40:55 +01:00
Thomas Wolf
766c6b2ce3
Merge pull request #159 from jaderabbit/master
...
Allow do_eval to be used without do_train and to use the pretrained model in the output folder
2019-01-07 12:31:06 +01:00
Thomas Wolf
77966a43a4
Merge pull request #156 from rodgzilla/cl_args_doc
...
Adding new pretrained model to the help of the `bert_model` argument.
2019-01-07 12:27:16 +01:00
Thomas Wolf
2e8c5c00ec
Merge pull request #141 from SinghJasdeep/patch-1
...
loading saved model when n_classes != 2
2019-01-07 12:21:13 +01:00
Sang-Kil Park
ca4e7aaa72
Fix error when bert_model
param is path or url.
...
Error occurs when `bert_model` param is path or url. Therefore, if it is path, specify the last path to prevent error.
2019-01-05 11:42:54 +09:00
Jade Abbott
193e2df8ba
Remove rogue comment
2019-01-03 13:13:06 +02:00
Jade Abbott
c64de50ea4
nb_tr_steps is not initialized
2019-01-03 12:34:57 +02:00
Jade Abbott
b96149a19b
Training loss is not initialized if only do_eval is specified
2019-01-03 10:32:10 +02:00
Jade Abbott
be3b9bcf4d
Allow one to use the pretrained model in evaluation when do_train is not selected
2019-01-03 09:02:33 +02:00
Grégory Châtel
186f75342e
Adding new pretrained model to the help of the bert_model
argument.
2019-01-02 14:00:59 +01:00
Jasdeep Singh
99709ee61d
loading saved model when n_classes != 2
...
Required to for: Assertion `t >= 0 && t < n_classes` failed, if your default number of classes is not 2.
2018-12-20 13:55:47 -08:00
tholor
e5fc98c542
add exemplary training data. update to nvidia apex. refactor 'item -> line in doc' mapping. add warning for unknown word.
2018-12-20 18:30:52 +01:00
deepset
a58361f197
Add example for fine tuning BERT language model ( #1 )
...
Adds an example for loading a pre-trained BERT model and fine tune it as a language model (masked tokens & nextSentence) on your target corpus.
2018-12-18 10:32:25 +01:00
thomwolf
ae88eb88a4
set encoding to 'utf-8' in calls to open
2018-12-14 13:48:58 +01:00
thomwolf
e1eab59aac
no fp16 on evaluation
2018-12-13 14:54:02 +01:00
thomwolf
087798b7fa
fix reloading model for evaluation in examples
2018-12-13 14:48:12 +01:00
thomwolf
0f544625f4
fix swag example for work with apex
2018-12-13 13:35:59 +01:00
thomwolf
0cf88ff084
make examples work without apex
2018-12-13 13:28:00 +01:00
thomwolf
d3fcec1a3e
add saving and loading model in examples
2018-12-13 12:50:44 +01:00
thomwolf
b3caec5a56
adding save checkpoint and loading in examples
2018-12-13 12:48:13 +01:00
Thomas Wolf
91aab2a6d3
Merge pull request #116 from FDecaYed/deyuf/fp16_with_apex
...
Change to use apex for better fp16 and multi-gpu support
2018-12-13 12:32:37 +01:00
Thomas Wolf
ffe9075f48
Merge pull request #96 from rodgzilla/multiple-choice-code
...
BertForMultipleChoice and Swag dataset example.
2018-12-13 12:05:11 +01:00
Deyu Fu
c8ea286048
change to apex for better fp16 and multi-gpu support
2018-12-11 17:13:58 -08:00
Thomas Wolf
e622790a93
Merge pull request #91 from rodgzilla/convert-examples-code-improvement
...
run_classifier.py improvements
2018-12-11 05:12:04 -05:00
Grégory Châtel
df34f22854
Removing the dependency to pandas and using the csv module to load data.
2018-12-10 17:45:23 +01:00
Grégory Châtel
d429c15f25
Removing old code from copy-paste.
2018-12-06 19:19:21 +01:00
Grégory Châtel
63c45056aa
Finishing the code for the Swag task.
2018-12-06 18:53:05 +01:00
Grégory Châtel
c45d8ac554
Storing the feature of each choice as a dict for readability.
2018-12-06 16:01:28 +01:00
Grégory Châtel
0812aee2c3
Fixing problems in convert_examples_to_features.
2018-12-06 15:53:07 +01:00
Grégory Châtel
f2b873e995
convert_examples_to_features code and small improvements.
2018-12-06 15:40:47 +01:00
Grégory Châtel
83fdbd6043
Adding read_swag_examples to load the dataset.
2018-12-06 14:02:46 +01:00
Grégory Châtel
7183cded4e
SwagExample class.
2018-12-06 13:39:44 +01:00
Grégory Châtel
fa7daa247d
Fixing the commentary of the SquadExample
class.
2018-12-06 13:14:33 +01:00
Grégory Châtel
a994bf4076
Fixing related to issue #83 .
2018-12-05 18:16:30 +01:00
Grégory Châtel
c6d9d5394e
Simplifying code for easier understanding.
2018-12-05 17:53:09 +01:00
Grégory Châtel
793262e8ec
Removing trailing whitespaces.
2018-12-05 17:52:39 +01:00
Davide Fiocco
e60e8a6068
Correct assignement for logits in classifier example
...
I tried to address https://github.com/huggingface/pytorch-pretrained-BERT/issues/76
should be correct, but there's likely a more efficient way.
2018-12-02 12:38:26 +01:00
Davide Fiocco
dc13e276ee
Point typo fix
2018-12-01 01:02:16 +01:00
thomwolf
89d47230d7
clean up classification model output
2018-11-30 22:54:53 +01:00
thomwolf
c588453a0f
fix run_squad
2018-11-30 14:22:40 +01:00
thomwolf
0541442558
add do_lower_case in examples
2018-11-30 13:47:33 +01:00
Li Li
0aaedcc02f
Bug fix in examples;correct t_total for distributed training;run prediction for full dataset
2018-11-27 01:08:37 -08:00
thomwolf
32167cdf4b
remove convert_to_unicode and printable_text from examples
2018-11-26 23:33:22 +01:00
thomwolf
05053d163c
update cache_dir in readme and examples
2018-11-26 10:45:13 +01:00
thomwolf
6b2136a8a9
fixing weights decay in run_squad example
2018-11-20 10:12:44 +01:00
Thomas Wolf
061eeca84a
Merge pull request #32 from xiaoda99/master
...
Fix ineffective no_decay bug when using BERTAdam
2018-11-20 10:11:46 +01:00
thomwolf
2f21497d3e
fixing param.grad is None in fp16 examples
2018-11-20 10:01:21 +01:00
xiaoda99
6c4789e4e8
Fix ineffective no_decay bug
2018-11-18 16:16:21 +08:00
thomwolf
27ee0fff3c
add no_cuda args in extract_features
2018-11-17 23:04:44 +01:00
thomwolf
aa50fd196f
remove unused arguments in example scripts
2018-11-17 23:01:05 +01:00
thomwolf
47a7d4ec14
update examples from master
2018-11-17 12:21:35 +01:00
thomwolf
c8cba67742
clean up readme and examples
2018-11-17 12:19:16 +01:00
thomwolf
757750d6f6
fix tests
2018-11-17 11:58:14 +01:00
thomwolf
4e46affc34
updating examples
2018-11-17 10:30:54 +01:00
thomwolf
cba85a67b9
fix nan in optimizer_on_cpu
2018-11-15 21:47:41 +01:00
thomwolf
1de35b624b
preparing for first release
2018-11-15 20:56:10 +01:00