Thomas Wolf
4428aefc63
Merge pull request #1488 from huggingface/pytorch-tpu
...
GLUE on TPU
2019-10-11 16:33:00 +02:00
Luran He
f382a8decd
convert int to str before adding to a str
2019-10-10 19:20:39 -04:00
Lysandre
639f4b7190
Don't save/load when on TPU
2019-10-10 19:17:25 +00:00
Lysandre
d4e7934ac3
GLUE on TPU
2019-10-10 19:03:06 +00:00
Rémi Louf
1e68c28670
add test for initialization of Bert2Rnd
2019-10-10 18:07:11 +02:00
Thomas Wolf
6596e3d566
Merge pull request #1454 from bkkaggle/pytorch-built-in-tensorboard
...
Change tensorboard imports to use built-in tensorboard if available
2019-10-10 11:56:55 +02:00
thomwolf
177a721205
move back to simple space spliting
2019-10-10 11:45:47 +02:00
thomwolf
a5997dd81a
better error messages
2019-10-10 11:31:01 +02:00
Lysandre Debut
2431fea98a
Merge pull request #1383 from keskarnitish/master
...
Adding CTRL
2019-10-09 11:31:05 -04:00
thomwolf
d9e60f4f0d
Merge branch 'master' into pr/1383
2019-10-09 17:25:08 +02:00
Lysandre Debut
e84470ef81
Merge pull request #1384 from huggingface/encoding-qol
...
Quality of life enhancements in encoding + patch MLM masking
2019-10-09 11:18:24 -04:00
jinoobaek-qz
69629c4f0f
Improve naming and only do regex when necessary
2019-10-09 08:48:40 -04:00
jinoobaek-qz
bf34a252b8
Golden path
2019-10-09 08:48:40 -04:00
jinoobaek-qz
528d3f327b
Improve readability and improve make less assumptions about checkpoint format
2019-10-09 08:48:40 -04:00
jinoobaek-qz
56301bd9e8
Extract method
2019-10-09 08:48:40 -04:00
jinoobaek-qz
d6c5469712
Delete older checkpoint after saving new checkpoint
2019-10-09 08:48:40 -04:00
jinoobaek-qz
54a31f50fb
Add save_total_limit
2019-10-09 08:48:40 -04:00
Thomas Wolf
439fac723a
Merge pull request #1409 from brian41005/master
...
Evaluation result.txt path changing #1286
2019-10-09 03:14:34 +02:00
Bilal Khan
5ce8d29abe
Change tensorboard imports to use built-in tensorboard if available
2019-10-08 16:29:43 -05:00
VictorSanh
7ce83b4931
update weights for distilgpt2
2019-10-07 12:30:27 -04:00
LysandreJik
f3e0218fbb
Correct device assignment in run_generation
2019-10-05 21:05:16 -04:00
thomwolf
78ef1a9930
fixes
2019-10-04 17:59:44 -04:00
thomwolf
6c1d0bc066
update encode_plus - add truncation strategies
2019-10-04 17:38:38 -04:00
VictorSanh
0820bb0555
unecessary carriage return
2019-10-04 17:23:15 -04:00
VictorSanh
f5891c3821
run_squad --> run_squad_w_distillation
2019-10-04 17:23:15 -04:00
VictorSanh
764a7923ec
add distillation+finetuning option in run_squad
2019-10-04 17:23:15 -04:00
thomwolf
92c0f2fb90
Merge remote-tracking branch 'origin/julien_multiple-choice' into encoding-qol
2019-10-04 15:48:06 -04:00
Julien Chaumond
9e136ff57c
Honor args.overwrite_cache (h/t @erenup)
2019-10-04 15:00:56 -04:00
keskarnitish
dbed1c5d94
Adding CTRL (squashed commit)
...
adding conversion script
adding first draft of modeling & tokenization
adding placeholder for test files
bunch of changes
registering the tokenizer/model/etc
tests
change link; something is very VERY wrong here
weird end-of-word thingy going on
i think the tokenization works now ; wrote the unit tests
overall structure works;load w next
the monster is alive!
works after some cleanup as well
adding emacs autosave to gitignore
currently only supporting the 48 layer one; seems to infer fine on my macbook
cleanup
fixing some documentation
fixing some documentation
tests passing?
now works on CUDA also
adding greedy?
adding greedy sampling
works well
2019-10-03 22:29:03 -07:00
Lysandre Debut
d3f24dfad7
Merge branch 'master' into master
2019-10-03 22:43:09 +00:00
LysandreJik
ecc4f1bdfa
XLM use_lang_embedding flag in run_generation
2019-10-03 17:42:16 -04:00
LysandreJik
c2c2ca0fdb
Added XLM to run_generation, with prompt language selection.
2019-10-03 17:18:48 -04:00
LysandreJik
aebd83230f
Update naming + remove f string in run_lm_finetuning example
2019-10-03 11:31:36 -04:00
LysandreJik
5ed50a93fb
LM finetuning won't mask special tokens anymore
2019-10-03 11:31:36 -04:00
Brian Ma
7af0777910
Update run_glue.py
...
add DistilBert model shortcut into ALL_MODELS
2019-10-03 15:31:11 +00:00
VictorSanh
5f07d8f11a
prepare release
2019-10-03 10:27:11 -04:00
VictorSanh
35071007cb
incoming release 🔥 update links to arxiv preprint
2019-10-03 10:27:11 -04:00
VictorSanh
2a91f6071f
upddate README - TODO updadte link to paper
2019-10-03 10:27:11 -04:00
VictorSanh
c51e533a5f
update train.py
2019-10-03 10:27:11 -04:00
VictorSanh
a76c3f9cb0
update requirements
2019-10-03 10:27:11 -04:00
VictorSanh
bb9c5ead54
update distiller
2019-10-03 10:27:11 -04:00
VictorSanh
a12ab0a8db
update binarized_data
2019-10-03 10:27:11 -04:00
VictorSanh
4d6dfbd376
update extract
2019-10-03 10:27:11 -04:00
VictorSanh
23edebc079
update extract_distilbert
2019-10-03 10:27:11 -04:00
VictorSanh
cbfcfce205
update token_counts
2019-10-03 10:27:11 -04:00
VictorSanh
19e4ebbe3f
grouped_batch_sampler
2019-10-03 10:27:11 -04:00
VictorSanh
594202a934
lm_seqs_dataset
2019-10-03 10:27:11 -04:00
VictorSanh
38084507c4
add distillation_configs
2019-10-03 10:27:11 -04:00
Brian Ma
2195c0d5f9
Evaluation result.txt path changing #1286
2019-10-03 12:49:12 +08:00
Thomas Wolf
963529e29b
Merge pull request #1288 from echan00/master
...
Typo with LM Fine tuning script
2019-10-01 18:46:07 -04:00
thomwolf
f7978f70ec
use format instead of f-strings
2019-10-01 18:45:38 -04:00
Julien Chaumond
b350662955
overflowing_tokens do not really make sense here, let's just return a number
...
Co-Authored-By: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2019-09-30 16:37:09 -04:00
Julien Chaumond
f5bcde0b2f
[multiple-choice] Simplify and use tokenizer.encode_plus
2019-09-30 16:04:55 -04:00
Denny
9478590630
Update run_lm_finetuning.py
...
The previous method, just as phrased, did not exist in the class.
2019-09-27 15:18:42 -03:00
Thomas Wolf
d83d295763
Merge pull request #1337 from mgrankin/fastdataset
...
faster dataset building
2019-09-27 10:35:12 +02:00
thomwolf
da2e47ad15
clean up a little run_tf_glue
2019-09-27 09:41:15 +02:00
thomwolf
528c288fa9
clean up run_tf_glue
2019-09-27 09:40:29 +02:00
VictorSanh
702f589848
fix input in run_glue for distilbert
2019-09-27 00:20:14 -04:00
mgrankin
f71a4577b8
faster dataset building
2019-09-26 16:53:13 +03:00
thomwolf
481d9c4fb5
Merge branch 'master' into tf2
2019-09-26 12:02:54 +02:00
thomwolf
31c23bd5ee
[BIG] pytorch-transformers => transformers
2019-09-26 10:15:53 +02:00
thomwolf
5705333441
add initialization for everybody
2019-09-26 10:06:20 +02:00
thomwolf
7c9f8f93f9
fix tests
2019-09-26 01:59:53 +02:00
thomwolf
d6dde438ea
add batch dimension in encode
2019-09-26 01:45:55 +02:00
thomwolf
4a21c4d88d
add warning if neither pt nor tf are found
2019-09-26 01:30:06 +02:00
thomwolf
3b7fb48c3b
fix loading from tf/pt
2019-09-25 17:46:16 +02:00
thomwolf
a049c8043b
push fix to training
2019-09-25 17:33:16 +02:00
mataney
a9f24a16bc
[FIX] fix run_generation.py to work with batch_size > 1
2019-09-25 15:53:29 +03:00
thomwolf
5def3302f4
update run_glue
2019-09-25 12:38:08 +02:00
thomwolf
f71758f7a4
update internal glue processors
2019-09-25 12:00:50 +02:00
thomwolf
b5ec526f85
updated data processor and metrics
2019-09-24 17:10:50 +02:00
LysandreJik
f09e5ecef0
[Proposal] GLUE processors included in library
2019-09-24 09:47:34 -04:00
LysandreJik
c832f43a4d
output_token_type
-> token_type_ids
2019-09-24 07:21:38 -04:00
LysandreJik
3927d7756c
Updated the GLUE pre-processing method
2019-09-24 07:15:11 -04:00
LysandreJik
9d44236f70
Updated DistilBERT
2019-09-24 07:03:24 -04:00
Lorenzo Ampil
4b543c3007
Add option to use a 'stop token' which will be used to truncate the output text to everything till right before the 'stop token'
2019-09-22 21:38:38 +08:00
VictorSanh
9f995b99d4
minor fixes
2019-09-19 21:36:06 +00:00
VictorSanh
3fe5c8e8a8
update bert-base-uncased rslts
2019-09-19 19:34:22 +00:00
VictorSanh
354944e607
[distillation] big update w/ new weights
2019-09-19 19:25:21 +00:00
LysandreJik
60414f31a9
GLUE updated with new methods
2019-09-19 10:55:06 +02:00
LysandreJik
bf503158c5
Sentence -> Sequence. Removed output_mask from the special token addition methods.
2019-09-19 10:55:06 +02:00
LysandreJik
de8e14b6c0
Added DistilBERT to run_squad script
2019-09-19 10:55:06 +02:00
LysandreJik
88368c2a16
Added DistilBERT to run_lm_finetuning
2019-09-19 10:55:06 +02:00
LysandreJik
75635072e1
Updated GLUE script to add DistilBERT. Cleaned up unused args in the utils file.
2019-09-19 10:55:06 +02:00
LysandreJik
59057abe52
typo
2019-09-19 10:55:06 +02:00
LysandreJik
bac332fec0
Updated the GLUE data processor. Corrections to RoBERTa and XLNet.
2019-09-19 10:55:06 +02:00
Erik Chan
f0340eccf9
Typo
...
Typo
2019-09-18 13:42:11 -07:00
erenup
8960988f35
fixed to find best dev acc
2019-09-19 01:10:05 +08:00
erenup
46ffc28329
Merge branch 'master' into run_multiple_choice_merge
...
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
2019-09-18 21:43:46 +08:00
erenup
15143fbad6
move run_multiple_choice.py and utils_multiple_choice.py to examples
2019-09-18 21:18:46 +08:00
erenup
3cd6289758
Merge remote-tracking branch 'huggingface/master' into run_multiple_choice_merge
...
# Conflicts:
# examples/contrib/run_swag.py
2019-09-18 21:16:59 +08:00
erenup
36362cf086
move schedule.step after optimizer.step
2019-09-18 21:13:40 +08:00
thomwolf
e768f2322a
update run_openai_gpt to fix #1264
2019-09-18 10:07:47 +02:00
thomwolf
8334993915
clean up examples - updated to new keyword inputs - #1246
2019-09-18 10:01:27 +02:00
erenup
5882c442e5
add example usage
2019-09-16 22:38:08 +08:00
erenup
982f181aa7
Merge remote-tracking branch 'origin/master' into run_multiple_choice_add_doc
2019-09-16 19:12:00 +08:00
erenup
84b9d1c423
Merge remote-tracking branch 'huggingface/master'
...
# Conflicts:
# pytorch_transformers/__init__.py
2019-09-16 19:06:12 +08:00
erenup
603b470a3d
add warnning info
2019-09-16 18:53:37 +08:00
erenup
4812a5a767
add doc string
2019-09-16 11:50:18 +08:00
VictorSanh
32e1332acf
[distil] fix once for all general logger for scripts
2019-09-11 14:19:07 +00:00
VictorSanh
364920e216
fix small bug/typo
2019-09-10 21:45:01 +00:00
Thomas Wolf
23c23f5399
Merge pull request #1229 from SKRohit/master
...
changes in evaluate function in run_lm_finetuning.py
2019-09-10 22:16:45 +02:00
searchivarius
eab980fd68
Fix to prevent crashing on assert len(tokens_b)>=1
2019-09-09 19:58:08 -04:00
VictorSanh
a95ced6260
[Distillation] save last chkpt as pytorch_model.bin
2019-09-09 19:53:35 +00:00
Rohit Kumar Singh
e5df36397b
changes in return statement of evaluate function
...
changed `results` to `result` and removed `results` dict defined previously
2019-09-09 19:55:57 +05:30
LysandreJik
3f91338be9
Patched a few outdated parameters
2019-09-06 17:48:06 -04:00
LysandreJik
f47f9a5874
Updated outdated examples
2019-09-06 17:10:33 -04:00
LysandreJik
5e151f5e77
Table of contents
2019-09-06 12:08:36 -04:00
LysandreJik
593c070435
Better examples
2019-09-06 12:00:12 -04:00
VictorSanh
dddd6b9927
Update DistilBERT training code
2019-09-05 18:26:14 +00:00
Stefan Schweter
a1c34bd286
distillation: fix ModuleNotFoundError error in token counts script
2019-08-31 12:21:38 +02:00
Thomas Wolf
51e980ce36
Merge pull request #1155 from anhnt170489/apex_fp16
...
Update apex fp16 implementation
2019-08-30 23:29:11 +02:00
VictorSanh
282c276e09
typos + file name coherence in distillation README
2019-08-30 12:02:29 -04:00
VictorSanh
803c1cc4ea
fix relative import bug cf Issue #1140
2019-08-30 12:01:27 -04:00
Thomas Wolf
0a2fecdf90
Merge branch 'master' into master
2019-08-30 16:30:08 +02:00
Rabeeh KARIMI
39eb31e11e
remove reloading tokenizer in the training, adding it to the evaluation part
2019-08-30 15:44:41 +02:00
Rabeeh KARIMI
350bb6bffa
updated tokenizer loading for addressing reproducibility issues
2019-08-30 15:34:28 +02:00
Thomas Wolf
01ad55f8cf
Merge pull request #1026 from rabeehk/master
...
loads the tokenizer for each checkpoint, to solve the reproducability…
2019-08-30 14:15:36 +02:00
erenup
6e1ac34e2b
Merge remote-tracking branch 'huggingface/master'
2019-08-30 15:50:11 +08:00
jamin
2fb9a934b4
re-format
2019-08-30 14:05:28 +09:00
jamin
c8731b9583
update apex fp16 implementation
2019-08-30 13:54:00 +09:00
LysandreJik
caf1d116a6
Closing bracket in DistilBERT's token count.
2019-08-29 15:30:10 -04:00
Luis
fe8fb10b44
Small modification of comment in the run_glue.py example
...
Add RoBERTa to the comment as it was not explicit that RoBERTa don't use token_type_ids.
2019-08-29 14:43:30 +02:00
erenup
942d3f4b20
modifiy code of arc label insurance
2019-08-29 10:21:17 +08:00
LysandreJik
bf3dc778b8
Changed learning rate for run_squad test
2019-08-28 18:24:43 -04:00
Andreas Daiminger
1d15a7f278
swap order of optimizer.step() and scheduler.step()
2019-08-28 19:18:27 +02:00
Thomas Wolf
0ecfd17f49
Merge pull request #987 from huggingface/generative-finetuning
...
Generative finetuning
2019-08-28 16:51:50 +02:00
thomwolf
b5eb283aaa
update credits
2019-08-28 16:36:55 +02:00
thomwolf
912a377e90
dilbert -> distilbert
2019-08-28 13:59:42 +02:00
thomwolf
4ce5f36f78
update readmes
2019-08-28 12:14:31 +02:00
erenup
ec4b1c659f
logging truth error
2019-08-28 16:50:40 +08:00
erenup
df52abe373
add sep_toekn between question and choice
2019-08-28 16:36:21 +08:00
erenup
43c243254a
avoid invalid labels of truth
2019-08-28 16:03:17 +08:00
erenup
3c7e676f8b
add test related code: test the best dev acc model when model is training
2019-08-28 15:57:29 +08:00
VictorSanh
93e82ab424
Write README for DilBERT
2019-08-28 06:26:09 +00:00
VictorSanh
fea921d382
add licensing
2019-08-28 04:45:39 +00:00
VictorSanh
da1e4e53fc
some fixes in train.py
for loading previous checkpoint
2019-08-28 04:01:03 +00:00
VictorSanh
0d8f8848d5
add scripts/extract_for_distil.py
2019-08-28 04:00:19 +00:00
VictorSanh
7f2c384c80
add scripts/token_counts.py
2019-08-28 04:00:03 +00:00
VictorSanh
4d16b279e5
add scripts/binarized_data.py
2019-08-28 03:59:48 +00:00
VictorSanh
b247b0d880
add train.py
for distillation
2019-08-28 02:12:47 +00:00
VictorSanh
780f183e55
add requirements
2019-08-28 01:39:52 +00:00
VictorSanh
e424d2e45d
add README
2019-08-28 01:10:10 +00:00
VictorSanh
1ae81e4aa1
add dataset. distiller, utils
2019-08-28 01:10:05 +00:00
thomwolf
06510ccb53
typo
2019-08-23 22:08:10 +02:00
thomwolf
ab7bd5ef98
fixing tokenization and training
2019-08-23 17:31:21 +02:00
Thomas Wolf
90dcd8c05d
Merge branch 'master' into generative-finetuning
2019-08-22 10:43:30 +02:00
VictorSanh
57272d5ddf
fix for glue
2019-08-22 00:25:49 -04:00
VictorSanh
b006a7a12f
fix for squad
2019-08-22 00:25:42 -04:00
Thomas Wolf
9beaa85b07
Merge pull request #1055 from qipeng/run_squad_fix
...
Fix #1015 (tokenizer defaults to use_lower_case=True when loading from trained models)
2019-08-21 01:20:46 +02:00
Lysandre
2d042274ac
Sequence special token handling for BERT and RoBERTa
2019-08-20 14:15:28 -04:00
Peng Qi
3bffd2e8e5
more fixes
2019-08-20 10:59:28 -07:00
Thomas Wolf
3b56427a1e
Merge pull request #1040 from FeiWang96/multi_gpu
...
Fix bug of multi-gpu training in lm finetuning
2019-08-20 17:13:44 +02:00
thomwolf
a690edab17
various fix and clean up on run_lm_finetuning
2019-08-20 15:52:12 +02:00
erenup
fc74132598
add best steps to train
2019-08-20 19:06:41 +08:00
Duzeyao
d86b49ac86
swap optimizer.step and scheduler.step
2019-08-20 16:46:34 +08:00
Duzeyao
45ab8bf60e
Revert "Update finetune_on_pregenerated.py"
...
This reverts commit a1359b970c
.
2019-08-20 16:40:39 +08:00
erenup
97c30b73d5
add test related code
2019-08-20 16:31:04 +08:00
erenup
d5e60e5b7a
add test related code
2019-08-20 16:25:50 +08:00
Zeyao Du
a1359b970c
Update finetune_on_pregenerated.py
2019-08-20 16:00:07 +08:00
Zeyao Du
28f7ca1f80
swap optimizer.step and scheduler.step
2019-08-20 15:58:42 +08:00
Peng Qi
a368b87791
Fix #1015
2019-08-19 13:07:00 -07:00
Lysandre
f94f1c6016
Distributed training + tokenizer agnostic mask token
2019-08-19 14:58:50 -04:00
Thomas Wolf
5a49b793d9
Merge pull request #1023 from tuvuumass/patch-1
...
fix issue #824
2019-08-19 15:31:46 +02:00
erenup
4270d3da1b
fix a bug of evaluating
2019-08-19 16:38:52 +08:00
Chi-Liang Liu
40acf6b52a
don't save model without training
2019-08-18 05:02:25 -04:00
erenup
47e9aea0fe
add args info to evaluate_result.txt
2019-08-18 17:00:53 +08:00
erenup
5582bc4b23
add multiple choice to robreta and xlnet, test on swag, roberta=0.82.28
...
, xlnet=0.80
2019-08-18 16:01:48 +08:00
wangfei
856a63da4d
Fix: save model/model.module
2019-08-18 11:03:47 +08:00
wangfei
1ef41b8337
Revert "Fix: save model/model.module"
...
This reverts commit 00e9c4cc96
.
2019-08-18 11:03:12 +08:00
wangfei
00e9c4cc96
Fix: save model/model.module
2019-08-18 11:02:02 +08:00
erenup
e384ae2b9d
Merge remote-tracking branch 'huggingface/master'
...
merge huggingface/master to update
2019-08-17 12:05:57 +08:00
Jason Phang
d8923270e6
Correct truncation for RoBERTa in 2-input GLUE
2019-08-16 16:30:38 -04:00
Lysandre
5652f54ac2
Simplified data generator + better perplexity calculator
...
GPT-2 now obtains ~20 perplexity on WikiText-2
2019-08-16 13:49:56 -04:00
LysandreJik
7e7fc53da5
Fixing run_glue example with RoBERTa
2019-08-16 11:53:10 -04:00
LysandreJik
715534800a
BERT + RoBERTa masking tokens handling + GPU device update.
2019-08-16 10:10:21 -04:00
LysandreJik
339e556feb
CLM for BERT, beginning of CLM fot RoBERTa; still needs a better masking token mechanism.
2019-08-16 10:10:20 -04:00
LysandreJik
5c18825a18
Removed dataset limit
2019-08-16 10:10:20 -04:00
LysandreJik
3e3e145497
Added GPT to the generative fine-tuning.
2019-08-16 10:10:20 -04:00
LysandreJik
47975ed53e
Language Modeling fine-tuning using GPT-2.
2019-08-16 10:10:20 -04:00
wangfei
b8ff56896c
Fix bug of multi-gpu training in lm finetuning
2019-08-16 12:11:05 +08:00
Rabeeh KARIMI
3d47a7f8ab
loads the tokenizer for each checkpoint, to solve the reproducability issue
2019-08-14 10:58:26 +02:00
LysandreJik
39f426be65
Added special tokens <pad> and <mask> to RoBERTa.
2019-08-13 15:19:50 -04:00
Julien Chaumond
baf08ca1d4
[RoBERTa] run_glue: correct pad_token + reorder labels
2019-08-13 12:51:15 -04:00
tuvuumass
ba4bce2581
fix issue #824
2019-08-13 11:26:27 -04:00
Julien Chaumond
912fdff899
[RoBERTa] Update run_glue
for RoBERTa
2019-08-12 13:49:50 -04:00
erenup
b219029c45
refactoring old run_swag. This script is mainly refatored from run_squad in pytorch_transformers
2019-08-11 15:20:37 +08:00
Thomas Wolf
b4f9464f90
Merge pull request #960 from ethanjperez/patch-1
...
Fixing unused weight_decay argument
2019-08-07 10:09:55 +02:00
Thomas Wolf
d43dc48b34
Merge branch 'master' into auto_models
2019-08-05 19:17:35 +02:00
thomwolf
70c10caa06
add option mentioned in #940
2019-08-05 17:09:37 +02:00
thomwolf
b90e29d52c
working on automodels
2019-08-05 16:06:34 +02:00
Ethan Perez
28ba345ecc
Fixing unused weight_decay argument
...
Currently the L2 regularization is hard-coded to "0.01", even though there is a --weight_decay flag implemented (that is unused). I'm making this flag control the weight decay used for fine-tuning in this script.
2019-08-04 12:31:46 -04:00
Thomas Wolf
c054b5ee64
Merge pull request #896 from zijunsun/master
...
fix multi-gpu training bug when using fp16
2019-07-26 19:31:02 +02:00
zijunsun
f0aeb7a814
multi-gpu training also should be after apex fp16(squad)
2019-07-26 15:23:29 +08:00
zijunsun
adb3ef6368
multi-gpu training also should be after apex fp16
2019-07-25 13:09:10 +08:00
Chi-Liang Liu
a7fce6d917
fix squad v1 error (na_prob_file should be None)
2019-07-24 16:11:36 +08:00
thomwolf
6070b55443
fix #868
2019-07-23 17:46:01 +02:00
thomwolf
2c9a3115b7
fix #858
2019-07-23 16:45:55 +02:00
Thomas Wolf
268c6cc160
Merge pull request #845 from rabeehk/master
...
fixed version issues in run_openai_gpt
2019-07-23 15:29:31 +02:00
Peiqin Lin
76be189b08
typos
2019-07-21 20:39:42 +08:00
Rabeeh KARIMI
f63ff536ad
fixed version issues in run_openai_gpt
2019-07-20 12:43:07 +02:00
Thomas Wolf
a615499076
Merge pull request #797 from yzy5630/fix-examples
...
fix some errors for distributed lm_finetuning
2019-07-18 23:32:33 +02:00
yzy5630
a1fe4ba9c9
use new API for save and load
2019-07-18 15:45:23 +08:00
yzy5630
a7ba27b1b4
add parser for adam
2019-07-18 08:52:51 +08:00
yzy5630
d6522e2873
change loss and optimizer to new API
2019-07-17 21:22:34 +08:00
thomwolf
71d597dad0
fix #800
2019-07-17 13:51:09 +02:00
yzy5630
123da5a2fa
fix errors for lm_finetuning examples
2019-07-17 09:56:07 +08:00
yzy5630
60a1bdcdac
fix some errors for distributed lm_finetuning
2019-07-17 09:16:20 +08:00
thomwolf
e848b54730
fix #792
2019-07-16 21:22:19 +02:00
thomwolf
1849aa7d39
update readme and pretrained model weight files
2019-07-16 15:11:29 +02:00
thomwolf
f31154cb9d
Merge branch 'xlnet'
2019-07-16 11:51:13 +02:00
thomwolf
76da9765b6
fix run_generation test
2019-07-15 17:52:35 +02:00
thomwolf
e691fc0963
update QA models tests + run_generation
2019-07-15 17:45:24 +02:00
thomwolf
15d8b1266c
update tokenizer - update squad example for xlnet
2019-07-15 17:30:42 +02:00
thomwolf
3b469cb422
updating squad for compatibility with XLNet
2019-07-15 15:28:37 +02:00
thomwolf
0e9825e252
small fix to run_glue
2019-07-14 23:43:28 +02:00
thomwolf
2397f958f9
updating examples and doc
2019-07-14 23:20:10 +02:00
thomwolf
c490f5ce87
added generation examples in tests
2019-07-13 15:26:58 +02:00
thomwolf
7d4b200e40
good quality generation example for GPT, GPT-2, Transfo-XL, XLNet
2019-07-13 15:25:03 +02:00
thomwolf
7322c314a6
remove python2 testing for examples
2019-07-12 14:24:08 +02:00
thomwolf
936e813c84
clean up examples - added squad example and test
2019-07-12 14:16:06 +02:00
thomwolf
762ded9b1c
wip examples
2019-07-12 11:28:52 +02:00
LysandreJik
3821ecbf4a
Byte order mark management in TSV glue reading.
2019-07-11 20:16:28 -04:00
thomwolf
c6bf1a400d
fix test examples et model pretrained
2019-07-11 22:29:08 +02:00
thomwolf
92a782b108
fix run_glue test
2019-07-11 22:20:10 +02:00
thomwolf
ccb6947dc1
optimization tests
2019-07-11 17:39:47 +02:00
thomwolf
b21d84b027
update examples
2019-07-11 15:37:34 +02:00
thomwolf
ec07cf5a66
rewamp optimization
2019-07-11 14:48:22 +02:00
thomwolf
4fef5919a5
updating examples
2019-07-11 12:03:08 +02:00
thomwolf
50b7e52a7f
WIP examples
2019-07-10 15:33:34 +02:00
thomwolf
ed6c8d37f4
fix merge
2019-07-09 17:14:52 +02:00
thomwolf
4ce237c880
update run_glue
2019-07-09 17:00:32 +02:00
thomwolf
3b7cb7bf44
small update to run_glue
2019-07-09 16:12:15 +02:00
thomwolf
d0efbd3cd1
update sequencesummary module
2019-07-09 15:46:43 +02:00
thomwolf
d5481cbe1b
adding tests to examples - updating summary module - coverage update
2019-07-09 15:29:42 +02:00
thomwolf
b19786985d
unified tokenizer api and serialization + tests
2019-07-09 10:25:18 +02:00
thomwolf
3d5f291386
updates to run_glue
2019-07-05 17:22:15 +02:00
thomwolf
99b90edab1
cleaning up run_glue example
2019-07-05 17:09:35 +02:00
thomwolf
1113f97f33
clean up glue example
2019-07-05 16:31:13 +02:00
thomwolf
162ba383b0
fix model loading
2019-07-05 15:57:14 +02:00
thomwolf
36bca545ff
tokenization abstract class - tests for examples
2019-07-05 15:02:59 +02:00
Thomas Wolf
78462aad61
Merge pull request #733 from ceremonious/parallel-generation
...
Added option to use multiple workers to create training data
2019-07-05 12:04:30 +02:00
thomwolf
0bab55d5d5
[BIG] name change
2019-07-05 11:55:36 +02:00
thomwolf
c41f2bad69
WIP XLM + refactoring
2019-07-03 22:54:39 +02:00
Lei Mao
64b2a828c0
fix evaluation bug
2019-07-01 14:56:24 -07:00
thomwolf
2b56e98892
standardizing API across models - XLNetForSeqClass working
2019-06-28 16:35:09 +02:00
thomwolf
3a00674cbf
fix imports
2019-06-27 17:18:46 +02:00
Mayhul Arora
08ff056c43
Added option to use multiple workers to create training data for lm fine tuning
2019-06-26 16:16:12 -07:00
thomwolf
59cefd4f98
fix #726 - get_lr in examples
2019-06-26 11:28:27 +02:00
thomwolf
092dacfd62
changing is_regression to unified API
2019-06-26 09:54:05 +02:00