LysandreJik
ab984a8b72
Python 2 compatibility
2019-09-19 15:01:33 +02:00
LysandreJik
3df208c93a
Tokenizer accepts token list as well as string
2019-09-19 14:47:52 +02:00
LysandreJik
66ea76b8a9
prepare_for_model and prepare_pair_for_model methods. Added an option to select which sequence will be truncated.
2019-09-19 13:50:51 +02:00
LysandreJik
60414f31a9
GLUE updated with new methods
2019-09-19 10:55:06 +02:00
LysandreJik
baa74326ab
Stride + tests + small fixes
2019-09-19 10:55:06 +02:00
LysandreJik
c10c7d59e7
Mask computing in standalone method. Tests.
2019-09-19 10:55:06 +02:00
LysandreJik
bf503158c5
Sentence -> Sequence. Removed output_mask from the special token addition methods.
2019-09-19 10:55:06 +02:00
LysandreJik
8cba057260
Doc + remove artefacts
2019-09-19 10:55:06 +02:00
LysandreJik
6393261e41
encode + encode_plus tests modified
2019-09-19 10:55:06 +02:00
LysandreJik
dcc9bb3252
Modified encode to return only lists. Added a more complete encode_plus method
2019-09-19 10:55:06 +02:00
LysandreJik
af23b626c8
Max encoding length + corresponding tests
2019-09-19 10:55:06 +02:00
LysandreJik
c4d4f3ec8c
Updated DistilBERT test to reflect the sequence encoding
2019-09-19 10:55:06 +02:00
LysandreJik
d572d7027b
Number of added tokens calculator
2019-09-19 10:55:06 +02:00
LysandreJik
de8e14b6c0
Added DistilBERT to run_squad script
2019-09-19 10:55:06 +02:00
LysandreJik
88368c2a16
Added DistilBERT to run_lm_finetuning
2019-09-19 10:55:06 +02:00
LysandreJik
2d8ec5a684
Changed warning to be more explicit
...
Co-authored by: julien_c <chaumond@gmail.com>
2019-09-19 10:55:06 +02:00
LysandreJik
75635072e1
Updated GLUE script to add DistilBERT. Cleaned up unused args in the utils file.
2019-09-19 10:55:06 +02:00
LysandreJik
92a9976e91
Distilbert sequence builder w/ mask
2019-09-19 10:55:06 +02:00
LysandreJik
59057abe52
typo
2019-09-19 10:55:06 +02:00
LysandreJik
bac332fec0
Updated the GLUE data processor. Corrections to RoBERTa and XLNet.
2019-09-19 10:55:06 +02:00
LysandreJik
c3df2136e1
Added binary masking tests
2019-09-19 10:55:06 +02:00
LysandreJik
e391d4735e
Tokenizers' encode function can output binary masks
2019-09-19 10:55:06 +02:00
sshleifer
119610b5c5
Merge branch 'master' into delete-n-special-doc
2019-09-19 01:35:01 -07:00
sshleifer
08e4ad5eea
Remove documentation for unused kwarg
2019-09-18 16:35:01 -07:00
Erik Chan
f0340eccf9
Typo
...
Typo
2019-09-18 13:42:11 -07:00
Thomas Wolf
0d1dad6d53
Merge pull request #1004 from erenup/master
...
Refactoring old run_swag.py
2019-09-18 21:42:51 +02:00
erenup
8960988f35
fixed to find best dev acc
2019-09-19 01:10:05 +08:00
erenup
b57bfb5fa0
Merge pull request #3 from erenup/run_multiple_choice_merge
...
Run multiple choice merge
2019-09-18 21:45:04 +08:00
erenup
46ffc28329
Merge branch 'master' into run_multiple_choice_merge
...
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
2019-09-18 21:43:46 +08:00
Simon Layton
ec94f4e0f8
Fix fp16 masking in PoolerEndLogits
...
Necessary to run xlnet (at least in squad) with `--fp16 --fp16_opt_level="O2"`, otherwise loss is immediately `NaN` and fine-tuning cannot proceed.
2019-09-18 09:30:58 -04:00
erenup
15143fbad6
move run_multiple_choice.py and utils_multiple_choice.py to examples
2019-09-18 21:18:46 +08:00
erenup
3cd6289758
Merge remote-tracking branch 'huggingface/master' into run_multiple_choice_merge
...
# Conflicts:
# examples/contrib/run_swag.py
2019-09-18 21:16:59 +08:00
erenup
36362cf086
move schedule.step after optimizer.step
2019-09-18 21:13:40 +08:00
thomwolf
3a527fa820
OpenAI GPT tests ok
2019-09-18 14:15:48 +02:00
thomwolf
556442afb3
hot fix
2019-09-18 14:12:41 +02:00
thomwolf
160b5d6080
fix xlm lang_embeddings loading
2019-09-18 14:10:20 +02:00
thomwolf
26497d1199
fix tests
2019-09-18 12:17:21 +02:00
thomwolf
6a083fd447
update pt-tf conversion script
2019-09-18 12:11:32 +02:00
thomwolf
f6969cc12b
upgrade max model difference to 2e-2 (for transfo-xl adaptive softmax + inputs)
2019-09-18 11:12:02 +02:00
thomwolf
e768f2322a
update run_openai_gpt to fix #1264
2019-09-18 10:07:47 +02:00
thomwolf
8334993915
clean up examples - updated to new keyword inputs - #1246
2019-09-18 10:01:27 +02:00
Julien Chaumond
62760baf46
tiny fixes
2019-09-17 18:29:15 -04:00
thomwolf
45de034bf8
fix #1223
2019-09-17 10:25:06 +02:00
erenup
5a81e79e25
Merge pull request #2 from erenup/run_multiple_choice_add_doc
...
Run multiple choice add doc
2019-09-16 22:39:54 +08:00
erenup
5882c442e5
add example usage
2019-09-16 22:38:08 +08:00
erenup
a9debaca3d
fixed init_weight
2019-09-16 19:55:24 +08:00
thomwolf
c88f05163d
fix typo in XLM models
2019-09-16 13:42:20 +02:00
erenup
982f181aa7
Merge remote-tracking branch 'origin/master' into run_multiple_choice_add_doc
2019-09-16 19:12:00 +08:00
erenup
84b9d1c423
Merge remote-tracking branch 'huggingface/master'
...
# Conflicts:
# pytorch_transformers/__init__.py
2019-09-16 19:06:12 +08:00
erenup
603b470a3d
add warnning info
2019-09-16 18:53:37 +08:00