Commit Graph

714 Commits

Author SHA1 Message Date
Stefan Schweter
e7cf2ccd15 distillation: add German distilbert model 2019-11-19 19:55:19 +01:00
Kazutoshi Shinoda
f3386d9383 typo "deay" -> "decay" 2019-11-18 11:50:06 -05:00
Stefan Schweter
56c84863a1 camembert: add support for CamemBERT in run_ner example 2019-11-18 17:06:57 +01:00
Julien Chaumond
26858f27cb [camembert] Upload to s3 + rename script 2019-11-16 00:11:07 -05:00
Louis MARTIN
3e20c2e871 Update demo_camembert.py with new classes 2019-11-16 00:11:07 -05:00
Louis MARTIN
f12e4d8da7 Move demo_camembert.py to examples/contrib 2019-11-16 00:11:07 -05:00
Louis MARTIN
6e72fd094c Add demo_camembert.py 2019-11-16 00:11:07 -05:00
Thomas Wolf
74ce8de7d8
Merge pull request #1792 from stefan-it/distilbert-for-token-classification
DistilBERT for token classification
2019-11-14 22:47:53 +01:00
Thomas Wolf
05db5bc1af
added small comparison between BERT, RoBERTa and DistilBERT 2019-11-14 22:40:22 +01:00
Thomas Wolf
9629e2c676
Merge pull request #1804 from ronakice/master
fix multi-gpu eval in torch examples
2019-11-14 22:24:05 +01:00
Thomas Wolf
df99f8c5a1
Merge pull request #1832 from huggingface/memory-leak-schedulers
replace LambdaLR scheduler wrappers by function
2019-11-14 22:10:31 +01:00
Rémi Louf
2276bf69b7 update the examples, docs and template 2019-11-14 20:38:02 +01:00
Lysandre
d7929899da Specify checkpoint in saved file for run_lm_finetuning.py 2019-11-14 10:49:00 -05:00
ronakice
2e31176557 fix multi-gpu eval 2019-11-12 05:55:11 -05:00
Stefan Schweter
2b07b9e5ee examples: add DistilBert support for NER fine-tuning 2019-11-11 16:19:34 +01:00
Adrian Bauer
7a9aae1044 Fix run_bertology.py
Make imports and args.overwrite_cache match run_glue.py
2019-11-08 16:28:40 -05:00
Julien Chaumond
f88c104d8f [run_tf_glue] Add comment for context 2019-11-05 19:56:43 -05:00
Julien Chaumond
30968d70af misc doc 2019-11-05 19:06:12 -05:00
Thomas Wolf
e99071f105
Merge pull request #1734 from orena1/patch-1
add progress bar to convert_examples_to_features
2019-11-05 11:34:20 +01:00
Thomas Wolf
ba973342e3
Merge pull request #1553 from WilliamTambellini/timeSquadInference
Add speed log to examples/run_squad.py
2019-11-05 11:13:12 +01:00
Thomas Wolf
237fad339c
Merge pull request #1709 from oneraghavan/master
Fixing mode in evaluate during training
2019-11-05 10:55:33 +01:00
Oren Amsalem
d7906165a3
add progress bar for convert_examples_to_features
It takes considerate amount of time (~10 min) to parse the examples to features, it is good to have a progress-bar to track this
2019-11-05 10:34:27 +02:00
thomwolf
89d6272898 Fix #1623 2019-11-04 16:21:12 +01:00
Thomas Wolf
9a3b173cd3
Merge branch 'master' into master 2019-11-04 11:41:26 +01:00
thomwolf
ad90868627 Update example readme 2019-11-04 11:27:22 +01:00
Raghavan
e5b1048bae
Fixing mode in evaluate during training 2019-11-03 16:14:46 +05:30
Lysandre
1a2b40cb53 run_tf_glue MRPC evaluation only for MRPC 2019-10-31 18:00:51 -04:00
Timothy Liu
be36cf92fb Added mixed precision support to benchmarks.py 2019-10-31 17:24:37 -04:00
Julien Chaumond
f96ce1c241 [run_generation] Fix generation with batch_size>1 2019-10-31 18:27:11 +00:00
Julien Chaumond
3c1b6f594e
Merge branch 'master' into fix_top_k_top_p_filtering 2019-10-31 13:53:51 -04:00
Victor SANH
fa735208c9
update readme - fix example command distil* 2019-10-30 14:27:28 -04:00
Thomas Wolf
c7058d8224
Merge pull request #1608 from focox/master
Error raised by "tmp_eval_loss += tmp_eval_loss.item()" when using multi-gpu
2019-10-30 17:14:07 +01:00
Thomas Wolf
04c69db399
Merge pull request #1628 from huggingface/tfglue
run_tf_glue works with all tasks
2019-10-30 17:04:03 +01:00
Thomas Wolf
3df4367244
Merge pull request #1601 from huggingface/clean-roberta
Clean roberta model & all tokenizers now add special tokens by default (breaking change)
2019-10-30 17:00:40 +01:00
Thomas Wolf
36174696cc
Merge branch 'master' into clean-roberta 2019-10-30 16:51:06 +01:00
Thomas Wolf
228cdd6a6e
Merge branch 'master' into conditional-generation 2019-10-30 16:40:35 +01:00
Rémi Louf
070507df1f format utils for summarization 2019-10-30 11:24:12 +01:00
Rémi Louf
da10de8466 fix bug with padding mask + add corresponding test 2019-10-30 11:19:58 +01:00
Rémi Louf
3b0d2fa30e rename seq2seq to encoder_decoder 2019-10-30 10:54:46 +01:00
Rémi Louf
9c1bdb5b61 revert renaming of lm_labels to ltr_lm_labels 2019-10-30 10:43:13 +01:00
Rémi Louf
098a89f312 update docstrings; rename lm_labels to more explicit ltr_lm_labels 2019-10-29 20:08:03 +01:00
Rémi Louf
dfce409691 resolve PR comments 2019-10-29 17:10:20 +01:00
altsoph
079bfb32fb Evaluation fixed. 2019-10-28 10:18:58 -04:00
altsoph
438f2730a0 Evaluation code fixed. 2019-10-28 10:18:58 -04:00
Rémi Louf
4c3ac4a7d8 here's one big commit 2019-10-28 10:49:50 +01:00
Rémi Louf
932543f77e fix test of truncation function 2019-10-28 10:49:49 +01:00
Rémi Louf
a67413ccc8 extend works in-place 2019-10-28 10:49:49 +01:00
Rémi Louf
b915ba9dfe pad sequence with 0, mask with -1 2019-10-28 10:49:49 +01:00
Lysandre
bab6ad01aa run_tf_glue works with all tasks 2019-10-24 21:41:45 +00:00
Matt Maybeno
ae1d03fc51 Add roberta to doc 2019-10-24 14:32:48 -04:00
Matt Maybeno
4e5f88b74f Add Roberta to run_ner.py 2019-10-24 14:32:48 -04:00
VictorSanh
5b6cafb11b [release] fix table weirdness 2019-10-23 10:35:16 -04:00
VictorSanh
8ad5c591cd [RELEASE] DistilRoBERTa 2019-10-23 10:29:47 -04:00
focox@qq.com
bd847ce7d7 fixed the bug raised by "tmp_eval_loss += tmp_eval_loss.item()" when parallelly using multi-gpu. 2019-10-23 20:27:13 +08:00
Julien Chaumond
ef1b8b2ae5 [CTRL] warn if generation prompt does not start with a control code
see also https://github.com/salesforce/ctrl/pull/50
2019-10-22 21:30:32 +00:00
Lysandre
7d709e55ed Remove 2019-10-22 14:12:33 -04:00
Lysandre
1cfd974868 Option to benchmark only one of the two libraries 2019-10-22 13:32:23 -04:00
Pasquale Minervini
abd7110e21 gradient norm clipping should be done right before calling the optimiser - fixing run_glue and run_ner as well 2019-10-21 19:56:52 +01:00
Pasquale Minervini
3775550c4b gradient norm clipping should be done right before calling the optimiser 2019-10-20 22:33:56 +01:00
LysandreJik
7dd29ed2f1 Benchmarks example script 2019-10-18 10:53:04 -04:00
William Tambellini
0919389d9a Add speed log to examples/run_squad.py
Add a speed estimate log (time per example)
for evaluation to examples/run_squad.py
2019-10-17 14:41:04 -07:00
leo-du
ecd15667f3 fix repetition penalty 2019-10-17 14:47:14 -04:00
thomwolf
8cd56e3036 fix data processing in script 2019-10-17 16:33:26 +02:00
Rémi Louf
578d23e061 add training pipeline (formatting temporary) 2019-10-17 14:02:27 +02:00
Rémi Louf
47a06d88a0 use two different tokenizers for storyand summary 2019-10-17 13:04:26 +02:00
Rémi Louf
bfb9b540d4 add Model2Model to __init__ 2019-10-17 12:59:51 +02:00
Rémi Louf
c1bc709c35 correct the truncation and padding of dataset 2019-10-17 10:41:53 +02:00
Rémi Louf
e4e0ee14bd add separator between data import and train 2019-10-16 20:05:32 +02:00
Rémi Louf
0d81fc853e specify in readme that both datasets are required 2019-10-15 15:26:33 +02:00
Rémi Louf
1aec940587 test the full story processing 2019-10-15 15:18:07 +02:00
Rémi Louf
22e1af6859 truncation function is fully tested 2019-10-15 14:43:50 +02:00
Rémi Louf
260ac7d9a8 wip commit, switching computers 2019-10-15 12:24:35 +02:00
thomwolf
be916cb3fb Merge branch 'master' of https://github.com/huggingface/transformers 2019-10-15 10:37:13 +02:00
thomwolf
5875aaf762 install tensorboard 2019-10-15 10:36:46 +02:00
Thomas Wolf
40f14ff545
Merge pull request #1513 from slayton58/amp_fp16_einsum
Force einsum to run in fp16
2019-10-15 10:25:00 +02:00
Thomas Wolf
d147671c6c
Merge pull request #1508 from tlkh/master
Added performance enhancements (XLA, AMP) to examples
2019-10-15 09:57:18 +02:00
thomwolf
2c1d5564ad add readme information 2019-10-15 09:56:52 +02:00
thomwolf
c55badcee0 Add NER finetuning details by @stefan-it in example readme 2019-10-15 09:33:52 +02:00
Julien Chaumond
788e632622 [ner] Honor args.overwrite_cache 2019-10-15 09:17:31 +02:00
thomwolf
0f9ebb0b43 add seqeval as requirement for examples 2019-10-15 09:17:31 +02:00
thomwolf
66adb71734 update to transformers 2019-10-15 09:17:31 +02:00
Marianne Stecklina
5ff9cd158a Add option to predict on test set 2019-10-15 09:17:31 +02:00
Marianne Stecklina
7f5367e0b1 Add cli argument for configuring labels 2019-10-15 09:17:31 +02:00
Marianne Stecklina
e1d4179b64 Make file reading more robust 2019-10-15 09:17:31 +02:00
Marianne Stecklina
383ef96747 Implement fine-tuning BERT on CoNLL-2003 named entity recognition task 2019-10-15 09:17:31 +02:00
Marianne Stecklina
5adb39e757 Add option to predict on test set 2019-10-15 09:14:53 +02:00
Marianne Stecklina
99b189df6d Add cli argument for configuring labels 2019-10-15 09:14:53 +02:00
Marianne Stecklina
3e9420add1 Make file reading more robust 2019-10-15 09:14:53 +02:00
Marianne Stecklina
cde42c4354 Implement fine-tuning BERT on CoNLL-2003 named entity recognition task 2019-10-15 09:14:53 +02:00
hlums
74c5035808 Fix token order in xlnet preprocessing. 2019-10-14 21:27:11 +00:00
Rémi Louf
fe25eefc15 add instructions to fetch the dataset 2019-10-14 20:45:39 +02:00
Rémi Louf
412793275d delegate the padding with special tokens to the tokenizer 2019-10-14 20:45:16 +02:00
Rémi Louf
447fffb21f process the raw CNN/Daily Mail dataset
the data provided by Li Dong et al. were already tokenized, which means
that they are not compatible with  all the models in the library. We
thus process the raw data directly and tokenize them using the models'
tokenizers.
2019-10-14 18:12:20 +02:00
Simon Layton
4e6a55751a Force einsum to fp16 2019-10-14 11:12:41 -04:00
Rémi Louf
67d10960ae load and prepare CNN/Daily Mail data
We write a function to load an preprocess the CNN/Daily Mail dataset as
provided by Li Dong et al. The issue is that this dataset has already
been tokenized by the authors, so we actually need to find the original,
plain-text dataset if we want to apply it to all models.
2019-10-14 14:11:20 +02:00
Timothy Liu
376e65a674 Added automatic mixed precision and XLA options to run_tf_glue.py 2019-10-13 13:19:06 +00:00
Timothy Liu
86f23a1944 Minor enhancements to run_tf_glue.py 2019-10-13 10:21:35 +00:00
VictorSanh
d844db4005 Add citation bibtex 2019-10-11 16:55:42 -04:00
Rémi Louf
b3261e7ace read parameters from CLI, load model & tokenizer 2019-10-11 18:40:38 +02:00
Rémi Louf
d889e0b71b add base for seq2seq finetuning 2019-10-11 17:36:12 +02:00
Thomas Wolf
4428aefc63
Merge pull request #1488 from huggingface/pytorch-tpu
GLUE on TPU
2019-10-11 16:33:00 +02:00
Luran He
f382a8decd convert int to str before adding to a str 2019-10-10 19:20:39 -04:00
Lysandre
639f4b7190 Don't save/load when on TPU 2019-10-10 19:17:25 +00:00
Lysandre
d4e7934ac3 GLUE on TPU 2019-10-10 19:03:06 +00:00
Rémi Louf
1e68c28670 add test for initialization of Bert2Rnd 2019-10-10 18:07:11 +02:00
Thomas Wolf
6596e3d566
Merge pull request #1454 from bkkaggle/pytorch-built-in-tensorboard
Change tensorboard imports to use built-in tensorboard if available
2019-10-10 11:56:55 +02:00
thomwolf
177a721205 move back to simple space spliting 2019-10-10 11:45:47 +02:00
thomwolf
a5997dd81a better error messages 2019-10-10 11:31:01 +02:00
Lysandre Debut
2431fea98a
Merge pull request #1383 from keskarnitish/master
Adding CTRL
2019-10-09 11:31:05 -04:00
thomwolf
d9e60f4f0d Merge branch 'master' into pr/1383 2019-10-09 17:25:08 +02:00
Lysandre Debut
e84470ef81
Merge pull request #1384 from huggingface/encoding-qol
Quality of life enhancements in encoding + patch MLM masking
2019-10-09 11:18:24 -04:00
jinoobaek-qz
69629c4f0f Improve naming and only do regex when necessary 2019-10-09 08:48:40 -04:00
jinoobaek-qz
bf34a252b8 Golden path 2019-10-09 08:48:40 -04:00
jinoobaek-qz
528d3f327b Improve readability and improve make less assumptions about checkpoint format 2019-10-09 08:48:40 -04:00
jinoobaek-qz
56301bd9e8 Extract method 2019-10-09 08:48:40 -04:00
jinoobaek-qz
d6c5469712 Delete older checkpoint after saving new checkpoint 2019-10-09 08:48:40 -04:00
jinoobaek-qz
54a31f50fb Add save_total_limit 2019-10-09 08:48:40 -04:00
Thomas Wolf
439fac723a
Merge pull request #1409 from brian41005/master
Evaluation result.txt path changing #1286
2019-10-09 03:14:34 +02:00
Bilal Khan
5ce8d29abe Change tensorboard imports to use built-in tensorboard if available 2019-10-08 16:29:43 -05:00
VictorSanh
7ce83b4931 update weights for distilgpt2 2019-10-07 12:30:27 -04:00
LysandreJik
f3e0218fbb Correct device assignment in run_generation 2019-10-05 21:05:16 -04:00
thomwolf
78ef1a9930 fixes 2019-10-04 17:59:44 -04:00
thomwolf
6c1d0bc066 update encode_plus - add truncation strategies 2019-10-04 17:38:38 -04:00
VictorSanh
0820bb0555 unecessary carriage return 2019-10-04 17:23:15 -04:00
VictorSanh
f5891c3821 run_squad --> run_squad_w_distillation 2019-10-04 17:23:15 -04:00
VictorSanh
764a7923ec add distillation+finetuning option in run_squad 2019-10-04 17:23:15 -04:00
thomwolf
92c0f2fb90 Merge remote-tracking branch 'origin/julien_multiple-choice' into encoding-qol 2019-10-04 15:48:06 -04:00
Julien Chaumond
9e136ff57c Honor args.overwrite_cache (h/t @erenup) 2019-10-04 15:00:56 -04:00
keskarnitish
dbed1c5d94 Adding CTRL (squashed commit)
adding conversion script

adding first draft of modeling & tokenization

adding placeholder for test files

bunch of changes

registering the tokenizer/model/etc

tests

change link; something is very VERY wrong here

weird end-of-word thingy going on

i think the tokenization works now ; wrote the unit tests

overall structure works;load w next

the monster is alive!

works after some cleanup as well

adding emacs autosave to gitignore

currently only supporting the 48 layer one; seems to infer fine on my macbook

cleanup

fixing some documentation

fixing some documentation

tests passing?

now works on CUDA also

adding greedy?

adding greedy sampling

works well
2019-10-03 22:29:03 -07:00
Lysandre Debut
d3f24dfad7
Merge branch 'master' into master 2019-10-03 22:43:09 +00:00
LysandreJik
ecc4f1bdfa XLM use_lang_embedding flag in run_generation 2019-10-03 17:42:16 -04:00
LysandreJik
c2c2ca0fdb Added XLM to run_generation, with prompt language selection. 2019-10-03 17:18:48 -04:00
LysandreJik
aebd83230f Update naming + remove f string in run_lm_finetuning example 2019-10-03 11:31:36 -04:00
LysandreJik
5ed50a93fb LM finetuning won't mask special tokens anymore 2019-10-03 11:31:36 -04:00
Brian Ma
7af0777910 Update run_glue.py
add DistilBert model shortcut into ALL_MODELS
2019-10-03 15:31:11 +00:00
VictorSanh
5f07d8f11a prepare release 2019-10-03 10:27:11 -04:00
VictorSanh
35071007cb incoming release 🔥 update links to arxiv preprint 2019-10-03 10:27:11 -04:00
VictorSanh
2a91f6071f upddate README - TODO updadte link to paper 2019-10-03 10:27:11 -04:00
VictorSanh
c51e533a5f update train.py 2019-10-03 10:27:11 -04:00
VictorSanh
a76c3f9cb0 update requirements 2019-10-03 10:27:11 -04:00
VictorSanh
bb9c5ead54 update distiller 2019-10-03 10:27:11 -04:00
VictorSanh
a12ab0a8db update binarized_data 2019-10-03 10:27:11 -04:00
VictorSanh
4d6dfbd376 update extract 2019-10-03 10:27:11 -04:00
VictorSanh
23edebc079 update extract_distilbert 2019-10-03 10:27:11 -04:00
VictorSanh
cbfcfce205 update token_counts 2019-10-03 10:27:11 -04:00
VictorSanh
19e4ebbe3f grouped_batch_sampler 2019-10-03 10:27:11 -04:00
VictorSanh
594202a934 lm_seqs_dataset 2019-10-03 10:27:11 -04:00
VictorSanh
38084507c4 add distillation_configs 2019-10-03 10:27:11 -04:00
Brian Ma
2195c0d5f9 Evaluation result.txt path changing #1286 2019-10-03 12:49:12 +08:00
Thomas Wolf
963529e29b
Merge pull request #1288 from echan00/master
Typo with LM Fine tuning script
2019-10-01 18:46:07 -04:00