Stefan Schweter
e7cf2ccd15
distillation: add German distilbert model
2019-11-19 19:55:19 +01:00
Kazutoshi Shinoda
f3386d9383
typo "deay" -> "decay"
2019-11-18 11:50:06 -05:00
Stefan Schweter
56c84863a1
camembert: add support for CamemBERT in run_ner example
2019-11-18 17:06:57 +01:00
Julien Chaumond
26858f27cb
[camembert] Upload to s3 + rename script
2019-11-16 00:11:07 -05:00
Louis MARTIN
3e20c2e871
Update demo_camembert.py with new classes
2019-11-16 00:11:07 -05:00
Louis MARTIN
f12e4d8da7
Move demo_camembert.py to examples/contrib
2019-11-16 00:11:07 -05:00
Louis MARTIN
6e72fd094c
Add demo_camembert.py
2019-11-16 00:11:07 -05:00
Thomas Wolf
74ce8de7d8
Merge pull request #1792 from stefan-it/distilbert-for-token-classification
...
DistilBERT for token classification
2019-11-14 22:47:53 +01:00
Thomas Wolf
05db5bc1af
added small comparison between BERT, RoBERTa and DistilBERT
2019-11-14 22:40:22 +01:00
Thomas Wolf
9629e2c676
Merge pull request #1804 from ronakice/master
...
fix multi-gpu eval in torch examples
2019-11-14 22:24:05 +01:00
Thomas Wolf
df99f8c5a1
Merge pull request #1832 from huggingface/memory-leak-schedulers
...
replace LambdaLR scheduler wrappers by function
2019-11-14 22:10:31 +01:00
Rémi Louf
2276bf69b7
update the examples, docs and template
2019-11-14 20:38:02 +01:00
Lysandre
d7929899da
Specify checkpoint in saved file for run_lm_finetuning.py
2019-11-14 10:49:00 -05:00
ronakice
2e31176557
fix multi-gpu eval
2019-11-12 05:55:11 -05:00
Stefan Schweter
2b07b9e5ee
examples: add DistilBert support for NER fine-tuning
2019-11-11 16:19:34 +01:00
Adrian Bauer
7a9aae1044
Fix run_bertology.py
...
Make imports and args.overwrite_cache match run_glue.py
2019-11-08 16:28:40 -05:00
Julien Chaumond
f88c104d8f
[run_tf_glue] Add comment for context
2019-11-05 19:56:43 -05:00
Julien Chaumond
30968d70af
misc doc
2019-11-05 19:06:12 -05:00
Thomas Wolf
e99071f105
Merge pull request #1734 from orena1/patch-1
...
add progress bar to convert_examples_to_features
2019-11-05 11:34:20 +01:00
Thomas Wolf
ba973342e3
Merge pull request #1553 from WilliamTambellini/timeSquadInference
...
Add speed log to examples/run_squad.py
2019-11-05 11:13:12 +01:00
Thomas Wolf
237fad339c
Merge pull request #1709 from oneraghavan/master
...
Fixing mode in evaluate during training
2019-11-05 10:55:33 +01:00
Oren Amsalem
d7906165a3
add progress bar for convert_examples_to_features
...
It takes considerate amount of time (~10 min) to parse the examples to features, it is good to have a progress-bar to track this
2019-11-05 10:34:27 +02:00
thomwolf
89d6272898
Fix #1623
2019-11-04 16:21:12 +01:00
Thomas Wolf
9a3b173cd3
Merge branch 'master' into master
2019-11-04 11:41:26 +01:00
thomwolf
ad90868627
Update example readme
2019-11-04 11:27:22 +01:00
Raghavan
e5b1048bae
Fixing mode in evaluate during training
2019-11-03 16:14:46 +05:30
Lysandre
1a2b40cb53
run_tf_glue MRPC evaluation only for MRPC
2019-10-31 18:00:51 -04:00
Timothy Liu
be36cf92fb
Added mixed precision support to benchmarks.py
2019-10-31 17:24:37 -04:00
Julien Chaumond
f96ce1c241
[run_generation] Fix generation with batch_size>1
2019-10-31 18:27:11 +00:00
Julien Chaumond
3c1b6f594e
Merge branch 'master' into fix_top_k_top_p_filtering
2019-10-31 13:53:51 -04:00
Victor SANH
fa735208c9
update readme - fix example command distil*
2019-10-30 14:27:28 -04:00
Thomas Wolf
c7058d8224
Merge pull request #1608 from focox/master
...
Error raised by "tmp_eval_loss += tmp_eval_loss.item()" when using multi-gpu
2019-10-30 17:14:07 +01:00
Thomas Wolf
04c69db399
Merge pull request #1628 from huggingface/tfglue
...
run_tf_glue works with all tasks
2019-10-30 17:04:03 +01:00
Thomas Wolf
3df4367244
Merge pull request #1601 from huggingface/clean-roberta
...
Clean roberta model & all tokenizers now add special tokens by default (breaking change)
2019-10-30 17:00:40 +01:00
Thomas Wolf
36174696cc
Merge branch 'master' into clean-roberta
2019-10-30 16:51:06 +01:00
Thomas Wolf
228cdd6a6e
Merge branch 'master' into conditional-generation
2019-10-30 16:40:35 +01:00
Rémi Louf
070507df1f
format utils for summarization
2019-10-30 11:24:12 +01:00
Rémi Louf
da10de8466
fix bug with padding mask + add corresponding test
2019-10-30 11:19:58 +01:00
Rémi Louf
3b0d2fa30e
rename seq2seq to encoder_decoder
2019-10-30 10:54:46 +01:00
Rémi Louf
9c1bdb5b61
revert renaming of lm_labels to ltr_lm_labels
2019-10-30 10:43:13 +01:00
Rémi Louf
098a89f312
update docstrings; rename lm_labels to more explicit ltr_lm_labels
2019-10-29 20:08:03 +01:00
Rémi Louf
dfce409691
resolve PR comments
2019-10-29 17:10:20 +01:00
altsoph
079bfb32fb
Evaluation fixed.
2019-10-28 10:18:58 -04:00
altsoph
438f2730a0
Evaluation code fixed.
2019-10-28 10:18:58 -04:00
Rémi Louf
4c3ac4a7d8
here's one big commit
2019-10-28 10:49:50 +01:00
Rémi Louf
932543f77e
fix test of truncation function
2019-10-28 10:49:49 +01:00
Rémi Louf
a67413ccc8
extend works in-place
2019-10-28 10:49:49 +01:00
Rémi Louf
b915ba9dfe
pad sequence with 0, mask with -1
2019-10-28 10:49:49 +01:00
Lysandre
bab6ad01aa
run_tf_glue works with all tasks
2019-10-24 21:41:45 +00:00
Matt Maybeno
ae1d03fc51
Add roberta to doc
2019-10-24 14:32:48 -04:00
Matt Maybeno
4e5f88b74f
Add Roberta to run_ner.py
2019-10-24 14:32:48 -04:00
VictorSanh
5b6cafb11b
[release] fix table weirdness
2019-10-23 10:35:16 -04:00
VictorSanh
8ad5c591cd
[RELEASE] DistilRoBERTa
2019-10-23 10:29:47 -04:00
focox@qq.com
bd847ce7d7
fixed the bug raised by "tmp_eval_loss += tmp_eval_loss.item()" when parallelly using multi-gpu.
2019-10-23 20:27:13 +08:00
Julien Chaumond
ef1b8b2ae5
[CTRL] warn if generation prompt does not start with a control code
...
see also https://github.com/salesforce/ctrl/pull/50
2019-10-22 21:30:32 +00:00
Lysandre
7d709e55ed
Remove
2019-10-22 14:12:33 -04:00
Lysandre
1cfd974868
Option to benchmark only one of the two libraries
2019-10-22 13:32:23 -04:00
Pasquale Minervini
abd7110e21
gradient norm clipping should be done right before calling the optimiser - fixing run_glue and run_ner as well
2019-10-21 19:56:52 +01:00
Pasquale Minervini
3775550c4b
gradient norm clipping should be done right before calling the optimiser
2019-10-20 22:33:56 +01:00
LysandreJik
7dd29ed2f1
Benchmarks example script
2019-10-18 10:53:04 -04:00
William Tambellini
0919389d9a
Add speed log to examples/run_squad.py
...
Add a speed estimate log (time per example)
for evaluation to examples/run_squad.py
2019-10-17 14:41:04 -07:00
leo-du
ecd15667f3
fix repetition penalty
2019-10-17 14:47:14 -04:00
thomwolf
8cd56e3036
fix data processing in script
2019-10-17 16:33:26 +02:00
Rémi Louf
578d23e061
add training pipeline (formatting temporary)
2019-10-17 14:02:27 +02:00
Rémi Louf
47a06d88a0
use two different tokenizers for storyand summary
2019-10-17 13:04:26 +02:00
Rémi Louf
bfb9b540d4
add Model2Model to __init__
2019-10-17 12:59:51 +02:00
Rémi Louf
c1bc709c35
correct the truncation and padding of dataset
2019-10-17 10:41:53 +02:00
Rémi Louf
e4e0ee14bd
add separator between data import and train
2019-10-16 20:05:32 +02:00
Rémi Louf
0d81fc853e
specify in readme that both datasets are required
2019-10-15 15:26:33 +02:00
Rémi Louf
1aec940587
test the full story processing
2019-10-15 15:18:07 +02:00
Rémi Louf
22e1af6859
truncation function is fully tested
2019-10-15 14:43:50 +02:00
Rémi Louf
260ac7d9a8
wip commit, switching computers
2019-10-15 12:24:35 +02:00
thomwolf
be916cb3fb
Merge branch 'master' of https://github.com/huggingface/transformers
2019-10-15 10:37:13 +02:00
thomwolf
5875aaf762
install tensorboard
2019-10-15 10:36:46 +02:00
Thomas Wolf
40f14ff545
Merge pull request #1513 from slayton58/amp_fp16_einsum
...
Force einsum to run in fp16
2019-10-15 10:25:00 +02:00
Thomas Wolf
d147671c6c
Merge pull request #1508 from tlkh/master
...
Added performance enhancements (XLA, AMP) to examples
2019-10-15 09:57:18 +02:00
thomwolf
2c1d5564ad
add readme information
2019-10-15 09:56:52 +02:00
thomwolf
c55badcee0
Add NER finetuning details by @stefan-it in example readme
2019-10-15 09:33:52 +02:00
Julien Chaumond
788e632622
[ner] Honor args.overwrite_cache
2019-10-15 09:17:31 +02:00
thomwolf
0f9ebb0b43
add seqeval as requirement for examples
2019-10-15 09:17:31 +02:00
thomwolf
66adb71734
update to transformers
2019-10-15 09:17:31 +02:00
Marianne Stecklina
5ff9cd158a
Add option to predict on test set
2019-10-15 09:17:31 +02:00
Marianne Stecklina
7f5367e0b1
Add cli argument for configuring labels
2019-10-15 09:17:31 +02:00
Marianne Stecklina
e1d4179b64
Make file reading more robust
2019-10-15 09:17:31 +02:00
Marianne Stecklina
383ef96747
Implement fine-tuning BERT on CoNLL-2003 named entity recognition task
2019-10-15 09:17:31 +02:00
Marianne Stecklina
5adb39e757
Add option to predict on test set
2019-10-15 09:14:53 +02:00
Marianne Stecklina
99b189df6d
Add cli argument for configuring labels
2019-10-15 09:14:53 +02:00
Marianne Stecklina
3e9420add1
Make file reading more robust
2019-10-15 09:14:53 +02:00
Marianne Stecklina
cde42c4354
Implement fine-tuning BERT on CoNLL-2003 named entity recognition task
2019-10-15 09:14:53 +02:00
hlums
74c5035808
Fix token order in xlnet preprocessing.
2019-10-14 21:27:11 +00:00
Rémi Louf
fe25eefc15
add instructions to fetch the dataset
2019-10-14 20:45:39 +02:00
Rémi Louf
412793275d
delegate the padding with special tokens to the tokenizer
2019-10-14 20:45:16 +02:00
Rémi Louf
447fffb21f
process the raw CNN/Daily Mail dataset
...
the data provided by Li Dong et al. were already tokenized, which means
that they are not compatible with all the models in the library. We
thus process the raw data directly and tokenize them using the models'
tokenizers.
2019-10-14 18:12:20 +02:00
Simon Layton
4e6a55751a
Force einsum to fp16
2019-10-14 11:12:41 -04:00
Rémi Louf
67d10960ae
load and prepare CNN/Daily Mail data
...
We write a function to load an preprocess the CNN/Daily Mail dataset as
provided by Li Dong et al. The issue is that this dataset has already
been tokenized by the authors, so we actually need to find the original,
plain-text dataset if we want to apply it to all models.
2019-10-14 14:11:20 +02:00
Timothy Liu
376e65a674
Added automatic mixed precision and XLA options to run_tf_glue.py
2019-10-13 13:19:06 +00:00
Timothy Liu
86f23a1944
Minor enhancements to run_tf_glue.py
2019-10-13 10:21:35 +00:00
VictorSanh
d844db4005
Add citation bibtex
2019-10-11 16:55:42 -04:00
Rémi Louf
b3261e7ace
read parameters from CLI, load model & tokenizer
2019-10-11 18:40:38 +02:00
Rémi Louf
d889e0b71b
add base for seq2seq finetuning
2019-10-11 17:36:12 +02:00
Thomas Wolf
4428aefc63
Merge pull request #1488 from huggingface/pytorch-tpu
...
GLUE on TPU
2019-10-11 16:33:00 +02:00
Luran He
f382a8decd
convert int to str before adding to a str
2019-10-10 19:20:39 -04:00
Lysandre
639f4b7190
Don't save/load when on TPU
2019-10-10 19:17:25 +00:00
Lysandre
d4e7934ac3
GLUE on TPU
2019-10-10 19:03:06 +00:00
Rémi Louf
1e68c28670
add test for initialization of Bert2Rnd
2019-10-10 18:07:11 +02:00
Thomas Wolf
6596e3d566
Merge pull request #1454 from bkkaggle/pytorch-built-in-tensorboard
...
Change tensorboard imports to use built-in tensorboard if available
2019-10-10 11:56:55 +02:00
thomwolf
177a721205
move back to simple space spliting
2019-10-10 11:45:47 +02:00
thomwolf
a5997dd81a
better error messages
2019-10-10 11:31:01 +02:00
Lysandre Debut
2431fea98a
Merge pull request #1383 from keskarnitish/master
...
Adding CTRL
2019-10-09 11:31:05 -04:00
thomwolf
d9e60f4f0d
Merge branch 'master' into pr/1383
2019-10-09 17:25:08 +02:00
Lysandre Debut
e84470ef81
Merge pull request #1384 from huggingface/encoding-qol
...
Quality of life enhancements in encoding + patch MLM masking
2019-10-09 11:18:24 -04:00
jinoobaek-qz
69629c4f0f
Improve naming and only do regex when necessary
2019-10-09 08:48:40 -04:00
jinoobaek-qz
bf34a252b8
Golden path
2019-10-09 08:48:40 -04:00
jinoobaek-qz
528d3f327b
Improve readability and improve make less assumptions about checkpoint format
2019-10-09 08:48:40 -04:00
jinoobaek-qz
56301bd9e8
Extract method
2019-10-09 08:48:40 -04:00
jinoobaek-qz
d6c5469712
Delete older checkpoint after saving new checkpoint
2019-10-09 08:48:40 -04:00
jinoobaek-qz
54a31f50fb
Add save_total_limit
2019-10-09 08:48:40 -04:00
Thomas Wolf
439fac723a
Merge pull request #1409 from brian41005/master
...
Evaluation result.txt path changing #1286
2019-10-09 03:14:34 +02:00
Bilal Khan
5ce8d29abe
Change tensorboard imports to use built-in tensorboard if available
2019-10-08 16:29:43 -05:00
VictorSanh
7ce83b4931
update weights for distilgpt2
2019-10-07 12:30:27 -04:00
LysandreJik
f3e0218fbb
Correct device assignment in run_generation
2019-10-05 21:05:16 -04:00
thomwolf
78ef1a9930
fixes
2019-10-04 17:59:44 -04:00
thomwolf
6c1d0bc066
update encode_plus - add truncation strategies
2019-10-04 17:38:38 -04:00
VictorSanh
0820bb0555
unecessary carriage return
2019-10-04 17:23:15 -04:00
VictorSanh
f5891c3821
run_squad --> run_squad_w_distillation
2019-10-04 17:23:15 -04:00
VictorSanh
764a7923ec
add distillation+finetuning option in run_squad
2019-10-04 17:23:15 -04:00
thomwolf
92c0f2fb90
Merge remote-tracking branch 'origin/julien_multiple-choice' into encoding-qol
2019-10-04 15:48:06 -04:00
Julien Chaumond
9e136ff57c
Honor args.overwrite_cache (h/t @erenup)
2019-10-04 15:00:56 -04:00
keskarnitish
dbed1c5d94
Adding CTRL (squashed commit)
...
adding conversion script
adding first draft of modeling & tokenization
adding placeholder for test files
bunch of changes
registering the tokenizer/model/etc
tests
change link; something is very VERY wrong here
weird end-of-word thingy going on
i think the tokenization works now ; wrote the unit tests
overall structure works;load w next
the monster is alive!
works after some cleanup as well
adding emacs autosave to gitignore
currently only supporting the 48 layer one; seems to infer fine on my macbook
cleanup
fixing some documentation
fixing some documentation
tests passing?
now works on CUDA also
adding greedy?
adding greedy sampling
works well
2019-10-03 22:29:03 -07:00
Lysandre Debut
d3f24dfad7
Merge branch 'master' into master
2019-10-03 22:43:09 +00:00
LysandreJik
ecc4f1bdfa
XLM use_lang_embedding flag in run_generation
2019-10-03 17:42:16 -04:00
LysandreJik
c2c2ca0fdb
Added XLM to run_generation, with prompt language selection.
2019-10-03 17:18:48 -04:00
LysandreJik
aebd83230f
Update naming + remove f string in run_lm_finetuning example
2019-10-03 11:31:36 -04:00
LysandreJik
5ed50a93fb
LM finetuning won't mask special tokens anymore
2019-10-03 11:31:36 -04:00
Brian Ma
7af0777910
Update run_glue.py
...
add DistilBert model shortcut into ALL_MODELS
2019-10-03 15:31:11 +00:00
VictorSanh
5f07d8f11a
prepare release
2019-10-03 10:27:11 -04:00
VictorSanh
35071007cb
incoming release 🔥 update links to arxiv preprint
2019-10-03 10:27:11 -04:00
VictorSanh
2a91f6071f
upddate README - TODO updadte link to paper
2019-10-03 10:27:11 -04:00
VictorSanh
c51e533a5f
update train.py
2019-10-03 10:27:11 -04:00
VictorSanh
a76c3f9cb0
update requirements
2019-10-03 10:27:11 -04:00
VictorSanh
bb9c5ead54
update distiller
2019-10-03 10:27:11 -04:00
VictorSanh
a12ab0a8db
update binarized_data
2019-10-03 10:27:11 -04:00
VictorSanh
4d6dfbd376
update extract
2019-10-03 10:27:11 -04:00
VictorSanh
23edebc079
update extract_distilbert
2019-10-03 10:27:11 -04:00
VictorSanh
cbfcfce205
update token_counts
2019-10-03 10:27:11 -04:00
VictorSanh
19e4ebbe3f
grouped_batch_sampler
2019-10-03 10:27:11 -04:00
VictorSanh
594202a934
lm_seqs_dataset
2019-10-03 10:27:11 -04:00
VictorSanh
38084507c4
add distillation_configs
2019-10-03 10:27:11 -04:00
Brian Ma
2195c0d5f9
Evaluation result.txt path changing #1286
2019-10-03 12:49:12 +08:00
Thomas Wolf
963529e29b
Merge pull request #1288 from echan00/master
...
Typo with LM Fine tuning script
2019-10-01 18:46:07 -04:00