Commit Graph

15053 Commits

Author SHA1 Message Date
Lorenzo Ampil
ec276d6aba Add special tokens to documentation for the tensorflow model examples #1561 2019-10-27 14:00:40 +08:00
Lorenzo Ampil
6e011690a9 Add special tokens to documentation for the rest of pytorch model examples #1561 2019-10-27 13:59:14 +08:00
Lysandre
beaf66b1f3 Remove break 2019-10-24 21:43:28 +00:00
Lysandre
bab6ad01aa run_tf_glue works with all tasks 2019-10-24 21:41:45 +00:00
Matt Maybeno
ae1d03fc51 Add roberta to doc 2019-10-24 14:32:48 -04:00
Matt Maybeno
4e5f88b74f Add Roberta to run_ner.py 2019-10-24 14:32:48 -04:00
Matt Maybeno
b92d68421d Use roberta model and update doc strings 2019-10-24 14:32:48 -04:00
Matt Maybeno
66085a1321 RoBERTa token classification
[WIP] copy paste bert token classification for roberta
2019-10-24 14:32:48 -04:00
Lysandre
b82bfbd0c3 Updated README to show all available documentation 2019-10-24 15:55:31 +00:00
VictorSanh
5b6cafb11b [release] fix table weirdness 2019-10-23 10:35:16 -04:00
VictorSanh
8ad5c591cd [RELEASE] DistilRoBERTa 2019-10-23 10:29:47 -04:00
focox@qq.com
bd847ce7d7 fixed the bug raised by "tmp_eval_loss += tmp_eval_loss.item()" when parallelly using multi-gpu. 2019-10-23 20:27:13 +08:00
Lysandre Debut
6e85bccafc
Fixed typo 2019-10-22 18:07:01 -04:00
Lysandre
fbcc5ff9fb Change branch to master 2019-10-22 18:01:10 -04:00
Lysandre
69eba0ab19 Edit script path 2019-10-22 17:53:52 -04:00
Lysandre
bc3e57d551 Multi version doc deployment 2019-10-22 17:51:30 -04:00
Julien Chaumond
ef1b8b2ae5 [CTRL] warn if generation prompt does not start with a control code
see also https://github.com/salesforce/ctrl/pull/50
2019-10-22 21:30:32 +00:00
Julián Peller (dataista)
e16d46843a Fix architectures count 2019-10-22 15:13:47 -04:00
Lysandre
7d709e55ed Remove 2019-10-22 14:12:33 -04:00
Lysandre
44286b94d3 RoBERTa doesn't print a warning when no special tokens are passed. 2019-10-22 13:46:48 -04:00
Lysandre
1cfd974868 Option to benchmark only one of the two libraries 2019-10-22 13:32:23 -04:00
Lysandre
777faa8ae7 Fix #1597 2019-10-22 11:26:42 -04:00
Thomas Wolf
b8c9ea0010
Merge pull request #1580 from pminervini/master
Gradient norm clipping should be done right before calling the optimiser
2019-10-22 13:59:20 +02:00
Pasquale Minervini
abd7110e21 gradient norm clipping should be done right before calling the optimiser - fixing run_glue and run_ner as well 2019-10-21 19:56:52 +01:00
thomwolf
4d456542e9 Fix citation 2019-10-21 16:34:14 +02:00
Thomas Wolf
0e64fec1ab
Merge pull request #1568 from daemon/patch-1
Fix hanging when loading pretrained models
2019-10-21 14:31:57 +02:00
Lorenzo Ampil
3a52b65795 Add special tokens to documentation for bert examples to resolve issue: #1561 2019-10-21 12:55:51 +08:00
erenup
86a630702d Merge branch 'huggingface/master' 2019-10-21 12:06:09 +08:00
Pasquale Minervini
3775550c4b gradient norm clipping should be done right before calling the optimiser 2019-10-20 22:33:56 +01:00
Pasquale Minervini
bf2c36a920
Merge pull request #1 from huggingface/master
update
2019-10-20 23:30:45 +02:00
Ralph Tang
a2c8c8ef00
Fix hanging when loading pretrained models
- Fix hanging when loading pretrained models from the cache without having internet access. This is a widespread issue on supercomputers whose internal compute nodes are firewalled.
2019-10-19 16:19:20 -04:00
LysandreJik
82f6abd98a Benchmark section added to the documentation 2019-10-18 17:27:10 -04:00
LysandreJik
7dd29ed2f1 Benchmarks example script 2019-10-18 10:53:04 -04:00
Lysandre Debut
8efc0ec91a Add Benchmarks to issue templates 2019-10-18 10:45:44 -04:00
William Tambellini
0919389d9a Add speed log to examples/run_squad.py
Add a speed estimate log (time per example)
for evaluation to examples/run_squad.py
2019-10-17 14:41:04 -07:00
VictorSanh
fd97761c5a soft launch distilroberta 2019-10-17 15:28:58 -04:00
leo-du
ecd15667f3 fix repetition penalty 2019-10-17 14:47:14 -04:00
thomwolf
56e2ee4ead fix model2model 2019-10-17 16:33:31 +02:00
thomwolf
8cd56e3036 fix data processing in script 2019-10-17 16:33:26 +02:00
Rémi Louf
578d23e061 add training pipeline (formatting temporary) 2019-10-17 14:02:27 +02:00
Rémi Louf
47a06d88a0 use two different tokenizers for storyand summary 2019-10-17 13:04:26 +02:00
Rémi Louf
bfb9b540d4 add Model2Model to __init__ 2019-10-17 12:59:51 +02:00
Rémi Louf
c1bc709c35 correct the truncation and padding of dataset 2019-10-17 10:41:53 +02:00
Rémi Louf
87d60b6e19 reword explanation of encoder_attention_mask 2019-10-17 10:18:19 +02:00
Rémi Louf
638fe7f5a4 correct composition of padding and causal masks 2019-10-17 10:13:07 +02:00
Rémi Louf
4e0f24348f document the MLM modification + raise exception on MLM training with encoder-decoder 2019-10-17 09:41:53 +02:00
Rémi Louf
624a5644cc revert black formatting to conform with lib style 2019-10-17 09:27:56 +02:00
Rémi Louf
9b71fc9a18 tying weights is going to be a clusterfuck 2019-10-16 21:31:38 +02:00
Rémi Louf
95ec1d08be separate inputs into encoder & decoder inputs 2019-10-16 20:55:42 +02:00
Rémi Louf
e4e0ee14bd add separator between data import and train 2019-10-16 20:05:32 +02:00