Commit Graph

15053 Commits

Author SHA1 Message Date
thomwolf
b509bf7655 updating loss computation 2019-04-12 12:12:33 +02:00
thomwolf
1d203a34c0 back to simple indexing 2019-04-11 23:51:03 +02:00
Thomas Wolf
616743330e
Merge pull request #462 from 8enmann/master
fix run_gpt2.py
2019-04-11 21:54:46 +02:00
Thomas Wolf
2cdfb8b254
Merge pull request #467 from yaroslavvb/patch-2
Update README.md
2019-04-11 21:53:23 +02:00
Jie Yang
c49ce3c722 fix tsv read error in Windows 2019-04-11 15:40:19 -04:00
thomwolf
074c869bbe fix OpenAIGPTMultipleChoiceHead 2019-04-11 20:53:50 +02:00
thomwolf
724eb45cef add stale bot 2019-04-11 17:12:00 +02:00
thomwolf
4bc4c69af9 finetuning any BERT model - fixes #455 2019-04-11 16:57:59 +02:00
thomwolf
a05fad8dce fix typo 2019-04-11 13:16:17 +02:00
thomwolf
4a82f4f856 update special token addition 2019-04-11 13:11:22 +02:00
thomwolf
991b8e65f4 Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT 2019-04-11 11:43:15 +02:00
thomwolf
e99b2014cc fixes #471 2019-04-11 11:43:13 +02:00
Yaroslav Bulatov
8fffba5f47
Update README.md
Fix for

```> > > > 04/09/2019 21:39:38 - INFO - __main__ -   device: cuda n_gpu: 1, distributed training: False, 16-bits training: False
Traceback (most recent call last):
  File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 642, in <module>
    main()
  File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 502, in main
    raise ValueError("Training is currently the only implemented execution option. Please set `do_train`.")
ValueError: Training is currently the only implemented execution option. Please set `do_train`.
```
2019-04-09 14:45:47 -07:00
Benjamin Mann
fd8a3556f0 fix run_gpt2.py 2019-04-08 17:20:35 -07:00
Dhanajit Brahma
f4fc9c6152 Merge branch 'master' of https://github.com/dhanajitb/pytorch-pretrained-BERT 2019-04-07 17:52:35 +05:30
Dhanajit Brahma
6c4c7be282 Merge remote-tracking branch 'upstream/master' 2019-04-07 16:59:36 +05:30
Dhanajit Brahma
4d3cf0d602 removing some redundant lines 2019-04-07 16:59:07 +05:30
dhanajitb
0d6a882f63
Cleaned some redundant lines
```while not args.unconditional:
   if not args.unconditional:
```
These lines have been updated
2019-04-07 16:54:38 +05:30
lukovnikov
fc7693adc3 schedule fix 2019-04-03 18:16:47 +02:00
lukovnikov
20686b78fc schedule fix 2019-04-03 18:13:52 +02:00
lukovnikov
1b4ce76c38 schedule fix 2019-04-03 17:40:12 +02:00
lukovnikov
5fed5bb3d6 schedule fix 2019-04-03 17:20:29 +02:00
lukovnikov
23bd2eebf5 schedule fix 2019-04-03 17:10:34 +02:00
lukovnikov
91a073f804 schedule fix 2019-04-03 17:10:08 +02:00
lukovnikov
b64cc63a77 optimization schedule test update 2019-04-03 16:42:40 +02:00
lukovnikov
d164867d90 - updated docs for optimization 2019-04-03 16:13:51 +02:00
lukovnikov
1758c8fc72 - updated docs for optimization 2019-04-03 16:08:34 +02:00
lukovnikov
725a56329d Merge remote-tracking branch 'upstream/master' into optim
# Conflicts:
#	pytorch_pretrained_bert/optimization.py

- updated docs for optimization
2019-04-03 16:07:50 +02:00
Thomas Wolf
94980b529f
Merge pull request #404 from CatalinVoss/fix_lm_loss
Fix Language Modeling Loss
2019-04-03 11:35:30 +02:00
Thomas Wolf
9ca25ce828
Merge pull request #427 from jeonsworld/patch-1
fix sample_doc
2019-04-03 11:26:58 +02:00
Thomas Wolf
db4dccd1b5
Merge pull request #389 from lukovnikov/master
Fix cosine schedule
2019-04-03 11:21:43 +02:00
thomwolf
19666dcb3b Should fix #438 2019-04-03 11:01:01 +02:00
thomwolf
1d8c232324 Fix #436 2019-04-03 10:51:03 +02:00
thomwolf
846b1fd6f8 Fix #419 2019-04-03 10:50:38 +02:00
Thomas Wolf
404adcdabf
Merge pull request #437 from MottoX/fix-link
Fix links in README
2019-04-02 11:40:46 +02:00
Weixin Wang
f26ce6992e
Fix links in README 2019-04-02 17:20:32 +08:00
Thomas Wolf
2f80dbbc0d
Merge pull request #430 from MottoX/master
Fix typo in example code
2019-04-02 10:41:56 +02:00
Thomas Wolf
94adad6be3
Merge pull request #435 from marpaia/training-fixes
Fixes to the TensorFlow conversion tool
2019-04-02 10:41:40 +02:00
Mike Arpaia
8b5c63e4de Fixes to the TensorFlow conversion tool 2019-04-01 13:17:54 -06:00
Weixin Wang
d07db28f52
Fix typo in example code
Modify 'unambigiously' to 'unambiguously'
2019-03-31 01:20:18 +08:00
jeonsworld
60005f464d
Update pregenerate_training_data.py
If the value of rand_end is returned from the randint function, the value of sampled_doc_index that matches current_idx is returned from searchsorted.

example:
cumsum_max = {int64} 30
doc_cumsum = {ndarray} [ 5  7 11 19 30]
doc_lengths = {list} <class 'list'>: [5, 2, 4, 8, 11]
if current_idx  = 1,
rand_start = 7
rand_end = 35
sentence_index = randint(7, 35) % cumsum_max
if randint return 35, sentence_index becomes 5.
if sentence_index is 5, np.searchsorted returns 1 equal to current_index.
2019-03-30 14:50:17 +09:00
Dhanajit Brahma
4d3721f9bc Just updating
Merge remote-tracking branch 'upstream/master'
2019-03-29 21:56:47 +05:30
Thomas Wolf
ec5c1d6134
Merge pull request #425 from Separius/patch-1
fix lm_finetuning's link
2019-03-29 09:14:11 +01:00
Sepehr Sameni
b588ff362a
fix lm_finetuning's link 2019-03-29 12:39:24 +04:30
dhanajitb
f872eb98c2
making unconditional generation work
The unconditional generation works now but if the seed is fixed, the sample is the same every time.
n_samples > 1 will give different samples though.
I am giving the start token as '<|endoftext|>' for the unconditional generation.
2019-03-28 22:46:15 +05:30
Thomas Wolf
694e2117f3
Merge pull request #388 from ananyahjha93/master
Added remaining GLUE tasks to 'run_classifier.py'
2019-03-28 09:06:53 +01:00
Catalin Voss
01520d5412 Remove my unhelpful comments :) 2019-03-27 10:45:28 -07:00
Thomas Wolf
f7c9dc8c99
Merge pull request #409 from ikuyamada/master
Remove padding_idx from position_embeddings and token_type_embeddings
2019-03-27 12:30:03 +01:00
Thomas Wolf
cc8c2d2332
Merge pull request #396 from IndexFziQ/IndexFziQ
add tqdm to the process of eval in examples/run_swag.py
2019-03-27 12:03:26 +01:00
Thomas Wolf
bbff03fbfc
Merge pull request #394 from desireevl/master
Minor change in README
2019-03-27 12:03:00 +01:00