lukovnikov
1b4ce76c38
schedule fix
2019-04-03 17:40:12 +02:00
lukovnikov
5fed5bb3d6
schedule fix
2019-04-03 17:20:29 +02:00
lukovnikov
23bd2eebf5
schedule fix
2019-04-03 17:10:34 +02:00
lukovnikov
91a073f804
schedule fix
2019-04-03 17:10:08 +02:00
lukovnikov
b64cc63a77
optimization schedule test update
2019-04-03 16:42:40 +02:00
lukovnikov
d164867d90
- updated docs for optimization
2019-04-03 16:13:51 +02:00
lukovnikov
1758c8fc72
- updated docs for optimization
2019-04-03 16:08:34 +02:00
lukovnikov
725a56329d
Merge remote-tracking branch 'upstream/master' into optim
...
# Conflicts:
# pytorch_pretrained_bert/optimization.py
- updated docs for optimization
2019-04-03 16:07:50 +02:00
Thomas Wolf
94980b529f
Merge pull request #404 from CatalinVoss/fix_lm_loss
...
Fix Language Modeling Loss
2019-04-03 11:35:30 +02:00
Thomas Wolf
9ca25ce828
Merge pull request #427 from jeonsworld/patch-1
...
fix sample_doc
2019-04-03 11:26:58 +02:00
Thomas Wolf
db4dccd1b5
Merge pull request #389 from lukovnikov/master
...
Fix cosine schedule
2019-04-03 11:21:43 +02:00
thomwolf
19666dcb3b
Should fix #438
2019-04-03 11:01:01 +02:00
thomwolf
1d8c232324
Fix #436
2019-04-03 10:51:03 +02:00
thomwolf
846b1fd6f8
Fix #419
2019-04-03 10:50:38 +02:00
Thomas Wolf
404adcdabf
Merge pull request #437 from MottoX/fix-link
...
Fix links in README
2019-04-02 11:40:46 +02:00
Weixin Wang
f26ce6992e
Fix links in README
2019-04-02 17:20:32 +08:00
Thomas Wolf
2f80dbbc0d
Merge pull request #430 from MottoX/master
...
Fix typo in example code
2019-04-02 10:41:56 +02:00
Thomas Wolf
94adad6be3
Merge pull request #435 from marpaia/training-fixes
...
Fixes to the TensorFlow conversion tool
2019-04-02 10:41:40 +02:00
Mike Arpaia
8b5c63e4de
Fixes to the TensorFlow conversion tool
2019-04-01 13:17:54 -06:00
Weixin Wang
d07db28f52
Fix typo in example code
...
Modify 'unambigiously' to 'unambiguously'
2019-03-31 01:20:18 +08:00
jeonsworld
60005f464d
Update pregenerate_training_data.py
...
If the value of rand_end is returned from the randint function, the value of sampled_doc_index that matches current_idx is returned from searchsorted.
example:
cumsum_max = {int64} 30
doc_cumsum = {ndarray} [ 5 7 11 19 30]
doc_lengths = {list} <class 'list'>: [5, 2, 4, 8, 11]
if current_idx = 1,
rand_start = 7
rand_end = 35
sentence_index = randint(7, 35) % cumsum_max
if randint return 35, sentence_index becomes 5.
if sentence_index is 5, np.searchsorted returns 1 equal to current_index.
2019-03-30 14:50:17 +09:00
Dhanajit Brahma
4d3721f9bc
Just updating
...
Merge remote-tracking branch 'upstream/master'
2019-03-29 21:56:47 +05:30
Thomas Wolf
ec5c1d6134
Merge pull request #425 from Separius/patch-1
...
fix lm_finetuning's link
2019-03-29 09:14:11 +01:00
Sepehr Sameni
b588ff362a
fix lm_finetuning's link
2019-03-29 12:39:24 +04:30
dhanajitb
f872eb98c2
making unconditional generation work
...
The unconditional generation works now but if the seed is fixed, the sample is the same every time.
n_samples > 1 will give different samples though.
I am giving the start token as '<|endoftext|>' for the unconditional generation.
2019-03-28 22:46:15 +05:30
Thomas Wolf
694e2117f3
Merge pull request #388 from ananyahjha93/master
...
Added remaining GLUE tasks to 'run_classifier.py'
2019-03-28 09:06:53 +01:00
Catalin Voss
01520d5412
Remove my unhelpful comments :)
2019-03-27 10:45:28 -07:00
Thomas Wolf
f7c9dc8c99
Merge pull request #409 from ikuyamada/master
...
Remove padding_idx from position_embeddings and token_type_embeddings
2019-03-27 12:30:03 +01:00
Thomas Wolf
cc8c2d2332
Merge pull request #396 from IndexFziQ/IndexFziQ
...
add tqdm to the process of eval in examples/run_swag.py
2019-03-27 12:03:26 +01:00
Thomas Wolf
bbff03fbfc
Merge pull request #394 from desireevl/master
...
Minor change in README
2019-03-27 12:03:00 +01:00
Thomas Wolf
2fb8ddeeff
Merge pull request #392 from Rocketknight1/master
...
Add full language model fine-tuning
2019-03-27 12:02:36 +01:00
thomwolf
34561e61a5
update main readme also
2019-03-27 12:00:04 +01:00
thomwolf
361aff6de5
typos
2019-03-27 11:54:59 +01:00
thomwolf
cea8ba1d59
adjusted formating and some wording in the readme
2019-03-27 11:53:44 +01:00
Ikuya Yamada
0401317b23
Remove padding_idx from position_embeddings and token_type_embeddings
2019-03-26 21:56:35 +09:00
Matthew Carrigan
24e67fbf75
Minor README update
2019-03-25 12:33:30 +00:00
Matthew Carrigan
8d1d1ffde2
Corrected the displayed loss when gradient_accumulation_steps > 1
2019-03-25 12:15:19 +00:00
Catalin Voss
fda2f62395
Fix test failures due to old torch issue with non-contiguous view
2019-03-24 14:37:13 -07:00
Catalin Voss
0dd796e359
Also fix loss function issue with the double head models
2019-03-24 14:35:55 -07:00
Catalin Voss
472857c47f
Fix typo syntax err (sorry, c/p from my repo)
2019-03-24 14:14:49 -07:00
Catalin Voss
2e6f5ffb96
Fix GPT language model loss here as well
2019-03-24 14:14:44 -07:00
Catalin Voss
5938f31fa7
Fix c/p typo from my experiment code
2019-03-24 14:14:40 -07:00
Catalin Voss
7797d21b8d
Fix GPT2 language modeling loss computation
2019-03-24 14:14:35 -07:00
Ananya Harsh Jha
f471979167
added GLUE dev set results and details on how to run GLUE tasks
2019-03-21 15:38:30 -04:00
Matthew Carrigan
abb7d1ff6d
Added proper context management to ensure cleanup happens in the right
...
order.
2019-03-21 17:50:03 +00:00
Matthew Carrigan
06a30cfdf3
Added a --reduce_memory option to the training script to keep training
...
data on disc as a memmap rather than in memory
2019-03-21 17:04:12 +00:00
Matthew Carrigan
7d1ae644ef
Added a --reduce_memory option to the training script to keep training
...
data on disc as a memmap rather than in memory
2019-03-21 17:02:18 +00:00
Matthew Carrigan
2bba7f810e
Added a --reduce_memory option to shelve docs to disc instead of keeping them in memory.
2019-03-21 16:50:16 +00:00
Matthew Carrigan
8733ffcb5e
Removing a couple of other old unnecessary comments
2019-03-21 14:09:57 +00:00
Matthew Carrigan
8a861048dd
Fixed up the notes on a possible future low-memory path
2019-03-21 14:08:39 +00:00