thomwolf
29b7b30eaa
updating evaluation on a single gpu
2019-06-18 22:20:21 +02:00
thomwolf
7d2001aa44
overwrite_output_dir
2019-06-18 22:13:30 +02:00
thomwolf
16a1f338c4
fixing
2019-06-18 17:06:31 +02:00
thomwolf
92e0ad5aba
no numpy
2019-06-18 17:00:52 +02:00
thomwolf
4e6edc3274
hop
2019-06-18 16:57:15 +02:00
thomwolf
f55b60b9ee
fixing again
2019-06-18 16:56:52 +02:00
thomwolf
8bd9118294
quick fix
2019-06-18 16:54:41 +02:00
thomwolf
3e847449ad
fix out_label_ids
2019-06-18 16:53:31 +02:00
thomwolf
aad3a54e9c
fix paths
2019-06-18 16:48:04 +02:00
thomwolf
40dbda6871
updating classification example
2019-06-18 16:45:52 +02:00
thomwolf
7388c83b60
update run_classifier for distributed eval
2019-06-18 16:32:49 +02:00
thomwolf
9727723243
fix pickle
2019-06-18 16:02:42 +02:00
thomwolf
9710b68dbc
fix pickles
2019-06-18 16:01:15 +02:00
thomwolf
15ebd67d4e
cache in run_classifier + various fixes to the examples
2019-06-18 15:58:22 +02:00
thomwolf
e6e5f19257
fix
2019-06-18 14:45:14 +02:00
thomwolf
a432b3d466
distributed traing t_total
2019-06-18 14:39:09 +02:00
thomwolf
c5407f343f
split squad example in two
2019-06-18 14:29:03 +02:00
thomwolf
335f57baf8
only on main process
2019-06-18 14:03:46 +02:00
thomwolf
326944d627
add tensorboard to run_squad
2019-06-18 14:02:42 +02:00
thomwolf
d82e5deeb1
set find_unused_parameters=True in DDP
2019-06-18 12:13:14 +02:00
thomwolf
a59abedfb5
DDP update
2019-06-18 12:06:26 +02:00
thomwolf
2ef5e0de87
switch to pytorch DistributedDataParallel
2019-06-18 12:03:13 +02:00
thomwolf
9ce37af99b
oups
2019-06-18 11:47:54 +02:00
thomwolf
a40955f071
no need to duplicate models anymore
2019-06-18 11:46:14 +02:00
thomwolf
382e2d1e50
spliting config and weight files for bert also
2019-06-18 10:37:16 +02:00
Thomas Wolf
cad88e19de
Merge pull request #672 from oliverguhr/master
...
Add vocabulary and model config to the finetune output
2019-06-14 17:02:47 +02:00
Thomas Wolf
460d9afd45
Merge pull request #640 from Barqawiz/master
...
Support latest multi language bert fine tune
2019-06-14 16:57:02 +02:00
Thomas Wolf
277c77f1c5
Merge pull request #630 from tguens/master
...
Update run_squad.py
2019-06-14 16:56:26 +02:00
Thomas Wolf
659af2cbd0
Merge pull request #604 from samuelbroscheit/master
...
Fixing issue "Training beyond specified 't_total' steps with schedule 'warmup_linear'" reported in #556
2019-06-14 16:49:24 +02:00
Meet Pragnesh Shah
e02ce4dc79
[hotfix] Fix frozen pooler parameters in SWAG example.
2019-06-11 15:13:53 -07:00
Oliver Guhr
5c08c8c273
adds the tokenizer + model config to the output
2019-06-11 13:46:33 +02:00
jeonsworld
a3a604cefb
Update pregenerate_training_data.py
...
apply Whole Word Masking technique.
referred to [create_pretraining_data.py](https://github.com/google-research/bert/blob/master/create_pretraining_data.py )
2019-06-10 12:17:23 +09:00
Ahmad Barqawi
c4fe56dcc0
support latest multi language bert fine tune
...
fix issue of bert-base-multilingual and add support for uncased multilingual
2019-05-27 11:27:41 +02:00
tguens
9e7bc51b95
Update run_squad.py
...
Indentation change so that the output "nbest_predictions.json" is not empty.
2019-05-22 17:27:59 +08:00
samuelbroscheit
94247ad6cb
Make num_train_optimization_steps int
2019-05-13 12:38:22 +02:00
samuel.broscheit
49a77ac16f
Clean up a little bit
2019-05-12 00:31:10 +02:00
samuel.broscheit
3bf3f9596f
Fixing the issues reported in https://github.com/huggingface/pytorch-pretrained-BERT/issues/556
...
Reason for issue was that optimzation steps where computed from example size, which is different from actual size of dataloader when an example is chunked into multiple instances.
Solution in this pull request is to compute num_optimization_steps directly from len(data_loader).
2019-05-12 00:13:45 +02:00
burcturkoglu
00c7fd2b79
Division to num_train_optimizer of global_step in lr_this_step is removed.
2019-05-09 10:57:03 +03:00
burcturkoglu
fa37b4da77
Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT
2019-05-09 10:55:24 +03:00
burcturkoglu
5289b4b9e0
Division to num_train_optimizer of global_step in lr_this_step is removed.
2019-05-09 10:51:38 +03:00
Thomas Wolf
0198399d84
Merge pull request #570 from MottoX/fix-1
...
Create optimizer only when args.do_train is True
2019-05-08 16:07:50 +02:00
MottoX
18c8aef9d3
Fix documentation typo
2019-05-02 19:23:36 +08:00
MottoX
74dbba64bc
Prepare optimizer only when args.do_train is True
2019-05-02 19:09:29 +08:00
Aneesh Pappu
365fb34c6c
small fix to remove shifting of lm labels during pre process of roc stories, as this shifting happens interanlly in the model
2019-04-30 13:53:04 -07:00
Thomas Wolf
2dee86319d
Merge pull request #527 from Mathieu-Prouveur/fix_value_training_loss
...
Update example files so that tr_loss is not affected by args.gradient…
2019-04-30 11:12:55 +02:00
Mathieu Prouveur
87b9ec3843
Fix tr_loss rescaling factor using global_step
2019-04-29 12:58:29 +02:00
Mathieu Prouveur
ed8fad7390
Update example files so that tr_loss is not affected by args.gradient_accumulation_step
2019-04-24 14:07:00 +02:00
thomwolf
d94c6b0144
fix training schedules in examples to match new API
2019-04-23 11:17:06 +02:00
Thomas Wolf
c36cca075a
Merge pull request #515 from Rocketknight1/master
...
Fix --reduce_memory in finetune_on_pregenerated
2019-04-23 10:30:23 +02:00
Matthew Carrigan
b8e2a9c584
Made --reduce_memory actually do something in finetune_on_pregenerated
2019-04-22 14:01:48 +01:00
Sangwhan Moon
14b1f719f4
Fix indentation weirdness in GPT-2 example.
2019-04-22 02:20:22 +09:00
Thomas Wolf
8407429d74
Merge pull request #494 from SudoSharma/patch-1
...
Fix indentation for unconditional generation
2019-04-17 11:11:36 +02:00
Ben Mann
87677fcc4d
[run_gpt2.py] temperature should be a float, not int
2019-04-16 15:23:21 -07:00
Abhi Sharma
07154dadb4
Fix indentation for unconditional generation
2019-04-16 11:11:49 -07:00
Thomas Wolf
3d78e226e6
Merge pull request #489 from huggingface/tokenization_serialization
...
Better serialization for Tokenizers and Configuration classes - Also fix #466
2019-04-16 08:49:54 +02:00
thomwolf
3571187ef6
fix saving models in distributed setting examples
2019-04-15 16:43:56 +02:00
thomwolf
2499b0a5fc
add ptvsd to run_squad
2019-04-15 15:33:04 +02:00
thomwolf
7816f7921f
clean up distributed training logging in run_squad example
2019-04-15 15:27:10 +02:00
thomwolf
1135f2384a
clean up logger in examples for distributed case
2019-04-15 15:22:40 +02:00
thomwolf
60ea6c59d2
added best practices for serialization in README and examples
2019-04-15 15:00:33 +02:00
thomwolf
179a2c2ff6
update example to work with new serialization semantic
2019-04-15 14:33:23 +02:00
thomwolf
3e65f255dc
add serialization semantics to tokenizers - fix transfo-xl tokenizer
2019-04-15 11:47:25 +02:00
Thomas Wolf
aff44f0c08
Merge branch 'master' into master
2019-04-15 10:58:34 +02:00
Thomas Wolf
bb61b747df
Merge pull request #474 from jiesutd/master
...
Fix tsv read error in Windows
2019-04-15 10:56:48 +02:00
Matthew Carrigan
dbbd6c7500
Replaced some randints with cleaner randranges, and added a helpful
...
error for users whose corpus is just one giant document.
2019-04-12 15:07:58 +01:00
Thomas Wolf
616743330e
Merge pull request #462 from 8enmann/master
...
fix run_gpt2.py
2019-04-11 21:54:46 +02:00
Thomas Wolf
2cdfb8b254
Merge pull request #467 from yaroslavvb/patch-2
...
Update README.md
2019-04-11 21:53:23 +02:00
Jie Yang
c49ce3c722
fix tsv read error in Windows
2019-04-11 15:40:19 -04:00
thomwolf
4bc4c69af9
finetuning any BERT model - fixes #455
2019-04-11 16:57:59 +02:00
Yaroslav Bulatov
8fffba5f47
Update README.md
...
Fix for
```> > > > 04/09/2019 21:39:38 - INFO - __main__ - device: cuda n_gpu: 1, distributed training: False, 16-bits training: False
Traceback (most recent call last):
File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 642, in <module>
main()
File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 502, in main
raise ValueError("Training is currently the only implemented execution option. Please set `do_train`.")
ValueError: Training is currently the only implemented execution option. Please set `do_train`.
```
2019-04-09 14:45:47 -07:00
Benjamin Mann
fd8a3556f0
fix run_gpt2.py
2019-04-08 17:20:35 -07:00
Dhanajit Brahma
6c4c7be282
Merge remote-tracking branch 'upstream/master'
2019-04-07 16:59:36 +05:30
Dhanajit Brahma
4d3cf0d602
removing some redundant lines
2019-04-07 16:59:07 +05:30
Thomas Wolf
9ca25ce828
Merge pull request #427 from jeonsworld/patch-1
...
fix sample_doc
2019-04-03 11:26:58 +02:00
thomwolf
846b1fd6f8
Fix #419
2019-04-03 10:50:38 +02:00
Thomas Wolf
2f80dbbc0d
Merge pull request #430 from MottoX/master
...
Fix typo in example code
2019-04-02 10:41:56 +02:00
Mike Arpaia
8b5c63e4de
Fixes to the TensorFlow conversion tool
2019-04-01 13:17:54 -06:00
Weixin Wang
d07db28f52
Fix typo in example code
...
Modify 'unambigiously' to 'unambiguously'
2019-03-31 01:20:18 +08:00
jeonsworld
60005f464d
Update pregenerate_training_data.py
...
If the value of rand_end is returned from the randint function, the value of sampled_doc_index that matches current_idx is returned from searchsorted.
example:
cumsum_max = {int64} 30
doc_cumsum = {ndarray} [ 5 7 11 19 30]
doc_lengths = {list} <class 'list'>: [5, 2, 4, 8, 11]
if current_idx = 1,
rand_start = 7
rand_end = 35
sentence_index = randint(7, 35) % cumsum_max
if randint return 35, sentence_index becomes 5.
if sentence_index is 5, np.searchsorted returns 1 equal to current_index.
2019-03-30 14:50:17 +09:00
dhanajitb
f872eb98c2
making unconditional generation work
...
The unconditional generation works now but if the seed is fixed, the sample is the same every time.
n_samples > 1 will give different samples though.
I am giving the start token as '<|endoftext|>' for the unconditional generation.
2019-03-28 22:46:15 +05:30
Thomas Wolf
694e2117f3
Merge pull request #388 from ananyahjha93/master
...
Added remaining GLUE tasks to 'run_classifier.py'
2019-03-28 09:06:53 +01:00
Thomas Wolf
cc8c2d2332
Merge pull request #396 from IndexFziQ/IndexFziQ
...
add tqdm to the process of eval in examples/run_swag.py
2019-03-27 12:03:26 +01:00
thomwolf
361aff6de5
typos
2019-03-27 11:54:59 +01:00
thomwolf
cea8ba1d59
adjusted formating and some wording in the readme
2019-03-27 11:53:44 +01:00
Matthew Carrigan
24e67fbf75
Minor README update
2019-03-25 12:33:30 +00:00
Matthew Carrigan
8d1d1ffde2
Corrected the displayed loss when gradient_accumulation_steps > 1
2019-03-25 12:15:19 +00:00
Matthew Carrigan
abb7d1ff6d
Added proper context management to ensure cleanup happens in the right
...
order.
2019-03-21 17:50:03 +00:00
Matthew Carrigan
06a30cfdf3
Added a --reduce_memory option to the training script to keep training
...
data on disc as a memmap rather than in memory
2019-03-21 17:04:12 +00:00
Matthew Carrigan
7d1ae644ef
Added a --reduce_memory option to the training script to keep training
...
data on disc as a memmap rather than in memory
2019-03-21 17:02:18 +00:00
Matthew Carrigan
2bba7f810e
Added a --reduce_memory option to shelve docs to disc instead of keeping them in memory.
2019-03-21 16:50:16 +00:00
Matthew Carrigan
8733ffcb5e
Removing a couple of other old unnecessary comments
2019-03-21 14:09:57 +00:00
Matthew Carrigan
8a861048dd
Fixed up the notes on a possible future low-memory path
2019-03-21 14:08:39 +00:00
Matthew Carrigan
a8a577ba93
Reduced memory usage for pregenerating the data a lot by writing it
...
out on the fly without shuffling - the Sampler in the finetuning script
will shuffle for us.
2019-03-21 14:05:52 +00:00
Matthew Carrigan
0ae59e662d
Reduced memory usage for pregenerating the data a lot by writing it
...
out on the fly without shuffling - the Sampler in the finetuning script
will shuffle for us.
2019-03-21 14:04:17 +00:00
Matthew Carrigan
6a9038ba53
Removed an old irrelevant comment
2019-03-21 13:36:41 +00:00
Yuqiang Xie
77944d1b31
add tqdm to the process of eval
...
Maybe better.
2019-03-21 20:59:33 +08:00
Matthew Carrigan
29a392fbcf
Small README changes
2019-03-20 17:35:17 +00:00
Matthew Carrigan
832b2b0058
Adding README
2019-03-20 17:31:49 +00:00
Matthew Carrigan
934d3f4d2f
Syncing up argument names between the scripts
2019-03-20 17:23:23 +00:00
Matthew Carrigan
f19ba35b2b
Move old finetuning script into the new folder
2019-03-20 16:47:06 +00:00