Commit Graph

319 Commits

Author SHA1 Message Date
thomwolf
a40955f071 no need to duplicate models anymore 2019-06-18 11:46:14 +02:00
thomwolf
382e2d1e50 spliting config and weight files for bert also 2019-06-18 10:37:16 +02:00
Thomas Wolf
cad88e19de
Merge pull request #672 from oliverguhr/master
Add vocabulary and model config to the finetune output
2019-06-14 17:02:47 +02:00
Thomas Wolf
460d9afd45
Merge pull request #640 from Barqawiz/master
Support latest multi language bert fine tune
2019-06-14 16:57:02 +02:00
Thomas Wolf
277c77f1c5
Merge pull request #630 from tguens/master
Update run_squad.py
2019-06-14 16:56:26 +02:00
Thomas Wolf
659af2cbd0
Merge pull request #604 from samuelbroscheit/master
Fixing issue "Training beyond specified 't_total' steps with schedule 'warmup_linear'" reported in #556
2019-06-14 16:49:24 +02:00
Meet Pragnesh Shah
e02ce4dc79
[hotfix] Fix frozen pooler parameters in SWAG example. 2019-06-11 15:13:53 -07:00
Oliver Guhr
5c08c8c273 adds the tokenizer + model config to the output 2019-06-11 13:46:33 +02:00
jeonsworld
a3a604cefb
Update pregenerate_training_data.py
apply Whole Word Masking technique.
referred to [create_pretraining_data.py](https://github.com/google-research/bert/blob/master/create_pretraining_data.py)
2019-06-10 12:17:23 +09:00
Ahmad Barqawi
c4fe56dcc0 support latest multi language bert fine tune
fix issue of bert-base-multilingual and add support for uncased multilingual
2019-05-27 11:27:41 +02:00
tguens
9e7bc51b95
Update run_squad.py
Indentation change so that the output "nbest_predictions.json" is not empty.
2019-05-22 17:27:59 +08:00
samuelbroscheit
94247ad6cb Make num_train_optimization_steps int 2019-05-13 12:38:22 +02:00
samuel.broscheit
49a77ac16f Clean up a little bit 2019-05-12 00:31:10 +02:00
samuel.broscheit
3bf3f9596f Fixing the issues reported in https://github.com/huggingface/pytorch-pretrained-BERT/issues/556
Reason for issue was that optimzation steps where computed from example size, which is different from actual size of dataloader when an example is chunked into multiple instances.

Solution in this pull request is to compute num_optimization_steps directly from len(data_loader).
2019-05-12 00:13:45 +02:00
burcturkoglu
00c7fd2b79 Division to num_train_optimizer of global_step in lr_this_step is removed. 2019-05-09 10:57:03 +03:00
burcturkoglu
fa37b4da77 Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT 2019-05-09 10:55:24 +03:00
burcturkoglu
5289b4b9e0 Division to num_train_optimizer of global_step in lr_this_step is removed. 2019-05-09 10:51:38 +03:00
Thomas Wolf
0198399d84
Merge pull request #570 from MottoX/fix-1
Create optimizer only when args.do_train is True
2019-05-08 16:07:50 +02:00
MottoX
18c8aef9d3 Fix documentation typo 2019-05-02 19:23:36 +08:00
MottoX
74dbba64bc Prepare optimizer only when args.do_train is True 2019-05-02 19:09:29 +08:00
Aneesh Pappu
365fb34c6c small fix to remove shifting of lm labels during pre process of roc stories, as this shifting happens interanlly in the model 2019-04-30 13:53:04 -07:00
Thomas Wolf
2dee86319d
Merge pull request #527 from Mathieu-Prouveur/fix_value_training_loss
Update example files so that tr_loss is not affected by args.gradient…
2019-04-30 11:12:55 +02:00
Mathieu Prouveur
87b9ec3843 Fix tr_loss rescaling factor using global_step 2019-04-29 12:58:29 +02:00
Mathieu Prouveur
ed8fad7390 Update example files so that tr_loss is not affected by args.gradient_accumulation_step 2019-04-24 14:07:00 +02:00
thomwolf
d94c6b0144 fix training schedules in examples to match new API 2019-04-23 11:17:06 +02:00
Thomas Wolf
c36cca075a
Merge pull request #515 from Rocketknight1/master
Fix --reduce_memory in finetune_on_pregenerated
2019-04-23 10:30:23 +02:00
Matthew Carrigan
b8e2a9c584 Made --reduce_memory actually do something in finetune_on_pregenerated 2019-04-22 14:01:48 +01:00
Sangwhan Moon
14b1f719f4 Fix indentation weirdness in GPT-2 example. 2019-04-22 02:20:22 +09:00
Thomas Wolf
8407429d74
Merge pull request #494 from SudoSharma/patch-1
Fix indentation for unconditional generation
2019-04-17 11:11:36 +02:00
Ben Mann
87677fcc4d
[run_gpt2.py] temperature should be a float, not int 2019-04-16 15:23:21 -07:00
Abhi Sharma
07154dadb4
Fix indentation for unconditional generation 2019-04-16 11:11:49 -07:00
Thomas Wolf
3d78e226e6
Merge pull request #489 from huggingface/tokenization_serialization
Better serialization for Tokenizers and Configuration classes - Also fix #466
2019-04-16 08:49:54 +02:00
thomwolf
3571187ef6 fix saving models in distributed setting examples 2019-04-15 16:43:56 +02:00
thomwolf
2499b0a5fc add ptvsd to run_squad 2019-04-15 15:33:04 +02:00
thomwolf
7816f7921f clean up distributed training logging in run_squad example 2019-04-15 15:27:10 +02:00
thomwolf
1135f2384a clean up logger in examples for distributed case 2019-04-15 15:22:40 +02:00
thomwolf
60ea6c59d2 added best practices for serialization in README and examples 2019-04-15 15:00:33 +02:00
thomwolf
179a2c2ff6 update example to work with new serialization semantic 2019-04-15 14:33:23 +02:00
thomwolf
3e65f255dc add serialization semantics to tokenizers - fix transfo-xl tokenizer 2019-04-15 11:47:25 +02:00
Thomas Wolf
aff44f0c08
Merge branch 'master' into master 2019-04-15 10:58:34 +02:00
Thomas Wolf
bb61b747df
Merge pull request #474 from jiesutd/master
Fix tsv read error in Windows
2019-04-15 10:56:48 +02:00
Matthew Carrigan
dbbd6c7500 Replaced some randints with cleaner randranges, and added a helpful
error for users whose corpus is just one giant document.
2019-04-12 15:07:58 +01:00
Thomas Wolf
616743330e
Merge pull request #462 from 8enmann/master
fix run_gpt2.py
2019-04-11 21:54:46 +02:00
Thomas Wolf
2cdfb8b254
Merge pull request #467 from yaroslavvb/patch-2
Update README.md
2019-04-11 21:53:23 +02:00
Jie Yang
c49ce3c722 fix tsv read error in Windows 2019-04-11 15:40:19 -04:00
thomwolf
4bc4c69af9 finetuning any BERT model - fixes #455 2019-04-11 16:57:59 +02:00
Yaroslav Bulatov
8fffba5f47
Update README.md
Fix for

```> > > > 04/09/2019 21:39:38 - INFO - __main__ -   device: cuda n_gpu: 1, distributed training: False, 16-bits training: False
Traceback (most recent call last):
  File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 642, in <module>
    main()
  File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 502, in main
    raise ValueError("Training is currently the only implemented execution option. Please set `do_train`.")
ValueError: Training is currently the only implemented execution option. Please set `do_train`.
```
2019-04-09 14:45:47 -07:00
Benjamin Mann
fd8a3556f0 fix run_gpt2.py 2019-04-08 17:20:35 -07:00
Dhanajit Brahma
6c4c7be282 Merge remote-tracking branch 'upstream/master' 2019-04-07 16:59:36 +05:30
Dhanajit Brahma
4d3cf0d602 removing some redundant lines 2019-04-07 16:59:07 +05:30
Thomas Wolf
9ca25ce828
Merge pull request #427 from jeonsworld/patch-1
fix sample_doc
2019-04-03 11:26:58 +02:00
thomwolf
846b1fd6f8 Fix #419 2019-04-03 10:50:38 +02:00
Thomas Wolf
2f80dbbc0d
Merge pull request #430 from MottoX/master
Fix typo in example code
2019-04-02 10:41:56 +02:00
Mike Arpaia
8b5c63e4de Fixes to the TensorFlow conversion tool 2019-04-01 13:17:54 -06:00
Weixin Wang
d07db28f52
Fix typo in example code
Modify 'unambigiously' to 'unambiguously'
2019-03-31 01:20:18 +08:00
jeonsworld
60005f464d
Update pregenerate_training_data.py
If the value of rand_end is returned from the randint function, the value of sampled_doc_index that matches current_idx is returned from searchsorted.

example:
cumsum_max = {int64} 30
doc_cumsum = {ndarray} [ 5  7 11 19 30]
doc_lengths = {list} <class 'list'>: [5, 2, 4, 8, 11]
if current_idx  = 1,
rand_start = 7
rand_end = 35
sentence_index = randint(7, 35) % cumsum_max
if randint return 35, sentence_index becomes 5.
if sentence_index is 5, np.searchsorted returns 1 equal to current_index.
2019-03-30 14:50:17 +09:00
dhanajitb
f872eb98c2
making unconditional generation work
The unconditional generation works now but if the seed is fixed, the sample is the same every time.
n_samples > 1 will give different samples though.
I am giving the start token as '<|endoftext|>' for the unconditional generation.
2019-03-28 22:46:15 +05:30
Thomas Wolf
694e2117f3
Merge pull request #388 from ananyahjha93/master
Added remaining GLUE tasks to 'run_classifier.py'
2019-03-28 09:06:53 +01:00
Thomas Wolf
cc8c2d2332
Merge pull request #396 from IndexFziQ/IndexFziQ
add tqdm to the process of eval in examples/run_swag.py
2019-03-27 12:03:26 +01:00
thomwolf
361aff6de5 typos 2019-03-27 11:54:59 +01:00
thomwolf
cea8ba1d59 adjusted formating and some wording in the readme 2019-03-27 11:53:44 +01:00
Matthew Carrigan
24e67fbf75 Minor README update 2019-03-25 12:33:30 +00:00
Matthew Carrigan
8d1d1ffde2 Corrected the displayed loss when gradient_accumulation_steps > 1 2019-03-25 12:15:19 +00:00
Matthew Carrigan
abb7d1ff6d Added proper context management to ensure cleanup happens in the right
order.
2019-03-21 17:50:03 +00:00
Matthew Carrigan
06a30cfdf3 Added a --reduce_memory option to the training script to keep training
data on disc as a memmap rather than in memory
2019-03-21 17:04:12 +00:00
Matthew Carrigan
7d1ae644ef Added a --reduce_memory option to the training script to keep training
data on disc as a memmap rather than in memory
2019-03-21 17:02:18 +00:00
Matthew Carrigan
2bba7f810e Added a --reduce_memory option to shelve docs to disc instead of keeping them in memory. 2019-03-21 16:50:16 +00:00
Matthew Carrigan
8733ffcb5e Removing a couple of other old unnecessary comments 2019-03-21 14:09:57 +00:00
Matthew Carrigan
8a861048dd Fixed up the notes on a possible future low-memory path 2019-03-21 14:08:39 +00:00
Matthew Carrigan
a8a577ba93 Reduced memory usage for pregenerating the data a lot by writing it
out on the fly without shuffling - the Sampler in the finetuning script
will shuffle for us.
2019-03-21 14:05:52 +00:00
Matthew Carrigan
0ae59e662d Reduced memory usage for pregenerating the data a lot by writing it
out on the fly without shuffling - the Sampler in the finetuning script
will shuffle for us.
2019-03-21 14:04:17 +00:00
Matthew Carrigan
6a9038ba53 Removed an old irrelevant comment 2019-03-21 13:36:41 +00:00
Yuqiang Xie
77944d1b31
add tqdm to the process of eval
Maybe better.
2019-03-21 20:59:33 +08:00
Matthew Carrigan
29a392fbcf Small README changes 2019-03-20 17:35:17 +00:00
Matthew Carrigan
832b2b0058 Adding README 2019-03-20 17:31:49 +00:00
Matthew Carrigan
934d3f4d2f Syncing up argument names between the scripts 2019-03-20 17:23:23 +00:00
Matthew Carrigan
f19ba35b2b Move old finetuning script into the new folder 2019-03-20 16:47:06 +00:00
Matthew Carrigan
7de5c6aa5e PEP8 and formatting cleanups 2019-03-20 16:44:04 +00:00
Matthew Carrigan
1798e98e5a Added final TODOs 2019-03-20 16:42:37 +00:00
Matthew Carrigan
c64c2fc4c2 Fixed embarrassing indentation problem 2019-03-20 15:42:57 +00:00
Matthew Carrigan
0540d360f2 Fixed logging 2019-03-20 15:36:51 +00:00
Matthew Carrigan
976554a472 First commit of the new LM finetuning 2019-03-20 14:23:51 +00:00
Ananya Harsh Jha
e5b63fb542 Merge branch 'master' of https://github.com/ananyahjha93/pytorch-pretrained-BERT
pull current master to local
2019-03-17 08:30:13 -04:00
Ananya Harsh Jha
8a4e90ff40 corrected folder creation error for MNLI-MM, verified GLUE results 2019-03-17 08:16:50 -04:00
Ananya Harsh Jha
e0bf01d9a9 added hack for mismatched MNLI 2019-03-16 14:10:48 -04:00
Ananya Harsh Jha
4c721c6b6a added eval time metrics for GLUE tasks 2019-03-15 23:21:24 -04:00
tseretelitornike
83857ffeaa
Added missing imports. 2019-03-15 12:45:48 +01:00
Yongbo Wang
d1e4fa98a9
typo in annotation
modify `heruistic` to `heuristic` in line 660, `charcter` to `character` in line 661.
2019-03-14 17:32:15 +08:00
Yongbo Wang
3d6452163d
typo
modify `mull` to `null` in line 474 annotation.
2019-03-14 17:03:38 +08:00
thomwolf
a98dfe4ced fixing #377 (empty nbest_predictions.json) 2019-03-14 09:57:06 +01:00
Ananya Harsh Jha
043c8781ef added code for all glue task processors 2019-03-14 04:24:04 -04:00
Yongbo Wang
22a465a91f
Simplify code, delete redundancy line
delete redundancy line `if args.train`, simplify code.
2019-03-13 09:42:06 +08:00
Elon Musk
66d8206809
Update run_gpt2.py 2019-03-08 11:59:08 -05:00
thomwolf
7cc35c3104 fix openai gpt example and updating readme 2019-03-06 11:43:21 +01:00
thomwolf
994d86609b fixing PYTORCH_PRETRAINED_BERT_CACHE use in examples 2019-03-06 10:21:24 +01:00
thomwolf
5c85fc3977 fix typo - logger info 2019-03-06 10:05:21 +01:00
Thomas Wolf
8e36da7acb
Merge pull request #347 from jplehmann/feature/sst2-processor
Processor for SST-2 task
2019-03-06 09:48:27 +01:00
Thomas Wolf
3c01dfb775
Merge pull request #338 from CatalinVoss/patch-3
Fix top k generation for k != 0
2019-03-06 09:47:33 +01:00
John Lehmann
0f96d4b1f7 Run classifier processor for SST-2. 2019-03-05 13:38:28 -06:00
Catalin Voss
4b4b079272
Fix top k generation for k != 0 2019-03-02 21:54:44 -08:00
Catalin Voss
c0cf0a04d5
Fix typo 2019-02-27 18:01:06 -08:00
Ben Johnson
8607233679
Update run_openai_gpt.py 2019-02-20 13:58:54 -05:00
thomwolf
0202da0271 remove unnecessary example 2019-02-18 13:51:42 +01:00
thomwolf
690a0dbf36 fix example - masking 2019-02-18 10:50:30 +01:00
thomwolf
fbb248a2e4 examples testing 2019-02-18 01:28:18 +01:00
thomwolf
b65f07d8c0 adding examples 2019-02-18 00:55:33 +01:00
wlhgtc
8efaf8f176
fix 'best_non_null_entry' is None error 2019-02-15 15:57:25 +08:00
Davide Fiocco
65df0d78ed
--do_lower_case is duplicated in parser args
Deleting one repetition (please review!)
2019-02-13 15:30:05 +01:00
Thomas Wolf
03cdb2a390
Merge pull request #254 from huggingface/python_2
Adding OpenAI GPT and Transformer-XL models, compatibility with Python 2
2019-02-11 14:19:26 +01:00
thomwolf
d38caba169 typo in run_squad 2019-02-11 14:10:27 +01:00
thomwolf
af62cc5f20 fix run_squad example 2019-02-11 14:06:32 +01:00
thomwolf
eebc8abbe2 clarify and unify model saving logic in examples 2019-02-11 14:04:19 +01:00
thomwolf
32fea876bb add distant debugging to run_transfo_xl 2019-02-11 12:53:32 +01:00
thomwolf
b31ba23913 cuda on in the examples by default 2019-02-11 12:15:43 +01:00
thomwolf
6cd769957e update transfo xl example 2019-02-09 16:59:17 +01:00
thomwolf
1320e4ec0c mc_token_mask => mc_token_ids 2019-02-09 16:58:53 +01:00
thomwolf
f4a07a392c mems not splitted 2019-02-09 16:14:31 +01:00
thomwolf
43b9af0cac mems initialized to None in run_transfo 2019-02-09 16:12:19 +01:00
thomwolf
b80684b23f fixing run openai gpt example 2019-02-08 22:31:32 +01:00
thomwolf
7b4b0cf966 logging 2019-02-08 11:16:29 +01:00
thomwolf
4bbb9f2d68 log loss - helpers 2019-02-08 11:14:29 +01:00
thomwolf
5d7e845712 fix model on cuda 2019-02-08 11:08:43 +01:00
thomwolf
eccb2f0163 hot fix 2019-02-08 11:05:20 +01:00
thomwolf
5adc20723b add distant debugging 2019-02-08 11:03:59 +01:00
thomwolf
777459b471 run openai example running 2019-02-08 10:33:14 +01:00
thomwolf
6bc082da0a updating examples 2019-02-08 00:02:26 +01:00
thomwolf
e77721e4fe renamed examples 2019-02-07 23:15:15 +01:00
thomwolf
d482e3d79d adding examples for openai and transformer-xl 2019-02-07 17:06:41 +01:00
tholor
9aebc711c9 adjust error message related to args.do_eval 2019-02-07 11:49:38 +01:00
tholor
4a450b25d5 removing unused argument eval_batch_size from LM finetuning #256 2019-02-07 10:06:38 +01:00
Baoyang Song
7ac3311e48
Fix the undefined variable in squad example 2019-02-06 19:36:08 +01:00
thomwolf
ed47cb6cba fixing transfo eval script 2019-02-06 16:22:17 +01:00
Thomas Wolf
848aae49e1
Merge branch 'master' into python_2 2019-02-06 00:13:20 +01:00
thomwolf
448937c00d python 2 compatibility 2019-02-06 00:07:46 +01:00
thomwolf
d609ba24cb resolving merge conflicts 2019-02-05 16:14:25 +01:00
Thomas Wolf
64ce900974
Merge pull request #248 from JoeDumoulin/squad1.1-fix
fix prediction on run-squad.py example
2019-02-05 16:00:51 +01:00
Thomas Wolf
e9e77cd3c4
Merge pull request #218 from matej-svejda/master
Fix learning rate problems in run_classifier.py
2019-02-05 15:40:44 +01:00
thomwolf
1579c53635 more explicit notation: num_train_step => num_train_optimization_steps 2019-02-05 15:36:33 +01:00
joe dumoulin
aa90e0c36a fix prediction on run-squad.py example 2019-02-01 10:15:44 -08:00
Thomas Wolf
8f8bbd4a4c
Merge pull request #244 from deepset-ai/prettify_lm_masking
Avoid confusion of inplace LM masking
2019-02-01 12:17:50 +01:00
tholor
ce75b169bd avoid confusion of inplace masking of tokens_a / tokens_b 2019-01-31 11:42:06 +01:00
Surya Kasturi
9bf528877e
Update run_squad.py 2019-01-30 15:09:31 -05:00
Surya Kasturi
af2b78601b
Update run_squad2.py 2019-01-30 15:08:56 -05:00
Matej Svejda
5169069997 make examples consistent, revert error in num_train_steps calculation 2019-01-30 11:47:25 +01:00
Matej Svejda
9c6a48c8c3 fix learning rate/fp16 and warmup problem for all examples 2019-01-27 14:07:24 +01:00
Matej Svejda
01ff4f82ba learning rate problems in run_classifier.py 2019-01-22 23:40:06 +01:00
liangtaiwan
be9fa192f0 don't save if do not train 2019-01-18 00:41:55 +08:00
thomwolf
a28dfc8659 fix eval for wt103 2019-01-16 11:18:19 +01:00
thomwolf
8831c68803 fixing various parts of model conversion, loading and weights sharing 2019-01-16 10:31:16 +01:00
thomwolf
bcd4aa8fe0 update evaluation example 2019-01-15 23:32:34 +01:00