Commit Graph

211 Commits

Author SHA1 Message Date
Thomas Wolf
659af2cbd0
Merge pull request #604 from samuelbroscheit/master
Fixing issue "Training beyond specified 't_total' steps with schedule 'warmup_linear'" reported in #556
2019-06-14 16:49:24 +02:00
Meet Pragnesh Shah
e02ce4dc79
[hotfix] Fix frozen pooler parameters in SWAG example. 2019-06-11 15:13:53 -07:00
jeonsworld
a3a604cefb
Update pregenerate_training_data.py
apply Whole Word Masking technique.
referred to [create_pretraining_data.py](https://github.com/google-research/bert/blob/master/create_pretraining_data.py)
2019-06-10 12:17:23 +09:00
samuelbroscheit
94247ad6cb Make num_train_optimization_steps int 2019-05-13 12:38:22 +02:00
samuel.broscheit
49a77ac16f Clean up a little bit 2019-05-12 00:31:10 +02:00
samuel.broscheit
3bf3f9596f Fixing the issues reported in https://github.com/huggingface/pytorch-pretrained-BERT/issues/556
Reason for issue was that optimzation steps where computed from example size, which is different from actual size of dataloader when an example is chunked into multiple instances.

Solution in this pull request is to compute num_optimization_steps directly from len(data_loader).
2019-05-12 00:13:45 +02:00
burcturkoglu
00c7fd2b79 Division to num_train_optimizer of global_step in lr_this_step is removed. 2019-05-09 10:57:03 +03:00
burcturkoglu
fa37b4da77 Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT 2019-05-09 10:55:24 +03:00
burcturkoglu
5289b4b9e0 Division to num_train_optimizer of global_step in lr_this_step is removed. 2019-05-09 10:51:38 +03:00
Thomas Wolf
0198399d84
Merge pull request #570 from MottoX/fix-1
Create optimizer only when args.do_train is True
2019-05-08 16:07:50 +02:00
MottoX
18c8aef9d3 Fix documentation typo 2019-05-02 19:23:36 +08:00
MottoX
74dbba64bc Prepare optimizer only when args.do_train is True 2019-05-02 19:09:29 +08:00
Aneesh Pappu
365fb34c6c small fix to remove shifting of lm labels during pre process of roc stories, as this shifting happens interanlly in the model 2019-04-30 13:53:04 -07:00
Thomas Wolf
2dee86319d
Merge pull request #527 from Mathieu-Prouveur/fix_value_training_loss
Update example files so that tr_loss is not affected by args.gradient…
2019-04-30 11:12:55 +02:00
Mathieu Prouveur
87b9ec3843 Fix tr_loss rescaling factor using global_step 2019-04-29 12:58:29 +02:00
Mathieu Prouveur
ed8fad7390 Update example files so that tr_loss is not affected by args.gradient_accumulation_step 2019-04-24 14:07:00 +02:00
thomwolf
d94c6b0144 fix training schedules in examples to match new API 2019-04-23 11:17:06 +02:00
Thomas Wolf
c36cca075a
Merge pull request #515 from Rocketknight1/master
Fix --reduce_memory in finetune_on_pregenerated
2019-04-23 10:30:23 +02:00
Matthew Carrigan
b8e2a9c584 Made --reduce_memory actually do something in finetune_on_pregenerated 2019-04-22 14:01:48 +01:00
Sangwhan Moon
14b1f719f4 Fix indentation weirdness in GPT-2 example. 2019-04-22 02:20:22 +09:00
Thomas Wolf
8407429d74
Merge pull request #494 from SudoSharma/patch-1
Fix indentation for unconditional generation
2019-04-17 11:11:36 +02:00
Ben Mann
87677fcc4d
[run_gpt2.py] temperature should be a float, not int 2019-04-16 15:23:21 -07:00
Abhi Sharma
07154dadb4
Fix indentation for unconditional generation 2019-04-16 11:11:49 -07:00
Thomas Wolf
3d78e226e6
Merge pull request #489 from huggingface/tokenization_serialization
Better serialization for Tokenizers and Configuration classes - Also fix #466
2019-04-16 08:49:54 +02:00
thomwolf
3571187ef6 fix saving models in distributed setting examples 2019-04-15 16:43:56 +02:00
thomwolf
2499b0a5fc add ptvsd to run_squad 2019-04-15 15:33:04 +02:00
thomwolf
7816f7921f clean up distributed training logging in run_squad example 2019-04-15 15:27:10 +02:00
thomwolf
1135f2384a clean up logger in examples for distributed case 2019-04-15 15:22:40 +02:00
thomwolf
60ea6c59d2 added best practices for serialization in README and examples 2019-04-15 15:00:33 +02:00
thomwolf
179a2c2ff6 update example to work with new serialization semantic 2019-04-15 14:33:23 +02:00
thomwolf
3e65f255dc add serialization semantics to tokenizers - fix transfo-xl tokenizer 2019-04-15 11:47:25 +02:00
Thomas Wolf
aff44f0c08
Merge branch 'master' into master 2019-04-15 10:58:34 +02:00
Thomas Wolf
bb61b747df
Merge pull request #474 from jiesutd/master
Fix tsv read error in Windows
2019-04-15 10:56:48 +02:00
Matthew Carrigan
dbbd6c7500 Replaced some randints with cleaner randranges, and added a helpful
error for users whose corpus is just one giant document.
2019-04-12 15:07:58 +01:00
Thomas Wolf
616743330e
Merge pull request #462 from 8enmann/master
fix run_gpt2.py
2019-04-11 21:54:46 +02:00
Thomas Wolf
2cdfb8b254
Merge pull request #467 from yaroslavvb/patch-2
Update README.md
2019-04-11 21:53:23 +02:00
Jie Yang
c49ce3c722 fix tsv read error in Windows 2019-04-11 15:40:19 -04:00
thomwolf
4bc4c69af9 finetuning any BERT model - fixes #455 2019-04-11 16:57:59 +02:00
Yaroslav Bulatov
8fffba5f47
Update README.md
Fix for

```> > > > 04/09/2019 21:39:38 - INFO - __main__ -   device: cuda n_gpu: 1, distributed training: False, 16-bits training: False
Traceback (most recent call last):
  File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 642, in <module>
    main()
  File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 502, in main
    raise ValueError("Training is currently the only implemented execution option. Please set `do_train`.")
ValueError: Training is currently the only implemented execution option. Please set `do_train`.
```
2019-04-09 14:45:47 -07:00
Benjamin Mann
fd8a3556f0 fix run_gpt2.py 2019-04-08 17:20:35 -07:00
Dhanajit Brahma
6c4c7be282 Merge remote-tracking branch 'upstream/master' 2019-04-07 16:59:36 +05:30
Dhanajit Brahma
4d3cf0d602 removing some redundant lines 2019-04-07 16:59:07 +05:30
Thomas Wolf
9ca25ce828
Merge pull request #427 from jeonsworld/patch-1
fix sample_doc
2019-04-03 11:26:58 +02:00
thomwolf
846b1fd6f8 Fix #419 2019-04-03 10:50:38 +02:00
Thomas Wolf
2f80dbbc0d
Merge pull request #430 from MottoX/master
Fix typo in example code
2019-04-02 10:41:56 +02:00
Mike Arpaia
8b5c63e4de Fixes to the TensorFlow conversion tool 2019-04-01 13:17:54 -06:00
Weixin Wang
d07db28f52
Fix typo in example code
Modify 'unambigiously' to 'unambiguously'
2019-03-31 01:20:18 +08:00
jeonsworld
60005f464d
Update pregenerate_training_data.py
If the value of rand_end is returned from the randint function, the value of sampled_doc_index that matches current_idx is returned from searchsorted.

example:
cumsum_max = {int64} 30
doc_cumsum = {ndarray} [ 5  7 11 19 30]
doc_lengths = {list} <class 'list'>: [5, 2, 4, 8, 11]
if current_idx  = 1,
rand_start = 7
rand_end = 35
sentence_index = randint(7, 35) % cumsum_max
if randint return 35, sentence_index becomes 5.
if sentence_index is 5, np.searchsorted returns 1 equal to current_index.
2019-03-30 14:50:17 +09:00
dhanajitb
f872eb98c2
making unconditional generation work
The unconditional generation works now but if the seed is fixed, the sample is the same every time.
n_samples > 1 will give different samples though.
I am giving the start token as '<|endoftext|>' for the unconditional generation.
2019-03-28 22:46:15 +05:30
Thomas Wolf
694e2117f3
Merge pull request #388 from ananyahjha93/master
Added remaining GLUE tasks to 'run_classifier.py'
2019-03-28 09:06:53 +01:00