Thomas Wolf
3ae8c8be1e
Merge pull request #562 from apappu97/roc_stories_lmlabels_fix
...
Small fix to remove shifting of lm labels during pre process of RocStories.
2019-05-01 11:20:17 +02:00
Thomas Wolf
e89520175d
Merge pull request #564 from 8enmann/patch-2
...
Fix #537
2019-05-01 11:18:46 +02:00
Ben Mann
74f7906db4
Fix #537
2019-04-30 19:48:22 -07:00
Aneesh Pappu
365fb34c6c
small fix to remove shifting of lm labels during pre process of roc stories, as this shifting happens interanlly in the model
2019-04-30 13:53:04 -07:00
thomwolf
cd110835a0
coverage in circle-ci
2019-04-30 11:35:40 +02:00
Thomas Wolf
2dee86319d
Merge pull request #527 from Mathieu-Prouveur/fix_value_training_loss
...
Update example files so that tr_loss is not affected by args.gradient…
2019-04-30 11:12:55 +02:00
thomwolf
80f53f7380
gpt-2 from_pretrained can use special tokens
2019-04-30 11:10:22 +02:00
thomwolf
e79ceb1533
gpt-2 special tokens
2019-04-30 11:05:54 +02:00
thomwolf
1f5fc95b68
add code coverage
2019-04-30 11:05:26 +02:00
thomwolf
c30139a013
add special tokens to gpt-2
2019-04-30 10:45:26 +02:00
Mathieu Prouveur
87b9ec3843
Fix tr_loss rescaling factor using global_step
2019-04-29 12:58:29 +02:00
Ailing Zhang
3963d57c89
move pytroch_pretrained_bert cache folder under same path as torch
2019-04-27 11:09:11 -07:00
thomwolf
b832d5bb8a
Release: 0.6.2
2019-04-25 21:37:47 +02:00
Thomas Wolf
e6cf62d499
Merge pull request #488 from dhpollack/fix_multichoice
...
fixed BertForMultipleChoice model init and forward pass
2019-04-25 21:04:16 +02:00
Thomas Wolf
1cc1c3c344
Merge pull request #533 from lukovnikov/master
...
Docs for new learning rate code
2019-04-25 21:02:35 +02:00
Thomas Wolf
dee8af4e46
Merge pull request #518 from huggingface/schedules_in_examples
...
Fix training schedules in examples to match new API
2019-04-25 21:01:04 +02:00
lukovnikov
56a47ce2b7
- replaced OpenAIGPTAdam with OpenAIAdam in docs
2019-04-25 16:05:28 +02:00
lukovnikov
331a46ff04
- replaced OpenAIGPTAdam with OpenAIAdam in docs
2019-04-25 16:04:37 +02:00
lukovnikov
704037ad51
- updated docs for new LR API
...
- added some images for illustration
- updated comments in optimization
2019-04-25 15:59:39 +02:00
Thomas Wolf
d76a57b0ba
Merge pull request #506 from ailzhang/hubconf
...
Hubconf
2019-04-24 20:59:21 +02:00
thomwolf
80f995a141
revert BertForMultipleChoice linear classifier
2019-04-24 16:51:54 +02:00
Mathieu Prouveur
ed8fad7390
Update example files so that tr_loss is not affected by args.gradient_accumulation_step
2019-04-24 14:07:00 +02:00
thomwolf
d94c6b0144
fix training schedules in examples to match new API
2019-04-23 11:17:06 +02:00
Thomas Wolf
c36cca075a
Merge pull request #515 from Rocketknight1/master
...
Fix --reduce_memory in finetune_on_pregenerated
2019-04-23 10:30:23 +02:00
Thomas Wolf
99e02c3415
Merge pull request #512 from cynthia/master
...
Fix indentation weirdness in GPT-2 example.
2019-04-23 10:29:01 +02:00
Thomas Wolf
98cb7b2c51
Merge pull request #445 from lukovnikov/master
...
Learning rate schedules improvement + extension
2019-04-23 10:27:38 +02:00
Matthew Carrigan
b8e2a9c584
Made --reduce_memory actually do something in finetune_on_pregenerated
2019-04-22 14:01:48 +01:00
Matt
af8a0384fc
Merge pull request #1 from huggingface/master
...
Pulling commits from main repo
2019-04-22 13:56:47 +01:00
Sangwhan Moon
14b1f719f4
Fix indentation weirdness in GPT-2 example.
2019-04-22 02:20:22 +09:00
lukovnikov
69850b4011
python 2 compat
2019-04-21 14:02:38 +02:00
lukovnikov
bb7557d3ab
- removed __all__ in optimization
...
- removed unused plotting code
- using ABC for LRSchedule
- added some schedule object init tests
2019-04-21 13:48:33 +02:00
lukovnikov
34ccc8ebf4
Merge remote-tracking branch 'upstream/master'
2019-04-21 13:16:15 +02:00
Ailing Zhang
bfd6f6b257
fix from_pretrained positional args
2019-04-17 16:31:40 -07:00
Ailing Zhang
ae4c9fee73
add hubconf
2019-04-17 13:34:34 -07:00
Thomas Wolf
68a889ee43
Merge pull request #500 from huggingface/network
...
Updating network handling
2019-04-17 15:22:14 +02:00
thomwolf
34ae5bf838
small clean up in tests
2019-04-17 14:52:12 +02:00
thomwolf
23d4554ec0
is python 2 happy now
2019-04-17 14:48:34 +02:00
thomwolf
265550ec34
relax network connection requirements
2019-04-17 14:22:35 +02:00
thomwolf
fa76520240
fix file_utils on python 2
2019-04-17 13:32:22 +02:00
thomwolf
bcde2c61cb
fix #497
2019-04-17 12:35:38 +02:00
thomwolf
929579f3b5
fix #497
2019-04-17 12:35:08 +02:00
thomwolf
31d387604c
adding s3 model tests with --runslow
2019-04-17 11:58:27 +02:00
Thomas Wolf
8407429d74
Merge pull request #494 from SudoSharma/patch-1
...
Fix indentation for unconditional generation
2019-04-17 11:11:36 +02:00
Thomas Wolf
2e153930cf
Merge pull request #495 from SudoSharma/patch-2
...
Fix gradient overflow issue during attention mask
2019-04-17 11:10:36 +02:00
Thomas Wolf
46078e1b46
Merge pull request #496 from 8enmann/patch-1
...
[run_gpt2.py] temperature should be a float, not int
2019-04-17 11:08:54 +02:00
Thomas Wolf
b8686130ca
Merge pull request #498 from huggingface/GPT2_tokenization
...
Gpt2 tokenization
2019-04-17 11:06:41 +02:00
thomwolf
5afa497cbf
fix GPT-2 tokenization to work also on python 3...
2019-04-17 11:04:41 +02:00
thomwolf
bc70779bf0
fixed GPT-2 tokenization on python 2
2019-04-17 10:56:15 +02:00
Ben Mann
87677fcc4d
[run_gpt2.py] temperature should be a float, not int
2019-04-16 15:23:21 -07:00
Abhi Sharma
9e666aaa29
Fix gradient overflow issue during attention mask
...
This fix is in reference to issue #382 . GPT2 can now be trained in mixed precision, which I've confirmed with testing. I also tested unconditional generation on multiple seeds before and after changing 1e10 to 1e4 and there was no difference. Please let me know if there is anything else I can do to make this pull request better. Thanks for all your work!
2019-04-16 11:42:34 -07:00