Commit Graph

774 Commits

Author SHA1 Message Date
Aneesh Pappu
365fb34c6c small fix to remove shifting of lm labels during pre process of roc stories, as this shifting happens interanlly in the model 2019-04-30 13:53:04 -07:00
Thomas Wolf
2dee86319d
Merge pull request #527 from Mathieu-Prouveur/fix_value_training_loss
Update example files so that tr_loss is not affected by args.gradient…
2019-04-30 11:12:55 +02:00
Mathieu Prouveur
87b9ec3843 Fix tr_loss rescaling factor using global_step 2019-04-29 12:58:29 +02:00
thomwolf
b832d5bb8a Release: 0.6.2 2019-04-25 21:37:47 +02:00
Thomas Wolf
e6cf62d499
Merge pull request #488 from dhpollack/fix_multichoice
fixed BertForMultipleChoice model init and forward pass
2019-04-25 21:04:16 +02:00
Thomas Wolf
1cc1c3c344
Merge pull request #533 from lukovnikov/master
Docs for new learning rate code
2019-04-25 21:02:35 +02:00
Thomas Wolf
dee8af4e46
Merge pull request #518 from huggingface/schedules_in_examples
Fix training schedules in examples to match new API
2019-04-25 21:01:04 +02:00
lukovnikov
56a47ce2b7 - replaced OpenAIGPTAdam with OpenAIAdam in docs 2019-04-25 16:05:28 +02:00
lukovnikov
331a46ff04 - replaced OpenAIGPTAdam with OpenAIAdam in docs 2019-04-25 16:04:37 +02:00
lukovnikov
704037ad51 - updated docs for new LR API
- added some images for illustration
- updated comments in optimization
2019-04-25 15:59:39 +02:00
Thomas Wolf
d76a57b0ba
Merge pull request #506 from ailzhang/hubconf
Hubconf
2019-04-24 20:59:21 +02:00
thomwolf
80f995a141 revert BertForMultipleChoice linear classifier 2019-04-24 16:51:54 +02:00
Mathieu Prouveur
ed8fad7390 Update example files so that tr_loss is not affected by args.gradient_accumulation_step 2019-04-24 14:07:00 +02:00
thomwolf
d94c6b0144 fix training schedules in examples to match new API 2019-04-23 11:17:06 +02:00
Thomas Wolf
c36cca075a
Merge pull request #515 from Rocketknight1/master
Fix --reduce_memory in finetune_on_pregenerated
2019-04-23 10:30:23 +02:00
Thomas Wolf
99e02c3415
Merge pull request #512 from cynthia/master
Fix indentation weirdness in GPT-2 example.
2019-04-23 10:29:01 +02:00
Thomas Wolf
98cb7b2c51
Merge pull request #445 from lukovnikov/master
Learning rate schedules improvement + extension
2019-04-23 10:27:38 +02:00
Matthew Carrigan
b8e2a9c584 Made --reduce_memory actually do something in finetune_on_pregenerated 2019-04-22 14:01:48 +01:00
Matt
af8a0384fc
Merge pull request #1 from huggingface/master
Pulling commits from main repo
2019-04-22 13:56:47 +01:00
Sangwhan Moon
14b1f719f4 Fix indentation weirdness in GPT-2 example. 2019-04-22 02:20:22 +09:00
lukovnikov
69850b4011 python 2 compat 2019-04-21 14:02:38 +02:00
lukovnikov
bb7557d3ab - removed __all__ in optimization
- removed unused plotting code
- using ABC for LRSchedule
- added some schedule object init tests
2019-04-21 13:48:33 +02:00
lukovnikov
34ccc8ebf4 Merge remote-tracking branch 'upstream/master' 2019-04-21 13:16:15 +02:00
Ailing Zhang
bfd6f6b257 fix from_pretrained positional args 2019-04-17 16:31:40 -07:00
Ailing Zhang
ae4c9fee73 add hubconf 2019-04-17 13:34:34 -07:00
Thomas Wolf
68a889ee43
Merge pull request #500 from huggingface/network
Updating network handling
2019-04-17 15:22:14 +02:00
thomwolf
34ae5bf838 small clean up in tests 2019-04-17 14:52:12 +02:00
thomwolf
23d4554ec0 is python 2 happy now 2019-04-17 14:48:34 +02:00
thomwolf
265550ec34 relax network connection requirements 2019-04-17 14:22:35 +02:00
thomwolf
fa76520240 fix file_utils on python 2 2019-04-17 13:32:22 +02:00
thomwolf
bcde2c61cb fix #497 2019-04-17 12:35:38 +02:00
thomwolf
929579f3b5 fix #497 2019-04-17 12:35:08 +02:00
thomwolf
31d387604c adding s3 model tests with --runslow 2019-04-17 11:58:27 +02:00
Thomas Wolf
8407429d74
Merge pull request #494 from SudoSharma/patch-1
Fix indentation for unconditional generation
2019-04-17 11:11:36 +02:00
Thomas Wolf
2e153930cf
Merge pull request #495 from SudoSharma/patch-2
Fix gradient overflow issue during attention mask
2019-04-17 11:10:36 +02:00
Thomas Wolf
46078e1b46
Merge pull request #496 from 8enmann/patch-1
[run_gpt2.py] temperature should be a float, not int
2019-04-17 11:08:54 +02:00
Thomas Wolf
b8686130ca
Merge pull request #498 from huggingface/GPT2_tokenization
Gpt2 tokenization
2019-04-17 11:06:41 +02:00
thomwolf
5afa497cbf fix GPT-2 tokenization to work also on python 3... 2019-04-17 11:04:41 +02:00
thomwolf
bc70779bf0 fixed GPT-2 tokenization on python 2 2019-04-17 10:56:15 +02:00
Ben Mann
87677fcc4d
[run_gpt2.py] temperature should be a float, not int 2019-04-16 15:23:21 -07:00
Abhi Sharma
9e666aaa29
Fix gradient overflow issue during attention mask
This fix is in reference to issue #382. GPT2 can now be trained in mixed precision, which I've confirmed with testing. I also tested unconditional generation on multiple seeds before and after changing 1e10 to 1e4 and there was no difference. Please let me know if there is anything else I can do to make this pull request better. Thanks for all your work!
2019-04-16 11:42:34 -07:00
Abhi Sharma
07154dadb4
Fix indentation for unconditional generation 2019-04-16 11:11:49 -07:00
thomwolf
bdaba1897c updating GPT tokenization 2019-04-16 17:44:06 +02:00
thomwolf
18a8a15f78 improving GPT2 tokenization and adding tests 2019-04-16 17:00:55 +02:00
Thomas Wolf
3d78e226e6
Merge pull request #489 from huggingface/tokenization_serialization
Better serialization for Tokenizers and Configuration classes - Also fix #466
2019-04-16 08:49:54 +02:00
thomwolf
3571187ef6 fix saving models in distributed setting examples 2019-04-15 16:43:56 +02:00
Thomas Wolf
64b6ef4db0
Merge pull request #490 from huggingface/better_finetuning_GPT_GPT-2
Clean up GPT and GPT-2 losses computation
2019-04-15 16:14:50 +02:00
thomwolf
d616022455 fix openai special tokens loading 2019-04-15 16:07:45 +02:00
thomwolf
df5d9c3551 load all models on cpu 2019-04-15 15:43:01 +02:00
thomwolf
2499b0a5fc add ptvsd to run_squad 2019-04-15 15:33:04 +02:00