Commit Graph

19383 Commits

Author SHA1 Message Date
Abhi Sharma
07154dadb4
Fix indentation for unconditional generation 2019-04-16 11:11:49 -07:00
thomwolf
bdaba1897c updating GPT tokenization 2019-04-16 17:44:06 +02:00
thomwolf
18a8a15f78 improving GPT2 tokenization and adding tests 2019-04-16 17:00:55 +02:00
Thomas Wolf
3d78e226e6
Merge pull request #489 from huggingface/tokenization_serialization
Better serialization for Tokenizers and Configuration classes - Also fix #466
2019-04-16 08:49:54 +02:00
thomwolf
3571187ef6 fix saving models in distributed setting examples 2019-04-15 16:43:56 +02:00
Thomas Wolf
64b6ef4db0
Merge pull request #490 from huggingface/better_finetuning_GPT_GPT-2
Clean up GPT and GPT-2 losses computation
2019-04-15 16:14:50 +02:00
thomwolf
d616022455 fix openai special tokens loading 2019-04-15 16:07:45 +02:00
thomwolf
df5d9c3551 load all models on cpu 2019-04-15 15:43:01 +02:00
thomwolf
2499b0a5fc add ptvsd to run_squad 2019-04-15 15:33:04 +02:00
thomwolf
7816f7921f clean up distributed training logging in run_squad example 2019-04-15 15:27:10 +02:00
thomwolf
1135f2384a clean up logger in examples for distributed case 2019-04-15 15:22:40 +02:00
thomwolf
cc43307023 update readme 2019-04-15 15:06:10 +02:00
thomwolf
60ea6c59d2 added best practices for serialization in README and examples 2019-04-15 15:00:33 +02:00
thomwolf
179a2c2ff6 update example to work with new serialization semantic 2019-04-15 14:33:23 +02:00
thomwolf
b3c6ee0ac1 tokenization updates 2019-04-15 14:24:52 +02:00
thomwolf
20577d8a7c add configuration serialization to readme 2019-04-15 14:21:41 +02:00
thomwolf
9761aa4845 add to_json_file method to configuration classes 2019-04-15 14:12:08 +02:00
thomwolf
b17963d82f update readme 2019-04-15 13:44:30 +02:00
thomwolf
e8568a3b17 fixing tests 2019-04-15 12:55:38 +02:00
thomwolf
870b734bfd added tokenizers serialization tests 2019-04-15 12:03:56 +02:00
thomwolf
3e65f255dc add serialization semantics to tokenizers - fix transfo-xl tokenizer 2019-04-15 11:47:25 +02:00
Thomas Wolf
6b35cfd28f
Merge pull request #423 from dhanajitb/master
making unconditional generation work
2019-04-15 11:01:53 +02:00
Thomas Wolf
aff44f0c08
Merge branch 'master' into master 2019-04-15 10:58:34 +02:00
Thomas Wolf
7e7e4753c8
Merge pull request #480 from mboyanov/docs/cls_token_info
Extend the BertForSequenceClassification docs to mention the special CLS token.
2019-04-15 10:57:25 +02:00
Thomas Wolf
bb61b747df
Merge pull request #474 from jiesutd/master
Fix tsv read error in Windows
2019-04-15 10:56:48 +02:00
Thomas Wolf
7873d76464
Merge pull request #478 from Rocketknight1/master
Added a helpful error for users with single-document corpuses - fixes # 452
2019-04-15 10:55:57 +02:00
David Pollack
38ba7b439b fixed BertForMultipleChoice model init and forward pass 2019-04-15 10:38:01 +02:00
thomwolf
fe2756ff41 update double head model 2019-04-15 10:04:05 +02:00
Martin Boyanov
34cf67fd6c Extend the BertForSequenceClassification docs to mention the special CLS token. 2019-04-12 21:30:28 +03:00
Matthew Carrigan
dbbd6c7500 Replaced some randints with cleaner randranges, and added a helpful
error for users whose corpus is just one giant document.
2019-04-12 15:07:58 +01:00
thomwolf
b509bf7655 updating loss computation 2019-04-12 12:12:33 +02:00
thomwolf
1d203a34c0 back to simple indexing 2019-04-11 23:51:03 +02:00
Thomas Wolf
616743330e
Merge pull request #462 from 8enmann/master
fix run_gpt2.py
2019-04-11 21:54:46 +02:00
Thomas Wolf
2cdfb8b254
Merge pull request #467 from yaroslavvb/patch-2
Update README.md
2019-04-11 21:53:23 +02:00
Jie Yang
c49ce3c722 fix tsv read error in Windows 2019-04-11 15:40:19 -04:00
thomwolf
074c869bbe fix OpenAIGPTMultipleChoiceHead 2019-04-11 20:53:50 +02:00
thomwolf
724eb45cef add stale bot 2019-04-11 17:12:00 +02:00
thomwolf
4bc4c69af9 finetuning any BERT model - fixes #455 2019-04-11 16:57:59 +02:00
thomwolf
a05fad8dce fix typo 2019-04-11 13:16:17 +02:00
thomwolf
4a82f4f856 update special token addition 2019-04-11 13:11:22 +02:00
thomwolf
991b8e65f4 Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT 2019-04-11 11:43:15 +02:00
thomwolf
e99b2014cc fixes #471 2019-04-11 11:43:13 +02:00
Yaroslav Bulatov
8fffba5f47
Update README.md
Fix for

```> > > > 04/09/2019 21:39:38 - INFO - __main__ -   device: cuda n_gpu: 1, distributed training: False, 16-bits training: False
Traceback (most recent call last):
  File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 642, in <module>
    main()
  File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 502, in main
    raise ValueError("Training is currently the only implemented execution option. Please set `do_train`.")
ValueError: Training is currently the only implemented execution option. Please set `do_train`.
```
2019-04-09 14:45:47 -07:00
Benjamin Mann
fd8a3556f0 fix run_gpt2.py 2019-04-08 17:20:35 -07:00
Dhanajit Brahma
f4fc9c6152 Merge branch 'master' of https://github.com/dhanajitb/pytorch-pretrained-BERT 2019-04-07 17:52:35 +05:30
Dhanajit Brahma
6c4c7be282 Merge remote-tracking branch 'upstream/master' 2019-04-07 16:59:36 +05:30
Dhanajit Brahma
4d3cf0d602 removing some redundant lines 2019-04-07 16:59:07 +05:30
dhanajitb
0d6a882f63
Cleaned some redundant lines
```while not args.unconditional:
   if not args.unconditional:
```
These lines have been updated
2019-04-07 16:54:38 +05:30
lukovnikov
fc7693adc3 schedule fix 2019-04-03 18:16:47 +02:00
lukovnikov
20686b78fc schedule fix 2019-04-03 18:13:52 +02:00