Commit Graph

19383 Commits

Author SHA1 Message Date
nhatchan
cd30565aed Fix importing unofficial TF models
Importing unofficial TF models seems to be working well, at least for me.
This PR resolves #50.
2019-01-14 13:35:40 +09:00
nhatchan
8edc898f63 Fix documentation (missing backslashes)
This PR adds missing backslashes in LM Fine-tuning subsection in README.md.
2019-01-13 21:23:19 +09:00
nhatchan
6c65cb2492 lm_finetuning compatibility with Python 3.5
dicts are not ordered in Python 3.5 or prior, which is a cause of #175.
This PR replaces one with a list, to keep its order.
2019-01-13 21:09:13 +09:00
Li Dong
a2da2b4109
[bug fix] args.do_lower_case is always True
The "default=True" makes args.do_lower_case always True.

```python
parser.add_argument("--do_lower_case",
                        default=True,
                        action='store_true')
```
2019-01-13 19:51:11 +08:00
Thomas Wolf
35becc6d84
Merge pull request #182 from deepset-ai/fix_lowercase_and_saving
add do_lower_case arg and adjust model saving for lm finetuning.
2019-01-11 08:50:13 +01:00
tholor
506e5bb0c8 add do_lower_case arg and adjust model saving for lm finetuning. 2019-01-11 08:32:46 +01:00
Thomas Wolf
e485829a41
Merge pull request #174 from abeljim/master
Added Squad 2.0
2019-01-10 23:40:45 +01:00
Thomas Wolf
7e60205bd3
Merge pull request #179 from likejazz/patch-2
Fix it to run properly even if without `--do_train` param.
2019-01-10 23:39:10 +01:00
Sang-Kil Park
64326dccfb
Fix it to run properly even if without --do_train param.
It was modified similar to `run_classifier.py`, and Fixed to run properly even if without `--do_train` param.
2019-01-10 21:51:39 +09:00
thomwolf
e5c78c6684 update readme and few typos 2019-01-10 01:40:00 +01:00
thomwolf
fa5222c296 update readme 2019-01-10 01:25:28 +01:00
Thomas Wolf
0dd5f55ac8
Merge pull request #172 from WrRan/never_split
Never split some texts.
2019-01-09 13:44:09 +01:00
Unknown
b3628f117e Added Squad 2.0 2019-01-08 15:13:13 -08:00
thomwolf
ab90d4cddd adding docs and example for OpenAI GPT 2019-01-09 00:12:43 +01:00
thomwolf
dc5df92fa8 added LM head for OpenAI 2019-01-08 17:18:47 +01:00
thomwolf
3cf12b235a added tests + fixed losses 2019-01-08 16:24:23 +01:00
thomwolf
eed51c5bdf add OpenAI GPT 2019-01-08 12:26:58 +01:00
WrRan
3f60a60eed text in never_split should not lowercase 2019-01-08 13:33:57 +08:00
WrRan
751beb9e73 never split some text 2019-01-08 10:54:51 +08:00
thomwolf
793dcd236b Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT into fifth-release 2019-01-07 13:37:55 +01:00
thomwolf
2e4db64cab add do_lower_case tokenizer loading optino in run_squad and ine_tuning examples 2019-01-07 13:06:42 +01:00
thomwolf
c9fd350567 remove default when action is store_true in arguments 2019-01-07 13:01:54 +01:00
thomwolf
93f563b8a8 adding OpenAI GPT 2019-01-07 12:55:36 +01:00
Thomas Wolf
e048c7f1c8
Merge pull request #171 from donglixp/patch-1
LayerNorm initialization
2019-01-07 12:44:46 +01:00
Thomas Wolf
d3d56f9a0b
Merge pull request #166 from likejazz/patch-1
Fix error when `bert_model` param is path or url.
2019-01-07 12:40:55 +01:00
Thomas Wolf
766c6b2ce3
Merge pull request #159 from jaderabbit/master
Allow do_eval to be used without do_train and to use the pretrained model in the output folder
2019-01-07 12:31:06 +01:00
Thomas Wolf
77966a43a4
Merge pull request #156 from rodgzilla/cl_args_doc
Adding new pretrained model to the help of the `bert_model` argument.
2019-01-07 12:27:16 +01:00
Thomas Wolf
bcd607542c
Merge pull request #145 from wlhgtc/master
Correct the  wrong note
2019-01-07 12:23:05 +01:00
Thomas Wolf
2e8c5c00ec
Merge pull request #141 from SinghJasdeep/patch-1
loading saved model when n_classes != 2
2019-01-07 12:21:13 +01:00
Thomas Wolf
2860377021
Merge pull request #134 from rodgzilla/update_doc_pretrained_models
Fixing various class documentations.
2019-01-07 12:06:06 +01:00
Thomas Wolf
c18bdb4433
Merge pull request #124 from deepset-ai/master
Add example for fine tuning BERT language model
2019-01-07 12:03:51 +01:00
Li Dong
d0d9b384f2
LayerNorm initialization
The LayerNorm gamma and beta should be initialized by .fill_(1.0) and .zero_().

reference links:

989e78c412/tensorflow/contrib/layers/python/layers/layers.py (L2298)

989e78c412/tensorflow/contrib/layers/python/layers/layers.py (L2308)
2019-01-07 15:51:33 +08:00
Sang-Kil Park
ca4e7aaa72
Fix error when bert_model param is path or url.
Error occurs when `bert_model` param is path or url. Therefore, if it is path, specify the last path to prevent error.
2019-01-05 11:42:54 +09:00
Jade Abbott
193e2df8ba Remove rogue comment 2019-01-03 13:13:06 +02:00
Jade Abbott
c64de50ea4 nb_tr_steps is not initialized 2019-01-03 12:34:57 +02:00
Jade Abbott
b96149a19b Training loss is not initialized if only do_eval is specified 2019-01-03 10:32:10 +02:00
Jade Abbott
be3b9bcf4d Allow one to use the pretrained model in evaluation when do_train is not selected 2019-01-03 09:02:33 +02:00
Grégory Châtel
186f75342e Adding new pretrained model to the help of the bert_model argument. 2019-01-02 14:00:59 +01:00
wlhgtc
e626eecc25
Update modeling.py 2018-12-22 20:26:05 +08:00
Jasdeep Singh
99709ee61d
loading saved model when n_classes != 2
Required to for: Assertion `t >= 0 && t < n_classes` failed,  if your default number of classes is not 2.
2018-12-20 13:55:47 -08:00
Julien Chaumond
8da280ebbe Setup CI 2018-12-20 16:33:39 -05:00
tholor
e5fc98c542 add exemplary training data. update to nvidia apex. refactor 'item -> line in doc' mapping. add warning for unknown word. 2018-12-20 18:30:52 +01:00
Grégory Châtel
7176674849 Fixing various class documentations. 2018-12-20 13:11:17 +01:00
Thomas Wolf
7fb94ab934
Merge pull request #127 from patrick-s-h-lewis/tokenizer-error-on-long-seqs
raises value error for bert tokenizer for long sequences
2018-12-19 10:29:17 +01:00
Thomas Wolf
2feb29c0ff
Merge pull request #130 from sodre/use-entry-points
Use entry-points instead of scripts
2018-12-19 10:18:24 +01:00
Thomas Wolf
2c9991496b
Merge pull request #128 from sodre/add-license
Add license to source distribution
2018-12-19 10:15:53 +01:00
tholor
17595ef2de Merge branch 'master' of https://github.com/deepset-ai/pytorch-pretrained-BERT 2018-12-19 09:22:53 +01:00
tholor
67f4dd56a3 update readme for run_lm_finetuning 2018-12-19 09:22:37 +01:00
Patrick Sodré
ecf3ea197e Remove original script 2018-12-19 02:26:08 +00:00
Patrick Sodré
87c1244c7d Convert scripts into entry_points
The recommended approach to create launch scripts is to use entry_points
and console_scripts.

xref: https://packaging.python.org/guides/distributing-packages-using-setuptools/#scripts
2018-12-19 02:26:08 +00:00