Commit Graph

15053 Commits

Author SHA1 Message Date
thomwolf
9727723243 fix pickle 2019-06-18 16:02:42 +02:00
thomwolf
9710b68dbc fix pickles 2019-06-18 16:01:15 +02:00
thomwolf
15ebd67d4e cache in run_classifier + various fixes to the examples 2019-06-18 15:58:22 +02:00
thomwolf
e6e5f19257 fix 2019-06-18 14:45:14 +02:00
thomwolf
a432b3d466 distributed traing t_total 2019-06-18 14:39:09 +02:00
thomwolf
c5407f343f split squad example in two 2019-06-18 14:29:03 +02:00
thomwolf
335f57baf8 only on main process 2019-06-18 14:03:46 +02:00
thomwolf
326944d627 add tensorboard to run_squad 2019-06-18 14:02:42 +02:00
thomwolf
d82e5deeb1 set find_unused_parameters=True in DDP 2019-06-18 12:13:14 +02:00
thomwolf
a59abedfb5 DDP update 2019-06-18 12:06:26 +02:00
thomwolf
2ef5e0de87 switch to pytorch DistributedDataParallel 2019-06-18 12:03:13 +02:00
thomwolf
9ce37af99b oups 2019-06-18 11:47:54 +02:00
thomwolf
a40955f071 no need to duplicate models anymore 2019-06-18 11:46:14 +02:00
Thomas Wolf
3763f8944d
Merge pull request #696 from huggingface/split_config_weights
Split config weights
2019-06-18 11:42:57 +02:00
thomwolf
f964753090 explanation on the current location of the caching folder 2019-06-18 11:36:28 +02:00
thomwolf
868de8d1d7 updating weights loading 2019-06-18 10:58:20 +02:00
thomwolf
64e0adda81 better error message 2019-06-18 10:51:31 +02:00
thomwolf
382e2d1e50 spliting config and weight files for bert also 2019-06-18 10:37:16 +02:00
Thomas Wolf
a6f2511811
Merge pull request #694 from huggingface/release_0.6.3
Release 0.6.3
2019-06-17 16:27:25 +02:00
thomwolf
4447f270b2 updating hub 2019-06-17 16:21:28 +02:00
thomwolf
33d3db5c43 updating head masking, readme and docstrings 2019-06-17 15:51:28 +02:00
thomwolf
965f172de6 output all hidden layers states in GPT/GPT-2 2019-06-17 14:34:12 +02:00
thomwolf
f12007e421 add head masking and pruning to openai GPT 2019-06-17 14:19:40 +02:00
thomwolf
b860e47cf5 add head masking and pruning to gpt-2 2019-06-17 14:12:10 +02:00
thomwolf
7220d47a1c adding head pruning and tests 2019-06-17 13:20:45 +02:00
thomwolf
8415a38b23 better error messages 2019-06-17 13:03:48 +02:00
thomwolf
96c4d3d988 add head masking tests 2019-06-17 12:17:26 +02:00
thomwolf
34858ae1d9 adding bert whole words, bertgerman and gpt-2 medium models, head masking 2019-06-17 11:02:39 +02:00
Thomas Wolf
80684f6f86
Merge pull request #690 from shashwath94/projadpsftmax_fix
Transformer XL ProjectedAdaptiveLogSoftmax output fix
2019-06-15 23:14:10 +02:00
Thomas Wolf
9e363703d6
Merge pull request #688 from deepset-ai/german_bert
Add German Bert model to code, update readme
2019-06-15 23:13:41 +02:00
Thomas Wolf
cc6cd430f7
Merge pull request #691 from vanche/master
import class "GPT2MultipleChoiceHead"
2019-06-15 23:12:55 +02:00
vanche
8289646d4e
import class "GPT2MultipleChoiceHead" 2019-06-15 22:19:30 +09:00
Shashwath H A
5076a5daa7 Fix proj adp softmax output return when n_clusters=0 2019-06-14 22:03:21 -04:00
timoeller
16af9ff7b0 Add German Bert model to code, update readme 2019-06-14 17:42:46 +02:00
Thomas Wolf
b3f9e9451b
Merge pull request #687 from huggingface/tests_and_doc
Updating tests and doc
2019-06-14 17:23:45 +02:00
thomwolf
44e9ddd7fe fix num_special_tokens in GPT 2 test 2019-06-14 17:17:43 +02:00
Thomas Wolf
cad88e19de
Merge pull request #672 from oliverguhr/master
Add vocabulary and model config to the finetune output
2019-06-14 17:02:47 +02:00
Thomas Wolf
c6de625229
Merge pull request #655 from huggingface/finish_torchhub_interfaces
Finish torchhub interfaces
2019-06-14 17:02:08 +02:00
Thomas Wolf
ff276fc00c
Merge branch 'master' into finish_torchhub_interfaces 2019-06-14 16:59:07 +02:00
Thomas Wolf
a64736dc23
Merge pull request #646 from Colanim/patch-1
Fix link in README
2019-06-14 16:57:45 +02:00
Thomas Wolf
460d9afd45
Merge pull request #640 from Barqawiz/master
Support latest multi language bert fine tune
2019-06-14 16:57:02 +02:00
Thomas Wolf
277c77f1c5
Merge pull request #630 from tguens/master
Update run_squad.py
2019-06-14 16:56:26 +02:00
Thomas Wolf
659af2cbd0
Merge pull request #604 from samuelbroscheit/master
Fixing issue "Training beyond specified 't_total' steps with schedule 'warmup_linear'" reported in #556
2019-06-14 16:49:24 +02:00
Thomas Wolf
2d6a53490d
Merge pull request #597 from huggingface/attention
GPT-2 (medium size model, special_tokens, fine-tuning, attention) + repo code coverage metric
2019-06-14 16:47:32 +02:00
Thomas Wolf
35e6baab37
Merge branch 'master' into attention 2019-06-14 16:41:56 +02:00
thomwolf
5e1207b8ad add attention to all bert models and add test 2019-06-14 16:28:25 +02:00
thomwolf
bcc9e93e6f fix test 2019-06-14 15:38:20 +02:00
Thomas Wolf
f9cde97b31
Merge pull request #675 from meetshah1995/patch-1
[hotfix] Fix frozen pooler parameters in SWAG example.
2019-06-12 10:01:21 +02:00
Meet Pragnesh Shah
e02ce4dc79
[hotfix] Fix frozen pooler parameters in SWAG example. 2019-06-11 15:13:53 -07:00
Oliver Guhr
5c08c8c273 adds the tokenizer + model config to the output 2019-06-11 13:46:33 +02:00