transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Thomas Wolf	277c77f1c5	Merge pull request #630 from tguens/master Update run_squad.py	2019-06-14 16:56:26 +02:00
Thomas Wolf	659af2cbd0	Merge pull request #604 from samuelbroscheit/master Fixing issue "Training beyond specified 't_total' steps with schedule 'warmup_linear'" reported in #556	2019-06-14 16:49:24 +02:00
Thomas Wolf	2d6a53490d	Merge pull request #597 from huggingface/attention GPT-2 (medium size model, special_tokens, fine-tuning, attention) + repo code coverage metric	2019-06-14 16:47:32 +02:00
Thomas Wolf	35e6baab37	Merge branch 'master' into attention	2019-06-14 16:41:56 +02:00
thomwolf	5e1207b8ad	add attention to all bert models and add test	2019-06-14 16:28:25 +02:00
thomwolf	bcc9e93e6f	fix test	2019-06-14 15:38:20 +02:00
Thomas Wolf	f9cde97b31	Merge pull request #675 from meetshah1995/patch-1 [hotfix] Fix frozen pooler parameters in SWAG example.	2019-06-12 10:01:21 +02:00
Meet Pragnesh Shah	e02ce4dc79	[hotfix] Fix frozen pooler parameters in SWAG example.	2019-06-11 15:13:53 -07:00
Thomas Wolf	784c0ed89a	Merge pull request #668 from jeonsworld/patch-2 apply Whole Word Masking technique	2019-06-11 11:29:10 +02:00
jeonsworld	a3a604cefb	Update pregenerate_training_data.py apply Whole Word Masking technique. referred to [create_pretraining_data.py](https://github.com/google-research/bert/blob/master/create_pretraining_data.py)	2019-06-10 12:17:23 +09:00
VictorSanh	ee0308f79d	fix typo	2019-06-06 17:30:49 +02:00
VictorSanh	2d07f945ad	fix error with torch.no_grad and loss computation	2019-06-06 17:10:24 +02:00
VictorSanh	6b8d227092	some cleaning	2019-06-06 17:07:03 +02:00
VictorSanh	122d5c52ac	distinguish was is not trained	2019-06-06 17:02:51 +02:00
VictorSanh	2647ac3294	forgot bertForPreTraining	2019-06-06 16:57:40 +02:00
VictorSanh	cf44d98392	Add more examples to BERT models for torchhub	2019-06-06 16:36:02 +02:00
thomwolf	a3274ac40b	adding attention outputs in bert	2019-06-03 16:11:45 -05:00
VictorSanh	826496580b	Revert "add output_attentions for BertModel" This reverts commit `de5e5682a1`.	2019-06-03 17:10:25 -04:00
VictorSanh	de5e5682a1	add output_attentions for BertModel	2019-06-03 17:05:24 -04:00
Thomas Wolf	2a329c6186	Merge pull request #651 from huggingface/gpt_torchhub Add GPT* compatibility to torchhub	2019-05-31 14:44:52 +02:00
VictorSanh	45d21502f0	update doc	2019-05-31 01:04:16 -04:00
VictorSanh	98f5c7864f	decorelate dependencies + fix bug	2019-05-31 01:00:29 -04:00
VictorSanh	c8bd026ef6	move dependecies list to hubconf	2019-05-31 00:36:58 -04:00
VictorSanh	19ef2b0a66	Fix typo in hubconf	2019-05-31 00:33:33 -04:00
VictorSanh	d0f591051c	gpt_hubconf	2019-05-31 00:28:10 -04:00
VictorSanh	4a210c9fc6	Move bert_hubconf to hubconfs	2019-05-31 00:28:00 -04:00
VictorSanh	0c5a4fe9c9	modify from_pretrained for OpenAIGPT	2019-05-31 00:27:18 -04:00
VictorSanh	372a5c1cee	Hubconf doc - Specia case loading	2019-05-30 16:06:21 -04:00
Victor SANH	96592b544b	default in __init__s for classification BERT models (#650 )	2019-05-30 15:53:13 -04:00
VictorSanh	4cda86b08f	Update hubconf for torchhub: paths+examples+doc	2019-05-30 18:38:00 +00:00
tguens	9e7bc51b95	Update run_squad.py Indentation change so that the output "nbest_predictions.json" is not empty.	2019-05-22 17:27:59 +08:00
samuelbroscheit	94247ad6cb	Make num_train_optimization_steps int	2019-05-13 12:38:22 +02:00
samuel.broscheit	49a77ac16f	Clean up a little bit	2019-05-12 00:31:10 +02:00
samuel.broscheit	3bf3f9596f	Fixing the issues reported in https://github.com/huggingface/pytorch-pretrained-BERT/issues/556 Reason for issue was that optimzation steps where computed from example size, which is different from actual size of dataloader when an example is chunked into multiple instances. Solution in this pull request is to compute num_optimization_steps directly from len(data_loader).	2019-05-12 00:13:45 +02:00
Thomas Wolf	3fc63f126d	Merge pull request #598 from burcturkoglu/master Updating learning rate with special warm up in examples	2019-05-10 13:48:12 +02:00
burcturkoglu	00c7fd2b79	Division to num_train_optimizer of global_step in lr_this_step is removed.	2019-05-09 10:57:03 +03:00
burcturkoglu	fa37b4da77	Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT	2019-05-09 10:55:24 +03:00
burcturkoglu	5289b4b9e0	Division to num_train_optimizer of global_step in lr_this_step is removed.	2019-05-09 10:51:38 +03:00
thomwolf	275179a003	output attentions in GPT-2	2019-05-08 22:24:42 +02:00
thomwolf	366a3b0285	clean up in tokenization	2019-05-08 21:43:51 +02:00
Thomas Wolf	701bd59b8b	Merge pull request #585 from huntzhan/master Make the epsilon of LayerNorm configurable.	2019-05-08 16:56:38 +02:00
Thomas Wolf	303b5e2b92	Merge pull request #545 from ailzhang/cache_dir move pytroch_pretrained_bert cache folder under same path as torch	2019-05-08 16:55:27 +02:00
Thomas Wolf	0198399d84	Merge pull request #570 from MottoX/fix-1 Create optimizer only when args.do_train is True	2019-05-08 16:07:50 +02:00
Thomas Wolf	50fa92c026	Merge pull request #571 from MottoX/patch-1 Fix documentation typo	2019-05-08 16:06:13 +02:00
thomwolf	0efc4ab632	adding dropout to GPT-2 and embedding dropout to GPT	2019-05-08 10:41:35 +02:00
thomwolf	ea9dbea9d5	update GPT2 loss computation for more flexbility	2019-05-07 23:27:18 +02:00
thomwolf	ce86336545	add predict_special_tokens option to GPT also	2019-05-07 16:47:22 +02:00
thomwolf	d1b6979aa5	GPT-2 option to avoid predicting special tokens	2019-05-07 16:25:53 +02:00
huntzhan	101ab4dd8e	Make the epsilon of LayerNorm configurable.	2019-05-06 00:26:21 +08:00
thomwolf	e211785ada	extract attention weights from GPT	2019-05-02 18:31:26 +02:00

1 2 3 4 5 ...

836 Commits