transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Thomas Wolf	35e6baab37	Merge branch 'master' into attention	2019-06-14 16:41:56 +02:00
thomwolf	5e1207b8ad	add attention to all bert models and add test	2019-06-14 16:28:25 +02:00
thomwolf	bcc9e93e6f	fix test	2019-06-14 15:38:20 +02:00
Thomas Wolf	f9cde97b31	Merge pull request #675 from meetshah1995/patch-1 [hotfix] Fix frozen pooler parameters in SWAG example.	2019-06-12 10:01:21 +02:00
Meet Pragnesh Shah	e02ce4dc79	[hotfix] Fix frozen pooler parameters in SWAG example.	2019-06-11 15:13:53 -07:00
Thomas Wolf	784c0ed89a	Merge pull request #668 from jeonsworld/patch-2 apply Whole Word Masking technique	2019-06-11 11:29:10 +02:00
jeonsworld	a3a604cefb	Update pregenerate_training_data.py apply Whole Word Masking technique. referred to [create_pretraining_data.py](https://github.com/google-research/bert/blob/master/create_pretraining_data.py)	2019-06-10 12:17:23 +09:00
VictorSanh	ee0308f79d	fix typo	2019-06-06 17:30:49 +02:00
VictorSanh	2d07f945ad	fix error with torch.no_grad and loss computation	2019-06-06 17:10:24 +02:00
VictorSanh	6b8d227092	some cleaning	2019-06-06 17:07:03 +02:00
VictorSanh	122d5c52ac	distinguish was is not trained	2019-06-06 17:02:51 +02:00
VictorSanh	2647ac3294	forgot bertForPreTraining	2019-06-06 16:57:40 +02:00
VictorSanh	cf44d98392	Add more examples to BERT models for torchhub	2019-06-06 16:36:02 +02:00
thomwolf	a3274ac40b	adding attention outputs in bert	2019-06-03 16:11:45 -05:00
VictorSanh	826496580b	Revert "add output_attentions for BertModel" This reverts commit `de5e5682a1`.	2019-06-03 17:10:25 -04:00
VictorSanh	de5e5682a1	add output_attentions for BertModel	2019-06-03 17:05:24 -04:00
Thomas Wolf	2a329c6186	Merge pull request #651 from huggingface/gpt_torchhub Add GPT* compatibility to torchhub	2019-05-31 14:44:52 +02:00
VictorSanh	45d21502f0	update doc	2019-05-31 01:04:16 -04:00
VictorSanh	98f5c7864f	decorelate dependencies + fix bug	2019-05-31 01:00:29 -04:00
VictorSanh	c8bd026ef6	move dependecies list to hubconf	2019-05-31 00:36:58 -04:00
VictorSanh	19ef2b0a66	Fix typo in hubconf	2019-05-31 00:33:33 -04:00
VictorSanh	d0f591051c	gpt_hubconf	2019-05-31 00:28:10 -04:00
VictorSanh	4a210c9fc6	Move bert_hubconf to hubconfs	2019-05-31 00:28:00 -04:00
VictorSanh	0c5a4fe9c9	modify from_pretrained for OpenAIGPT	2019-05-31 00:27:18 -04:00
VictorSanh	372a5c1cee	Hubconf doc - Specia case loading	2019-05-30 16:06:21 -04:00
Victor SANH	96592b544b	default in __init__s for classification BERT models (#650 )	2019-05-30 15:53:13 -04:00
VictorSanh	4cda86b08f	Update hubconf for torchhub: paths+examples+doc	2019-05-30 18:38:00 +00:00
Thomas Wolf	3fc63f126d	Merge pull request #598 from burcturkoglu/master Updating learning rate with special warm up in examples	2019-05-10 13:48:12 +02:00
burcturkoglu	00c7fd2b79	Division to num_train_optimizer of global_step in lr_this_step is removed.	2019-05-09 10:57:03 +03:00
burcturkoglu	fa37b4da77	Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT	2019-05-09 10:55:24 +03:00
burcturkoglu	5289b4b9e0	Division to num_train_optimizer of global_step in lr_this_step is removed.	2019-05-09 10:51:38 +03:00
thomwolf	275179a003	output attentions in GPT-2	2019-05-08 22:24:42 +02:00
thomwolf	366a3b0285	clean up in tokenization	2019-05-08 21:43:51 +02:00
Thomas Wolf	701bd59b8b	Merge pull request #585 from huntzhan/master Make the epsilon of LayerNorm configurable.	2019-05-08 16:56:38 +02:00
Thomas Wolf	303b5e2b92	Merge pull request #545 from ailzhang/cache_dir move pytroch_pretrained_bert cache folder under same path as torch	2019-05-08 16:55:27 +02:00
Thomas Wolf	0198399d84	Merge pull request #570 from MottoX/fix-1 Create optimizer only when args.do_train is True	2019-05-08 16:07:50 +02:00
Thomas Wolf	50fa92c026	Merge pull request #571 from MottoX/patch-1 Fix documentation typo	2019-05-08 16:06:13 +02:00
thomwolf	0efc4ab632	adding dropout to GPT-2 and embedding dropout to GPT	2019-05-08 10:41:35 +02:00
thomwolf	ea9dbea9d5	update GPT2 loss computation for more flexbility	2019-05-07 23:27:18 +02:00
thomwolf	ce86336545	add predict_special_tokens option to GPT also	2019-05-07 16:47:22 +02:00
thomwolf	d1b6979aa5	GPT-2 option to avoid predicting special tokens	2019-05-07 16:25:53 +02:00
huntzhan	101ab4dd8e	Make the epsilon of LayerNorm configurable.	2019-05-06 00:26:21 +08:00
thomwolf	e211785ada	extract attention weights from GPT	2019-05-02 18:31:26 +02:00
MottoX	18c8aef9d3	Fix documentation typo	2019-05-02 19:23:36 +08:00
MottoX	74dbba64bc	Prepare optimizer only when args.do_train is True	2019-05-02 19:09:29 +08:00
thomwolf	db98a4a48b	gpt-2 tokenizer	2019-05-01 11:40:48 +02:00
Thomas Wolf	3ae8c8be1e	Merge pull request #562 from apappu97/roc_stories_lmlabels_fix Small fix to remove shifting of lm labels during pre process of RocStories.	2019-05-01 11:20:17 +02:00
Thomas Wolf	e89520175d	Merge pull request #564 from 8enmann/patch-2 Fix #537	2019-05-01 11:18:46 +02:00
Ben Mann	74f7906db4	Fix #537	2019-04-30 19:48:22 -07:00
Aneesh Pappu	365fb34c6c	small fix to remove shifting of lm labels during pre process of roc stories, as this shifting happens interanlly in the model	2019-04-30 13:53:04 -07:00

1 2 3 4 5 ...

829 Commits