transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-25 07:18:58 +06:00

Author	SHA1	Message	Date
samuelbroscheit	94247ad6cb	Make num_train_optimization_steps int	2019-05-13 12:38:22 +02:00
samuel.broscheit	49a77ac16f	Clean up a little bit	2019-05-12 00:31:10 +02:00
samuel.broscheit	3bf3f9596f	Fixing the issues reported in https://github.com/huggingface/pytorch-pretrained-BERT/issues/556 Reason for issue was that optimzation steps where computed from example size, which is different from actual size of dataloader when an example is chunked into multiple instances. Solution in this pull request is to compute num_optimization_steps directly from len(data_loader).	2019-05-12 00:13:45 +02:00
Thomas Wolf	3fc63f126d	Merge pull request #598 from burcturkoglu/master Updating learning rate with special warm up in examples	2019-05-10 13:48:12 +02:00
burcturkoglu	00c7fd2b79	Division to num_train_optimizer of global_step in lr_this_step is removed.	2019-05-09 10:57:03 +03:00
burcturkoglu	fa37b4da77	Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT	2019-05-09 10:55:24 +03:00
burcturkoglu	5289b4b9e0	Division to num_train_optimizer of global_step in lr_this_step is removed.	2019-05-09 10:51:38 +03:00
thomwolf	275179a003	output attentions in GPT-2	2019-05-08 22:24:42 +02:00
thomwolf	366a3b0285	clean up in tokenization	2019-05-08 21:43:51 +02:00
Thomas Wolf	701bd59b8b	Merge pull request #585 from huntzhan/master Make the epsilon of LayerNorm configurable.	2019-05-08 16:56:38 +02:00
Thomas Wolf	303b5e2b92	Merge pull request #545 from ailzhang/cache_dir move pytroch_pretrained_bert cache folder under same path as torch	2019-05-08 16:55:27 +02:00
Thomas Wolf	0198399d84	Merge pull request #570 from MottoX/fix-1 Create optimizer only when args.do_train is True	2019-05-08 16:07:50 +02:00
Thomas Wolf	50fa92c026	Merge pull request #571 from MottoX/patch-1 Fix documentation typo	2019-05-08 16:06:13 +02:00
thomwolf	0efc4ab632	adding dropout to GPT-2 and embedding dropout to GPT	2019-05-08 10:41:35 +02:00
thomwolf	ea9dbea9d5	update GPT2 loss computation for more flexbility	2019-05-07 23:27:18 +02:00
thomwolf	ce86336545	add predict_special_tokens option to GPT also	2019-05-07 16:47:22 +02:00
thomwolf	d1b6979aa5	GPT-2 option to avoid predicting special tokens	2019-05-07 16:25:53 +02:00
huntzhan	101ab4dd8e	Make the epsilon of LayerNorm configurable.	2019-05-06 00:26:21 +08:00
Chris	41089bc7d3	added file to convert pytorch->tf	2019-05-02 13:26:22 -04:00
Chris	0a8b4d65be	added file to convert pytorch->tf	2019-05-02 13:20:59 -04:00
Chris	968c1b44cb	added file to convert pytorch->tf	2019-05-02 13:19:56 -04:00
Chris	96c2b77f0f	added file to convert pytorch->tf	2019-05-02 13:14:25 -04:00
thomwolf	e211785ada	extract attention weights from GPT	2019-05-02 18:31:26 +02:00
MottoX	18c8aef9d3	Fix documentation typo	2019-05-02 19:23:36 +08:00
MottoX	74dbba64bc	Prepare optimizer only when args.do_train is True	2019-05-02 19:09:29 +08:00
thomwolf	db98a4a48b	gpt-2 tokenizer	2019-05-01 11:40:48 +02:00
Thomas Wolf	3ae8c8be1e	Merge pull request #562 from apappu97/roc_stories_lmlabels_fix Small fix to remove shifting of lm labels during pre process of RocStories.	2019-05-01 11:20:17 +02:00
Thomas Wolf	e89520175d	Merge pull request #564 from 8enmann/patch-2 Fix #537	2019-05-01 11:18:46 +02:00
Ben Mann	74f7906db4	Fix #537	2019-04-30 19:48:22 -07:00
Aneesh Pappu	365fb34c6c	small fix to remove shifting of lm labels during pre process of roc stories, as this shifting happens interanlly in the model	2019-04-30 13:53:04 -07:00
thomwolf	cd110835a0	coverage in circle-ci	2019-04-30 11:35:40 +02:00
Thomas Wolf	2dee86319d	Merge pull request #527 from Mathieu-Prouveur/fix_value_training_loss Update example files so that tr_loss is not affected by args.gradient…	2019-04-30 11:12:55 +02:00
thomwolf	80f53f7380	gpt-2 from_pretrained can use special tokens	2019-04-30 11:10:22 +02:00
thomwolf	e79ceb1533	gpt-2 special tokens	2019-04-30 11:05:54 +02:00
thomwolf	1f5fc95b68	add code coverage	2019-04-30 11:05:26 +02:00
thomwolf	c30139a013	add special tokens to gpt-2	2019-04-30 10:45:26 +02:00
Mathieu Prouveur	87b9ec3843	Fix tr_loss rescaling factor using global_step	2019-04-29 12:58:29 +02:00
Ailing Zhang	3963d57c89	move pytroch_pretrained_bert cache folder under same path as torch	2019-04-27 11:09:11 -07:00
thomwolf	b832d5bb8a	Release: 0.6.2	2019-04-25 21:37:47 +02:00
Thomas Wolf	e6cf62d499	Merge pull request #488 from dhpollack/fix_multichoice fixed BertForMultipleChoice model init and forward pass	2019-04-25 21:04:16 +02:00
Thomas Wolf	1cc1c3c344	Merge pull request #533 from lukovnikov/master Docs for new learning rate code	2019-04-25 21:02:35 +02:00
Thomas Wolf	dee8af4e46	Merge pull request #518 from huggingface/schedules_in_examples Fix training schedules in examples to match new API	2019-04-25 21:01:04 +02:00
lukovnikov	56a47ce2b7	- replaced OpenAIGPTAdam with OpenAIAdam in docs	2019-04-25 16:05:28 +02:00
lukovnikov	331a46ff04	- replaced OpenAIGPTAdam with OpenAIAdam in docs	2019-04-25 16:04:37 +02:00
lukovnikov	704037ad51	- updated docs for new LR API - added some images for illustration - updated comments in optimization	2019-04-25 15:59:39 +02:00
Thomas Wolf	d76a57b0ba	Merge pull request #506 from ailzhang/hubconf Hubconf	2019-04-24 20:59:21 +02:00
thomwolf	80f995a141	revert BertForMultipleChoice linear classifier	2019-04-24 16:51:54 +02:00
Mathieu Prouveur	ed8fad7390	Update example files so that tr_loss is not affected by args.gradient_accumulation_step	2019-04-24 14:07:00 +02:00
thomwolf	d94c6b0144	fix training schedules in examples to match new API	2019-04-23 11:17:06 +02:00
Thomas Wolf	c36cca075a	Merge pull request #515 from Rocketknight1/master Fix --reduce_memory in finetune_on_pregenerated	2019-04-23 10:30:23 +02:00

... 98 99 100 101 102 ...

5759 Commits