transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-30 17:52:35 +06:00

Author	SHA1	Message	Date
thomwolf	44e9ddd7fe	fix num_special_tokens in GPT 2 test	2019-06-14 17:17:43 +02:00
Thomas Wolf	ff276fc00c	Merge branch 'master' into finish_torchhub_interfaces	2019-06-14 16:59:07 +02:00
Thomas Wolf	35e6baab37	Merge branch 'master' into attention	2019-06-14 16:41:56 +02:00
thomwolf	5e1207b8ad	add attention to all bert models and add test	2019-06-14 16:28:25 +02:00
thomwolf	a3274ac40b	adding attention outputs in bert	2019-06-03 16:11:45 -05:00
VictorSanh	826496580b	Revert "add output_attentions for BertModel" This reverts commit `de5e5682a1`.	2019-06-03 17:10:25 -04:00
VictorSanh	de5e5682a1	add output_attentions for BertModel	2019-06-03 17:05:24 -04:00
VictorSanh	8f97f6c57f	fix typo cc @thomwolf	2019-06-01 17:29:07 -04:00
VictorSanh	c0c7ff5751	add transformer xl compatibility for torchhub	2019-06-01 16:08:24 -04:00
VictorSanh	a92b6dc3c1	add GPT2 torchhub compatibility	2019-06-01 15:27:43 -04:00
VictorSanh	0c5a4fe9c9	modify from_pretrained for OpenAIGPT	2019-05-31 00:27:18 -04:00
Victor SANH	96592b544b	default in __init__s for classification BERT models (#650 )	2019-05-30 15:53:13 -04:00
thomwolf	275179a003	output attentions in GPT-2	2019-05-08 22:24:42 +02:00
thomwolf	366a3b0285	clean up in tokenization	2019-05-08 21:43:51 +02:00
Thomas Wolf	701bd59b8b	Merge pull request #585 from huntzhan/master Make the epsilon of LayerNorm configurable.	2019-05-08 16:56:38 +02:00
Thomas Wolf	303b5e2b92	Merge pull request #545 from ailzhang/cache_dir move pytroch_pretrained_bert cache folder under same path as torch	2019-05-08 16:55:27 +02:00
thomwolf	0efc4ab632	adding dropout to GPT-2 and embedding dropout to GPT	2019-05-08 10:41:35 +02:00
thomwolf	ea9dbea9d5	update GPT2 loss computation for more flexbility	2019-05-07 23:27:18 +02:00
thomwolf	ce86336545	add predict_special_tokens option to GPT also	2019-05-07 16:47:22 +02:00
thomwolf	d1b6979aa5	GPT-2 option to avoid predicting special tokens	2019-05-07 16:25:53 +02:00
huntzhan	101ab4dd8e	Make the epsilon of LayerNorm configurable.	2019-05-06 00:26:21 +08:00
thomwolf	e211785ada	extract attention weights from GPT	2019-05-02 18:31:26 +02:00
thomwolf	db98a4a48b	gpt-2 tokenizer	2019-05-01 11:40:48 +02:00
Ben Mann	74f7906db4	Fix #537	2019-04-30 19:48:22 -07:00
thomwolf	80f53f7380	gpt-2 from_pretrained can use special tokens	2019-04-30 11:10:22 +02:00
thomwolf	e79ceb1533	gpt-2 special tokens	2019-04-30 11:05:54 +02:00
thomwolf	c30139a013	add special tokens to gpt-2	2019-04-30 10:45:26 +02:00
Ailing Zhang	3963d57c89	move pytroch_pretrained_bert cache folder under same path as torch	2019-04-27 11:09:11 -07:00
thomwolf	b832d5bb8a	Release: 0.6.2	2019-04-25 21:37:47 +02:00
Thomas Wolf	e6cf62d499	Merge pull request #488 from dhpollack/fix_multichoice fixed BertForMultipleChoice model init and forward pass	2019-04-25 21:04:16 +02:00
lukovnikov	704037ad51	- updated docs for new LR API - added some images for illustration - updated comments in optimization	2019-04-25 15:59:39 +02:00
Thomas Wolf	d76a57b0ba	Merge pull request #506 from ailzhang/hubconf Hubconf	2019-04-24 20:59:21 +02:00
thomwolf	80f995a141	revert BertForMultipleChoice linear classifier	2019-04-24 16:51:54 +02:00
lukovnikov	69850b4011	python 2 compat	2019-04-21 14:02:38 +02:00
lukovnikov	bb7557d3ab	- removed __all__ in optimization - removed unused plotting code - using ABC for LRSchedule - added some schedule object init tests	2019-04-21 13:48:33 +02:00
lukovnikov	34ccc8ebf4	Merge remote-tracking branch 'upstream/master'	2019-04-21 13:16:15 +02:00
Ailing Zhang	bfd6f6b257	fix from_pretrained positional args	2019-04-17 16:31:40 -07:00
thomwolf	23d4554ec0	is python 2 happy now	2019-04-17 14:48:34 +02:00
thomwolf	265550ec34	relax network connection requirements	2019-04-17 14:22:35 +02:00
thomwolf	fa76520240	fix file_utils on python 2	2019-04-17 13:32:22 +02:00
thomwolf	bcde2c61cb	fix #497	2019-04-17 12:35:38 +02:00
Thomas Wolf	2e153930cf	Merge pull request #495 from SudoSharma/patch-2 Fix gradient overflow issue during attention mask	2019-04-17 11:10:36 +02:00
thomwolf	5afa497cbf	fix GPT-2 tokenization to work also on python 3...	2019-04-17 11:04:41 +02:00
thomwolf	bc70779bf0	fixed GPT-2 tokenization on python 2	2019-04-17 10:56:15 +02:00
Abhi Sharma	9e666aaa29	Fix gradient overflow issue during attention mask This fix is in reference to issue #382. GPT2 can now be trained in mixed precision, which I've confirmed with testing. I also tested unconditional generation on multiple seeds before and after changing 1e10 to 1e4 and there was no difference. Please let me know if there is anything else I can do to make this pull request better. Thanks for all your work!	2019-04-16 11:42:34 -07:00
thomwolf	bdaba1897c	updating GPT tokenization	2019-04-16 17:44:06 +02:00
thomwolf	18a8a15f78	improving GPT2 tokenization and adding tests	2019-04-16 17:00:55 +02:00
Thomas Wolf	3d78e226e6	Merge pull request #489 from huggingface/tokenization_serialization Better serialization for Tokenizers and Configuration classes - Also fix #466	2019-04-16 08:49:54 +02:00
Thomas Wolf	64b6ef4db0	Merge pull request #490 from huggingface/better_finetuning_GPT_GPT-2 Clean up GPT and GPT-2 losses computation	2019-04-15 16:14:50 +02:00
thomwolf	d616022455	fix openai special tokens loading	2019-04-15 16:07:45 +02:00

1 2 3 4 5

249 Commits