transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-30 17:52:35 +06:00

Author	SHA1	Message	Date
Thomas Wolf	303b5e2b92	Merge pull request #545 from ailzhang/cache_dir move pytroch_pretrained_bert cache folder under same path as torch	2019-05-08 16:55:27 +02:00
thomwolf	0efc4ab632	adding dropout to GPT-2 and embedding dropout to GPT	2019-05-08 10:41:35 +02:00
thomwolf	ea9dbea9d5	update GPT2 loss computation for more flexbility	2019-05-07 23:27:18 +02:00
thomwolf	ce86336545	add predict_special_tokens option to GPT also	2019-05-07 16:47:22 +02:00
thomwolf	d1b6979aa5	GPT-2 option to avoid predicting special tokens	2019-05-07 16:25:53 +02:00
huntzhan	101ab4dd8e	Make the epsilon of LayerNorm configurable.	2019-05-06 00:26:21 +08:00
thomwolf	e211785ada	extract attention weights from GPT	2019-05-02 18:31:26 +02:00
thomwolf	db98a4a48b	gpt-2 tokenizer	2019-05-01 11:40:48 +02:00
Ben Mann	74f7906db4	Fix #537	2019-04-30 19:48:22 -07:00
thomwolf	80f53f7380	gpt-2 from_pretrained can use special tokens	2019-04-30 11:10:22 +02:00
thomwolf	e79ceb1533	gpt-2 special tokens	2019-04-30 11:05:54 +02:00
thomwolf	c30139a013	add special tokens to gpt-2	2019-04-30 10:45:26 +02:00
Ailing Zhang	3963d57c89	move pytroch_pretrained_bert cache folder under same path as torch	2019-04-27 11:09:11 -07:00
thomwolf	b832d5bb8a	Release: 0.6.2	2019-04-25 21:37:47 +02:00
Thomas Wolf	e6cf62d499	Merge pull request #488 from dhpollack/fix_multichoice fixed BertForMultipleChoice model init and forward pass	2019-04-25 21:04:16 +02:00
lukovnikov	704037ad51	- updated docs for new LR API - added some images for illustration - updated comments in optimization	2019-04-25 15:59:39 +02:00
Thomas Wolf	d76a57b0ba	Merge pull request #506 from ailzhang/hubconf Hubconf	2019-04-24 20:59:21 +02:00
thomwolf	80f995a141	revert BertForMultipleChoice linear classifier	2019-04-24 16:51:54 +02:00
lukovnikov	69850b4011	python 2 compat	2019-04-21 14:02:38 +02:00
lukovnikov	bb7557d3ab	- removed __all__ in optimization - removed unused plotting code - using ABC for LRSchedule - added some schedule object init tests	2019-04-21 13:48:33 +02:00
lukovnikov	34ccc8ebf4	Merge remote-tracking branch 'upstream/master'	2019-04-21 13:16:15 +02:00
Ailing Zhang	bfd6f6b257	fix from_pretrained positional args	2019-04-17 16:31:40 -07:00
thomwolf	23d4554ec0	is python 2 happy now	2019-04-17 14:48:34 +02:00
thomwolf	265550ec34	relax network connection requirements	2019-04-17 14:22:35 +02:00
thomwolf	fa76520240	fix file_utils on python 2	2019-04-17 13:32:22 +02:00
thomwolf	bcde2c61cb	fix #497	2019-04-17 12:35:38 +02:00
Thomas Wolf	2e153930cf	Merge pull request #495 from SudoSharma/patch-2 Fix gradient overflow issue during attention mask	2019-04-17 11:10:36 +02:00
thomwolf	5afa497cbf	fix GPT-2 tokenization to work also on python 3...	2019-04-17 11:04:41 +02:00
thomwolf	bc70779bf0	fixed GPT-2 tokenization on python 2	2019-04-17 10:56:15 +02:00
Abhi Sharma	9e666aaa29	Fix gradient overflow issue during attention mask This fix is in reference to issue #382. GPT2 can now be trained in mixed precision, which I've confirmed with testing. I also tested unconditional generation on multiple seeds before and after changing 1e10 to 1e4 and there was no difference. Please let me know if there is anything else I can do to make this pull request better. Thanks for all your work!	2019-04-16 11:42:34 -07:00
thomwolf	bdaba1897c	updating GPT tokenization	2019-04-16 17:44:06 +02:00
thomwolf	18a8a15f78	improving GPT2 tokenization and adding tests	2019-04-16 17:00:55 +02:00
Thomas Wolf	3d78e226e6	Merge pull request #489 from huggingface/tokenization_serialization Better serialization for Tokenizers and Configuration classes - Also fix #466	2019-04-16 08:49:54 +02:00
Thomas Wolf	64b6ef4db0	Merge pull request #490 from huggingface/better_finetuning_GPT_GPT-2 Clean up GPT and GPT-2 losses computation	2019-04-15 16:14:50 +02:00
thomwolf	d616022455	fix openai special tokens loading	2019-04-15 16:07:45 +02:00
thomwolf	df5d9c3551	load all models on cpu	2019-04-15 15:43:01 +02:00
thomwolf	60ea6c59d2	added best practices for serialization in README and examples	2019-04-15 15:00:33 +02:00
thomwolf	b3c6ee0ac1	tokenization updates	2019-04-15 14:24:52 +02:00
thomwolf	9761aa4845	add to_json_file method to configuration classes	2019-04-15 14:12:08 +02:00
thomwolf	e8568a3b17	fixing tests	2019-04-15 12:55:38 +02:00
thomwolf	870b734bfd	added tokenizers serialization tests	2019-04-15 12:03:56 +02:00
thomwolf	3e65f255dc	add serialization semantics to tokenizers - fix transfo-xl tokenizer	2019-04-15 11:47:25 +02:00
David Pollack	38ba7b439b	fixed BertForMultipleChoice model init and forward pass	2019-04-15 10:38:01 +02:00
thomwolf	fe2756ff41	update double head model	2019-04-15 10:04:05 +02:00
Martin Boyanov	34cf67fd6c	Extend the BertForSequenceClassification docs to mention the special CLS token.	2019-04-12 21:30:28 +03:00
thomwolf	b509bf7655	updating loss computation	2019-04-12 12:12:33 +02:00
thomwolf	1d203a34c0	back to simple indexing	2019-04-11 23:51:03 +02:00
thomwolf	074c869bbe	fix OpenAIGPTMultipleChoiceHead	2019-04-11 20:53:50 +02:00
thomwolf	a05fad8dce	fix typo	2019-04-11 13:16:17 +02:00
thomwolf	4a82f4f856	update special token addition	2019-04-11 13:11:22 +02:00
thomwolf	991b8e65f4	Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT	2019-04-11 11:43:15 +02:00
thomwolf	e99b2014cc	fixes #471	2019-04-11 11:43:13 +02:00
lukovnikov	fc7693adc3	schedule fix	2019-04-03 18:16:47 +02:00
lukovnikov	20686b78fc	schedule fix	2019-04-03 18:13:52 +02:00
lukovnikov	5fed5bb3d6	schedule fix	2019-04-03 17:20:29 +02:00
lukovnikov	91a073f804	schedule fix	2019-04-03 17:10:08 +02:00
lukovnikov	1758c8fc72	- updated docs for optimization	2019-04-03 16:08:34 +02:00
lukovnikov	725a56329d	Merge remote-tracking branch 'upstream/master' into optim # Conflicts: # pytorch_pretrained_bert/optimization.py - updated docs for optimization	2019-04-03 16:07:50 +02:00
Thomas Wolf	94980b529f	Merge pull request #404 from CatalinVoss/fix_lm_loss Fix Language Modeling Loss	2019-04-03 11:35:30 +02:00
Thomas Wolf	db4dccd1b5	Merge pull request #389 from lukovnikov/master Fix cosine schedule	2019-04-03 11:21:43 +02:00
thomwolf	19666dcb3b	Should fix #438	2019-04-03 11:01:01 +02:00
thomwolf	1d8c232324	Fix #436	2019-04-03 10:51:03 +02:00
Mike Arpaia	8b5c63e4de	Fixes to the TensorFlow conversion tool	2019-04-01 13:17:54 -06:00
Catalin Voss	01520d5412	Remove my unhelpful comments :)	2019-03-27 10:45:28 -07:00
Ikuya Yamada	0401317b23	Remove padding_idx from position_embeddings and token_type_embeddings	2019-03-26 21:56:35 +09:00
Catalin Voss	fda2f62395	Fix test failures due to old torch issue with non-contiguous view	2019-03-24 14:37:13 -07:00
Catalin Voss	0dd796e359	Also fix loss function issue with the double head models	2019-03-24 14:35:55 -07:00
Catalin Voss	472857c47f	Fix typo syntax err (sorry, c/p from my repo)	2019-03-24 14:14:49 -07:00
Catalin Voss	2e6f5ffb96	Fix GPT language model loss here as well	2019-03-24 14:14:44 -07:00
Catalin Voss	5938f31fa7	Fix c/p typo from my experiment code	2019-03-24 14:14:40 -07:00
Catalin Voss	7797d21b8d	Fix GPT2 language modeling loss computation	2019-03-24 14:14:35 -07:00
lukovnikov	262a9992d7	class weights	2019-03-18 18:29:12 +01:00
lukovnikov	19cc2c084e	same	2019-03-18 15:13:35 +01:00
lukovnikov	2283dcca5e	import revert	2019-03-18 13:40:12 +01:00
lukovnikov	b6c1cae67b	branches, optim cosine fix	2019-03-18 13:32:04 +01:00
lukovnikov	ef28b2c747	branches, optim cosine fix	2019-03-18 13:18:07 +01:00
lukovnikov	90430ae7ec	Merge remote-tracking branch 'origin/master' # Conflicts: # pytorch_pretrained_bert/optimization.py	2019-03-18 13:15:29 +01:00
lukovnikov	bed6408dcc	branches, optim cosine fix	2019-03-18 13:09:55 +01:00
thomwolf	e5f2d9122c	adding absolute imports to gpt2, openai and transfo-xl	2019-03-14 09:55:01 +01:00
lukovnikov	20e652209c	relation classification: replacing entity mention with mask token	2019-03-13 16:13:37 +01:00
lukovnikov	eac039d21f	changing docker	2019-03-12 13:45:12 +01:00
lukovnikov	471daf1b6c	changing docker	2019-03-12 13:32:42 +01:00
lukovnikov	9024613337	changing docker	2019-03-12 13:23:58 +01:00
lukovnikov	baf66d1419	restart cosine lr schedule	2019-03-12 13:22:23 +01:00
Thomas Wolf	9b03d67b83	Merge pull request #362 from Bharat123rox/patch-1 Make the hyperlink of NVIDIA Apex clickable	2019-03-11 09:08:51 +01:00
Thomas Wolf	13aa13dbc0	Merge pull request #358 from cdjhz/patch-1 add 'padding_idx=0' for BertEmbeddings	2019-03-11 09:06:55 +01:00
Bharat Raghunathan	f91ce0b803	Make the hyperlink of NVIDIA Apex clickable	2019-03-09 20:05:39 +05:30
lukovnikov	51efde54a9	cos fix	2019-03-09 02:45:25 +01:00
lukovnikov	f113a2dfdc	readme de	2019-03-09 02:29:57 +01:00
lukovnikov	90a41dbe14	BertAdam schedule objects	2019-03-09 02:23:20 +01:00
lukovnikov	88874f6cf0	BertAdam schedule objects	2019-03-08 19:08:30 +01:00
Haozhe Ji	72fa8d03a7	add 'padding_idx=0' for BertEmbeddings	2019-03-07 20:02:55 +08:00
Philipp Glock	6190e8ce4c	Fix: use dropout layer	2019-03-07 10:12:45 +01:00
thomwolf	5c85fc3977	fix typo - logger info	2019-03-06 10:05:21 +01:00
Thomas Wolf	21c88a07b7	Merge pull request #341 from potatochip/patch-1 catch exception if pathlib not install	2019-03-06 09:48:01 +01:00
Thomas Wolf	477ec4b6cc	Merge pull request #337 from CatalinVoss/patch-2 Allow tokenization of sequences > 512 for caching	2019-03-06 09:45:49 +01:00
Thomas Wolf	7b9e5a54b5	Merge pull request #327 from lukovnikov/master Issue#324: warmup linear fixes	2019-03-06 09:44:56 +01:00
Catalin Voss	4a49c22584	Warn instead of raising in BERT and GPT-2 tokenizers as well, to allow for pre-caching of tokens	2019-03-05 12:31:45 -08:00
Aaron Mangum	0c970caa4a	catch exception if pathlib not install	2019-03-04 14:30:19 -08:00
Catalin Voss	9775b2eb27	Allow tokenization of sequences > 512 for caching For many applications requiring randomized data access, it's easier to cache the tokenized representations than the words. So why not turn this into a warning?	2019-03-02 16:30:21 -08:00

1 2 3 4 5 ...

284 Commits