transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-15 18:48:24 +06:00

Author	SHA1	Message	Date
thomwolf	d9e60f4f0d	Merge branch 'master' into pr/1383	2019-10-09 17:25:08 +02:00
Lysandre Debut	e84470ef81	Merge pull request #1384 from huggingface/encoding-qol Quality of life enhancements in encoding + patch MLM masking	2019-10-09 11:18:24 -04:00
thomwolf	07d055f849	higher tolerance	2019-10-09 17:10:04 +02:00
thomwolf	48b438ff2a	doc and conversion	2019-10-09 17:06:30 +02:00
jinoobaek-qz	69629c4f0f	Improve naming and only do regex when necessary	2019-10-09 08:48:40 -04:00
jinoobaek-qz	bf34a252b8	Golden path	2019-10-09 08:48:40 -04:00
jinoobaek-qz	528d3f327b	Improve readability and improve make less assumptions about checkpoint format	2019-10-09 08:48:40 -04:00
jinoobaek-qz	56301bd9e8	Extract method	2019-10-09 08:48:40 -04:00
jinoobaek-qz	d6c5469712	Delete older checkpoint after saving new checkpoint	2019-10-09 08:48:40 -04:00
jinoobaek-qz	54a31f50fb	Add save_total_limit	2019-10-09 08:48:40 -04:00
thomwolf	c19b8e4ae0	fixing CTRL tests and OpenAI GPT tests	2019-10-09 13:51:05 +02:00
thomwolf	6dce6dda1b	fixing TF 2.0 model - adding more severe test on pt/tf equivalence	2019-10-09 11:57:55 +02:00
thomwolf	c56d921dda	adding TF 2.0 model	2019-10-09 11:07:43 +02:00
thomwolf	1c5079952f	simpler distilbert mask - fix tf tests	2019-10-09 04:26:20 +02:00
Thomas Wolf	58b302caf3	Merge pull request #1398 from dveselov/patch-1 Fixed typo in docs README	2019-10-09 03:52:42 +02:00
Thomas Wolf	439fac723a	Merge pull request #1409 from brian41005/master Evaluation result.txt path changing #1286	2019-10-09 03:14:34 +02:00
thomwolf	23b7138ab4	fix #1378 and #1453	2019-10-09 01:54:44 +02:00
Bilal Khan	5ce8d29abe	Change tensorboard imports to use built-in tensorboard if available	2019-10-08 16:29:43 -05:00
Julien Chaumond	d688af19e5	Update link to swift-coreml-transformers cc @lysandrejik	2019-10-08 16:37:52 -04:00
thomwolf	45dc04f33d	tf model [WIP]	2019-10-08 17:37:17 +02:00
Rémi Louf	770b15b58c	rename class in __init__	2019-10-08 17:32:28 +02:00
thomwolf	248314772f	fix tokenization	2019-10-08 17:19:28 +02:00
thomwolf	03c2c762a6	update tokenizer	2019-10-08 17:12:03 +02:00
thomwolf	3edfa1d6aa	update model to use past	2019-10-08 17:11:58 +02:00
Rémi Louf	f4d41fe33e	Merge pull request #1448 from huggingface/contributing add contribution guidelines	2019-10-08 16:55:34 +02:00
Rémi Louf	61ed889005	remove old seq2seq file	2019-10-08 16:30:58 +02:00
Rémi Louf	8abfee9ec3	rename Bert2Bert -> Bert2Rnd	2019-10-08 16:30:58 +02:00
Rémi Louf	82628b0fc9	add a placeholder test	2019-10-08 16:30:58 +02:00
Rémi Louf	0700983090	Add BertDecoderModel and Bert2Bert classes I am not sure what happens when the class is initialized with the pretrained weights.	2019-10-08 16:30:58 +02:00
Rémi Louf	75feacf172	add general structure for Bert2Bert class	2019-10-08 16:30:58 +02:00
Rémi Louf	15a2fc88a6	add General attention classes The modifications that I introduced in a previous commit did break Bert's internal API. I reverted these changes and added more general classes to handle the encoder-decoder attention case. There may be a more elegant way to deal with retro-compatibility (I am not comfortable with the current state of the code), but I cannot see it right now.	2019-10-08 16:30:58 +02:00
Rémi Louf	cd6a59d5c1	add a decoder layer for Bert	2019-10-08 16:30:58 +02:00
Rémi Louf	45de313a9e	add bullet point on modifying an existing PR	2019-10-08 11:54:10 +02:00
Rémi Louf	ade05b6cef	add code contribution	2019-10-07 23:20:25 +02:00
Rémi Louf	e9c09052a4	add issues and requests guidelines	2019-10-07 22:30:55 +02:00
LysandreJik	8fcc6507ce	Multilingual	2019-10-07 15:02:42 -04:00
Rémi Louf	6e3e1c959e	Merge pull request #1447 from huggingface/dev-requirements Provide requirements.txt for development dependencies	2019-10-07 18:49:26 +02:00
VictorSanh	7ce83b4931	update weights for distilgpt2	2019-10-07 12:30:27 -04:00
VictorSanh	9f81f1cba8	fix convert pt_to_tf2 for custom weights	2019-10-07 12:30:19 -04:00
Rémi Louf	7afd00a661	freeze dev requirements	2019-10-07 17:58:13 +02:00
Rémi Louf	a0dcefa382	generalize BertSelfAttention to take separate query, key, value There is currently no way to specify the quey, key and value separately in the Attention module. However, the decoder's "encoder-decoder attention" layers take the decoder's last output as a query, the encoder's states as key and value. We thus modify the existing code so query, key and value can be added separately. This obviously poses some naming conventions; `BertSelfAttention` is not a self-attention module anymore. The way the residual is forwarded is now awkard, etc. We will need to do some refacto once the decoder is fully implemented.	2019-10-07 17:53:58 +02:00
Rémi Louf	31adbb247c	add class wireframes for Bert decoder	2019-10-07 16:43:21 +02:00
Rémi Louf	dda1adad6d	rename BertLayer to BertEncoderLayer	2019-10-07 16:31:46 +02:00
Rémi Louf	0053c0e052	do some (light) housekeeping Several packages were imported but never used, indentation and line spaces did not follow PEP8.	2019-10-07 16:29:15 +02:00
thomwolf	bd5363cc83	update CTRL configuration	2019-10-07 15:37:30 +02:00
thomwolf	dc89441167	update CTRL pytorch model	2019-10-07 15:37:25 +02:00
thomwolf	320b7a7e01	fix #1416	2019-10-07 14:26:59 +02:00
Rémi Louf	386e86e222	raise exception when class initialized with __init__	2019-10-07 13:00:06 +02:00
Rémi Louf	4446c02b8a	add wireframe for seq2seq model	2019-10-07 12:04:05 +02:00
Thomas Wolf	1615360c71	Merge pull request #1438 from SeanBE/master fix pytorch-transformers migration description in README	2019-10-07 05:02:23 -04:00

... 262 263 264 265 266 ...

15053 Commits