transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-08-02 19:21:31 +06:00

Author	SHA1	Message	Date
Rémi Louf	75feacf172	add general structure for Bert2Bert class	2019-10-08 16:30:58 +02:00
Rémi Louf	15a2fc88a6	add General attention classes The modifications that I introduced in a previous commit did break Bert's internal API. I reverted these changes and added more general classes to handle the encoder-decoder attention case. There may be a more elegant way to deal with retro-compatibility (I am not comfortable with the current state of the code), but I cannot see it right now.	2019-10-08 16:30:58 +02:00
Rémi Louf	cd6a59d5c1	add a decoder layer for Bert	2019-10-08 16:30:58 +02:00
Rémi Louf	a0dcefa382	generalize BertSelfAttention to take separate query, key, value There is currently no way to specify the quey, key and value separately in the Attention module. However, the decoder's "encoder-decoder attention" layers take the decoder's last output as a query, the encoder's states as key and value. We thus modify the existing code so query, key and value can be added separately. This obviously poses some naming conventions; `BertSelfAttention` is not a self-attention module anymore. The way the residual is forwarded is now awkard, etc. We will need to do some refacto once the decoder is fully implemented.	2019-10-07 17:53:58 +02:00
Rémi Louf	31adbb247c	add class wireframes for Bert decoder	2019-10-07 16:43:21 +02:00
Rémi Louf	dda1adad6d	rename BertLayer to BertEncoderLayer	2019-10-07 16:31:46 +02:00
Rémi Louf	0053c0e052	do some (light) housekeeping Several packages were imported but never used, indentation and line spaces did not follow PEP8.	2019-10-07 16:29:15 +02:00
Rémi Louf	386e86e222	raise exception when class initialized with __init__	2019-10-07 13:00:06 +02:00
Rémi Louf	4446c02b8a	add wireframe for seq2seq model	2019-10-07 12:04:05 +02:00
Christopher Goh	904158ac4d	Rephrase forward method to reduce ambiguity	2019-10-06 23:40:52 -04:00
Christopher Goh	0f65d8cbbe	Fix some typos in README	2019-10-06 23:40:52 -04:00
LysandreJik	f3e0218fbb	Correct device assignment in run_generation	2019-10-05 21:05:16 -04:00
VictorSanh	0820bb0555	unecessary carriage return	2019-10-04 17:23:15 -04:00
VictorSanh	f5891c3821	run_squad --> run_squad_w_distillation	2019-10-04 17:23:15 -04:00
VictorSanh	764a7923ec	add distillation+finetuning option in run_squad	2019-10-04 17:23:15 -04:00
Lysandre Debut	bb464289ce	New model addition issue template	2019-10-04 16:41:26 -04:00
LysandreJik	7bddb45a6f	Decode documentaton	2019-10-04 14:27:38 -04:00
Thomas Wolf	b3cfd97946	Merge pull request #1373 from TimYagan/fix-css Fixed critical css font-family issues	2019-10-03 19:04:02 -04:00
Lysandre Debut	81a1e12469	Merge pull request #1313 from enzoampil/master Add option to use a 'stop token'	2019-10-03 22:43:57 +00:00
Lysandre Debut	d3f24dfad7	Merge branch 'master' into master	2019-10-03 22:43:09 +00:00
LysandreJik	ecc4f1bdfa	XLM use_lang_embedding flag in run_generation	2019-10-03 17:42:16 -04:00
LysandreJik	c2c2ca0fdb	Added XLM to run_generation, with prompt language selection.	2019-10-03 17:18:48 -04:00
Thomas Wolf	1569610f2d	Merge pull request #1296 from danai-antoniou/add-duplicate-tokens-error Added ValueError for duplicates in list of added tokens	2019-10-03 17:06:17 -04:00
drc10723	e1b2949ae6	DistillBert Documentation Code Example fixes	2019-10-03 15:51:33 -04:00
VictorSanh	e2ae9c0b73	fix links in doc index	2019-10-03 11:42:21 -04:00
Brian Ma	7af0777910	Update run_glue.py add DistilBert model shortcut into ALL_MODELS	2019-10-03 15:31:11 +00:00
VictorSanh	c1689ac301	fix name	2019-10-03 10:56:39 -04:00
VictorSanh	4a790c40b1	update doc for distil*	2019-10-03 10:54:02 -04:00
VictorSanh	6be46a6e64	update links to new weights	2019-10-03 10:27:11 -04:00
VictorSanh	5f07d8f11a	prepare release	2019-10-03 10:27:11 -04:00
VictorSanh	35071007cb	incoming release 🔥 update links to arxiv preprint	2019-10-03 10:27:11 -04:00
VictorSanh	f1f23ad171	fix buf in convert_pt_chkpt_to_tf2	2019-10-03 10:27:11 -04:00
VictorSanh	2a91f6071f	upddate README - TODO updadte link to paper	2019-10-03 10:27:11 -04:00
VictorSanh	c51e533a5f	update train.py	2019-10-03 10:27:11 -04:00
VictorSanh	a76c3f9cb0	update requirements	2019-10-03 10:27:11 -04:00
VictorSanh	bb9c5ead54	update distiller	2019-10-03 10:27:11 -04:00
VictorSanh	a12ab0a8db	update binarized_data	2019-10-03 10:27:11 -04:00
VictorSanh	4d6dfbd376	update extract	2019-10-03 10:27:11 -04:00
VictorSanh	23edebc079	update extract_distilbert	2019-10-03 10:27:11 -04:00
VictorSanh	cbfcfce205	update token_counts	2019-10-03 10:27:11 -04:00
VictorSanh	19e4ebbe3f	grouped_batch_sampler	2019-10-03 10:27:11 -04:00
VictorSanh	594202a934	lm_seqs_dataset	2019-10-03 10:27:11 -04:00
VictorSanh	38084507c4	add distillation_configs	2019-10-03 10:27:11 -04:00
LysandreJik	ebb32261b1	fix #1401	2019-10-02 17:52:56 -04:00
Santiago Castro	63ed224b7c	initialy -> initially	2019-10-02 15:04:18 +00:00
danai-antoniou	a95158518d	Moved duplicate token check	2019-10-02 07:44:15 +01:00
danai-antoniou	d73957899a	Merge branch 'master' of https://github.com/danai-antoniou/pytorch-transformers into add-duplicate-tokens-error	2019-10-02 07:38:50 +01:00
thomwolf	391db836ab	fix #1260 - remove special logic for decoding pairs of sequence	2019-10-01 19:09:13 -04:00
Thomas Wolf	963529e29b	Merge pull request #1288 from echan00/master Typo with LM Fine tuning script	2019-10-01 18:46:07 -04:00
thomwolf	f7978f70ec	use format instead of f-strings	2019-10-01 18:45:38 -04:00

1 2 3 4 5 ...

1839 Commits