transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-17 11:38:21 +06:00

Author	SHA1	Message	Date
Lysandre	1cfd974868	Option to benchmark only one of the two libraries	2019-10-22 13:32:23 -04:00
Lysandre	777faa8ae7	Fix #1597	2019-10-22 11:26:42 -04:00
Thomas Wolf	b8c9ea0010	Merge pull request #1580 from pminervini/master Gradient norm clipping should be done right before calling the optimiser	2019-10-22 13:59:20 +02:00
Pasquale Minervini	abd7110e21	gradient norm clipping should be done right before calling the optimiser - fixing run_glue and run_ner as well	2019-10-21 19:56:52 +01:00
thomwolf	4d456542e9	Fix citation	2019-10-21 16:34:14 +02:00
Thomas Wolf	0e64fec1ab	Merge pull request #1568 from daemon/patch-1 Fix hanging when loading pretrained models	2019-10-21 14:31:57 +02:00
Lorenzo Ampil	3a52b65795	Add special tokens to documentation for bert examples to resolve issue: #1561	2019-10-21 12:55:51 +08:00
erenup	86a630702d	Merge branch 'huggingface/master'	2019-10-21 12:06:09 +08:00
Pasquale Minervini	3775550c4b	gradient norm clipping should be done right before calling the optimiser	2019-10-20 22:33:56 +01:00
Pasquale Minervini	bf2c36a920	Merge pull request #1 from huggingface/master update	2019-10-20 23:30:45 +02:00
Ralph Tang	a2c8c8ef00	Fix hanging when loading pretrained models - Fix hanging when loading pretrained models from the cache without having internet access. This is a widespread issue on supercomputers whose internal compute nodes are firewalled.	2019-10-19 16:19:20 -04:00
LysandreJik	82f6abd98a	Benchmark section added to the documentation	2019-10-18 17:27:10 -04:00
LysandreJik	7dd29ed2f1	Benchmarks example script	2019-10-18 10:53:04 -04:00
Lysandre Debut	8efc0ec91a	Add Benchmarks to issue templates	2019-10-18 10:45:44 -04:00
William Tambellini	0919389d9a	Add speed log to examples/run_squad.py Add a speed estimate log (time per example) for evaluation to examples/run_squad.py	2019-10-17 14:41:04 -07:00
VictorSanh	fd97761c5a	soft launch distilroberta	2019-10-17 15:28:58 -04:00
leo-du	ecd15667f3	fix repetition penalty	2019-10-17 14:47:14 -04:00
thomwolf	56e2ee4ead	fix model2model	2019-10-17 16:33:31 +02:00
thomwolf	8cd56e3036	fix data processing in script	2019-10-17 16:33:26 +02:00
Rémi Louf	578d23e061	add training pipeline (formatting temporary)	2019-10-17 14:02:27 +02:00
Rémi Louf	47a06d88a0	use two different tokenizers for storyand summary	2019-10-17 13:04:26 +02:00
Rémi Louf	bfb9b540d4	add Model2Model to __init__	2019-10-17 12:59:51 +02:00
Rémi Louf	c1bc709c35	correct the truncation and padding of dataset	2019-10-17 10:41:53 +02:00
Rémi Louf	87d60b6e19	reword explanation of encoder_attention_mask	2019-10-17 10:18:19 +02:00
Rémi Louf	638fe7f5a4	correct composition of padding and causal masks	2019-10-17 10:13:07 +02:00
Rémi Louf	4e0f24348f	document the MLM modification + raise exception on MLM training with encoder-decoder	2019-10-17 09:41:53 +02:00
Rémi Louf	624a5644cc	revert black formatting to conform with lib style	2019-10-17 09:27:56 +02:00
Rémi Louf	9b71fc9a18	tying weights is going to be a clusterfuck	2019-10-16 21:31:38 +02:00
Rémi Louf	95ec1d08be	separate inputs into encoder & decoder inputs	2019-10-16 20:55:42 +02:00
Rémi Louf	e4e0ee14bd	add separator between data import and train	2019-10-16 20:05:32 +02:00
Rémi Louf	a424892fab	correct syntax error: dim() and not dims()	2019-10-16 18:24:32 +02:00
Rémi Louf	33c01368b1	remove Bert2Rnd test	2019-10-16 18:13:05 +02:00
Lysandre Debut	c544194611	Remove `special_tokens_mask` from inputs in README Co-authored-by: Thomas Wolf @thomwolf	2019-10-16 11:05:13 -04:00
Rémi Louf	0752069617	adapt attention masks for the decoder case The introduction of a decoder introduces 2 changes: - We need to be able to specify a separate mask in the cross attention to mask the positions corresponding to padding tokens in the encoder state. - The self-attention in the decoder needs to be causal on top of not attending to padding tokens.	2019-10-16 16:12:22 +02:00
Rémi Louf	c5a94a6100	fix function that defines masks in XLM the definition of `get_masks` would blow with the proper combination of arguments. It was just a matter of moving a definition outside of a control structure.	2019-10-16 13:00:32 +02:00
Rémi Louf	488a664151	add `is_decoder` attribute to `PretrainedConfig` We currenctly instantiate encoders and decoders for the seq2seq by passing the `is_decoder` keyword argument to the `from_pretrained` classmethod. On the other hand, the model class looks for the value of the `is_decoder` attribute in its config. In order for the value to propagate from the kwarg to the configuration we simply need to define `is_decoder` as an attribute to the base `PretrainedConfig`, with a default at `False`.	2019-10-15 21:03:32 +02:00
Rémi Louf	4c81960b9b	comment the seq2seq functions	2019-10-15 20:52:28 +02:00
Rémi Louf	6d6c326737	take path to pretrained for encoder and decoder for init	2019-10-15 16:08:27 +02:00
Rémi Louf	0d81fc853e	specify in readme that both datasets are required	2019-10-15 15:26:33 +02:00
Rémi Louf	19e9964780	remove Bert2Bert from module declaration	2019-10-15 15:20:28 +02:00
Rémi Louf	1aec940587	test the full story processing	2019-10-15 15:18:07 +02:00
Rémi Louf	22e1af6859	truncation function is fully tested	2019-10-15 14:43:50 +02:00
Rémi Louf	260ac7d9a8	wip commit, switching computers	2019-10-15 12:24:35 +02:00
thomwolf	be916cb3fb	Merge branch 'master' of https://github.com/huggingface/transformers	2019-10-15 10:37:13 +02:00
thomwolf	5875aaf762	install tensorboard	2019-10-15 10:36:46 +02:00
Thomas Wolf	40f14ff545	Merge pull request #1513 from slayton58/amp_fp16_einsum Force einsum to run in fp16	2019-10-15 10:25:00 +02:00
Thomas Wolf	e703e4dfe1	Merge pull request #1509 from julian-pani/patch-3 remove leftover usage of DUMMY_INPUTS	2019-10-15 10:24:13 +02:00
thomwolf	898ce064f8	add tests on TF2.0 & PT checkpoint => model convertion functions	2019-10-15 10:04:19 +02:00
Thomas Wolf	d147671c6c	Merge pull request #1508 from tlkh/master Added performance enhancements (XLA, AMP) to examples	2019-10-15 09:57:18 +02:00
thomwolf	2c1d5564ad	add readme information	2019-10-15 09:56:52 +02:00

... 346 347 348 349 350 ...

19383 Commits