transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-16 02:58:23 +06:00

Author	SHA1	Message	Date
Rémi Louf	a424892fab	correct syntax error: dim() and not dims()	2019-10-16 18:24:32 +02:00
Rémi Louf	33c01368b1	remove Bert2Rnd test	2019-10-16 18:13:05 +02:00
Lysandre Debut	c544194611	Remove `special_tokens_mask` from inputs in README Co-authored-by: Thomas Wolf @thomwolf	2019-10-16 11:05:13 -04:00
Rémi Louf	0752069617	adapt attention masks for the decoder case The introduction of a decoder introduces 2 changes: - We need to be able to specify a separate mask in the cross attention to mask the positions corresponding to padding tokens in the encoder state. - The self-attention in the decoder needs to be causal on top of not attending to padding tokens.	2019-10-16 16:12:22 +02:00
Rémi Louf	c5a94a6100	fix function that defines masks in XLM the definition of `get_masks` would blow with the proper combination of arguments. It was just a matter of moving a definition outside of a control structure.	2019-10-16 13:00:32 +02:00
Rémi Louf	488a664151	add `is_decoder` attribute to `PretrainedConfig` We currenctly instantiate encoders and decoders for the seq2seq by passing the `is_decoder` keyword argument to the `from_pretrained` classmethod. On the other hand, the model class looks for the value of the `is_decoder` attribute in its config. In order for the value to propagate from the kwarg to the configuration we simply need to define `is_decoder` as an attribute to the base `PretrainedConfig`, with a default at `False`.	2019-10-15 21:03:32 +02:00
Rémi Louf	4c81960b9b	comment the seq2seq functions	2019-10-15 20:52:28 +02:00
Rémi Louf	6d6c326737	take path to pretrained for encoder and decoder for init	2019-10-15 16:08:27 +02:00
Rémi Louf	0d81fc853e	specify in readme that both datasets are required	2019-10-15 15:26:33 +02:00
Rémi Louf	19e9964780	remove Bert2Bert from module declaration	2019-10-15 15:20:28 +02:00
Rémi Louf	1aec940587	test the full story processing	2019-10-15 15:18:07 +02:00
Rémi Louf	22e1af6859	truncation function is fully tested	2019-10-15 14:43:50 +02:00
Rémi Louf	260ac7d9a8	wip commit, switching computers	2019-10-15 12:24:35 +02:00
thomwolf	be916cb3fb	Merge branch 'master' of https://github.com/huggingface/transformers	2019-10-15 10:37:13 +02:00
thomwolf	5875aaf762	install tensorboard	2019-10-15 10:36:46 +02:00
Thomas Wolf	40f14ff545	Merge pull request #1513 from slayton58/amp_fp16_einsum Force einsum to run in fp16	2019-10-15 10:25:00 +02:00
Thomas Wolf	e703e4dfe1	Merge pull request #1509 from julian-pani/patch-3 remove leftover usage of DUMMY_INPUTS	2019-10-15 10:24:13 +02:00
thomwolf	898ce064f8	add tests on TF2.0 & PT checkpoint => model convertion functions	2019-10-15 10:04:19 +02:00
Thomas Wolf	d147671c6c	Merge pull request #1508 from tlkh/master Added performance enhancements (XLA, AMP) to examples	2019-10-15 09:57:18 +02:00
thomwolf	2c1d5564ad	add readme information	2019-10-15 09:56:52 +02:00
Thomas Wolf	08bd8f9f39	Merge pull request #1505 from e-budur/master Fixed the sample code in the title 'Quick tour'.	2019-10-15 09:50:36 +02:00
Thomas Wolf	8aa3b753bd	Merge pull request #1434 from bryant1410/patch-1 Remove unnecessary use of FusedLayerNorm in XLNet	2019-10-15 09:44:19 +02:00
Thomas Wolf	621e7a2529	Merge pull request #1275 from stecklin/ner-fine-tuning Implement fine-tuning BERT on CoNLL-2003 named entity recognition task	2019-10-15 09:35:24 +02:00
thomwolf	c55badcee0	Add NER finetuning details by @stefan-it in example readme	2019-10-15 09:33:52 +02:00
Julien Chaumond	788e632622	[ner] Honor args.overwrite_cache	2019-10-15 09:17:31 +02:00
thomwolf	0f9ebb0b43	add seqeval as requirement for examples	2019-10-15 09:17:31 +02:00
thomwolf	66adb71734	update to transformers	2019-10-15 09:17:31 +02:00
Marianne Stecklina	5ff9cd158a	Add option to predict on test set	2019-10-15 09:17:31 +02:00
Marianne Stecklina	7f5367e0b1	Add cli argument for configuring labels	2019-10-15 09:17:31 +02:00
Marianne Stecklina	e1d4179b64	Make file reading more robust	2019-10-15 09:17:31 +02:00
Marianne Stecklina	383ef96747	Implement fine-tuning BERT on CoNLL-2003 named entity recognition task	2019-10-15 09:17:31 +02:00
Marianne Stecklina	5adb39e757	Add option to predict on test set	2019-10-15 09:14:53 +02:00
Marianne Stecklina	99b189df6d	Add cli argument for configuring labels	2019-10-15 09:14:53 +02:00
Marianne Stecklina	3e9420add1	Make file reading more robust	2019-10-15 09:14:53 +02:00
Marianne Stecklina	cde42c4354	Implement fine-tuning BERT on CoNLL-2003 named entity recognition task	2019-10-15 09:14:53 +02:00
hlums	74c5035808	Fix token order in xlnet preprocessing.	2019-10-14 21:27:11 +00:00
Rémi Louf	fe25eefc15	add instructions to fetch the dataset	2019-10-14 20:45:39 +02:00
Rémi Louf	412793275d	delegate the padding with special tokens to the tokenizer	2019-10-14 20:45:16 +02:00
Rémi Louf	447fffb21f	process the raw CNN/Daily Mail dataset the data provided by Li Dong et al. were already tokenized, which means that they are not compatible with all the models in the library. We thus process the raw data directly and tokenize them using the models' tokenizers.	2019-10-14 18:12:20 +02:00
Thomas Wolf	80889a0226	Merge pull request #1512 from louismartin/fix-roberta-convert Fix import error in script to convert faisreq roberta checkpoints	2019-10-14 17:40:32 +02:00
Simon Layton	4e6a55751a	Force einsum to fp16	2019-10-14 11:12:41 -04:00
Thomas Wolf	f62f992cf7	Merge pull request #1502 from jeffxtang/master the working example code to use BertForQuestionAnswering	2019-10-14 16:14:52 +02:00
Rémi Louf	67d10960ae	load and prepare CNN/Daily Mail data We write a function to load an preprocess the CNN/Daily Mail dataset as provided by Li Dong et al. The issue is that this dataset has already been tokenized by the authors, so we actually need to find the original, plain-text dataset if we want to apply it to all models.	2019-10-14 14:11:20 +02:00
thomwolf	d9d387afce	clean up	2019-10-14 12:14:40 +02:00
thomwolf	b7141a1bc6	maxi simplication	2019-10-14 12:14:08 +02:00
thomwolf	bfbe68f035	update forward pass	2019-10-14 12:04:23 +02:00
thomwolf	0ef9bc923a	Cleaning up seq2seq [WIP]	2019-10-14 11:58:13 +02:00
Louis MARTIN	49cba6e543	Fix import error in script to convert faisreq roberta checkpoints	2019-10-14 01:38:57 -07:00
JulianPani	0993586758	remove usage of DUMMY_INPUTS Hey @thomwolf This change `da26bae61b (diff-8ddce309e88e8eb5b4d02228fd8881daL28-L29)` removed the constant, but one usage of that constant remains in the code.	2019-10-14 02:09:53 +03:00
Timothy Liu	376e65a674	Added automatic mixed precision and XLA options to run_tf_glue.py	2019-10-13 13:19:06 +00:00

... 260 261 262 263 264 ...

15053 Commits