Commit Graph

15053 Commits

Author SHA1 Message Date
Rémi Louf
a424892fab correct syntax error: dim() and not dims() 2019-10-16 18:24:32 +02:00
Rémi Louf
33c01368b1 remove Bert2Rnd test 2019-10-16 18:13:05 +02:00
Lysandre Debut
c544194611
Remove special_tokens_mask from inputs in README
Co-authored-by: Thomas Wolf @thomwolf
2019-10-16 11:05:13 -04:00
Rémi Louf
0752069617 adapt attention masks for the decoder case
The introduction of a decoder introduces 2 changes:
- We need to be able to specify a separate mask in the cross
attention to mask the positions corresponding to padding tokens in the
encoder state.
- The self-attention in the decoder needs to be causal on top of not
attending to padding tokens.
2019-10-16 16:12:22 +02:00
Rémi Louf
c5a94a6100 fix function that defines masks in XLM
the definition of `get_masks` would blow with the proper combination of
arguments. It was just a matter of moving a definition outside of a
control structure.
2019-10-16 13:00:32 +02:00
Rémi Louf
488a664151 add is_decoder attribute to PretrainedConfig
We currenctly instantiate encoders and decoders for the seq2seq by
passing the `is_decoder` keyword argument to the `from_pretrained`
classmethod. On the other hand, the model class looks for the value
of the `is_decoder` attribute in its config.

In order for the value to propagate from the kwarg to the configuration
we simply need to define `is_decoder` as an attribute to the base
`PretrainedConfig`, with a default at `False`.
2019-10-15 21:03:32 +02:00
Rémi Louf
4c81960b9b comment the seq2seq functions 2019-10-15 20:52:28 +02:00
Rémi Louf
6d6c326737 take path to pretrained for encoder and decoder for init 2019-10-15 16:08:27 +02:00
Rémi Louf
0d81fc853e specify in readme that both datasets are required 2019-10-15 15:26:33 +02:00
Rémi Louf
19e9964780 remove Bert2Bert from module declaration 2019-10-15 15:20:28 +02:00
Rémi Louf
1aec940587 test the full story processing 2019-10-15 15:18:07 +02:00
Rémi Louf
22e1af6859 truncation function is fully tested 2019-10-15 14:43:50 +02:00
Rémi Louf
260ac7d9a8 wip commit, switching computers 2019-10-15 12:24:35 +02:00
thomwolf
be916cb3fb Merge branch 'master' of https://github.com/huggingface/transformers 2019-10-15 10:37:13 +02:00
thomwolf
5875aaf762 install tensorboard 2019-10-15 10:36:46 +02:00
Thomas Wolf
40f14ff545
Merge pull request #1513 from slayton58/amp_fp16_einsum
Force einsum to run in fp16
2019-10-15 10:25:00 +02:00
Thomas Wolf
e703e4dfe1
Merge pull request #1509 from julian-pani/patch-3
remove leftover usage of DUMMY_INPUTS
2019-10-15 10:24:13 +02:00
thomwolf
898ce064f8 add tests on TF2.0 & PT checkpoint => model convertion functions 2019-10-15 10:04:19 +02:00
Thomas Wolf
d147671c6c
Merge pull request #1508 from tlkh/master
Added performance enhancements (XLA, AMP) to examples
2019-10-15 09:57:18 +02:00
thomwolf
2c1d5564ad add readme information 2019-10-15 09:56:52 +02:00
Thomas Wolf
08bd8f9f39
Merge pull request #1505 from e-budur/master
Fixed the sample code in the title 'Quick tour'.
2019-10-15 09:50:36 +02:00
Thomas Wolf
8aa3b753bd
Merge pull request #1434 from bryant1410/patch-1
Remove unnecessary use of FusedLayerNorm in XLNet
2019-10-15 09:44:19 +02:00
Thomas Wolf
621e7a2529
Merge pull request #1275 from stecklin/ner-fine-tuning
Implement fine-tuning BERT on CoNLL-2003 named entity recognition task
2019-10-15 09:35:24 +02:00
thomwolf
c55badcee0 Add NER finetuning details by @stefan-it in example readme 2019-10-15 09:33:52 +02:00
Julien Chaumond
788e632622 [ner] Honor args.overwrite_cache 2019-10-15 09:17:31 +02:00
thomwolf
0f9ebb0b43 add seqeval as requirement for examples 2019-10-15 09:17:31 +02:00
thomwolf
66adb71734 update to transformers 2019-10-15 09:17:31 +02:00
Marianne Stecklina
5ff9cd158a Add option to predict on test set 2019-10-15 09:17:31 +02:00
Marianne Stecklina
7f5367e0b1 Add cli argument for configuring labels 2019-10-15 09:17:31 +02:00
Marianne Stecklina
e1d4179b64 Make file reading more robust 2019-10-15 09:17:31 +02:00
Marianne Stecklina
383ef96747 Implement fine-tuning BERT on CoNLL-2003 named entity recognition task 2019-10-15 09:17:31 +02:00
Marianne Stecklina
5adb39e757 Add option to predict on test set 2019-10-15 09:14:53 +02:00
Marianne Stecklina
99b189df6d Add cli argument for configuring labels 2019-10-15 09:14:53 +02:00
Marianne Stecklina
3e9420add1 Make file reading more robust 2019-10-15 09:14:53 +02:00
Marianne Stecklina
cde42c4354 Implement fine-tuning BERT on CoNLL-2003 named entity recognition task 2019-10-15 09:14:53 +02:00
hlums
74c5035808 Fix token order in xlnet preprocessing. 2019-10-14 21:27:11 +00:00
Rémi Louf
fe25eefc15 add instructions to fetch the dataset 2019-10-14 20:45:39 +02:00
Rémi Louf
412793275d delegate the padding with special tokens to the tokenizer 2019-10-14 20:45:16 +02:00
Rémi Louf
447fffb21f process the raw CNN/Daily Mail dataset
the data provided by Li Dong et al. were already tokenized, which means
that they are not compatible with  all the models in the library. We
thus process the raw data directly and tokenize them using the models'
tokenizers.
2019-10-14 18:12:20 +02:00
Thomas Wolf
80889a0226
Merge pull request #1512 from louismartin/fix-roberta-convert
Fix import error in script to convert faisreq roberta checkpoints
2019-10-14 17:40:32 +02:00
Simon Layton
4e6a55751a Force einsum to fp16 2019-10-14 11:12:41 -04:00
Thomas Wolf
f62f992cf7
Merge pull request #1502 from jeffxtang/master
the working example code to use BertForQuestionAnswering
2019-10-14 16:14:52 +02:00
Rémi Louf
67d10960ae load and prepare CNN/Daily Mail data
We write a function to load an preprocess the CNN/Daily Mail dataset as
provided by Li Dong et al. The issue is that this dataset has already
been tokenized by the authors, so we actually need to find the original,
plain-text dataset if we want to apply it to all models.
2019-10-14 14:11:20 +02:00
thomwolf
d9d387afce clean up 2019-10-14 12:14:40 +02:00
thomwolf
b7141a1bc6 maxi simplication 2019-10-14 12:14:08 +02:00
thomwolf
bfbe68f035 update forward pass 2019-10-14 12:04:23 +02:00
thomwolf
0ef9bc923a Cleaning up seq2seq [WIP] 2019-10-14 11:58:13 +02:00
Louis MARTIN
49cba6e543 Fix import error in script to convert faisreq roberta checkpoints 2019-10-14 01:38:57 -07:00
JulianPani
0993586758
remove usage of DUMMY_INPUTS
Hey @thomwolf  
This change da26bae61b (diff-8ddce309e88e8eb5b4d02228fd8881daL28-L29) removed the constant, but one usage of that constant remains in the code.
2019-10-14 02:09:53 +03:00
Timothy Liu
376e65a674 Added automatic mixed precision and XLA options to run_tf_glue.py 2019-10-13 13:19:06 +00:00