Commit Graph

15053 Commits

Author SHA1 Message Date
Lysandre
164c794eb3 New SQuAD API for distillation script 2020-01-10 11:42:53 +01:00
Lysandre
801f2ac8c7 Add PRETRAINED_INIT_CONFIGURATION to DistilBERT tokenizer 2020-01-10 11:42:21 +01:00
Yohei Tamura
bfec203d4e modified: src/transformers/tokenization_utils.py 2020-01-09 12:54:28 +01:00
Julien Chaumond
f599623a99 PreTrainedTokenizerFast: hotfix _convert_encoding
cc @n1t0
2020-01-08 15:46:37 -05:00
Rishabh Manoj
f26a353057
Update pipelines.py
Modified QA pipeline to consider all features for each example before generating topk answers. 
Current pipeline only takes one SquadExample, one SquadFeature, one start logit list, one end logit list to retrieve the answer, this is not correct as one SquadExample can produce multiple SquadFeatures.
2020-01-08 21:12:34 +05:30
Lysandre
16ce15ed4b DistilBERT token type ids removed from inputs in run_squad 2020-01-08 13:18:30 +01:00
Lysandre Debut
f24232cd1b Fix error with global step in run_squad.py 2020-01-08 11:39:00 +01:00
thomwolf
1b59b57b57 ignore_index equal -100 in T5 model 2020-01-08 09:52:10 +01:00
Romain Keramitas
569da80ced Make doc regarding masked indices more clear.
Signed-off-by: Romain Keramitas <r.keramitas@gmail.com>
2020-01-07 17:37:27 +01:00
Oren Amsalem
43114b89ba spelling correction (#2434) 2020-01-07 17:25:25 +01:00
Genta Indra Winata
d6a677b14b Fix typograpical errors (#2438) 2020-01-07 17:21:23 +01:00
Lysandre Debut
27c1b656cc Fix error with global step in run_lm_finetuning.py 2020-01-07 16:16:12 +01:00
Lysandre
24df44d9c7 Black version python 3.5 2020-01-07 15:53:42 +01:00
Lysandre Debut
73be60c47b Quotes 2020-01-07 15:34:23 +01:00
Lysandre
6806f8204e fix #2410 2020-01-07 15:20:45 +01:00
Simone Primarosa
176d3b3079 Add support for Albert and XLMRoberta for the Glue example (#2403)
* Add support for Albert and XLMRoberta for the Glue example
2020-01-07 14:55:55 +01:00
Morgan Funtowicz
9261c7f771 Remove f-string device creation on PyTorch GPU pipelines.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-01-07 11:46:44 +01:00
Morgan Funtowicz
91d33c798b Fix issue on pipelines where pytorch's tensors are not copied on the user-specified GPU device.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-01-07 11:12:31 +01:00
Dima Galat
2926852f14 fixed formatting 2020-01-07 11:56:03 +11:00
Dima Galat
e2810edc8f removing redundant .flush 2020-01-07 11:47:25 +11:00
Julien Chaumond
c301faa92b Distributed or parallel setup 2020-01-06 18:41:08 -05:00
alberduris
81d6841b4b GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
alberduris
dd4df80f0b Moved the encoded_prompts to correct device 2020-01-06 15:11:12 +01:00
Lysandre Debut
1efc208ff3 Complete DataProcessor class 2020-01-06 15:02:25 +01:00
Simone Primarosa
c45d0cf60f Improve logging message in the single sentence classification processor 2020-01-06 14:54:36 +01:00
Simone Primarosa
bf89be77b9 Improve logging message in the single sentence classification processor 2020-01-06 14:54:36 +01:00
Simone Primarosa
bf8d4bc674 Improve logging message in glue feature conversion 2020-01-06 14:54:36 +01:00
Lysandre
74755c89b9 Example snippet for BertForQuestionAnswering 2020-01-06 14:41:53 +01:00
Aymeric Augustin
0ffc8eaf53 Enforce target version for black.
This should stabilize formatting.
2020-01-05 12:52:14 -05:00
karajan1001
f01b3e6680 fix #2399 an ImportError in official example (#2400)
* fix #2399 an ImportError in official example

* style

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-01-05 12:50:20 -05:00
Julien Chaumond
78528742f1 Fix syntax + link to community page 2020-01-05 12:43:39 -05:00
Clement
12e0aa4368 Proposition to include community models in readme 2020-01-05 12:37:11 -05:00
Morgan Funtowicz
80faf22b4a Updating documentation for converting tensorflow model to reflect the new cli convert format.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-01-04 13:41:18 +01:00
Dima
d0e594f9db Releasing file lock 2020-01-02 09:45:48 +11:00
Julien Chaumond
629b22adcf [run_lm_finetuning] mask_tokens: document types 2020-01-01 12:55:10 -05:00
Julien Chaumond
594ca6dead [debug] Debug Heisenbug, the old school way. 2019-12-29 10:07:21 -05:00
Julien Chaumond
0df4e62da0
[http] Tweak http user-agent (#2353) 2019-12-29 10:06:50 -05:00
Thomas Wolf
f75bf05ce6
Merge pull request #2352 from huggingface/cli_tweaks
Cli tweaks
2019-12-28 15:40:00 +01:00
Julien Chaumond
0d467fd6de Typo 2019-12-27 23:06:48 -05:00
Julien Chaumond
d8293e84f3 [cli] upload: max number of files at the same time 2019-12-27 23:02:53 -05:00
Julien Chaumond
4d6c93e923 Kill __main__ 2019-12-27 22:55:22 -05:00
Julien Chaumond
9b2badf3c9 [cli] Update doc 2019-12-27 22:54:29 -05:00
Julien Chaumond
f78ebc22ad [cli] Add ability to delete remote object 2019-12-27 22:53:49 -05:00
Anthony MOI
bfe870be65
Hotfix tokenizers version for sdist installs 2019-12-27 11:05:52 -05:00
Thomas Wolf
74ea432847
Merge pull request #2286 from adelevie/patch-2
Typo in tokenization_utils.py
2019-12-27 10:50:47 +01:00
Thomas Wolf
492bea9aa0
Merge pull request #2292 from patrickvonplaten/add_cached_past_for_language_generation
Add cached past for language generation
2019-12-27 10:33:27 +01:00
Thomas Wolf
e213900fa2
Merge pull request #2290 from patrickvonplaten/fix_typo_in_doc_for_language_generation
duplicated line for repeating_words_penalty_for_language_generation
2019-12-27 10:29:06 +01:00
Thomas Wolf
9f5f646442
Merge pull request #2211 from huggingface/fast-tokenizers
Fast tokenizers
2019-12-27 10:24:29 +01:00
Aymeric Augustin
9024b19994 Auto-format (fixes previous commit). 2019-12-27 10:13:52 +01:00
Aymeric Augustin
3233b58ad4 Quote square brackets in shell commands.
This ensures compatibility with zsh.

Fix #2316.
2019-12-27 08:50:25 +01:00