Lysandre
164c794eb3
New SQuAD API for distillation script
2020-01-10 11:42:53 +01:00
Lysandre
801f2ac8c7
Add PRETRAINED_INIT_CONFIGURATION to DistilBERT tokenizer
2020-01-10 11:42:21 +01:00
Yohei Tamura
bfec203d4e
modified: src/transformers/tokenization_utils.py
2020-01-09 12:54:28 +01:00
Julien Chaumond
f599623a99
PreTrainedTokenizerFast: hotfix _convert_encoding
...
cc @n1t0
2020-01-08 15:46:37 -05:00
Rishabh Manoj
f26a353057
Update pipelines.py
...
Modified QA pipeline to consider all features for each example before generating topk answers.
Current pipeline only takes one SquadExample, one SquadFeature, one start logit list, one end logit list to retrieve the answer, this is not correct as one SquadExample can produce multiple SquadFeatures.
2020-01-08 21:12:34 +05:30
Lysandre
16ce15ed4b
DistilBERT token type ids removed from inputs in run_squad
2020-01-08 13:18:30 +01:00
Lysandre Debut
f24232cd1b
Fix error with global step in run_squad.py
2020-01-08 11:39:00 +01:00
thomwolf
1b59b57b57
ignore_index equal -100 in T5 model
2020-01-08 09:52:10 +01:00
Romain Keramitas
569da80ced
Make doc regarding masked indices more clear.
...
Signed-off-by: Romain Keramitas <r.keramitas@gmail.com>
2020-01-07 17:37:27 +01:00
Oren Amsalem
43114b89ba
spelling correction ( #2434 )
2020-01-07 17:25:25 +01:00
Genta Indra Winata
d6a677b14b
Fix typograpical errors ( #2438 )
2020-01-07 17:21:23 +01:00
Lysandre Debut
27c1b656cc
Fix error with global step in run_lm_finetuning.py
2020-01-07 16:16:12 +01:00
Lysandre
24df44d9c7
Black version python 3.5
2020-01-07 15:53:42 +01:00
Lysandre Debut
73be60c47b
Quotes
2020-01-07 15:34:23 +01:00
Lysandre
6806f8204e
fix #2410
2020-01-07 15:20:45 +01:00
Simone Primarosa
176d3b3079
Add support for Albert and XLMRoberta for the Glue example ( #2403 )
...
* Add support for Albert and XLMRoberta for the Glue example
2020-01-07 14:55:55 +01:00
Morgan Funtowicz
9261c7f771
Remove f-string device creation on PyTorch GPU pipelines.
...
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-01-07 11:46:44 +01:00
Morgan Funtowicz
91d33c798b
Fix issue on pipelines where pytorch's tensors are not copied on the user-specified GPU device.
...
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-01-07 11:12:31 +01:00
Dima Galat
2926852f14
fixed formatting
2020-01-07 11:56:03 +11:00
Dima Galat
e2810edc8f
removing redundant .flush
2020-01-07 11:47:25 +11:00
Julien Chaumond
c301faa92b
Distributed or parallel setup
2020-01-06 18:41:08 -05:00
alberduris
81d6841b4b
GPU text generation: mMoved the encoded_prompt to correct device
2020-01-06 15:11:12 +01:00
alberduris
dd4df80f0b
Moved the encoded_prompts to correct device
2020-01-06 15:11:12 +01:00
Lysandre Debut
1efc208ff3
Complete DataProcessor class
2020-01-06 15:02:25 +01:00
Simone Primarosa
c45d0cf60f
Improve logging message in the single sentence classification processor
2020-01-06 14:54:36 +01:00
Simone Primarosa
bf89be77b9
Improve logging message in the single sentence classification processor
2020-01-06 14:54:36 +01:00
Simone Primarosa
bf8d4bc674
Improve logging message in glue feature conversion
2020-01-06 14:54:36 +01:00
Lysandre
74755c89b9
Example snippet for BertForQuestionAnswering
2020-01-06 14:41:53 +01:00
Aymeric Augustin
0ffc8eaf53
Enforce target version for black.
...
This should stabilize formatting.
2020-01-05 12:52:14 -05:00
karajan1001
f01b3e6680
fix #2399 an ImportError in official example ( #2400 )
...
* fix #2399 an ImportError in official example
* style
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-01-05 12:50:20 -05:00
Julien Chaumond
78528742f1
Fix syntax + link to community page
2020-01-05 12:43:39 -05:00
Clement
12e0aa4368
Proposition to include community models in readme
2020-01-05 12:37:11 -05:00
Morgan Funtowicz
80faf22b4a
Updating documentation for converting tensorflow model to reflect the new cli convert format.
...
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-01-04 13:41:18 +01:00
Dima
d0e594f9db
Releasing file lock
2020-01-02 09:45:48 +11:00
Julien Chaumond
629b22adcf
[run_lm_finetuning] mask_tokens: document types
2020-01-01 12:55:10 -05:00
Julien Chaumond
594ca6dead
[debug] Debug Heisenbug, the old school way.
2019-12-29 10:07:21 -05:00
Julien Chaumond
0df4e62da0
[http] Tweak http user-agent ( #2353 )
2019-12-29 10:06:50 -05:00
Thomas Wolf
f75bf05ce6
Merge pull request #2352 from huggingface/cli_tweaks
...
Cli tweaks
2019-12-28 15:40:00 +01:00
Julien Chaumond
0d467fd6de
Typo
2019-12-27 23:06:48 -05:00
Julien Chaumond
d8293e84f3
[cli] upload: max number of files at the same time
2019-12-27 23:02:53 -05:00
Julien Chaumond
4d6c93e923
Kill __main__
2019-12-27 22:55:22 -05:00
Julien Chaumond
9b2badf3c9
[cli] Update doc
2019-12-27 22:54:29 -05:00
Julien Chaumond
f78ebc22ad
[cli] Add ability to delete remote object
2019-12-27 22:53:49 -05:00
Anthony MOI
bfe870be65
Hotfix tokenizers version for sdist installs
2019-12-27 11:05:52 -05:00
Thomas Wolf
74ea432847
Merge pull request #2286 from adelevie/patch-2
...
Typo in tokenization_utils.py
2019-12-27 10:50:47 +01:00
Thomas Wolf
492bea9aa0
Merge pull request #2292 from patrickvonplaten/add_cached_past_for_language_generation
...
Add cached past for language generation
2019-12-27 10:33:27 +01:00
Thomas Wolf
e213900fa2
Merge pull request #2290 from patrickvonplaten/fix_typo_in_doc_for_language_generation
...
duplicated line for repeating_words_penalty_for_language_generation
2019-12-27 10:29:06 +01:00
Thomas Wolf
9f5f646442
Merge pull request #2211 from huggingface/fast-tokenizers
...
Fast tokenizers
2019-12-27 10:24:29 +01:00
Aymeric Augustin
9024b19994
Auto-format (fixes previous commit).
2019-12-27 10:13:52 +01:00
Aymeric Augustin
3233b58ad4
Quote square brackets in shell commands.
...
This ensures compatibility with zsh.
Fix #2316 .
2019-12-27 08:50:25 +01:00