Julien Chaumond
5a6b138b00
[Umberto] model shortcuts ( #2661 )
...
* [Umberto] model shortcuts
cc @loretoparisi @simonefrancia
see #2485
* Ensure that tokenizers will be correctly configured
2020-01-30 21:05:53 -05:00
Julien Chaumond
7fe294bf07
Hotfix: same handling of non-existent files as for config
2020-01-30 20:05:04 -05:00
Julien Chaumond
b85c59f997
config.architectures
2020-01-30 19:26:59 -05:00
Julien Chaumond
f9bc3f5771
style tweak
2020-01-30 19:26:59 -05:00
Julien Chaumond
0b13fb822a
No need for a model_type here
...
cc @lysandrejik
2020-01-30 19:26:59 -05:00
Jared Nielsen
71a382319f
Correct documentation
2020-01-30 18:41:24 -05:00
Lysandre
01a14ebd8d
Add FlauBERT to automodels
2020-01-30 18:40:22 -05:00
Julien Chaumond
9fa836a73f
fill_mask helper ( #2576 )
...
* fill_mask helper
* [poc] FillMaskPipeline
* Revert "[poc] FillMaskPipeline"
This reverts commit 67eeea55b0
.
* Revert "fill_mask helper"
This reverts commit cacc17b884
.
* README: clarify that Pipelines can also do text-classification
cf. question at the AI&ML meetup last week, @mfuntowicz
* Fix test: test feature-extraction pipeline
* Test tweaks
* Slight refactor of existing pipeline (in preparation of new FillMaskPipeline)
* Extraneous doc
* More robust way of doing this
@mfuntowicz as we don't rely on the model name anymore (see AutoConfig)
* Also add RobertaConfig as a quickfix for wrong token_type_ids
* cs
* [BIG] FillMaskPipeline
2020-01-30 18:15:42 -05:00
Hang Le
b43cb09aaa
Add layerdrop
2020-01-30 12:05:01 -05:00
Lysandre
df27648bd9
Rename test_examples to test_doc_samples
2020-01-30 10:07:22 -05:00
Lysandre
93dccf527b
Pretrained models
2020-01-30 10:04:18 -05:00
Lysandre
90787fed81
Style
2020-01-30 10:04:18 -05:00
Lysandre
73306d028b
FlauBERT documentation
2020-01-30 10:04:18 -05:00
Lysandre
ce2f4227ab
Fix failing FlauBERT test
2020-01-30 10:04:18 -05:00
Hang Le
f0a4fc6cd6
Add Flaubert
2020-01-30 10:04:18 -05:00
Peter Izsak
a5381495e6
Added classifier dropout rate in ALBERT
2020-01-30 09:52:34 -05:00
Bram Vanroy
83446a88d9
Use _pad_token of pad_token_id
...
Requesting pad_token_id would cause an error message when it is None. Use private _pad_token instead.
2020-01-29 17:44:58 -05:00
BramVanroy
9fde13a3ac
Add check to verify existence of pad_token_id
...
In batch_encode_plus we have to ensure that the tokenizer has a pad_token_id so that, when padding, no None values are added as padding. That would happen with gpt2, openai, transfoxl.
closes https://github.com/huggingface/transformers/issues/2640
2020-01-29 17:44:58 -05:00
Lysandre
e63a81dd25
Style
2020-01-29 16:29:20 -05:00
Lysandre
217349016a
Copy object instead of passing the reference
2020-01-29 16:15:39 -05:00
Jared Nielsen
adb8c93134
Remove lines causing a KeyError
2020-01-29 14:01:16 -05:00
Lysandre
c69b082601
Update documentation
2020-01-29 12:06:13 -05:00
Julien Plu
ca1d66734d
Apply quality and style requirements once again
2020-01-29 12:06:13 -05:00
Julien Plu
5e3c72842d
bugfix on model name
2020-01-29 12:06:13 -05:00
Julien Plu
0731fa1587
Apply quality and style requirements
2020-01-29 12:06:13 -05:00
Julien Plu
a3998e76ae
Add TF2 CamemBERT model
2020-01-29 12:06:13 -05:00
Lysandre
b5625f131d
Style
2020-01-29 11:47:49 -05:00
Lysandre
44a5b4bbe7
Update documentation
2020-01-29 11:47:49 -05:00
Julien Plu
7fc628d98e
Apply style
2020-01-29 11:47:49 -05:00
Julien Plu
64ca855617
Add TF2 XLM-RoBERTa model
2020-01-29 11:47:49 -05:00
BramVanroy
9d87eafd11
Streamlining
...
- mostly stylistic streamlining
- removed 'additional context' sections. They seem to be rarely used and might cause confusion. If more details are needed, users can add them to the 'details' section
2020-01-28 10:41:10 -05:00
BramVanroy
a3b3638f6f
phrasing
2020-01-28 10:41:10 -05:00
BramVanroy
c96ca70f25
Update ---new-benchmark.md
2020-01-28 10:41:10 -05:00
BramVanroy
7b5eda32bb
Update --new-model-addition.md
...
Motivate users to @-tag authors of models to increase visibility and expand the community
2020-01-28 10:41:10 -05:00
BramVanroy
c63d91dd1c
Update bug-report.md
...
- change references to pytorch-transformers to transformers
- link to code formatting guidelines
2020-01-28 10:41:10 -05:00
BramVanroy
b2907cd06e
Update feature-request.md
...
- add 'your contribution' section
- add code formatting link to 'additional context'
2020-01-28 10:41:10 -05:00
BramVanroy
2fec88ee02
Update question-help.md
...
Prefer that general questions are asked on Stack Overflow
2020-01-28 10:41:10 -05:00
BramVanroy
7e03d2bd7c
update migration guide
...
Streamlines usages of pytorch-transformers and pytorch-pretrained-bert. Add link to the README for the migration guide.
2020-01-28 10:41:10 -05:00
Lysandre
335dd5e68a
Default save steps 50 to 500 in all scripts
2020-01-28 09:42:11 -05:00
Lysandre
ea2600bd5f
Absolute definitive HeisenDistilBug solve
...
cc @julien-c @thomwolf
2020-01-27 21:58:36 -05:00
Wietse de Vries
5c3d441ee1
Fix formatting
2020-01-27 21:00:34 -05:00
Wietse de Vries
f5a236c3ca
Add Dutch pre-trained BERT model
2020-01-27 21:00:34 -05:00
Julien Chaumond
6b4c3ee234
[run_lm_finetuning] GPT2 tokenizer doesn't have a pad_token
...
ping @lysandrejik
2020-01-27 20:14:02 -05:00
Julien Chaumond
79815bf666
[serving] Fix typo
2020-01-27 19:58:25 -05:00
Julien Chaumond
5004d5af42
[serving] Update dependencies
2020-01-27 19:58:00 -05:00
Lysandre
9ca21c838b
Style
2020-01-27 14:49:12 -05:00
thomwolf
e0849a66ac
adding in the doc
2020-01-27 14:27:07 -05:00
thomwolf
6b081f04e6
style and quality
2020-01-27 14:27:07 -05:00
thomwolf
0e31e06a75
Add AutoModelForPreTraining
2020-01-27 14:27:07 -05:00
Julien Chaumond
ea56d305be
make style
2020-01-27 12:13:32 -05:00