Clement
2513fe0d02
added subtitle for recent contributors in readme ( #5130 )
2020-06-29 09:05:08 -04:00
Manuel Romero
30245c0c60
Fix table format fot test tesults ( #5357 )
2020-06-29 09:02:33 -04:00
Manuel Romero
c34010551a
Create model card ( #5356 )
2020-06-29 09:01:55 -04:00
Ali Safaya
01aa0b8527
Create README.md ( #5353 )
2020-06-29 08:58:30 -04:00
chrisliu
96907367f1
arxiv-ai-gpt2 model card ( #5337 )
...
* Add model card and generation script for model arxiv_ai_gpt2
* Update arxiv-ai-gpt2 model card
Remove unnecessary lines
* Delete code in model cards
2020-06-29 08:53:20 -04:00
Ali Safaya
3cdf8b7ec2
Create model card for asafaya/bert-mini-arabic ( #5352 )
...
* Create README.md
* Update model_cards/asafaya/bert-mini-arabic/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-06-29 08:41:41 -04:00
Ali Safaya
9db1f41604
Create README.md ( #5351 )
2020-06-29 08:36:00 -04:00
Julien Chaumond
c950fef545
[docs] Small tweaks to #5323
2020-06-29 14:24:33 +02:00
Sylvain Gugger
4544f906e2
model cards for roberta and bert-multilingual ( #5324 )
...
* More model cards (cc @myleott)
* Apply suggestions from code review
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-06-29 05:06:05 -04:00
sgugger
92671532e7
More model cards
2020-06-29 10:58:54 +02:00
Pradhy729
9209d36f93
Added a model card README.md for my pretrained model. ( #5325 )
...
* Create README.md
* Removed unnecessary link from README.md
* Update README.md
2020-06-29 16:29:14 +08:00
Julien Plu
7cb52f53ef
Fix LR decay in TF Trainer ( #5269 )
...
* Recover old PR
* Apply style
* Trigger CI
2020-06-29 14:38:32 +08:00
krevas
321c05abab
Model cards for finance-koelectra models ( #5313 )
...
* Add finance-koelectra readme card
* Add finance-koelectra readme card
* Add finance-koelectra readme card
* Add finance-koelectra readme card
2020-06-29 13:47:44 +08:00
Sam Shleifer
28a690a80e
[mBART] skip broken forward pass test, stronger integration test ( #5327 )
2020-06-28 15:08:28 -04:00
Sam Shleifer
45e26125de
save_pretrained: mkdir(exist_ok=True) ( #5258 )
...
* all save_pretrained methods mkdir if not os.path.exists
2020-06-28 14:53:47 -04:00
Suraj Patil
12dfbd4f7a
[examples] fix example links ( #5344 )
2020-06-28 12:54:54 -04:00
Patrick von Platen
98109464c1
clean reformer reverse sort ( #5343 )
2020-06-28 14:32:25 +02:00
Sylvain Gugger
1af58c0706
New model sharing tutorial ( #5323 )
2020-06-27 11:10:02 -04:00
Sylvain Gugger
efae6645e2
Fix xxx_length
behavior when using XLNet in pipeline ( #5319 )
2020-06-27 11:09:51 -04:00
Sam Shleifer
393b8dc09a
examples/seq2seq/run_eval.py fixes and docs ( #5322 )
2020-06-26 19:20:43 -04:00
Sam Shleifer
5543b30aa6
[pl_examples] default warmup steps=0 ( #5316 )
2020-06-26 15:03:41 -04:00
Sam Shleifer
bf0d12c220
CircleCI stores cleaner output at test_outputs.txt ( #5291 )
2020-06-26 13:59:31 -04:00
Thomas Wolf
601d4d699c
[tokenizers] Updates data processors, docstring, examples and model cards to the new API ( #5308 )
...
* remove references to old API in docstring - update data processors
* style
* fix tests - better type checking error messages
* better type checking
* include awesome fix by @LysandreJik for #5310
* updated doc and examples
2020-06-26 19:48:14 +02:00
Kevin Canwen Xu
fd405e9a93
Add BART-base modeling and configuration ( #5315 )
2020-06-27 00:53:10 +08:00
Sam Shleifer
798dbff6a7
[pipelines] Change summarization default to distilbart-cnn-12-6 ( #5289 )
2020-06-26 11:43:23 -04:00
Patrick von Platen
834b6884c5
Add benchmark notebook ( #5312 )
...
* add notebook
* Créé avec Colaboratory
* move notebook to correct folder
* correct link
* correct filename
* correct filename
* better name
2020-06-26 17:38:13 +02:00
Patrick von Platen
08c9607c3d
[Generation] fix docs for decoder_input_ids ( #5306 )
...
* fix docs
* Update src/transformers/modeling_utils.py
* Update src/transformers/modeling_tf_utils.py
* Update src/transformers/modeling_tf_utils.py
* Update src/transformers/modeling_utils.py
* Update src/transformers/modeling_tf_utils.py
* Update src/transformers/modeling_utils.py
2020-06-26 16:58:11 +02:00
Patrick von Platen
79a82cc06a
[Benchmarks] improve Example Plotter ( #5245 )
...
* improve plotting
* better labels
* fix time plot
2020-06-26 15:00:14 +02:00
Sylvain Gugger
88d7f96e33
Gpt2 model card ( #5283 )
...
* Bert base model card
* Add metadata
* Adapt examples
* GPT2 model card
* Remove the BERT model card
* Change language code
2020-06-26 08:08:31 -04:00
Sylvain Gugger
fc5bce9e60
Bert base model card ( #5276 )
...
* Bert base model card
* Add metadata
* Adapt examples
* Comment on text generation
* Update model_cards/bert-base-uncased-README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-06-26 08:01:19 -04:00
Funtowicz Morgan
135791e8ef
Add pad_to_multiple_of on tokenizers (reimport) ( #5054 )
...
* Add new parameter `pad_to_multiple_of` on tokenizers.
* unittest for pad_to_multiple_of
* Add .name when logging enum.
* Fix missing .items() on dict in tests.
* Add special check + warning if the tokenizer doesn't have proper pad_token.
* Use the correct logger format specifier.
* Ensure tokenizer with no pad_token do not modify the underlying padding strategy.
* Skip test if tokenizer doesn't have pad_token
* Fix RobertaTokenizer on empty input
* Format.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* fix and updating to simpler API
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-06-26 11:55:57 +02:00
Lysandre Debut
7cc15bdd96
Closes #5218
2020-06-25 18:19:21 -04:00
Joe Davison
2ffef0d0c7
Training & fine-tuning quickstart ( #5034 )
...
* add initial fine-tuning guide
* split code blocks to smaller segments
* fix up trianer section of fine-tune doc
* a few last typos
* Update usage -> task summary link
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-06-25 15:11:11 -06:00
Lysandre Debut
364a5ae1f0
Refactor Code samples; Test code samples ( #5036 )
...
* Refactor code samples
* Test docstrings
* Style
* Tokenization examples
* Run rust of tests
* First step to testing source docs
* Style and BART comment
* Test the remainder of the code samples
* Style
* let to const
* Formatting fixes
* Ready for merge
* Fix fixture + Style
* Fix last tests
* Update docs/source/quicktour.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Addressing @sgugger's comments + Fix MobileBERT in TF
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-06-25 16:46:00 -04:00
Thomas Wolf
315f464b0a
[tokenizers] Several small improvements and bug fixes ( #5287 )
...
* avoid recursion in id checks for fast tokenizers
* better typings and fix #5232
* align slow and fast tokenizers behaviors for Roberta and GPT2
* style and quality
* fix tests - improve typings
2020-06-25 22:17:14 +02:00
Sylvain Gugger
24f46ea3f3
Remove links for all docs ( #5280 )
2020-06-25 11:45:05 -04:00
Thomas Wolf
27cf1d97f0
[Tokenization] Fix #5181 - make #5155 more explicit - move back the default logging level in tests to WARNING ( #5252 )
...
* fix-5181
Padding to max sequence length while truncation to another length was wrong on slow tokenizers
* clean up and fix #5155
* fix XLM test
* Fix tests for Transfo-XL
* logging only above WARNING in tests
* switch slow tokenizers tests in @slow
* fix Marian truncation tokenization test
* style and quality
* make the test a lot faster by limiting the sequence length used in tests
2020-06-25 17:24:28 +02:00
Sam Shleifer
e008d520bb
[examples/seq2seq] more README improvements ( #5274 )
2020-06-25 10:13:01 -04:00
Julien Chaumond
6a495cae00
[model_cards] Example of how to specify inputs for the widget
2020-06-25 15:58:25 +02:00
Anthony MOI
0e1fce3c01
Fix convert_graph_to_onnx ( #5230 )
2020-06-25 08:17:02 +02:00
Moumeneb1
5543efd5cc
Create README.md ( #5259 )
2020-06-25 01:56:07 -04:00
Sam Shleifer
40457bcebb
examples/seq2seq supports translation ( #5202 )
2020-06-24 23:58:11 -04:00
Sylvain Gugger
d12ceb48ba
Tokenization tutorial ( #5257 )
...
* All done
* Link to the tutorial
* Typo fixes
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
* Add metnion of the return_xxx args
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-06-24 18:43:20 -04:00
Thomas Wolf
7ac9110711
Add more tests on tokenizers serialization - fix bugs ( #5056 )
...
* update tests for fast tokenizers + fix small bug in saving/loading
* better tests on serialization
* fixing serialization
* comment cleanup
2020-06-24 21:53:08 +02:00
Sylvain Gugger
0148c262e7
Fix first test ( #5255 )
2020-06-24 15:16:04 -04:00
Sylvain Gugger
70c1e1d2d5
Use master _static ( #5253 )
...
* Use _static from master everywhere
* Copy to existing too
2020-06-24 15:06:14 -04:00
Victor SANH
4965aee064
[HANS] Fix label_list for RoBERTa/BART (class flipping) ( #5196 )
...
* fix weirdness in roberta/bart for mnli trained checkpoints
* black compliance
* isort code check
2020-06-24 14:38:15 -04:00
Julien Chaumond
fc24a93e64
[HfApi] Add support for pipeline_tag
2020-06-24 16:54:00 +00:00
Setu Shah
0a3d0e02c5
Replace labels with -100 to skip loss calc ( #4718 )
2020-06-24 12:14:50 -04:00
Sylvain Gugger
6894b486d0
Fix version controller links (for realsies) ( #5251 )
2020-06-24 12:13:43 -04:00