Commit Graph

5759 Commits

Author SHA1 Message Date
Sai Saketh Aluru
1121ce9f98
Model cards for Hate-speech-CNERG models (#5236)
* Add dehatebert-mono-arabic readme card

* Update dehatebert-mono-arabic model card

* model cards for Hate-speech-CNERG models
2020-06-24 11:41:08 -04:00
Lysandre Debut
cf10d4cfdd
Cleaning TensorFlow models (#5229)
* Cleaning TensorFlow models

Update all classes


stylr

* Don't average loss
2020-06-24 11:37:20 -04:00
Sylvain Gugger
609e0c583f
Fix links (#5248) 2020-06-24 11:35:55 -04:00
Ali Modarressi
c9163a8d5a
delay decay schedule until the end of warmup (#4940) 2020-06-24 11:18:29 -04:00
Sylvain Gugger
f216b60671
Fix deploy doc (#5246)
* Try with the same command

* Try like this
2020-06-24 10:59:06 -04:00
Sylvain Gugger
49f6e7a3c6
Add some prints to debug (#5244) 2020-06-24 10:37:01 -04:00
Patrick von Platen
c2a26ec8a6
[Use cache] Align logic of use_cache with output_attentions and output_hidden_states (#5194)
* fix use cache

* add bart use cache

* fix bart

* finish bart
2020-06-24 16:09:17 +02:00
Sylvain Gugger
64c393ee74
Don't recreate old docs (#5243) 2020-06-24 09:59:07 -04:00
Patrick von Platen
b29683736a
fix print in benchmark (#5242) 2020-06-24 15:58:49 +02:00
Patrick von Platen
9fe09cec76
[Benchmark] Extend Benchmark to all model type extensions (#5241)
* add benchmark for all kinds of models

* improved import

* delete bogus files

* make style
2020-06-24 15:11:42 +02:00
Sylvain Gugger
7c41057d50
Add hugs (#5225) 2020-06-24 07:56:14 -04:00
Sylvain Gugger
5e85b324ec
Use the script in utils (#5224) 2020-06-24 07:55:58 -04:00
flozi00
5e31a98ab7
Create README.md (#5108)
* Create README.md

* Update model_cards/a-ware/roberta-large-squad-classification/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-06-24 04:45:51 -04:00
Adriano Diniz
033124e5f8
Update README.md (#5199)
Fix/add information in README.md
2020-06-24 04:42:46 -04:00
ahotrod
7ca6627ec3
Create README.md (#5217)
electra_large_discriminator_squad2_512 Question Answering LM
2020-06-24 04:40:50 -04:00
Kevin Canwen Xu
54e9ce785d
Fix PABEE division by zero error (#5233)
* Fix PABEE division by zero error

* patience=0 by default
2020-06-24 16:10:36 +08:00
Sylvain Gugger
9022ef021a
Only put tensors on a device (#5223)
* Only put tensors on a device

* Type hint and unpack list comprehension
2020-06-23 17:30:17 -04:00
Sylvain Gugger
173528e368
Add version control menu (#5222)
* Add version control menu

* Constify things

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-06-23 17:05:12 -04:00
Sam Shleifer
76e5af4cfd
[pl_examples] revert deletion of optimizer_step (#5227) 2020-06-23 16:40:45 -04:00
Julien Chaumond
c01480bba3 [file_utils] Type user-agent 2020-06-23 18:31:13 +02:00
Sam Shleifer
58918c76f4
[bart] add config.extra_pos_embeddings to facilitate reuse (#5190) 2020-06-23 11:35:42 -04:00
Thomas Wolf
b28b537131
More clear error message in the use-case of #5169 (#5184) 2020-06-23 13:37:29 +02:00
Thomas Wolf
11fdde0271
Tokenizers API developments (#5103)
* Add return lengths

* make pad a bit more flexible so it can be used as collate_fn

* check all kwargs sent to encoding method are known

* fixing kwargs in encodings

* New AddedToken class in python

This class let you specify specifique tokenization behaviors for some special tokens. Used in particular for GPT2 and Roberta, to control how white spaces are stripped around special tokens.

* style and quality

* switched to hugginface tokenizers library for AddedTokens

* up to tokenizer 0.8.0-rc3 - update API to use AddedToken state

* style and quality

* do not raise an error on additional or unused kwargs for tokenize() but only a warning

* transfo-xl pretrained model requires torch

* Update src/transformers/tokenization_utils.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-06-23 13:36:57 +02:00
Patrick von Platen
1ae132a07d
[Reformer] Axial Pos Emb Improve mem usage reformer (#5209)
* improve mem handling

* improve mem for pos ax encodings
2020-06-23 10:49:18 +02:00
Sam Shleifer
5144104070
[fix] remove unused import (#5206) 2020-06-22 23:39:04 -04:00
Sam Shleifer
0d158e38c9
[fix] mobilebert had wrong path, causing slow test failure (#5205) 2020-06-22 23:31:36 -04:00
Sam Shleifer
f5c2a122e3
Upgrade examples to pl=0.8.1(#5146) 2020-06-22 20:40:10 -04:00
flozi00
06b60c8b05
[Modelcard] bart-squadv2 (#5011)
* [Modelcard] bart-squadv2

* Update README.md

* Update README.md
2020-06-22 18:40:19 -04:00
flozi00
35e0687256
Create README.md (#5013) 2020-06-22 18:40:00 -04:00
Fran Martinez
22d2c8ea2f
Create README.md for finetuned BERT model (#5009)
* Create README.md

* changes in model usage section

* minor changes in output visualization

* minor errata in readme
2020-06-22 18:39:29 -04:00
furunkel
2589505693
Add model card for StackOBERTflow-comments-small (#5008)
* Create README.md

* Update README.md
2020-06-22 18:39:22 -04:00
bogdankostic
d8c26ed139
Specify dataset used for crossvalidation (#5175) 2020-06-22 18:26:12 -04:00
Adriano Diniz
a34fb91d54
Create README.md (#5149) 2020-06-22 18:00:53 -04:00
Adriano Diniz
ffabcf5249
Create README.md (#5160) 2020-06-22 17:59:54 -04:00
Adriano Diniz
3363a19b12
Create README.md (#5152)
* Create README.md

* Apply suggestions from code review

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-06-22 17:59:33 -04:00
Michaël Benesty
0cca61925c
Add link to new comunity notebook (optimization) (#5195)
* Add link to new comunity notebook (optimization)

related to https://github.com/huggingface/transformers/issues/4842#event-3469184635

This notebook is about benchmarking model training with/without dynamic padding optimization. 
https://github.com/ELS-RD/transformers-notebook 

Using dynamic padding on MNLI provides a **4.7 times training time reduction**, with max pad length set to 512. The effect is strong because few examples are >> 400 tokens in this dataset. IRL, it will depend of the dataset, but it always bring improvement and, after more than 20 experiments listed in this [article](https://towardsdatascience.com/divide-hugging-face-transformers-training-time-by-2-or-more-21bf7129db9q-21bf7129db9e?source=friends_link&sk=10a45a0ace94b3255643d81b6475f409), it seems to not hurt performance.

Following advice from @patrickvonplaten I do the PR myself :-)

* Update notebooks/README.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-06-22 23:47:33 +02:00
Lee Haau-Sing
1c5cd8e5f5
Add README.md (nyu-mll) (#5174)
* nyu-mll: roberta on smaller datasets

* Update README.md

* Update README.md

Co-authored-by: Alex Warstadt <alexwarstadt@gmail.com>
2020-06-22 17:24:27 -04:00
Sylvain Gugger
c439752482
Switch master/stable doc and add older releases (#5193) 2020-06-22 16:38:53 -04:00
Sylvain Gugger
417e492f1e
Quick tour (#5145)
* Quicktour part 1

* Update

* All done

* Typos

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Address comments in quick tour

* Update docs/source/quicktour.rst

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update from feedback

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-06-22 16:08:09 -04:00
Thomas Wolf
75e1eed8d1
Cleaner warning when loading pretrained models (#4557)
* Cleaner warning when loading pretrained models

This make more explicit logging messages when using the various `from_pretrained` methods. It also make these messages as `logging.warning` because it's a common source of silent mistakes.

* Update src/transformers/modeling_utils.py

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Update src/transformers/modeling_utils.py

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* style and quality

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-06-22 21:58:47 +02:00
Lysandre Debut
4e741efa92
Have documentation fail on warning (#5189)
* Have documentation fail on warning

* Force ci failure

* Revert "Force ci failure"

This reverts commit f0a4666ec2.
2020-06-22 15:49:50 -04:00
Sylvain Gugger
1262495a91
Add TF auto model to the docs + fix sphinx warnings (#5187) 2020-06-22 14:43:52 -04:00
Adriano Diniz
88429c57bc
Create README.md (#5165) 2020-06-22 13:49:14 -04:00
Manuel Romero
76ee9c8bc9
Create README.md (#5107)
* Create README.md

@julien-c check out that dataset meta tag is right

* Fix typo

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-06-22 13:47:30 -04:00
Manuel Romero
bf493d5569
Model card for t5-base-finetuned-emotion (recognition) (#5179) 2020-06-22 13:45:45 -04:00
Patrick von Platen
e9ef21175e
improve doc (#5185) 2020-06-22 19:00:11 +02:00
Thomas Wolf
ebc36108dc
[tokenizers] Fix #5081 and improve backward compatibility (#5125)
* fix #5081 and improve backward compatibility (slightly)

* add nlp to setup.cfg - style and quality

* align default to previous default

* remove test that doesn't generalize
2020-06-22 17:25:43 +02:00
Malte
d2a7c86dc3
Check if text is set to avoid IndexError (#4209)
Fix for https://github.com/huggingface/transformers/issues/3809
2020-06-22 11:09:05 -04:00
Iz Beltagy
90f4b24520
Add support for gradient checkpointing in BERT (#4659)
* add support for gradient checkpointing in BERT

* fix unit tests

* isort

* black

* workaround for `torch.utils.checkpoint.checkpoint` not accepting bool

* Revert "workaround for `torch.utils.checkpoint.checkpoint` not accepting bool"

This reverts commit 5eb68bb804.

* workaround for `torch.utils.checkpoint.checkpoint` not accepting bool

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-06-22 10:47:14 -04:00
Joseph Liu
f4e1f02210
Output hidden states (#4978)
* Configure all models to use output_hidden_states as argument passed to foward()

* Pass all tests

* Remove cast_bool_to_primitive in TF Flaubert model

* correct tf xlnet

* add pytorch test

* add tf test

* Fix broken tests

* Configure all models to use output_hidden_states as argument passed to foward()

* Pass all tests

* Remove cast_bool_to_primitive in TF Flaubert model

* correct tf xlnet

* add pytorch test

* add tf test

* Fix broken tests

* Refactor output_hidden_states for mobilebert

* Reset and remerge to master

Co-authored-by: Joseph Liu <joseph.liu@coinflex.com>
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2020-06-22 10:10:45 -04:00