Suraj Patil
08939cfdf7
[s2strainer] fix eval dataset loading ( #7477 )
2020-09-30 12:39:13 -04:00
Sylvain Gugger
a97a73e0ee
Small QOL improvements to TrainingArguments ( #7475 )
...
* Small QOL improvements to TrainingArguments
* With the self.
2020-09-30 12:12:03 -04:00
Sylvain Gugger
dc7d2daa4c
Alphabetize model lists ( #7478 )
2020-09-30 10:43:58 -04:00
Sylvain Gugger
fdccf82e28
Remove config assumption in Trainer ( #7464 )
...
* Remove config assumption in Trainer
* Initialize for eval
2020-09-30 09:03:25 -04:00
François REMY
cc4eff8087
Make transformers install check positive ( #7473 )
...
When transformers is correctly installed, you should get a positive message ^_^
2020-09-30 07:44:40 -04:00
Pengcheng He
7a0cf0ec93
Add DeBERTa model ( #5929 )
...
* Add DeBERTa model
* Remove dependency of deberta
* Address comments
* Patch DeBERTa
Documentation
Style
* Add final tests
* Style
* Enable tests + nitpicks
* position IDs
* BERT -> DeBERTa
* Quality
* Style
* Tokenization
* Last updates.
* @patrickvonplaten's comments
* Not everything can be a copy
* Apply most of @sgugger's review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Last reviews
* DeBERTa -> Deberta
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-09-30 07:07:30 -04:00
Lysandre Debut
44a93c981f
Number of GPUs for multi-gpu ( #7472 )
2020-09-30 06:53:20 -04:00
Lysandre Debut
886ef35ce6
Fix LXMERT with DataParallel ( #7471 )
2020-09-30 06:41:24 -04:00
Lysandre
35e94c68df
Number of GPUs
2020-09-30 12:29:26 +02:00
Lysandre Debut
056723ad1d
Multi-GPU setup ( #7453 )
2020-09-30 05:53:34 -04:00
Sylvain Gugger
4ba248748f
Get a better error when check_copies fails ( #7457 )
...
* Get a better error when check_copies fails
* Fix tests
2020-09-30 10:05:14 +02:00
Sam Shleifer
bef0175168
remove codecov PR comments ( #7400 )
2020-09-29 15:16:43 -04:00
Sylvain Gugger
a1c2ef7bd0
Add documentation for v3.3.1
2020-09-29 14:31:43 -04:00
Sylvain Gugger
1ba08dc221
Release: v3.3.1
2020-09-29 14:17:34 -04:00
Sylvain Gugger
8546dc55c2
Fix Trainer tests in a multiGPU env ( #7458 )
2020-09-29 14:06:41 -04:00
Sylvain Gugger
d0fd7154c5
Catch import datasets common errors ( #7456 )
2020-09-29 13:42:09 -04:00
Sylvain Gugger
f1220c5fe2
Add a code of conduct ( #7433 )
2020-09-29 13:38:47 -04:00
Teven
9e9a1fb8c7
Adding gradient checkpointing to GPT2 ( #7446 )
...
* GPT2 gradient checkpointing
* find_unused_parameters removed if checkpointing
* find_unused_parameters removed if checkpointing
* Update src/transformers/configuration_gpt2.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Added a test for generation with checkpointing
* Update src/transformers/configuration_gpt2.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-09-29 12:26:26 -04:00
Sylvain Gugger
52e8392b7e
Add automatic best model loading to Trainer ( #7431 )
...
* Add automatic best model loading to Trainer
* Some small fixes
* Formatting
2020-09-29 10:41:18 -04:00
Sylvain Gugger
1fc4de69ed
Document new features of make fixup ( #7434 )
2020-09-29 03:56:57 -04:00
GmailB
205bf0b7ea
Update README.md ( #7444 )
...
Hi, just corrected the example code, add 2 links and fixed some typos
2020-09-29 03:18:01 -04:00
Sam Shleifer
74d8d69bd4
[s2s] consistent output format across eval scripts ( #7435 )
2020-09-28 23:20:03 -04:00
Typicasoft
671b278e25
Create README.md ( #7436 )
...
* Create README.md
MagBERT-NER : Added widget (Text)
* Rename model_cards/README.md to model_cards/TypicaAI/magbert-ner/README.md
2020-09-28 18:25:25 -04:00
Manuel Romero
a1a8ffa512
Update README.md ( #7429 )
...
Add links to models fine-tuned on a downstream task
2020-09-28 13:40:09 -04:00
Stas Bekman
f62f2ffdcc
[makefile] 10x speed up checking/fixing ( #7403 )
...
* [makefile] check/fix only modified since branching files
* fix phonies
* parametrize dirs
* have only one source for dirs to check
* look ma, no autoformatters here
2020-09-28 10:45:42 -04:00
Lysandre
16c213820e
Update docs to version v3.3.0
2020-09-28 16:32:00 +02:00
Lysandre
0613f05226
Release: v3.3.0
2020-09-28 16:24:43 +02:00
Sylvain Gugger
ca3fc36de3
Reorganize documentation navbar ( #7423 )
...
* Reorganize documentation navbar
* Update css to have clear sections
2020-09-28 16:22:58 +02:00
Lysandre Debut
7f4115c099
Pull request template ( #7392 )
...
co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
2020-09-28 09:51:49 -04:00
Sylvain Gugger
0611eab5e3
Document RAG again ( #7377 )
...
Do not merge before Monday
2020-09-28 08:31:46 -04:00
Sylvain Gugger
7563d5a3cf
Catch PyTorch warning when saving/loading scheduler ( #7401 )
2020-09-28 08:20:10 -04:00
Boris Dayma
1749ca317e
docs: fix model sharing file names ( #5855 )
...
* docs: fix model sharing file names
* Update docs/source/model_sharing.rst
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* docs(model_sharing.rst): fix new line
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-09-28 08:17:30 -04:00
Patrick von Platen
8279471506
correct RAG model cards ( #7420 )
2020-09-28 11:08:39 +02:00
Marcin Zabłocki
4083a55ab0
Flos fix ( #7384 )
2020-09-28 04:09:26 -04:00
Ola Piktus
ae3e84f3ba
[RAG] Clean Rag readme in examples ( #7413 )
...
* Improve README + consolidation script
* Reformat README
* Reformat README
Co-authored-by: Your Name <you@example.com>
2020-09-28 10:06:39 +02:00
Sam Shleifer
748425d47d
[T5] allow config.decoder_layers to control decoder size ( #7409 )
...
* Working assymmetrical T5
* rename decoder_layers -> num_decoder_layers
* Fix docstring
* Allow creation of asymmetric t5 students
2020-09-28 03:08:04 -04:00
Sam Shleifer
7296fea1d6
[s2s] rougeLSum expects \n between sentences ( #7410 )
...
Co-authored-by: Swetha Mandava <smandava@nvidia.com>
2020-09-27 16:27:19 -04:00
Suraj Patil
eab5f59682
[s2s] add create student script ( #7290 )
...
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-09-27 15:10:46 -04:00
Patrick von Platen
e50a931c11
[Longformer, Bert, Roberta, ...] Fix multi gpu training ( #7272 )
...
* fix multi-gpu
* fix longformer
* force to delete unnecessary layers
* fix notifications
* fix warning
* fix roberta
* fix tests
* remove hasattr
* fix tests
* fix roberta
* merge and clean authorized keys
2020-09-25 20:33:21 +02:00
Patrick von Platen
2c8ecdf8a8
fix rag retriever save pretrained ( #7399 )
2020-09-25 19:47:12 +02:00
Patrick von Platen
1a14687e6f
Update README.md
2020-09-25 19:43:48 +02:00
Patrick von Platen
3327c2b0f6
Update README.md
2020-09-25 19:43:36 +02:00
Ola Piktus
fe326bd5cf
Remove dependency on examples/seq2seq from rag ( #7395 )
...
Co-authored-by: Your Name <you@example.com>
2020-09-25 18:20:49 +02:00
Sylvain Gugger
ad39271ae8
Fix FP16 and attention masks in FunnelTransformer ( #7374 )
...
* Fix #7371
* Fix training
* Fix test values
* Apply the fix to TF as well
2020-09-25 12:20:39 -04:00
Patrick von Platen
4e5b036bdd
Update README.md
2020-09-25 18:16:46 +02:00
Patrick von Platen
55eccfbb49
Update README.md
2020-09-25 18:16:44 +02:00
Sylvain Gugger
e2e77f02c2
Fix BartModel output documentation ( #7390 )
2020-09-25 11:48:13 -04:00
Sylvain Gugger
bbb07830ff
Speedup check_copies script ( #7394 )
2020-09-25 11:47:22 -04:00
Stas Bekman
8859c4f841
[code quality] new make target that combines style and quality targets ( #7310 )
...
* [code quality] merge style and quality targets
Any reason why we don't run `flake8` in `make style`? I find myself needing to run `make style` and `make quality` all the time, but I need the latter just for the last 2 checks. Since we have no control over the source code why bother with separating checking and fixing - let's just have one target that fixes and then performs the remaining checks, as we know the first two have been done already.
This PR suggests to merge the 2 targets into one efficient target.
I will edit the docs if this change resonates with the team.
* move checks into style, re-use target
* better name
* add fixup target
* document new target
2020-09-25 11:37:40 -04:00
Sam Shleifer
38a1b03f4d
Remove unhelpful bart warning ( #7391 )
2020-09-25 11:01:07 -04:00