Sylvain Gugger
ca3fc36de3
Reorganize documentation navbar ( #7423 )
...
* Reorganize documentation navbar
* Update css to have clear sections
2020-09-28 16:22:58 +02:00
Lysandre Debut
7f4115c099
Pull request template ( #7392 )
...
co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
2020-09-28 09:51:49 -04:00
Sylvain Gugger
0611eab5e3
Document RAG again ( #7377 )
...
Do not merge before Monday
2020-09-28 08:31:46 -04:00
Sylvain Gugger
7563d5a3cf
Catch PyTorch warning when saving/loading scheduler ( #7401 )
2020-09-28 08:20:10 -04:00
Boris Dayma
1749ca317e
docs: fix model sharing file names ( #5855 )
...
* docs: fix model sharing file names
* Update docs/source/model_sharing.rst
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* docs(model_sharing.rst): fix new line
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-09-28 08:17:30 -04:00
Patrick von Platen
8279471506
correct RAG model cards ( #7420 )
2020-09-28 11:08:39 +02:00
Marcin Zabłocki
4083a55ab0
Flos fix ( #7384 )
2020-09-28 04:09:26 -04:00
Ola Piktus
ae3e84f3ba
[RAG] Clean Rag readme in examples ( #7413 )
...
* Improve README + consolidation script
* Reformat README
* Reformat README
Co-authored-by: Your Name <you@example.com>
2020-09-28 10:06:39 +02:00
Sam Shleifer
748425d47d
[T5] allow config.decoder_layers to control decoder size ( #7409 )
...
* Working assymmetrical T5
* rename decoder_layers -> num_decoder_layers
* Fix docstring
* Allow creation of asymmetric t5 students
2020-09-28 03:08:04 -04:00
Sam Shleifer
7296fea1d6
[s2s] rougeLSum expects \n between sentences ( #7410 )
...
Co-authored-by: Swetha Mandava <smandava@nvidia.com>
2020-09-27 16:27:19 -04:00
Suraj Patil
eab5f59682
[s2s] add create student script ( #7290 )
...
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-09-27 15:10:46 -04:00
Patrick von Platen
e50a931c11
[Longformer, Bert, Roberta, ...] Fix multi gpu training ( #7272 )
...
* fix multi-gpu
* fix longformer
* force to delete unnecessary layers
* fix notifications
* fix warning
* fix roberta
* fix tests
* remove hasattr
* fix tests
* fix roberta
* merge and clean authorized keys
2020-09-25 20:33:21 +02:00
Patrick von Platen
2c8ecdf8a8
fix rag retriever save pretrained ( #7399 )
2020-09-25 19:47:12 +02:00
Patrick von Platen
1a14687e6f
Update README.md
2020-09-25 19:43:48 +02:00
Patrick von Platen
3327c2b0f6
Update README.md
2020-09-25 19:43:36 +02:00
Ola Piktus
fe326bd5cf
Remove dependency on examples/seq2seq from rag ( #7395 )
...
Co-authored-by: Your Name <you@example.com>
2020-09-25 18:20:49 +02:00
Sylvain Gugger
ad39271ae8
Fix FP16 and attention masks in FunnelTransformer ( #7374 )
...
* Fix #7371
* Fix training
* Fix test values
* Apply the fix to TF as well
2020-09-25 12:20:39 -04:00
Patrick von Platen
4e5b036bdd
Update README.md
2020-09-25 18:16:46 +02:00
Patrick von Platen
55eccfbb49
Update README.md
2020-09-25 18:16:44 +02:00
Sylvain Gugger
e2e77f02c2
Fix BartModel output documentation ( #7390 )
2020-09-25 11:48:13 -04:00
Sylvain Gugger
bbb07830ff
Speedup check_copies script ( #7394 )
2020-09-25 11:47:22 -04:00
Stas Bekman
8859c4f841
[code quality] new make target that combines style and quality targets ( #7310 )
...
* [code quality] merge style and quality targets
Any reason why we don't run `flake8` in `make style`? I find myself needing to run `make style` and `make quality` all the time, but I need the latter just for the last 2 checks. Since we have no control over the source code why bother with separating checking and fixing - let's just have one target that fixes and then performs the remaining checks, as we know the first two have been done already.
This PR suggests to merge the 2 targets into one efficient target.
I will edit the docs if this change resonates with the team.
* move checks into style, re-use target
* better name
* add fixup target
* document new target
2020-09-25 11:37:40 -04:00
Sam Shleifer
38a1b03f4d
Remove unhelpful bart warning ( #7391 )
2020-09-25 11:01:07 -04:00
Patrick von Platen
5ff0d6d7d0
Update README.md
2020-09-25 16:58:29 +02:00
Quentin Lhoest
cf1c88e092
[RAG] Fix retrieval offset in RAG's HfIndex and better integration tests ( #7372 )
...
* Fix retrieval offset in RAG's HfIndex
* update slow tests
* style
* fix new test
* style
* add better tests
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-09-25 16:12:46 +02:00
Patrick von Platen
571c7a11c1
[Rag] Fix wrong usage of num_beams
and bos_token_id
in Rag Sequence generation ( #7386 )
...
* fix_rag_sequence
* add second bug fix
2020-09-25 14:35:49 +02:00
Suraj Patil
415071b4c2
doc changes ( #7385 )
2020-09-25 08:00:36 -04:00
Patrick von Platen
2dd652d757
[RAG] Add missing doc and attention_mask to rag ( #7382 )
...
* add docs
* add missing docs and attention_mask in fine-tune
2020-09-25 11:23:55 +02:00
Lysandre Debut
7cdd9da5bf
Check config type using type
instead of isinstance
( #7363 )
...
* Check config type instead of instance
Bad merge
* Remove for loops
* Style
2020-09-25 05:09:09 -04:00
Sam Shleifer
3c6bf8998f
modeling_bart: 3 small cleanups that dont change outputs ( #7381 )
...
* Mbart passing
* boom boom
* cleaner assert
* add assert
* Fix tests
2020-09-25 04:24:14 -04:00
Suraj Patil
9e68d075a4
Seq2SeqTrainer ( #6769 )
...
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-09-24 18:46:58 -04:00
Sam Shleifer
d9d0f1140b
[s2s] distributed eval allows num_return_sequences > 1 ( #7254 )
2020-09-24 17:30:09 -04:00
Patrick von Platen
0804d077c6
correct attention mask ( #7373 )
2020-09-24 23:22:04 +02:00
Stas Bekman
a8cbc4269c
[fsmt] build/test scripts ( #7257 )
...
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-09-24 17:10:26 -04:00
Sylvain Gugger
a8e7982f84
Remove mentions of RAG from the docs ( #7376 )
...
* Remove mentions of RAG from the docs
* Deactivate check
2020-09-24 17:07:14 -04:00
Stas Bekman
eadd870b2f
[seq2seq] make it easier to run the scripts ( #7274 )
2020-09-24 15:23:48 -04:00
Lysandre Debut
8d3bb781ee
Formatter ( #7368 )
...
* Formatter
* Docs
2020-09-24 10:59:21 -04:00
Teven
7dfdf793bb
Fixing case in which Trainer
hung while saving model in distributed training ( #7365 )
...
* remote debugging
* remote debugging
* moved _store_flos call
* moved _store_flos call
* moved _store_flos call
* removed debugging artefacts
2020-09-24 09:56:40 -04:00
Sylvain Gugger
0ccb6f5c6d
Clean RAG docs and template docs ( #7348 )
...
* Clean RAG docs and template docs
* Fix typo
* Better doc
2020-09-24 09:24:41 -04:00
Sylvain Gugger
27174bd4fe
Make PyTorch model files independent from each other ( #7352 )
2020-09-24 08:53:54 -04:00
Julien Plu
d161ed1682
Update the TF models to remove their interdependencies ( #7238 )
...
* Refacto the models to remove their interdependencies
* Fix Flaubert model
* Fix Flaubert
* Fix XLM
* Fix Albert
* Fix Roberta
* Fix Albert
* Fix Flaubert
* Apply style + remove unused imports
* Fix Distilbert
* remove unused import
* fix Distilbert
* Fix Flaubert
* Apply style
* Fix Flaubert
* Add the copy comments for the check_copies script
* Fix MobileBert model name
* Address Morgan's comments
* Fix typo
* Oops typo
2020-09-24 08:30:59 -04:00
Jabin Huang
0cffa424f8
Updata tokenization_auto.py ( #6870 )
...
Updata tokenization_auto.py to handle Inherited tokenizer
2020-09-24 06:52:10 -04:00
Daquan Lin
03fb8e79c6
Update modeling_tf_longformer.py ( #7359 )
...
correct a very small mistake
2020-09-24 11:37:29 +02:00
Sylvain Gugger
1ff5bd38a3
Check decorator order ( #7326 )
...
* Check decorator order
* Adapt for parametrized decorators
* Fix typos
2020-09-24 04:54:37 -04:00
Sylvain Gugger
0be5f4a00c
Expand a bit the documentation doc ( #7350 )
2020-09-24 04:34:18 -04:00
Sam Shleifer
38f1703795
wip: Code to add lang tags to marian model cards ( #6586 )
2020-09-23 18:11:06 -04:00
Theo Linnemann
129fdae040
Remove reference to args in XLA check ( #7344 )
...
Previously, the TFTrainingArguments object did a check to see if XLA was enabled, but did this by referencing `self.args.xla`, when it should be `self.xla`, because it is the args object. This can be verified a few lines above, where the XLA field is set.
2020-09-23 13:56:21 -04:00
Felipe Curti
d266613635
[Benchmarks] Change all args to from no_...
to their positive form ( #7075 )
...
* Changed name to all no_... arguments and all references to them, inverting the boolean condition
* Change benchmark tests to use new Benchmark Args
* Update src/transformers/benchmark/benchmark_args_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/benchmark/benchmark.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Fix Style. Add --no options in help
* fix some part of tests
* Update src/transformers/benchmark/benchmark_args_utils.py
* Update src/transformers/benchmark/benchmark_args_utils.py
* Update src/transformers/benchmark/benchmark_args_utils.py
* fix all tests
* make style
* add backwards compability
* make backwards compatible
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: fmcurti <fcurti@DESKTOP-RRQURBM.localdomain>
2020-09-23 13:25:24 -04:00
Doug Blank
8c697d58ef
Ensure that integrations are imported before transformers or ml libs ( #7330 )
...
* Ensure that intergrations are imported before transformers or ml libs
* Black reformatter wanted a newline
* isort requests
* black requests
* flake8 requests
2020-09-23 13:23:45 -04:00
Sylvain Gugger
3323146e90
Models doc ( #7345 )
...
* Clean up model documentation
* Formatting
* Preparation work
* Long lines
* Main work on rst files
* Cleanup all config files
* Syntax fix
* Clean all tokenizers
* Work on first models
* Models beginning
* FaluBERT
* All PyTorch models
* All models
* Long lines again
* Fixes
* More fixes
* Update docs/source/model_doc/bert.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update docs/source/model_doc/electra.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Last fixes
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-09-23 13:20:45 -04:00