Commit Graph

5759 Commits

Author SHA1 Message Date
Julien Chaumond
a0a6387a0d
[model_cards] roberta-large-mnli: fix sep_token 2020-07-02 10:04:02 -04:00
Julien Chaumond
215db688da
Create roberta-large-mnli-README.md 2020-07-02 09:43:54 -04:00
Lysandre Debut
69d313e808
Bans SentencePiece 0.1.92 (#5418) 2020-07-02 09:23:00 -04:00
George Ho
84e56669af
Fix typo in glossary (#5466) 2020-07-02 09:19:33 -04:00
Teven
c6a510c6fa
Fixing missing arguments for TransfoXL tokenizer when using TextGenerationPipeline (#5465)
* overriding _parse_and_tokenize in `TextGenerationPipeine` to allow for TransfoXl tokenizer arguments
2020-07-02 13:53:33 +02:00
Teven
6726416e4a
Changed expected_output_ids in TransfoXL generation test (#5462)
* Changed expected_output_ids in TransfoXL generation test to match #4826 generation PR.

* making black happy

* making isort happy
2020-07-02 11:56:44 +02:00
tommccoy
812def00c9
fix use of mems in Transformer-XL (#4826)
Fixed duplicated memory use in Transformer-XL generation leading to bad predictions and performance.
2020-07-02 11:19:07 +02:00
Patrick von Platen
306f1a2695
Add Reformer MLM notebook (#5450)
* Add Reformer MLM notebook

* Update notebooks/README.md
2020-07-02 00:20:49 +02:00
Patrick von Platen
d16e36c7e5
[Reformer] Add Masked LM Reformer (#5426)
* fix conflicts

* fix

* happy rebasing
2020-07-01 22:43:18 +02:00
Funtowicz Morgan
f4323dbf8c
Don't discard entity_group when token is the latest in the sequence. (#5439)
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
2020-07-01 20:30:42 +02:00
Joe Davison
35befd9ce3
Fix tensor label type inference in default collator (#5250)
* allow tensor label inputs to default collator

* replace try/except with type check
2020-07-01 10:40:14 -06:00
Patrick von Platen
fe81f7d12c
finish reformer qa head (#5433) 2020-07-01 12:27:14 -04:00
Patrick von Platen
d697b6ca75
[Longformer] Major Refactor (#5219)
* refactor naming

* add small slow test

* refactor

* refactor naming

* rename selected to extra

* big global attention refactor

* make style

* refactor naming

* save intermed

* refactor functions

* finish function refactor

* fix tests

* fix longformer

* fix longformer

* fix longformer

* fix all tests but one

* finish longformer

* address sams and izs comments

* fix transpose
2020-07-01 17:43:32 +02:00
Sam Shleifer
e0d58ddb65
[fix] Marian tests import (#5442) 2020-07-01 11:42:22 -04:00
Funtowicz Morgan
608d5a7c44
Raises PipelineException on FillMaskPipeline when there are != 1 mask_token in the input (#5389)
* Added PipelineException

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* fill-mask pipeline raises exception when more than one mask_token detected.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Put everything in a function.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added tests on pipeline fill-mask when input has != 1 mask_token

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix numel() computation for TF

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Addressing PR comments.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Remove function typing to avoid import on specific framework.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Quality.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Retry typing with @julien-c tip.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Quality².

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Simplify fill-mask mask_token checking.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Trigger CI
2020-07-01 17:27:47 +02:00
Sylvain Gugger
6c55e9fc32
Fix dropdown bug in searches (#5440)
* Trigger CI

* Fix dropdown bug in searches
2020-07-01 11:02:59 -04:00
Sylvain Gugger
734a28a767
Clean up diffs in Trainer/TFTrainer (#5417)
* Cleanup and unify Trainer/TFTrainer

* Forgot to adapt TFTrainingArgs

* In tf scripts n_gpu -> n_replicas

* Update src/transformers/training_args.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* Formatting

* Fix typo

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-07-01 11:00:20 -04:00
Sam Shleifer
43cb03a93d
MarianTokenizer.prepare_translation_batch uses new tokenizer API (#5182) 2020-07-01 10:32:50 -04:00
Sam Shleifer
13deb95a40
Move tests/utils.py -> transformers/testing_utils.py (#5350) 2020-07-01 10:31:17 -04:00
sgugger
9c219305f5 Trigger CI 2020-07-01 10:22:50 -04:00
Sylvain Gugger
64e3d966b1
Add support for past states (#5399)
* Add support for past states

* Style and forgotten self

* You mean, documenting is not enough? I have to actually add it too?

* Add memory support during evaluation

* Fix tests in eval and add TF support

* No need to change this line anymore
2020-07-01 08:11:55 -04:00
Sylvain Gugger
4ade7491f4
Fix examples titles and optimization doc page (#5408) 2020-07-01 08:11:25 -04:00
Moseli Motsoehli
d60d231ea4
Create README.md (#5422)
* Create README.md

* Update model_cards/MoseliMotsoehli/TswanaBert/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-07-01 05:01:51 -04:00
Jay
298bdab18a
Create model card for schmidek/electra-small-cased (#5400) 2020-07-01 04:01:56 -04:00
Julien Plu
fcf0652460
Fix TensorFlow dataset generator (#4881)
* fix TensorFlow generator

* Better features handling

* Apply style

* Apply style

* Fix squad as well

* Apply style

* Better factorization of TF Tensors creation
2020-06-30 19:49:11 -04:00
Hong Xu
501040fd30
In the run_ner.py example, give the optional label arg a default value (#5326)
Otherwise, if label is not specified, the following error occurs:

	Traceback (most recent call last):
	  File "run_ner.py", line 303, in <module>
	    main()
	  File "run_ner.py", line 101, in main
	    model_args, data_args, training_args = parser.parse_json_file(json_file=os.path.abspath(sys.argv[1]))
	  File "/home/user/anaconda3/envs/bert/lib/python3.7/site-packages/transformers/hf_argparser.py", line 159, in parse_json_file
	    obj = dtype(**inputs)
	TypeError: __init__() missing 1 required positional argument: 'labels'
2020-06-30 19:45:35 -04:00
Sam Shleifer
b45e65efa0
Avoid deprecation warning for F.tanh (#5413) 2020-06-30 16:41:43 -04:00
Sam Shleifer
23231c0f78
[GH Runner] fix yaml indent (#5412) 2020-06-30 16:17:12 -04:00
Sam Shleifer
ac61114592
[CI] gh runner doesn't use -v, cats new result (#5409) 2020-06-30 16:12:14 -04:00
Sam Shleifer
27a7fe7a8d
examples/seq2seq: never override $WANDB_PROJECT (#5407) 2020-06-30 15:29:13 -04:00
Sam Shleifer
32d2031458
[fix] slow fill_mask test failure (#5406) 2020-06-30 15:28:15 -04:00
Sam Shleifer
80aa4b8aa6
[CI] GH-runner stores artifacts like CircleCI (#5318) 2020-06-30 15:01:53 -04:00
Sylvain Gugger
87716a6d07
Documentation for the Trainer API (#5383)
* Documentation for the Trainer API

* Address review comments

* Address comments
2020-06-30 11:43:43 -04:00
Yacine Jernite
c4d4e8bdbd
Move GenerationMixin to separate file (#5254)
* separate_generation_code

* isort

* renamed

* rename_files

* move_shapelit
2020-06-30 10:42:08 -04:00
Lysandre
90d13954c4 Repin versions 2020-06-30 09:16:36 -04:00
Sylvain Gugger
0607b88945
How to share model cards with the CLI (#5374)
* How to share model cards

* Switch the two options

* Fix bad copy/cut

* Julien's suggestion
2020-06-30 08:59:32 -04:00
Kevin Canwen Xu
331d8d2936
Upload DistilBART artwork (#5394) 2020-06-30 18:11:11 +08:00
Manuel Romero
09e841490c
Model Card Fixing (#5369)
- Fix missing ```-``` in language meta
- T5 pic uploaded to a more permanent place
2020-06-30 18:02:24 +08:00
Manuel Romero
4c5bed192a
Model Card Fixing (#5373)
- T5 pic uploaded to a more permanent place
2020-06-30 18:01:45 +08:00
Manuel Romero
02509d4b06
Model Card Fixing (#5371)
- Model pic uploaded to a more permanent place
2020-06-30 18:01:11 +08:00
Manuel Romero
79f0118c72
Model Card Fixing (#5370)
- Fix missing ```-``` in language meta
- T5 pic uploaded to a more permanent place
2020-06-30 18:00:29 +08:00
MichaelJanz
9a473f1e43
Update Bertabs example to work again (#5355)
* Fix the bug 'Attempted relative import with no known parent package' when using the bertabs example. Also change the used model from bertabs-finetuned-cnndm, since it seems not be accessible anymore

* Update run_summarization.py

Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
2020-06-30 14:05:01 +08:00
Sylvain Gugger
7f60e93ac5
Mention openAI model card and merge content (#5378)
* Mention openAI model card and merge content

* Fix sentence
2020-06-29 18:27:36 -04:00
chrisliu
482a5993c2
Fix model card folder name so that it is consistent with model hub (#5368)
* Merge upstream

* Merge upstream

* Add generate.py link

* Merge upstream

* Merge upstream

* Fix folder name
2020-06-29 12:54:30 -04:00
chrisliu
97f24303e8
Add link to file and fix typos in model card (#5367)
* Merge upstream

* Merge upstream

* Add generate.py link
2020-06-29 11:34:52 -04:00
Lysandre Debut
b9ee87f5c7
Doc for v3.0.0 (#5366)
* Doc for v3.0.0

* Update docs/source/_static/js/custom.js

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/_static/js/custom.js

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-06-29 11:08:54 -04:00
Lysandre
b62ca59527 Release: v3.0.0 2020-06-29 10:40:13 -04:00
Sam Shleifer
a316a6aaa8
[seq2seq docs] Move evaluation down, fix typo (#5365) 2020-06-29 10:36:04 -04:00
Patrick von Platen
4bcc35cd69
[Docs] Benchmark docs (#5360)
* first doc version

* add benchmark docs

* fix typos

* improve README

* Update docs/source/benchmarks.rst

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* fix naming and docs

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-06-29 16:08:57 +02:00
Sylvain Gugger
482c9178d3
Pin mecab for now (#5362) 2020-06-29 09:51:13 -04:00