Commit Graph

5783 Commits

Author SHA1 Message Date
Yossi Synett
bc0d26d1de
[All Seq2Seq model + CLM models that can be used with EncoderDecoder] Add cross-attention weights to outputs (#8071)
* Output cross-attention with decoder attention output

* Update src/transformers/modeling_bert.py

* add cross-attention for t5 and bart as well

* fix tests

* correct typo in docs

* add sylvains and sams comments

* correct typo

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-06 19:34:48 +01:00
hassoudi
30f2507a07
Update README.md (#8360)
Fix websitr address
2020-11-06 11:45:46 -05:00
Jonathan Chang
5807ba3fa9
Fix typo (#8351) 2020-11-06 11:19:41 -05:00
hassoudi
82146496b6
Update README.md (#8338)
fixes
2020-11-06 06:20:58 -05:00
ktrapeznikov
9e5c4d39ab
Create README.md (#8312)
* Create README.md

* Update model_cards/ktrapeznikov/gpt2-medium-topic-news/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-06 06:19:59 -05:00
hasantanvir79
06ebc37967
Create README.md (#8255)
* Create README.md

Initial commit

* Updated Read me

Updated

* Apply suggestions from code review

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-06 03:34:24 -05:00
Karthik Uppuluri
41cd031cf2
Create README.md (#8169) 2020-11-06 03:26:07 -05:00
Karthik Uppuluri
f932ddeff5
Create README.md (#8170) 2020-11-06 03:25:52 -05:00
Karthik Uppuluri
08b92f78fa
Create README.md (#8168)
* Create README.md

* Update README.md
2020-11-06 03:25:33 -05:00
Karthik Uppuluri
77d62e78b0
Create README.md (#8167)
* Create README.md

Telugu BERTU Readme file

* Update model_cards/kuppuluri/telugu_bertu/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-06 03:24:31 -05:00
Yifan Peng
dd6bfcaefb
Create README.md (#8327) 2020-11-06 03:22:52 -05:00
smanjil
ddeecf08e6
german medbert model details (#8266)
* model details

* Apply suggestions from code review

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-06 03:21:13 -05:00
Jiaxin Pei
96baaafd34
Create README.md (#8258) 2020-11-06 03:19:12 -05:00
Stefan Schweter
185259c261
[model_cards] Update Italian BERT models and introduce new Italian XXL ELECTRA model 🎉 (#8343) 2020-11-06 03:17:03 -05:00
Manuel Romero
34bbf60bf8
Model card: GPT-2 fine-tuned on CommonGen (#8248) 2020-11-06 03:15:11 -05:00
Manuel Romero
973218fd3b
Model card: CodeBERT fine-tuned for Insecure Code Detection (#8247)
* Model card: CodeBERT fine-tuned for Insecure Code Detection

* Update model_cards/mrm8488/codebert-base-finetuned-detect-insecure-code/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-06 03:13:45 -05:00
Manuel Romero
f833ca418b
Model card: T5-base fine-tuned on QuaRel (#8334) 2020-11-06 03:09:55 -05:00
Stas Bekman
9edafaebef
[s2s] test_bash_script.py - actually learn something (#8318)
* use decorator

* remove hardcoded paths

* make the test use more data and do real quality tests

* shave off 10 secs

* add --eval_beams 2, reformat

* reduce train size, use smaller custom dataset
2020-11-05 23:15:14 -05:00
Leandro von Werra
17450397a7
Docs bart training ref (#8330)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-11-05 17:20:57 -05:00
Stas Bekman
d787935a14
[s2s] test_distributed_eval (#8315)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-11-05 16:01:15 -05:00
Sylvain Gugger
04e442d575
Make Trainer evaluation handle dynamic seq_length (#8336)
* Make Trainer evaluation handle dynamic seq_length

* Document behavior.

* Fix test

* Better fix

* Fixes for realsies this time

* Address review comments

* Without forgetting to save...
2020-11-05 15:13:51 -05:00
Guillaume Filion
27b402cab0
Output global_attentions in Longformer models (#7562)
* Output global_attentions in Longformer models

* make style

* small refactoring

* fix tests

* make fix-copies

* add for tf as well

* remove comments in test

* make fix-copies

* make style

* add docs

* make docstring pretty

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2020-11-05 21:10:43 +01:00
Sam Shleifer
7abc1d96d1
no warn (#8329) 2020-11-05 11:42:24 -05:00
Bobby Donchev
52f44dd6d2
change TokenClassificationTask class methods to static methods (#7902)
* change TokenClassificationTask class methods to static methods

Since we do not require self in the class methods of TokenClassificationTask we should probably switch to static methods. Also, since the class TokenClassificationTask does not contain a constructor it is currently unusable as is. By switching to static methods this fixes the issue of having to document the intent of the broken class.

Also, since the get_labels and read_examples_from_file methods are ought to be implemented. Static method definitions are unchanged even after inheritance, which means that it can be overridden, similar to other class methods.

* Trigger Build

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-11-05 09:38:30 -05:00
Guillem García Subies
77c8f6c627
Corrected typo in readme (#8320) 2020-11-05 07:48:36 -05:00
Patrick von Platen
226b9debb7
Update PULL_REQUEST_TEMPLATE.md 2020-11-05 09:40:15 +01:00
Patrick von Platen
6f35c61f93
Update bug-report.md 2020-11-05 09:39:05 +01:00
Yifan Peng
638c0b7c50
Create README.md (#8223)
* Create README.md

* Update README.md

* Apply suggestions from code review

Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-05 03:03:19 -05:00
Sylvain Gugger
9c4aa4ac1a
Clean up data collators and datasets (#8308)
* Clean up data collators and datasets

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Remove needless clone

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-11-04 17:24:49 -05:00
Manuel Romero
b1d3e95eb5
Fix path to old run_language_modeling.py script (#8302) 2020-11-04 13:17:57 -05:00
Sylvain Gugger
b6e58db277
Speedup doc build (#8301)
* Try -j option

* Try other thing

* Bigger machine

* Test lower sphinx version

* Remove trailing space
2020-11-04 11:51:21 -05:00
Victor SANH
969ccac2e9
adding model cards for distilled models (#8300)
* adding model cards for distil models

* forgot the languages
2020-11-04 11:41:45 -05:00
Nicolas Patry
7342d9a583
Improve QA pipeline error handling (#8286)
- The issue is that with previous code we would have the following:

```python
qa_pipeline = (...)
qa_pipeline(question="Where was he born ?", context="")
-> IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
```

The goal here is to improve this to actually return a ValueError
wherever possible.

While at it, I tried to simplify QuestionArgumentHandler's code to
make it smaller and more compat while keeping backward compat.
2020-11-04 11:30:42 -05:00
Branden Chan
38630e7a87
Update model cards of deepset/roberta-base-squad2 v1 and v2 (#8241)
* update deepset/roberta-base-squad2 to v2

* Update model_cards/deepset/roberta-base-squad2/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-04 11:21:25 -05:00
Manuel Romero
04561ecbe6
Model card: T5-base fine-tuned on QASC (#8299) 2020-11-04 11:20:15 -05:00
Sylvain Gugger
854b44aa38 Revert size change as it doesn't change anything 2020-11-04 11:13:24 -05:00
Sylvain Gugger
414985c427 Upgrade resource for doc building 2020-11-04 10:44:19 -05:00
Sylvain Gugger
cf89724696
Fix validation file loading in scripts (#8298) 2020-11-04 10:42:18 -05:00
Patrick von Platen
cb966e640b
[Generate Test] fix greedy generate test (#8293)
* fix greedy generate test

* delet ipdb
2020-11-04 15:44:36 +01:00
Pengzhi Gao
734afa37f6
Fix typo in language-modeling README.md (#8287) 2020-11-04 09:38:02 -05:00
Stas Bekman
7a7e2c2606
[blenderbot] regex fix (#8282)
Fixing:

```
src/transformers/tokenization_blenderbot.py:163: DeprecationWarning: invalid escape sequence \s
    token = re.sub("\s{2,}", " ", token)
```
2020-11-04 09:02:28 -05:00
Ceyda Cinarel
29b536a73a
[WIP] Ner pipeline grouped_entities fixes (#5970)
* Bug fix: NER pipeline shouldn't group separate entities of same type

* style fix

* [Bug Fix] Shouldn't group entities that are both 'B' even if they are same type
	(B-type1 B-type1) != (B-type1 I-type1)
[Bug Fix] add an option `ignore_subwords` to ignore subsequent ##wordpieces in predictions. Because some models train on only the first token of a word and not on the subsequent wordpieces (BERT NER default). So it makes sense doing the same thing at inference time.
	The simplest fix is to just group the subwords with the first wordpiece.
	[TODO] how to handle ignored scores? just set them to 0 and calculate zero invariant mean ?
	[TODO] handle different wordpiece_prefix ## ? possible approaches:
		get it from tokenizer? but currently most tokenizers dont have a wordpiece_prefix property?
		have an _is_subword(token)
[Feature add] added option to `skip_special_tokens`. Cause It was harder to remove them after grouping.
[Additional Changes] remove B/I prefix on returned grouped_entities
[Feature Request/TODO] Return indexes?
[Bug TODO]  can't use fast tokenizer with grouped_entities ('BertTokenizerFast' object has no attribute 'convert_tokens_to_string')

* use offset_mapping to fix [UNK] token problem

* ignore score for subwords

* modify ner_pipeline test

* modify ner_pipeline test

* modify ner_pipeline test

* ner_pipeline change ignore_subwords default to true

* add ner_pipeline ignore_subword=False test case

* fix offset_mapping index

* fix style again duh

* change is_subword and convert_tokens_to_string logic

* merge tests with new test structure

* change test names

* remove old tests

* ner tests for fast tokenizer

* fast tokenizers have convert_tokens_to_string

* Fix the incorrect merge

Co-authored-by: Ceyda Cinarel <snu-ceyda@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-11-03 17:21:04 -05:00
Stas Bekman
1bb4bba53c
[CIs] Better reports everywhere (#8275)
* make it possible to invoke testconf.py in both test suites without crashing on having the same option added

* perl -pi -e 's|--make_reports|--make-reports|' to be consistent with other opts

* add `pytest --make-reports` to all CIs (and artifacts)

* fix
2020-11-03 16:57:12 -05:00
Sylvain Gugger
7f556d2e39
Data collator for token classification (#8274)
* Add DataCollatorForTokenClassification and clean tests

* Make quality
2020-11-03 16:33:27 -05:00
Philip May
6a064447f2
improve documentation of training_args.py (#8270)
* improve documentation of training_args.py

- do_train
- do_eval
- do_predict

* fix line too long

* fix style with black on training_args.py

* Update src/transformers/training_args.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/training_args.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/training_args.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix line length with utils/style_doc

* black reformatting

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-03 15:57:17 -05:00
Sylvain Gugger
4c19f3baab
Clean Trainer tests and datasets dep (#8268) 2020-11-03 15:50:55 -05:00
Patrick von Platen
068e6b5edd
make files independent (#8267) 2020-11-03 21:13:33 +01:00
Stas Bekman
cd360dcb26
[examples] minimal version requirement run-time check in PL (#8133)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-11-03 13:17:11 -05:00
Stas Bekman
971c638ee9
forward the worker stderr to the parent process (#8262) 2020-11-03 12:04:53 -05:00
Lysandre
eb6313e823 Fix Tatoeba skip 2020-11-03 10:35:00 -05:00