Commit Graph

11 Commits

Author SHA1 Message Date
Amil Khare
c852036b4a
[cleanup] Hoist ModelTester objects to top level (#4939)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-06-16 08:03:43 -04:00
Sylvain Gugger
4e10acb3e5
Add more models to common tests (#4910) 2020-06-10 13:19:53 -04:00
Bharat Raghunathan
6e603cb789
[All models] Extend config.output_attentions with output_attentions function arguments (#4538)
* DOC: Replace instances of ``config.output_attentions`` with function argument ``output_attentions``

* DOC: Apply Black Formatting

* Fix errors where output_attentions was undefined

* Remove output_attentions in classes per review

* Fix regressions on tests having `output_attention`

* Fix further regressions in tests relating to `output_attentions`

Ensure proper propagation of `output_attentions` as a function parameter
to all model subclasses

* Fix more regressions in `test_output_attentions`

* Fix issues with BertEncoder

* Rename related variables to `output_attentions`

* fix pytorch tests

* fix bert and gpt2 tf

* Fix most TF tests for `test_output_attentions`

* Fix linter errors and more TF tests

* fix conflicts

* DOC: Apply Black Formatting

* Fix errors where output_attentions was undefined

* Remove output_attentions in classes per review

* Fix regressions on tests having `output_attention`

* fix conflicts

* fix conflicts

* fix conflicts

* fix conflicts

* fix pytorch tests

* fix conflicts

* fix conflicts

* Fix linter errors and more TF tests

* fix tf tests

* make style

* fix isort

* improve output_attentions

* improve tensorflow

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-06-09 23:39:06 +02:00
Sylvain Gugger
f1fe18465d
Use labels to remove deprecation warnings (#4807) 2020-06-05 16:41:46 -04:00
Patrick von Platen
56ee2560be
[Longformer] Better handling of global attention mask vs local attention mask (#4672)
* better api

* improve automatic setting of global attention mask

* fix longformer bug

* fix global attention mask in test

* fix global attn mask flatten

* fix slow tests

* update docstring

* update docs and make more robust

* improve attention mask
2020-05-29 17:58:42 +02:00
Patrick von Platen
9c17256447
[Longformer] Multiple choice for longformer (#4645)
* add multiple choice for longformer

* add models to docs

* adapt docstring

* add test to longformer

* add longformer for mc in init and modeling auto

* fix tests
2020-05-29 13:46:08 +02:00
Suraj Patil
e444648a30
LongformerForTokenClassification (#4638) 2020-05-28 12:48:18 +02:00
Suraj Patil
ec4cdfdd05
LongformerForSequenceClassification (#4580)
* LongformerForSequenceClassification

* better naming x=>hidden_states, fix typo in doc

* Update src/transformers/modeling_longformer.py

* Update src/transformers/modeling_longformer.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-05-27 22:30:00 +02:00
Suraj Patil
03d8527de0
Longformer for question answering (#4500)
* added LongformerForQuestionAnswering

* add LongformerForQuestionAnswering

* fix import for LongformerForMaskedLM

* add LongformerForQuestionAnswering

* hardcoded sep_token_id

* compute attention_mask if not provided

* combine global_attention_mask with attention_mask when provided

* update example in  docstring

* add assert error messages, better attention combine

* add test for longformerForQuestionAnswering

* typo

* cast gloabl_attention_mask to long

* make style

* Update src/transformers/configuration_longformer.py

* Update src/transformers/configuration_longformer.py

* fix the code quality

* Merge branch 'longformer-for-question-answering' of https://github.com/patil-suraj/transformers into longformer-for-question-answering

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-05-25 18:43:36 +02:00
Patrick von Platen
aa925a52fa
[Tests, GPU, SLOW] fix a bunch of GPU hardcoded tests in Pytorch (#4468)
* fix gpu slow tests in pytorch

* change model to device syntax
2020-05-19 21:35:04 +02:00
Iz Beltagy
8f1d047148
Longformer (#4352)
* first commit

* bug fixes

* better examples

* undo padding

* remove wrong VOCAB_FILES_NAMES

* License

* make style

* make isort happy

* unit tests

* integration test

* make `black` happy by undoing `isort` changes!!

* lint

* no need for the padding value

* batch_size not bsz

* remove unused type casting

* seqlen not seq_len

* staticmethod

* `bert` selfattention instead of `n2`

* uint8 instead of bool + lints

* pad inputs_embeds using embeddings not a constant

* black

* unit test with padding

* fix unit tests

* remove redundant unit test

* upload model weights

* resolve todo

* simpler _mask_invalid_locations without lru_cache + backward compatible masked_fill_

* increase unittest coverage
2020-05-19 16:04:43 +02:00