* added LongformerForQuestionAnswering
* add LongformerForQuestionAnswering
* fix import for LongformerForMaskedLM
* add LongformerForQuestionAnswering
* hardcoded sep_token_id
* compute attention_mask if not provided
* combine global_attention_mask with attention_mask when provided
* update example in docstring
* add assert error messages, better attention combine
* add test for longformerForQuestionAnswering
* typo
* cast gloabl_attention_mask to long
* make style
* Update src/transformers/configuration_longformer.py
* Update src/transformers/configuration_longformer.py
* fix the code quality
* Merge branch 'longformer-for-question-answering' of https://github.com/patil-suraj/transformers into longformer-for-question-answering
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add Type Hints to modeling_utils.py Closes#3911
Add Type Hints to methods in `modeling_utils.py`
Note: The coverage isn't 100%. Mostly skipped internal methods.
* Reformat according to `black` and `isort`
* Use typing.Iterable instead of Sequence
* Parameterize Iterable by its generic type
* Use typing.Optional when None is the default value
* Adhere to style guideline
* Update src/transformers/modeling_utils.py
* Update src/transformers/modeling_utils.py
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* Warn the user about max_len being on the path to be deprecated.
* Ensure better backward compatibility when max_len is provided to a tokenizer.
* Make sure to override the parameter and not the actual instance value.
* Format & quality
* Adds predict stage for glue tasks, and generate result files which could be submitted to gluebenchmark.com website.
* Use Split enum + always output the label name
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* first commit
* bug fixes
* better examples
* undo padding
* remove wrong VOCAB_FILES_NAMES
* License
* make style
* make isort happy
* unit tests
* integration test
* make `black` happy by undoing `isort` changes!!
* lint
* no need for the padding value
* batch_size not bsz
* remove unused type casting
* seqlen not seq_len
* staticmethod
* `bert` selfattention instead of `n2`
* uint8 instead of bool + lints
* pad inputs_embeds using embeddings not a constant
* black
* unit test with padding
* fix unit tests
* remove redundant unit test
* upload model weights
* resolve todo
* simpler _mask_invalid_locations without lru_cache + backward compatible masked_fill_
* increase unittest coverage
* Distributed eval: SequentialDistributedSampler + gather all results
* For consistency only write to disk from world_master
Close https://github.com/huggingface/transformers/issues/4272
* Working distributed eval
* Hook into scripts
* Fix#3721 again
* TPU.mesh_reduce: stay in tensor space
Thanks @jysohn23
* Just a small comment
* whitespace
* torch.hub: pip install packaging
* Add test scenarii
* makes fetching last learning late in trainer backward compatible
* split comment to multiple lines
* fixes black styling issue
* uses version to create a more explicit logic