transformers/docs/source
Vasudev Gupta 6dfd027279
BigBird (#10183)
* init bigbird

* model.__init__ working, conversion script ready, config updated

* add conversion script

* BigBirdEmbeddings working :)

* slightly update conversion script

* BigBirdAttention working :) ; some bug in layer.output.dense

* add debugger-notebook

* forward() working for BigBirdModel :) ; replaced gelu with gelu_fast

* tf code adapted to torch till rand_attn in bigbird_block_sparse_attention ; till now everything working :)

* BigBirdModel working in block-sparse attention mode :)

* add BigBirdForPreTraining

* small fix

* add tokenizer for BigBirdModel

* fix config & hence modeling

* fix base prefix

* init testing

* init tokenizer test

* pos_embed must be absolute, attn_type=original_full when add_cross_attn=True , nsp loss is optional in BigBirdForPreTraining, add assert statements

* remove position_embedding_type arg

* complete normal tests

* add comments to block sparse attention

* add attn_probs for sliding & global tokens

* create fn for block sparse attn mask creation

* add special tests

* restore pos embed arg

* minor fix

* attn probs update

* make big bird fully gpu friendly

* fix tests

* remove pruning

* correct tokenzier & minor fixes

* update conversion script , remove norm_type

* tokenizer-inference test add

* remove extra comments

* add docs

* save intermediate

* finish trivia_qa conversion

* small update to forward

* correct qa and layer

* better error message

* BigBird QA ready

* fix rebased

* add triva-qa debugger notebook

* qa setup

* fixed till embeddings

* some issue in q/k/v_layer

* fix bug in conversion-script

* fixed till self-attn

* qa fixed except layer norm

* add qa end2end test

* fix gradient ckpting ; other qa test

* speed-up big bird a bit

* hub_id=google

* clean up

* make quality

* speed up einsum with bmm

* finish perf improvements for big bird

* remove wav2vec2 tok

* fix tokenizer

* include docs

* correct docs

* add helper to auto pad block size

* make style

* remove fast tokenizer for now

* fix some

* add pad test

* finish

* fix some bugs

* fix another bug

* fix buffer tokens

* fix comment and merge from master

* add comments

* make style

* commit some suggestions

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix typos

* fix some more suggestions

* add another patch

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix copies

* another path

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* update

* update nit suggestions

* make style

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-30 08:51:34 +03:00
..
_static Document v4.4.2 2021-03-18 15:19:25 -04:00
imgs [Templates] Add template "call-for-model" markdown and "call-for-big-bird" markdown (#9921) 2021-02-05 15:47:54 +03:00
internal Instantiate model only once in pipeline (#10888) 2021-03-29 10:39:14 -04:00
main_classes Add ImageFeatureExtractionMixin (#10905) 2021-03-26 11:23:56 -04:00
model_doc BigBird (#10183) 2021-03-30 08:51:34 +03:00
add_new_model.rst Add new model docs (#9667) 2021-02-01 17:55:10 +03:00
benchmarks.rst Make doc styler detect lists on rst (#9488) 2021-01-11 08:53:41 -05:00
bertology.rst Fix documentation links always pointing to master. (#9217) 2021-01-05 06:18:48 -05:00
community.md Add notebook on fine-tuning Bart (#10883) 2021-03-24 11:03:37 -04:00
conf.py Development on v4.5.0dev0 2021-03-16 11:41:15 -04:00
contributing.md Update installation page and add contributing to the doc (#5084) 2020-06-17 14:01:10 -04:00
converting_tensorflow_models.rst Fix broken links in the converting tf ckpt document (#9791) 2021-01-26 03:37:57 -05:00
custom_datasets.rst Rename NLP library to Datasets library (#10920) 2021-03-26 08:07:59 -04:00
examples.md per_device instead of per_gpu/error thrown when argument unknown (#4618) 2020-05-27 11:36:55 -04:00
favicon.ico Adding usage examples for common tasks (#2850) 2020-02-25 13:48:24 -05:00
glossary.rst Adds terms to Glossary (#10443) 2021-02-28 08:27:54 -05:00
index.rst BigBird (#10183) 2021-03-30 08:51:34 +03:00
installation.md split seq2seq script into summarization & translation (#10611) 2021-03-15 09:11:42 -04:00
migration.md Copyright (#8970) 2020-12-07 18:36:34 -05:00
model_sharing.rst [doc] nested markup is invalid in rst (#9898) 2021-01-30 09:59:19 -05:00
model_summary.rst ConvBERT Model (#9717) 2021-01-27 03:20:09 -05:00
multilingual.rst Fix documentation links always pointing to master. (#9217) 2021-01-05 06:18:48 -05:00
notebooks.md Update notebooks (#3620) 2020-04-06 14:32:39 -04:00
perplexity.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
philosophy.rst Minor documentation revisions from copyediting (#9266) 2020-12-23 10:15:49 -05:00
preprocessing.rst Minor documentation revisions from copyediting (#9266) 2020-12-23 10:15:49 -05:00
pretrained_models.rst Add m2m100 (#10236) 2021-03-06 22:14:16 +05:30
quicktour.rst Minor documentation revisions from copyediting (#9266) 2020-12-23 10:15:49 -05:00
sagemaker.md make local setup more clearer and added missing links (#10899) 2021-03-25 09:01:31 -04:00
serialization.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
task_summary.rst split seq2seq script into summarization & translation (#10611) 2021-03-15 09:11:42 -04:00
testing.rst [doc] [testing] extend the pytest -k section with more examples (#10761) 2021-03-17 09:23:38 -04:00
tokenizer_summary.rst Minor documentation revisions from copyediting (#9266) 2020-12-23 10:15:49 -05:00
training.rst [trainer] deepspeed integration (#9211) 2021-01-12 19:05:18 -08:00