transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-28 16:52:24 +06:00

History

Vasudev Gupta 6dfd027279 BigBird (#10183 ) * init bigbird * model.__init__ working, conversion script ready, config updated * add conversion script * BigBirdEmbeddings working :) * slightly update conversion script * BigBirdAttention working :) ; some bug in layer.output.dense * add debugger-notebook * forward() working for BigBirdModel :) ; replaced gelu with gelu_fast * tf code adapted to torch till rand_attn in bigbird_block_sparse_attention ; till now everything working :) * BigBirdModel working in block-sparse attention mode :) * add BigBirdForPreTraining * small fix * add tokenizer for BigBirdModel * fix config & hence modeling * fix base prefix * init testing * init tokenizer test * pos_embed must be absolute, attn_type=original_full when add_cross_attn=True , nsp loss is optional in BigBirdForPreTraining, add assert statements * remove position_embedding_type arg * complete normal tests * add comments to block sparse attention * add attn_probs for sliding & global tokens * create fn for block sparse attn mask creation * add special tests * restore pos embed arg * minor fix * attn probs update * make big bird fully gpu friendly * fix tests * remove pruning * correct tokenzier & minor fixes * update conversion script , remove norm_type * tokenizer-inference test add * remove extra comments * add docs * save intermediate * finish trivia_qa conversion * small update to forward * correct qa and layer * better error message * BigBird QA ready * fix rebased * add triva-qa debugger notebook * qa setup * fixed till embeddings * some issue in q/k/v_layer * fix bug in conversion-script * fixed till self-attn * qa fixed except layer norm * add qa end2end test * fix gradient ckpting ; other qa test * speed-up big bird a bit * hub_id=google * clean up * make quality * speed up einsum with bmm * finish perf improvements for big bird * remove wav2vec2 tok * fix tokenizer * include docs * correct docs * add helper to auto pad block size * make style * remove fast tokenizer for now * fix some * add pad test * finish * fix some bugs * fix another bug * fix buffer tokens * fix comment and merge from master * add comments * make style * commit some suggestions Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix typos * fix some more suggestions * add another patch Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix copies * another path Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * update * update nit suggestions * make style Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>		2021-03-30 08:51:34 +03:00
..
_static	Document v4.4.2	2021-03-18 15:19:25 -04:00
imgs	[Templates] Add template "call-for-model" markdown and "call-for-big-bird" markdown (#9921 )	2021-02-05 15:47:54 +03:00
internal	Instantiate model only once in pipeline (#10888 )	2021-03-29 10:39:14 -04:00
main_classes	Add ImageFeatureExtractionMixin (#10905 )	2021-03-26 11:23:56 -04:00
model_doc	BigBird (#10183 )	2021-03-30 08:51:34 +03:00
add_new_model.rst	Add new model docs (#9667 )	2021-02-01 17:55:10 +03:00
benchmarks.rst	Make doc styler detect lists on rst (#9488 )	2021-01-11 08:53:41 -05:00
bertology.rst	Fix documentation links always pointing to master. (#9217 )	2021-01-05 06:18:48 -05:00
community.md	Add notebook on fine-tuning Bart (#10883 )	2021-03-24 11:03:37 -04:00
conf.py	Development on v4.5.0dev0	2021-03-16 11:41:15 -04:00
contributing.md	Update installation page and add contributing to the doc (#5084 )	2020-06-17 14:01:10 -04:00
converting_tensorflow_models.rst	Fix broken links in the converting tf ckpt document (#9791 )	2021-01-26 03:37:57 -05:00
custom_datasets.rst	Rename NLP library to Datasets library (#10920 )	2021-03-26 08:07:59 -04:00
examples.md	per_device instead of per_gpu/error thrown when argument unknown (#4618 )	2020-05-27 11:36:55 -04:00
favicon.ico	Adding usage examples for common tasks (#2850 )	2020-02-25 13:48:24 -05:00
glossary.rst	Adds terms to Glossary (#10443 )	2021-02-28 08:27:54 -05:00
index.rst	BigBird (#10183 )	2021-03-30 08:51:34 +03:00
installation.md	split seq2seq script into summarization & translation (#10611 )	2021-03-15 09:11:42 -04:00
migration.md	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
model_sharing.rst	[doc] nested markup is invalid in rst (#9898 )	2021-01-30 09:59:19 -05:00
model_summary.rst	ConvBERT Model (#9717 )	2021-01-27 03:20:09 -05:00
multilingual.rst	Fix documentation links always pointing to master. (#9217 )	2021-01-05 06:18:48 -05:00
notebooks.md	Update notebooks (#3620 )	2020-04-06 14:32:39 -04:00
perplexity.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
philosophy.rst	Minor documentation revisions from copyediting (#9266 )	2020-12-23 10:15:49 -05:00
preprocessing.rst	Minor documentation revisions from copyediting (#9266 )	2020-12-23 10:15:49 -05:00
pretrained_models.rst	Add m2m100 (#10236 )	2021-03-06 22:14:16 +05:30
quicktour.rst	Minor documentation revisions from copyediting (#9266 )	2020-12-23 10:15:49 -05:00
sagemaker.md	make local setup more clearer and added missing links (#10899 )	2021-03-25 09:01:31 -04:00
serialization.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
task_summary.rst	split seq2seq script into summarization & translation (#10611 )	2021-03-15 09:11:42 -04:00
testing.rst	[doc] [testing] extend the pytest -k section with more examples (#10761 )	2021-03-17 09:23:38 -04:00
tokenizer_summary.rst	Minor documentation revisions from copyediting (#9266 )	2020-12-23 10:15:49 -05:00
training.rst	[trainer] deepspeed integration (#9211 )	2021-01-12 19:05:18 -08:00