transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 10:12:23 +06:00

History

Vasudev Gupta 6dfd027279 BigBird (#10183 ) * init bigbird * model.__init__ working, conversion script ready, config updated * add conversion script * BigBirdEmbeddings working :) * slightly update conversion script * BigBirdAttention working :) ; some bug in layer.output.dense * add debugger-notebook * forward() working for BigBirdModel :) ; replaced gelu with gelu_fast * tf code adapted to torch till rand_attn in bigbird_block_sparse_attention ; till now everything working :) * BigBirdModel working in block-sparse attention mode :) * add BigBirdForPreTraining * small fix * add tokenizer for BigBirdModel * fix config & hence modeling * fix base prefix * init testing * init tokenizer test * pos_embed must be absolute, attn_type=original_full when add_cross_attn=True , nsp loss is optional in BigBirdForPreTraining, add assert statements * remove position_embedding_type arg * complete normal tests * add comments to block sparse attention * add attn_probs for sliding & global tokens * create fn for block sparse attn mask creation * add special tests * restore pos embed arg * minor fix * attn probs update * make big bird fully gpu friendly * fix tests * remove pruning * correct tokenzier & minor fixes * update conversion script , remove norm_type * tokenizer-inference test add * remove extra comments * add docs * save intermediate * finish trivia_qa conversion * small update to forward * correct qa and layer * better error message * BigBird QA ready * fix rebased * add triva-qa debugger notebook * qa setup * fixed till embeddings * some issue in q/k/v_layer * fix bug in conversion-script * fixed till self-attn * qa fixed except layer norm * add qa end2end test * fix gradient ckpting ; other qa test * speed-up big bird a bit * hub_id=google * clean up * make quality * speed up einsum with bmm * finish perf improvements for big bird * remove wav2vec2 tok * fix tokenizer * include docs * correct docs * add helper to auto pad block size * make style * remove fast tokenizer for now * fix some * add pad test * finish * fix some bugs * fix another bug * fix buffer tokens * fix comment and merge from master * add comments * make style * commit some suggestions Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix typos * fix some more suggestions * add another patch Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix copies * another path Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * update * update nit suggestions * make style Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>		2021-03-30 08:51:34 +03:00
..
albert.rst	Enforce all objects in the main init are documented (#9014 )	2020-12-10 11:57:12 -05:00
auto.rst	AutoModelForTableQuestionAnswering (#9154 )	2020-12-16 12:14:33 -05:00
bart.rst	BartForCausalLM analogs to `ProphetNetForCausalLM` (#9128 )	2021-02-04 11:56:12 +03:00
barthez.rst	Fix documentation links always pointing to master. (#9217 )	2021-01-05 06:18:48 -05:00
bert.rst	Enforce all objects in the main init are documented (#9014 )	2020-12-10 11:57:12 -05:00
bertgeneration.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
bertweet.rst	Improve documentation coverage for Bertweet (#9379 )	2021-01-04 13:12:59 -05:00
bigbird.rst	BigBird (#10183 )	2021-03-30 08:51:34 +03:00
blenderbot_small.rst	BartForCausalLM analogs to `ProphetNetForCausalLM` (#9128 )	2021-02-04 11:56:12 +03:00
blenderbot.rst	BartForCausalLM analogs to `ProphetNetForCausalLM` (#9128 )	2021-02-04 11:56:12 +03:00
bort.rst	ADD BORT (#9813 )	2021-01-27 21:25:11 +03:00
camembert.rst	Enforce all objects in the main init are documented (#9014 )	2020-12-10 11:57:12 -05:00
convbert.rst	Fix doc for TFConverBertModel	2021-02-04 10:14:46 -05:00
ctrl.rst	Added TF CTRL Sequence Classification (#9151 )	2020-12-17 18:10:57 -05:00
deberta_v2.rst	Integrate DeBERTa v2(the 1.5B model surpassed human performance on Su… (#10018 )	2021-02-19 18:34:44 -05:00
deberta.rst	Integrate DeBERTa v2(the 1.5B model surpassed human performance on Su… (#10018 )	2021-02-19 18:34:44 -05:00
dialogpt.rst	ADD BORT (#9813 )	2021-01-27 21:25:11 +03:00
distilbert.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
dpr.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
electra.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
encoderdecoder.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
flaubert.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
fsmt.rst	Deprecate prepare_seq2seq_batch (#10287 )	2021-02-22 12:36:16 -05:00
funnel.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
gpt.rst	Added TF OpenAi GPT1 Sequence Classification (#9105 )	2020-12-15 11:27:08 -05:00
gpt2.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
herbert.rst	Improve documentation coverage for Herbert (#9428 )	2021-01-06 09:13:43 -05:00
ibert.rst	Update ibert.rst (#10445 )	2021-02-28 19:03:49 +03:00
layoutlm.rst	Layout lm tf 2 (#10636 )	2021-03-25 12:32:38 -04:00
led.rst	Upgrade styler to better handle lists (#9423 )	2021-01-06 07:46:17 -05:00
longformer.rst	Add message to documentation that longformer doesn't support token_type_ids (#9152 )	2020-12-16 11:06:14 -05:00
lxmert.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
m2m_100.rst	fix M2M100 example (#10745 )	2021-03-16 20:20:00 +05:30
marian.rst	Deprecate prepare_seq2seq_batch (#10287 )	2021-02-22 12:36:16 -05:00
mbart.rst	Deprecate prepare_seq2seq_batch (#10287 )	2021-02-22 12:36:16 -05:00
mobilebert.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
mpnet.rst	MPNet copyright files (#9015 )	2020-12-10 09:29:38 -05:00
mt5.rst	Enforce all objects in the main init are documented (#9014 )	2020-12-10 11:57:12 -05:00
pegasus.rst	Fix broken link (#10656 )	2021-03-11 14:29:02 -05:00
phobert.rst	Improve documentation coverage for Phobert (#9427 )	2021-01-06 10:04:32 -05:00
prophetnet.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
rag.rst	Add TFRag (#9002 )	2021-03-09 00:49:51 +03:00
reformer.rst	Enforce all objects in the main init are documented (#9014 )	2020-12-10 11:57:12 -05:00
retribert.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
roberta.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
speech_to_text.rst	Fix S2T example (#10741 )	2021-03-16 08:55:07 -04:00
squeezebert.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
t5.rst	Deprecate prepare_seq2seq_batch (#10287 )	2021-02-22 12:36:16 -05:00
tapas.rst	Fix URLs to TAPAS notebooks (#9435 )	2021-01-06 07:20:41 -05:00
transformerxl.rst	Fix script that check objects are documented (#9259 )	2020-12-22 11:12:58 -05:00
wav2vec2.rst	Add Fine-Tuning for Wav2Vec2 (#10145 )	2021-03-01 12:13:17 +03:00
xlm.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
xlmprophetnet.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
xlmroberta.rst	Enforce all objects in the main init are documented (#9014 )	2020-12-10 11:57:12 -05:00
xlnet.rst	Enforce all objects in the main init are documented (#9014 )	2020-12-10 11:57:12 -05:00
xlsr_wav2vec2.rst	[XLSR-Wav2Vec2] Add multi-lingual Wav2Vec2 models (#10648 )	2021-03-11 17:44:18 +03:00