transformers/docs/source/model_doc
Suraj Patil 860264379f
GPT Neo (#10848)
* lets begin

* boom boom

* fix out proj in attn

* fix attention

* fix local attention

* add tokenizer

* fix imports

* autotokenizer

* fix checkpoint name

* cleanup

* more clean-up

* more cleanup

* output attentions

* fix attn mask creation

* fix imports

* config doc

* add tests

* add slow tests

* quality

* add conversion script

* copyright

* typo

* another bites the dust

* fix attention tests

* doc

* add embed init in convert function

* fix copies

* remove tokenizer

* enable caching

* address review comments

* improve config and create attn layer list internally

* more consistent naming

* init hf config from mesh-tf config json file

* remove neo tokenizer from doc

* handle attention_mask in local attn layer

* attn_layers => attention_layers

* add tokenizer_class in config

* fix docstring

* raise if len of attention_layers is not same as num_layers

* remove tokenizer_class from config

* more consistent naming

* fix doc

* fix checkpoint names

* fp16 compat

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-30 09:42:30 -04:00
..
albert.rst Enforce all objects in the main init are documented (#9014) 2020-12-10 11:57:12 -05:00
auto.rst AutoModelForTableQuestionAnswering (#9154) 2020-12-16 12:14:33 -05:00
bart.rst BartForCausalLM analogs to ProphetNetForCausalLM (#9128) 2021-02-04 11:56:12 +03:00
barthez.rst Fix documentation links always pointing to master. (#9217) 2021-01-05 06:18:48 -05:00
bert.rst Enforce all objects in the main init are documented (#9014) 2020-12-10 11:57:12 -05:00
bertgeneration.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
bertweet.rst Improve documentation coverage for Bertweet (#9379) 2021-01-04 13:12:59 -05:00
bigbird.rst BigBird (#10183) 2021-03-30 08:51:34 +03:00
blenderbot_small.rst BartForCausalLM analogs to ProphetNetForCausalLM (#9128) 2021-02-04 11:56:12 +03:00
blenderbot.rst BartForCausalLM analogs to ProphetNetForCausalLM (#9128) 2021-02-04 11:56:12 +03:00
bort.rst ADD BORT (#9813) 2021-01-27 21:25:11 +03:00
camembert.rst Enforce all objects in the main init are documented (#9014) 2020-12-10 11:57:12 -05:00
convbert.rst Fix doc for TFConverBertModel 2021-02-04 10:14:46 -05:00
ctrl.rst Added TF CTRL Sequence Classification (#9151) 2020-12-17 18:10:57 -05:00
deberta_v2.rst Integrate DeBERTa v2(the 1.5B model surpassed human performance on Su… (#10018) 2021-02-19 18:34:44 -05:00
deberta.rst Integrate DeBERTa v2(the 1.5B model surpassed human performance on Su… (#10018) 2021-02-19 18:34:44 -05:00
dialogpt.rst ADD BORT (#9813) 2021-01-27 21:25:11 +03:00
distilbert.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
dpr.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
electra.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
encoderdecoder.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
flaubert.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
fsmt.rst Deprecate prepare_seq2seq_batch (#10287) 2021-02-22 12:36:16 -05:00
funnel.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
gpt_neo.rst GPT Neo (#10848) 2021-03-30 09:42:30 -04:00
gpt.rst Added TF OpenAi GPT1 Sequence Classification (#9105) 2020-12-15 11:27:08 -05:00
gpt2.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
herbert.rst Improve documentation coverage for Herbert (#9428) 2021-01-06 09:13:43 -05:00
ibert.rst Update ibert.rst (#10445) 2021-02-28 19:03:49 +03:00
layoutlm.rst Layout lm tf 2 (#10636) 2021-03-25 12:32:38 -04:00
led.rst Upgrade styler to better handle lists (#9423) 2021-01-06 07:46:17 -05:00
longformer.rst Add message to documentation that longformer doesn't support token_type_ids (#9152) 2020-12-16 11:06:14 -05:00
lxmert.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
m2m_100.rst fix M2M100 example (#10745) 2021-03-16 20:20:00 +05:30
marian.rst Deprecate prepare_seq2seq_batch (#10287) 2021-02-22 12:36:16 -05:00
mbart.rst Deprecate prepare_seq2seq_batch (#10287) 2021-02-22 12:36:16 -05:00
mobilebert.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
mpnet.rst MPNet copyright files (#9015) 2020-12-10 09:29:38 -05:00
mt5.rst Enforce all objects in the main init are documented (#9014) 2020-12-10 11:57:12 -05:00
pegasus.rst Fix broken link (#10656) 2021-03-11 14:29:02 -05:00
phobert.rst Improve documentation coverage for Phobert (#9427) 2021-01-06 10:04:32 -05:00
prophetnet.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
rag.rst Add TFRag (#9002) 2021-03-09 00:49:51 +03:00
reformer.rst Enforce all objects in the main init are documented (#9014) 2020-12-10 11:57:12 -05:00
retribert.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
roberta.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
speech_to_text.rst Fix S2T example (#10741) 2021-03-16 08:55:07 -04:00
squeezebert.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
t5.rst Deprecate prepare_seq2seq_batch (#10287) 2021-02-22 12:36:16 -05:00
tapas.rst Fix URLs to TAPAS notebooks (#9435) 2021-01-06 07:20:41 -05:00
transformerxl.rst Fix script that check objects are documented (#9259) 2020-12-22 11:12:58 -05:00
wav2vec2.rst Add Fine-Tuning for Wav2Vec2 (#10145) 2021-03-01 12:13:17 +03:00
xlm.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
xlmprophetnet.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
xlmroberta.rst Enforce all objects in the main init are documented (#9014) 2020-12-10 11:57:12 -05:00
xlnet.rst Enforce all objects in the main init are documented (#9014) 2020-12-10 11:57:12 -05:00
xlsr_wav2vec2.rst [XLSR-Wav2Vec2] Add multi-lingual Wav2Vec2 models (#10648) 2021-03-11 17:44:18 +03:00