transformers/docs/source/model_doc
Yih-Dar 8b240a0661
Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222)
* Add cross attentions to TFGPT2Model

* Add TFEncoderDecoderModel

* Add TFBaseModelOutputWithPoolingAndCrossAttentions

* Add cross attentions to TFBertModel

* Fix past or past_key_values argument issue

* Fix generation

* Fix save and load

* Add some checks and comments

* Clean the code that deals with past keys/values

* Add kwargs to processing_inputs

* Add serving_output to TFEncoderDecoderModel

* Some cleaning + fix use_cache value issue

* Fix tests + add bert2bert/bert2gpt2 tests

* Fix more tests

* Ignore crossattention.bias when loading GPT2 weights into TFGPT2

* Fix return_dict_in_generate in tf generation

* Fix is_token_logit_eos_token bug in tf generation

* Finalize the tests after fixing some bugs

* Fix another is_token_logit_eos_token bug in tf generation

* Add/Update docs

* Add TFBertEncoderDecoderModelTest

* Clean test script

* Add TFEncoderDecoderModel to the library

* Add cross attentions to TFRobertaModel

* Add TFRobertaEncoderDecoderModelTest

* make style

* Change the way of position_ids computation

* bug fix

* Fix copies in tf_albert

* Remove some copied from and apply some fix-copies

* Remove some copied

* Add cross attentions to some other TF models

* Remove encoder_hidden_states from TFLayoutLMModel.call for now

* Make style

* Fix TFRemBertForCausalLM

* Revert the change to longformer + Remove copies

* Revert the change to albert and convbert + Remove copies

* make quality

* make style

* Add TFRembertEncoderDecoderModelTest

* make quality and fix-copies

* test TFRobertaForCausalLM

* Fixes for failed tests

* Fixes for failed tests

* fix more tests

* Fixes for failed tests

* Fix Auto mapping order

* Fix TFRemBertEncoder return value

* fix tf_rembert

* Check copies are OK

* Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined

* Add TFEncoderDecoderModelSaveLoadTests

* fix tf weight loading

* check the change of use_cache

* Revert the change

* Add missing test_for_causal_lm for TFRobertaModelTest

* Try cleaning past

* fix _reorder_cache

* Revert some files to original versions

* Keep as many copies as possible

* Apply suggested changes - Use raise ValueError instead of assert

* Move import to top

* Fix wrong require_torch

* Replace more assert by raise ValueError

* Add test_pt_tf_model_equivalence (the test won't pass for now)

* add test for loading/saving

* finish

* finish

* Remove test_pt_tf_model_equivalence

* Update tf modeling template

* Remove pooling, added in the prev. commit, from MainLayer

* Update tf modeling test template

* Move inputs["use_cache"] = False to modeling_tf_utils.py

* Fix torch.Tensor in the comment

* fix use_cache

* Fix missing use_cache in ElectraConfig

* Add a note to from_pretrained

* Fix style

* Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt

* Fix TFMLP (in TFGPT2) activation issue

* Fix None past_key_values value in serving_output

* Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub

* Apply review suggestions - style for cross_attns in serving_output

* Apply review suggestions - change assert + docstrings

* break the error message to respect the char limit

* deprecate the argument past

* fix docstring style

* Update the encoder-decoder rst file

* fix Unknown interpreted text role "method"

* fix typo

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-10-13 00:10:34 +02:00
..
albert.rst albert flax (#13294) 2021-08-30 17:29:27 +02:00
auto.rst Image Segmentation pipeline (#13828) 2021-10-08 09:59:53 +02:00
bart.rst FlaxBart (#11537) 2021-06-14 15:16:08 +05:30
barthez.rst Examples reorg (#11350) 2021-04-21 11:11:20 -04:00
beit.rst beit-flax (#13515) 2021-09-21 13:34:19 +02:00
bert_japanese.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
bert.rst [Flax] Correct flax docs (#12782) 2021-08-04 16:31:23 +02:00
bertgeneration.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
bertweet.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
bigbird_pegasus.rst Add BigBirdPegasus (#10991) 2021-05-07 09:27:43 +02:00
bigbird.rst Flax Big Bird (#11967) 2021-06-14 20:01:03 +01:00
blenderbot_small.rst Add BlenderBot small tokenizer to the init (#13367) 2021-09-22 19:00:47 -04:00
blenderbot.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
bort.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
byt5.rst Improve T5 docs (#13240) 2021-09-01 15:05:40 +02:00
camembert.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
canine.rst Wrong model is used in example, should be character instead of subword model (#12676) 2021-07-13 08:40:27 -04:00
clip.rst add and fix examples (#12810) 2021-07-20 09:28:50 -04:00
convbert.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
cpm.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
ctrl.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
deberta_v2.rst Deberta_v2 tf (#13120) 2021-08-31 06:32:47 -04:00
deberta.rst Deberta tf (#12972) 2021-08-12 05:01:26 -04:00
deit.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
detr.rst Improve detr (#12147) 2021-06-17 10:37:54 -04:00
dialogpt.rst ADD BORT (#9813) 2021-01-27 21:25:11 +03:00
distilbert.rst Fix typo distilbert doc (#13643) 2021-09-20 15:10:33 -04:00
dpr.rst [DPR] Correct init (#13796) 2021-09-30 18:55:20 +02:00
electra.rst [Flax] Add Electra models (#11426) 2021-05-04 20:56:09 +02:00
encoderdecoder.rst Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222) 2021-10-13 00:10:34 +02:00
flaubert.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
fnet.rst Add FNet (#13045) 2021-09-20 13:24:30 +02:00
fsmt.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
funnel.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
gpt_neo.rst FlaxGPTNeo (#12493) 2021-07-06 18:55:18 +05:30
gpt.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
gpt2.rst Add Mistral GPT-2 Stability Tweaks (#13573) 2021-10-04 07:37:09 -04:00
gptj.rst [docs/gpt-j] fix typo (#13851) 2021-10-04 12:30:50 +02:00
herbert.rst Fixed typo: herBERT -> HerBERT (#13936) 2021-10-08 10:27:32 -04:00
hubert.rst Add Wav2Vec2 & Hubert ForSequenceClassification (#13153) 2021-08-27 20:52:51 +03:00
ibert.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
layoutlm.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
layoutlmv2.rst Add LayoutLMv2 + LayoutXLM (#12604) 2021-08-30 12:35:42 +02:00
layoutxlm.rst Add tokenizer docs (#13373) 2021-09-02 09:46:05 +02:00
led.rst Make gradient_checkpointing a training argument (#13657) 2021-09-22 07:51:38 -04:00
longformer.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
luke.rst Add LUKE (#11223) 2021-05-03 09:07:29 -04:00
lxmert.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
m2m_100.rst replace tgt_lang by tgt_text (#13061) 2021-08-09 22:47:05 +05:30
marian.rst Rely on huggingface_hub for common tools (#13100) 2021-08-12 14:59:02 +02:00
mbart.rst fix example (#13387) 2021-09-02 11:32:18 +02:00
megatron_bert.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
megatron_gpt2.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
mobilebert.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
mpnet.rst MPNet copyright files (#9015) 2020-12-10 09:29:38 -05:00
mt5.rst Fix mT5 documentation (#13639) 2021-09-20 07:53:31 -04:00
pegasus.rst [Flax] Addition of FlaxPegasus (#13420) 2021-09-14 17:15:19 +02:00
phobert.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
prophetnet.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
rag.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
reformer.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
rembert.rst Add RemBERT model code to huggingface (#10692) 2021-07-24 11:31:42 -04:00
retribert.rst Examples reorg (#11350) 2021-04-21 11:11:20 -04:00
roberta.rst Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222) 2021-10-13 00:10:34 +02:00
roformer.rst [RoFormer] Fix some issues (#12397) 2021-07-06 03:31:57 -04:00
speech_to_text_2.rst Add SpeechEncoderDecoder & Speech2Text2 (#13186) 2021-09-01 13:33:31 +02:00
speech_to_text.rst fix: typo spelling grammar (#13212) 2021-08-30 08:09:14 -04:00
speechencoderdecoder.rst Add SpeechEncoderDecoder & Speech2Text2 (#13186) 2021-09-01 13:33:31 +02:00
splinter.rst Add splinter (#12955) 2021-08-17 08:29:01 -04:00
squeezebert.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
t5.rst Improve T5 docs (#13240) 2021-09-01 15:05:40 +02:00
t5v1.1.rst Improve T5 docs (#13240) 2021-09-01 15:05:40 +02:00
tapas.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
transformerxl.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
visual_bert.rst Fix VisualBERT docs (#13106) 2021-08-13 11:44:04 +05:30
vit.rst Add DINO conversion script (#13265) 2021-08-26 17:25:20 +02:00
wav2vec2.rst Add Wav2Vec2 & Hubert ForSequenceClassification (#13153) 2021-08-27 20:52:51 +03:00
xlm.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
xlmprophetnet.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
xlmroberta.rst Honor contributors to models (#11329) 2021-04-21 09:47:27 -04:00
xlnet.rst Examples reorg (#11350) 2021-04-21 11:11:20 -04:00
xlsr_wav2vec2.rst [XLSR-Wav2Vec2] Add multi-lingual Wav2Vec2 models (#10648) 2021-03-11 17:44:18 +03:00