transformers/docs/source
Yih-Dar 8b240a0661
Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222)
* Add cross attentions to TFGPT2Model

* Add TFEncoderDecoderModel

* Add TFBaseModelOutputWithPoolingAndCrossAttentions

* Add cross attentions to TFBertModel

* Fix past or past_key_values argument issue

* Fix generation

* Fix save and load

* Add some checks and comments

* Clean the code that deals with past keys/values

* Add kwargs to processing_inputs

* Add serving_output to TFEncoderDecoderModel

* Some cleaning + fix use_cache value issue

* Fix tests + add bert2bert/bert2gpt2 tests

* Fix more tests

* Ignore crossattention.bias when loading GPT2 weights into TFGPT2

* Fix return_dict_in_generate in tf generation

* Fix is_token_logit_eos_token bug in tf generation

* Finalize the tests after fixing some bugs

* Fix another is_token_logit_eos_token bug in tf generation

* Add/Update docs

* Add TFBertEncoderDecoderModelTest

* Clean test script

* Add TFEncoderDecoderModel to the library

* Add cross attentions to TFRobertaModel

* Add TFRobertaEncoderDecoderModelTest

* make style

* Change the way of position_ids computation

* bug fix

* Fix copies in tf_albert

* Remove some copied from and apply some fix-copies

* Remove some copied

* Add cross attentions to some other TF models

* Remove encoder_hidden_states from TFLayoutLMModel.call for now

* Make style

* Fix TFRemBertForCausalLM

* Revert the change to longformer + Remove copies

* Revert the change to albert and convbert + Remove copies

* make quality

* make style

* Add TFRembertEncoderDecoderModelTest

* make quality and fix-copies

* test TFRobertaForCausalLM

* Fixes for failed tests

* Fixes for failed tests

* fix more tests

* Fixes for failed tests

* Fix Auto mapping order

* Fix TFRemBertEncoder return value

* fix tf_rembert

* Check copies are OK

* Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined

* Add TFEncoderDecoderModelSaveLoadTests

* fix tf weight loading

* check the change of use_cache

* Revert the change

* Add missing test_for_causal_lm for TFRobertaModelTest

* Try cleaning past

* fix _reorder_cache

* Revert some files to original versions

* Keep as many copies as possible

* Apply suggested changes - Use raise ValueError instead of assert

* Move import to top

* Fix wrong require_torch

* Replace more assert by raise ValueError

* Add test_pt_tf_model_equivalence (the test won't pass for now)

* add test for loading/saving

* finish

* finish

* Remove test_pt_tf_model_equivalence

* Update tf modeling template

* Remove pooling, added in the prev. commit, from MainLayer

* Update tf modeling test template

* Move inputs["use_cache"] = False to modeling_tf_utils.py

* Fix torch.Tensor in the comment

* fix use_cache

* Fix missing use_cache in ElectraConfig

* Add a note to from_pretrained

* Fix style

* Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt

* Fix TFMLP (in TFGPT2) activation issue

* Fix None past_key_values value in serving_output

* Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub

* Apply review suggestions - style for cross_attns in serving_output

* Apply review suggestions - change assert + docstrings

* break the error message to respect the char limit

* deprecate the argument past

* fix docstring style

* Update the encoder-decoder rst file

* fix Unknown interpreted text role "method"

* fix typo

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-10-13 00:10:34 +02:00
..
_static Deploy docs for v4.11.3 2021-10-06 12:58:47 -04:00
imgs [doc] DP/PP/TP/etc parallelism (#12524) 2021-07-09 17:39:09 -07:00
internal Fix doc building error 2021-08-12 05:49:02 -04:00
main_classes Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222) 2021-10-13 00:10:34 +02:00
model_doc Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222) 2021-10-13 00:10:34 +02:00
add_new_model.rst consistent nn. and nn.functional: part 5 docs (#12161) 2021-06-14 13:34:32 -07:00
add_new_pipeline.rst [Large PR] Entire rework of pipelines. (#13308) 2021-09-10 14:47:48 +02:00
benchmarks.rst [Docs] fixed broken link (#12205) 2021-06-16 15:14:53 -04:00
bertology.rst Fix documentation links always pointing to master. (#9217) 2021-01-05 06:18:48 -05:00
community.md docs: add HuggingArtists to community notebooks (#13050) 2021-08-10 09:36:44 +02:00
conf.py Docs for version v4.11.0 2021-09-27 14:19:38 -04:00
contributing.md Update installation page and add contributing to the doc (#5084) 2020-06-17 14:01:10 -04:00
converting_tensorflow_models.rst Examples reorg (#11350) 2021-04-21 11:11:20 -04:00
custom_datasets.rst Rename NLP library to Datasets library (#10920) 2021-03-26 08:07:59 -04:00
debugging.rst [debug] DebugUnderflowOverflow doesn't work with DP (#12816) 2021-07-21 09:36:02 -07:00
examples.md per_device instead of per_gpu/error thrown when argument unknown (#4618) 2020-05-27 11:36:55 -04:00
fast_tokenizers.rst Documentation about loading a fast tokenizer within Transformers (#11029) 2021-04-05 10:51:16 -04:00
favicon.ico Adding usage examples for common tasks (#2850) 2020-02-25 13:48:24 -05:00
glossary.rst Add video links to the documentation (#12162) 2021-06-15 06:37:37 -04:00
index.rst Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222) 2021-10-13 00:10:34 +02:00
installation.md Add mention of the huggingface_hub methods for offline mode (#12320) 2021-06-23 09:45:30 -04:00
migration.md consistent nn. and nn.functional: part 5 docs (#12161) 2021-06-14 13:34:32 -07:00
model_sharing.rst separate model card git push from the rest (#13514) 2021-09-14 18:07:36 +02:00
model_summary.rst Add video links to the documentation (#12162) 2021-06-15 06:37:37 -04:00
multilingual.rst Examples reorg (#11350) 2021-04-21 11:11:20 -04:00
notebooks.md Update notebooks (#3620) 2020-04-06 14:32:39 -04:00
parallelism.md Update parallelism.md (#13892) 2021-10-05 17:42:12 -07:00
performance.md Make gradient_checkpointing a training argument (#13657) 2021-09-22 07:51:38 -04:00
perplexity.rst Small changes in perplexity.rstto make the notebook executable on google collaboratory (#13541) 2021-09-13 13:32:32 +02:00
philosophy.rst Minor documentation revisions from copyediting (#9266) 2020-12-23 10:15:49 -05:00
preprocessing.rst doc mismatch fixed (#13345) 2021-08-31 06:28:37 -04:00
pretrained_models.rst Fix broken link to distill models in docs (#13848) 2021-10-04 11:57:54 -04:00
quicktour.rst [Large PR] Entire rework of pipelines. (#13308) 2021-09-10 14:47:48 +02:00
sagemaker.md remove documentation (#12657) 2021-07-12 18:02:51 +02:00
serialization.rst Autodocument the list of ONNX-supported models (#13884) 2021-10-05 22:43:16 -04:00
task_summary.rst Doctests job (#13088) 2021-08-12 03:42:25 -04:00
testing.rst [testing] auto-replay captured streams (#13803) 2021-09-30 09:26:49 -07:00
tokenizer_summary.rst Add video links to the documentation (#12162) 2021-06-15 06:37:37 -04:00
training.rst Change https:/ to https:// (#13644) 2021-09-20 12:31:46 -04:00
troubleshooting.md [troubleshooting] add 2 points of reference to the offline mode (#11236) 2021-04-14 08:39:23 -07:00