mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-18 03:58:25 +06:00
![]() * add past_key_values * add use_cache option * make mask before cutting ids * adjust position_ids according to past_key_values * flatten past_key_values * fix positional embeds * fix _reorder_cache * set use_cache to false when not decoder, fix attention mask init * add test for caching * add past_key_values for Roberta * fix position embeds * add caching test for roberta * add doc * make style * doc, fix attention mask, test * small fixes * adress patrick's comments * input_ids shouldn't start with pad token * use_cache only when decoder * make consistent with bert * make copies consistent * add use_cache to encoder * add past_key_values to tapas attention * apply suggestions from code review * make coppies consistent * add attn mask in tests * remove copied from longformer * apply suggestions from code review * fix bart test * nit * simplify model outputs * fix doc * fix output ordering |
||
---|---|---|
.. | ||
_static | ||
imgs | ||
internal | ||
main_classes | ||
model_doc | ||
benchmarks.rst | ||
bertology.rst | ||
conf.py | ||
contributing.md | ||
converting_tensorflow_models.rst | ||
custom_datasets.rst | ||
examples.md | ||
favicon.ico | ||
glossary.rst | ||
index.rst | ||
installation.md | ||
migration.md | ||
model_sharing.rst | ||
model_summary.rst | ||
multilingual.rst | ||
notebooks.md | ||
perplexity.rst | ||
philosophy.rst | ||
preprocessing.rst | ||
pretrained_models.rst | ||
quicktour.rst | ||
serialization.rst | ||
task_summary.rst | ||
testing.rst | ||
tokenizer_summary.rst | ||
training.rst |