transformers/docs/source/en
Younes Belkada 368a58e61c
[core ] Integrate Flash attention 2 in most used models (#25598)
* v1

* oops

* working v1

* fixup

* add some TODOs

* fixup

* padding support + try with module replacement

* nit

* alternative design

* oops

* add `use_cache` support for llama

* v1 falcon

* nit

* a bit of refactor

* nit

* nits nits

* add v1 padding support falcon (even though it seemed to work before)

* nit

* falcon works

* fixup

* v1 tests

* nit

* fix generation llama flash

* update tests

* fix tests + nits

* fix copies

* fix nit

* test- padding mask

* stype

* add more mem efficient support

* Update src/transformers/modeling_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fixup

* nit

* fixup

* remove it from config when saving

* fixup

* revert docstring

* add more checks

* use values

* oops

* new version

* fixup

* add same trick for falcon

* nit

* add another test

* change tests

* fix issues with GC and also falcon

* fixup

* oops

* Update src/transformers/models/falcon/modeling_falcon.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* add init_rope

* updates

* fix copies

* fixup

* fixup

* more clarification

* fixup

* right padding tests

* add docs

* add FA in docker image

* more clarifications

* add some figures

* add todo

* rectify comment

* Change to FA2

* Update docs/source/en/perf_infer_gpu_one.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* split in two lines

* change test name

* add more tests

* some clean up

* remove `rearrange` deps

* add more docs

* revert changes on dockerfile

* Revert "revert changes on dockerfile"

This reverts commit 8d72a66b4b.

* revert changes on dockerfile

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <hi@lysand.re>

* address some comments

* docs

* use inheritance

* Update src/transformers/testing_utils.py

Co-authored-by: Lysandre Debut <hi@lysand.re>

* fixup

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_utils.py

* final comments

* clean up

* style

* add cast + warning for PEFT models

* fixup

---------

Co-authored-by: Felix Marty <9808326+fxmarty@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>
2023-09-22 17:42:10 +02:00
..
internal Generate: add missing logits processors docs (#25653) 2023-08-25 11:56:17 +01:00
main_classes Tweaks to Chat Templates docs (#26168) 2023-09-15 12:50:57 +01:00
model_doc Add ViTMatte (#25843) 2023-09-19 10:56:10 -03:00
tasks [doc] fixed indices in obj detection example (#26343) 2023-09-22 10:29:27 -04:00
_config.py Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
_toctree.yml Add ViTMatte (#25843) 2023-09-19 10:56:10 -03:00
accelerate.md Fix typos (#25936) 2023-09-04 11:15:12 +01:00
add_new_model.md Fix typos (#25936) 2023-09-04 11:15:12 +01:00
add_new_pipeline.md Update add_new_pipeline.md (#26197) 2023-09-19 00:41:16 +02:00
add_tensorflow_model.md Remove utils/documentation_tests.txt (#26213) 2023-09-18 13:33:01 +02:00
attention.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
autoclass_tutorial.md Update autoclass_tutorial.md (#25929) 2023-09-04 11:16:49 +01:00
benchmarks.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
bertology.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
big_models.md Fix typos (#25936) 2023-09-04 11:15:12 +01:00
chat_templating.md Tweaks to Chat Templates docs (#26168) 2023-09-15 12:50:57 +01:00
community.md Update community.md (#25928) 2023-09-04 11:16:34 +01:00
contributing.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
create_a_model.md Fix typos (#25936) 2023-09-04 11:15:12 +01:00
custom_models.md Fix typos (#25936) 2023-09-04 11:15:12 +01:00
custom_tools.md [doc] Always call it Agents for consistency (#25958) 2023-09-05 12:27:20 +01:00
debugging.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
fast_tokenizers.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
generation_strategies.md Generate: legacy mode is only triggered when generation_config is untouched (#25962) 2023-09-12 12:08:17 +01:00
glossary.md Fix typos (#25936) 2023-09-04 11:15:12 +01:00
hpo_train.md enable optuna multi-objectives feature (#25969) 2023-09-12 18:01:22 +01:00
index.md Add ViTMatte (#25843) 2023-09-19 10:56:10 -03:00
installation.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
llm_tutorial.md Fix typos (#25936) 2023-09-04 11:15:12 +01:00
model_memory_anatomy.md Fix typos (#25936) 2023-09-04 11:15:12 +01:00
model_sharing.md Fix typos (#25936) 2023-09-04 11:15:12 +01:00
model_summary.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
multilingual.md Fix typo in example code (#25583) 2023-08-18 07:58:59 +02:00
notebooks.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
pad_truncation.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
peft.md [PEFT] Peft integration alternative design (#25077) 2023-08-18 19:08:03 +02:00
perf_hardware.md 🌐 [i18n-KO] Translated perf_hardware.md to Korean (#24966) 2023-07-25 07:44:24 -04:00
perf_infer_cpu.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_infer_gpu_many.md [core ] Integrate Flash attention 2 in most used models (#25598) 2023-09-22 17:42:10 +02:00
perf_infer_gpu_one.md [core ] Integrate Flash attention 2 in most used models (#25598) 2023-09-22 17:42:10 +02:00
perf_infer_special.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_torch_compile.md Fix rendering for torch.compile() docs (#25432) 2023-08-10 13:25:00 +02:00
perf_train_cpu_many.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_train_cpu.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_train_gpu_many.md deprecate sharded_ddp training argument (#24825) 2023-07-17 06:57:42 -04:00
perf_train_gpu_one.md [core ] Integrate Flash attention 2 in most used models (#25598) 2023-09-22 17:42:10 +02:00
perf_train_special.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_train_tpu_tf.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_train_tpu.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
performance.md [docs] Performance docs tidy up, part 1 (#23963) 2023-07-24 08:57:24 -04:00
perplexity.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
philosophy.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
pipeline_tutorial.md Support loading base64 images in pipelines (#25633) 2023-08-29 19:24:24 +01:00
pipeline_webserver.md Suggestions on Pipeline_webserver (#25570) 2023-08-18 10:17:44 +02:00
pr_checks.md Document check copies (#25291) 2023-08-04 14:56:29 +02:00
preprocessing.md Removal of deprecated vision methods and specify deprecation versions (#24570) 2023-06-29 15:09:51 +01:00
quicktour.md [TYPO] fix typo/format in quicktour.md (#25519) 2023-08-16 08:03:23 +02:00
run_scripts.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
sagemaker.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
serialization.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
task_summary.md Fix doctest (#25031) 2023-07-25 22:10:06 +02:00
tasks_explained.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
testing.md fix wrong path in some doc (#25658) 2023-08-23 08:34:30 +02:00
tf_xla.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
tflite.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
tokenizer_summary.md Fix typo: Roberta -> RoBERTa (#25302) 2023-08-03 14:17:30 -07:00
torchscript.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
training.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
transformers_agents.md [doc] Always call it Agents for consistency (#25958) 2023-09-05 12:27:20 +01:00
troubleshooting.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00