transformers/docs/source/ja
fxmarty 80377eb018
F.scaled_dot_product_attention support (#26572)
* add sdpa

* wip

* cleaning

* add ref

* yet more cleaning

* and more :)

* wip llama

* working llama

* add output_attentions=True support

* bigcode sdpa support

* fixes

* gpt-bigcode support, require torch>=2.1.1

* add falcon support

* fix conflicts falcon

* style

* fix attention_mask definition

* remove output_attentions from attnmaskconverter

* support whisper without removing any Copied from statement

* fix mbart default to eager renaming

* fix typo in falcon

* fix is_causal in SDPA

* check is_flash_attn_2_available in the models init as well in case the model is not initialized through from_pretrained

* add warnings when falling back on the manual implementation

* precise doc

* wip replace _flash_attn_enabled by config.attn_implementation

* fix typo

* add tests

* style

* add a copy.deepcopy on the config in from_pretrained, as we do not want to modify it inplace

* obey to config.attn_implementation if a config is passed in from_pretrained

* fix is_torch_sdpa_available when torch is not installed

* remove dead code

* Update src/transformers/modeling_attn_mask_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_attn_mask_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_attn_mask_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_attn_mask_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_attn_mask_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/bart/modeling_bart.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove duplicate pretraining_tp code

* add dropout in llama

* precise comment on attn_mask

* add fmt: off for _unmask_unattended docstring

* precise num_masks comment

* nuke pretraining_tp in LlamaSDPAAttention following Arthur's suggestion

* cleanup modeling_utils

* backward compatibility

* fix style as requested

* style

* improve documentation

* test pass

* style

* add _unmask_unattended tests

* skip meaningless tests for idefics

* hard_check SDPA requirements when specifically requested

* standardize the use if XXX_ATTENTION_CLASSES

* fix SDPA bug with mem-efficient backend on CUDA when using fp32

* fix test

* rely on SDPA is_causal parameter to handle the causal mask in some cases

* fix FALCON_ATTENTION_CLASSES

* remove _flash_attn_2_enabled occurences

* fix test

* add OPT to the list of supported flash models

* improve test

* properly test on different SDPA backends, on different dtypes & properly handle separately the pad tokens in the test

* remove remaining _flash_attn_2_enabled occurence

* Update src/transformers/modeling_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_attn_mask_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/perf_infer_gpu_one.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove use_attn_implementation

* fix docstring & slight bug

* make attn_implementation internal (_attn_implementation)

* typos

* fix tests

* deprecate use_flash_attention_2=True

* fix test

* add back llama that was removed by mistake

* fix tests

* remove _flash_attn_2_enabled occurences bis

* add check & test that passed attn_implementation is valid

* fix falcon torchscript export

* fix device of mask in tests

* add tip about torch.jit.trace and move bt doc below sdpa

* fix parameterized.expand order

* move tests from test_modeling_attn_mask_utils to test_modeling_utils as a relevant test class is already there

* update sdpaattention class with the new cache

* Update src/transformers/configuration_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/bark/modeling_bark.py

* address review comments

* WIP torch.jit.trace fix. left: test both eager & sdpa

* add test for torch.jit.trace for both eager/sdpa

* fix falcon with torch==2.0 that needs to use sdpa

* fix doc

* hopefully last fix

* fix key_value_length that has no default now in mask converter

* is it flacky?

* fix speculative decoding bug

* tests do pass

* fix following #27907

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-12-09 05:38:14 +09:00
..
internal Translating en/internal folder docs to Japanese 🇯🇵 (#26747) 2023-10-17 15:01:21 -07:00
main_classes docs: replace torch.distributed.run by torchrun (#27528) 2023-11-27 16:26:33 +00:00
model_doc Translate model_doc files from clip to cpm to JP (#27774) 2023-12-07 11:12:24 -08:00
tasks Translate en/tasks folder docs to Japanese 🇯🇵 (#27098) 2023-12-04 14:10:54 -08:00
_toctree.yml Translate model_doc files from clip to cpm to JP (#27774) 2023-12-07 11:12:24 -08:00
accelerate.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
add_new_model.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
add_tensorflow_model.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
attention.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
autoclass_tutorial.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
benchmarks.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
bertology.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
big_models.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
chat_templating.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
community.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
create_a_model.md [docs] fixed links with 404 (#27327) 2023-11-06 19:45:03 +00:00
custom_models.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
custom_tools.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
fast_tokenizers.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
generation_strategies.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
glossary.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
hpo_train.md Remove-auth-token (#27060) 2023-11-13 14:20:54 +01:00
index.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
installation.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
llm_tutorial.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
model_memory_anatomy.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
model_sharing.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
model_summary.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
multilingual.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
pad_truncation.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
peft.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
perf_hardware.md docs: replace torch.distributed.run by torchrun (#27528) 2023-11-27 16:26:33 +00:00
perf_infer_cpu.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
perf_infer_gpu_many.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
perf_infer_gpu_one.md F.scaled_dot_product_attention support (#26572) 2023-12-09 05:38:14 +09:00
perf_infer_special.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
perf_torch_compile.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
perf_train_cpu_many.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
perf_train_cpu.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
perf_train_gpu_many.md docs: replace torch.distributed.run by torchrun (#27528) 2023-11-27 16:26:33 +00:00
perf_train_gpu_one.md [docs] fixed links with 404 (#27327) 2023-11-06 19:45:03 +00:00
perf_train_special.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
perf_train_tpu_tf.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
perf_train_tpu.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
performance.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
perplexity.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
philosophy.md [docs] fixed links with 404 (#27327) 2023-11-06 19:45:03 +00:00
pipeline_tutorial.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
pipeline_webserver.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
pr_checks.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
preprocessing.md Broken links fixed related to datasets docs (#27569) 2023-11-17 13:44:09 -08:00
quicktour.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
run_scripts.md docs: replace torch.distributed.run by torchrun (#27528) 2023-11-27 16:26:33 +00:00
serialization.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
task_summary.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
tasks_explained.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
testing.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
tf_xla.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
tflite.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
tokenizer_summary.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
torchscript.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
training.md Broken links fixed related to datasets docs (#27569) 2023-11-17 13:44:09 -08:00
transformers_agents.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00
troubleshooting.md add japanese documentation (#26138) 2023-10-11 10:26:37 -07:00