transformers/docs/source
fxmarty 1da1302ec8
Flash Attention 2 support for RoCm (#27611)
* support FA2

* fix typo

* fix broken tests

* fix more test errors

* left/right

* fix bug

* more test

* typo

* fix layout flash attention falcon

* do not support this case

* use allclose instead of equal

* fix various bugs with flash attention

* bump

* fix test

* fix mistral

* use skiptest instead of return that may be misleading

* add fix causal arg flash attention

* fix copies

* more explicit comment

* still use self.is_causal

* fix causal argument

* comment

* fixes

* update documentation

* add link

* wrong test

* simplify FA2 RoCm requirements

* update opt

* make flash_attn_uses_top_left_mask attribute private and precise comment

* better error handling

* fix copy & mistral

* Update src/transformers/modeling_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/modeling_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/modeling_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/utils/import_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* use is_flash_attn_greater_or_equal_2_10 instead of is_flash_attn_greater_or_equal_210

* fix merge

* simplify

* inline args

---------

Co-authored-by: Felix Marty <felix@hf.co>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-12-04 21:52:17 +09:00
..
de docs: replace torch.distributed.run by torchrun (#27528) 2023-11-27 16:26:33 +00:00
en Flash Attention 2 support for RoCm (#27611) 2023-12-04 21:52:17 +09:00
es docs: replace torch.distributed.run by torchrun (#27528) 2023-11-27 16:26:33 +00:00
fr [i18n-fr] Translate installation to French (#27657) 2023-12-01 14:00:07 +01:00
hi Hindi translation of pipeline_tutorial.md (#26837) 2023-10-25 11:21:49 -07:00
it docs: replace torch.distributed.run by torchrun (#27528) 2023-11-27 16:26:33 +00:00
ja Docs: Fix broken cross-references, i.e. ~transformer. -> ~transformers. (#27740) 2023-11-28 08:40:44 -08:00
ko docs: replace torch.distributed.run by torchrun (#27528) 2023-11-27 16:26:33 +00:00
ms TVP model (#25856) 2023-11-21 16:41:55 +00:00
pt docs: replace torch.distributed.run by torchrun (#27528) 2023-11-27 16:26:33 +00:00
te Added Telugu [te] translations (#26828) 2023-10-20 15:27:55 -07:00
tr Translate index.md to Turkish (#27093) 2023-11-08 08:35:20 -05:00
zh translation main-class files to chinese (#27588) 2023-11-27 12:36:37 -08:00
_config.py [Styling] stylify using ruff (#27144) 2023-11-16 17:43:19 +01:00