transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-20 04:58:22 +06:00

History

Anton Vlasjuk 46df859975 [`GPTNeoX`] Flex Attention + Refactor (#34896 ) * gpt neox flex attention + refactor * some formatting * small fix on dropout * add assertion on flex attn test * flaky ci :( * add head mask support * style * handle dtype, replace torch where * fixup flex with output attns * code review and several other fixes * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * style * remove unnecessary comment * remove incorrect comment * make flex attn check more agnostic tor versions and centralized * change peft input dtype check to value since q and k could be affected by other stuff like RoPE * i forgor * flaky * code review and small fixes * Update src/transformers/models/gpt_neox/modeling_gpt_neox.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-12-04 14:48:28 +01:00
..
__init__.py	[WIP] Adding GPT-NeoX-20B (#16659 )	2022-05-24 09:31:10 -04:00
test_modeling_gpt_neox.py	[`GPTNeoX`] Flex Attention + Refactor (#34896 )	2024-12-04 14:48:28 +01:00

[GPTNeoX] Flex Attention + Refactor (#34896 )

* gpt neox flex attention + refactor

* some formatting

* small fix on dropout

* add assertion on flex attn test

* flaky ci :(

* add head mask support

* style

* handle dtype, replace torch where

* fixup flex with output attns

* code review and several other fixes

* Update src/transformers/modeling_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* style

* remove unnecessary comment

* remove incorrect comment

* make flex attn check more agnostic tor versions and centralized

* change peft input dtype check to value since q and k could be affected by other stuff like RoPE

* i forgor

* flaky

* code review and small fixes

* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

2024-12-04 14:48:28 +01:00

__init__.py [WIP] Adding GPT-NeoX-20B (#16659 ) 2022-05-24 09:31:10 -04:00

test_modeling_gpt_neox.py [GPTNeoX] Flex Attention + Refactor (#34896 ) 2024-12-04 14:48:28 +01:00