transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-06 14:20:04 +06:00

History

Eduardo Pacheco 22d159ddf9 Adding Flash Attention 2 Support for GPT2 (#29226 ) * First commit to add flash attention 2 for GPT-2 * more improvements * Make GPT2 pass tests and fixed Decison Transformers copies * Fixed missing arg * fix copies * Added expected speedup * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Added test * Fixed attn attribute * Update docs/source/en/model_doc/gpt2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/gpt2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update Decision transformer attentions * More updates * Passing tests * Fix copies * Fix copies part 2 * Decision transformer updates * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix copies * Decision transformer not supporting flash attn * Addressed comments * Addressed comments * Addressed comments --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>		2024-03-28 09:31:24 +00:00
..
de	Make torch xla available on GPU (#29334 )	2024-03-11 14:07:16 +00:00
en	Adding Flash Attention 2 Support for GPT2 (#29226 )	2024-03-28 09:31:24 +00:00
es	[docs] Spanish translation of attention.md (#29681 )	2024-03-15 11:55:35 -07:00
fr	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
hi	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
it	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
ja	[docs] Remove broken ChatML format link from chat_templating.md (#29643 )	2024-03-13 13:04:51 -07:00
ko	Make torch xla available on GPU (#29334 )	2024-03-11 14:07:16 +00:00
ms	[Docs] Add missing language options and fix broken links (#28852 )	2024-02-06 12:01:01 -08:00
pt	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
te	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
tr	Translate index.md to Turkish (#27093 )	2023-11-08 08:35:20 -05:00
zh	[docs] Remove broken ChatML format link from chat_templating.md (#29643 )	2024-03-13 13:04:51 -07:00
_config.py	[`Styling`] stylify using ruff (#27144 )	2023-11-16 17:43:19 +01:00