transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-04 13:20:12 +06:00

History

Anton Vlasjuk d95c864a25 🔴🔴🔴 [`Attention`] Refactor Attention Interface for Bart-based Models (#38108 ) * starting attn refactor for encoder decoder models via bart (eager + sdpa) * flash attention works, remove unnecessary code * flex attention support for bart!, gotta check if the renaming is not too aggressive * some comments * skip flex grad test for standalone as done with the other test * revert flex attn rename (for now), sdpa simplify, and todos * more todos * refactor mask creation for reuse * modular attempt at biogpt * first batch of other models * fix attn dropout * fix autoformer copies * hubert * another batch of models * copies/style + last round of bart models --> whisper next? * remove unnecessary _reshape function and remove copy to whisper * add skip for decoder-only models out of enc-dec (same as in bart) * bring back licences * remove comment, added to pr read instead * mostly docs * disable sew flex attn as it's unclear attn mask for now * oops * test fixes for enc-dec * torch fx fixes + try at flex attn * skip on mbart * some more fixes * musicgen skip / delete old attn class logic + sdpa compose compile skip * disable flex attn for musicgen, not worth the effort * more fixes and style * flex attention test for dropout and encoder decoder that dont have main input names * informer fixes * the weirdest thing I've encountered yet... * style * remove empty tensor attempt, found core root in previous commits * disable time series due to tests being very text centric on inputs * add speech to text to be ignoring the other attns, also due to tests * update docs * remaining issues resolved ? * update docs for current state --> nllb moe and pegasus x sdpa is questionable :D * some models have not set the is_causal flag... * change dtype in softmax tol old behaviour + some modular fixes * I hate it but it is what it is * fixes from main for bart * forgot this one * some model fixes * style * current status * marian works now * fixing some copies * some copy fixes + time series x informer * last models possibly and fixes on style/copies * some post merge fixes * more fixes * make attention interface callable and move warnings there * style lol * add comment to "unsupported" * remove callable interface and change interface warnings + some copies * fix * ternary is ugly af, make it simpler * how did that happen * fix flex attn test * failing the test * no more fallback! fixing copies next * style + attn fixed * fixing copies and mask creation * wrong copy * fixup tests and disable flex attn for now * fixup last tests?		2025-05-22 17:12:58 +02:00
..
ar	Fixed broken links (#37466 )	2025-04-14 14:16:07 +01:00
de	Transformers cli clean command (#37657 )	2025-04-30 12:15:43 +01:00
en	🔴🔴🔴 [`Attention`] Refactor Attention Interface for Bart-based Models (#38108 )	2025-05-22 17:12:58 +02:00
es	Transformers cli clean command (#37657 )	2025-04-30 12:15:43 +01:00
fr	[agents] remove agents 🧹 (#37368 )	2025-04-11 18:42:37 +01:00
hi	[i18n-HI] Translated TFLite page to Hindi (#34572 )	2024-11-04 09:40:30 -08:00
it	Fix typos (#37978 )	2025-05-06 14:45:20 +01:00
ja	Remove Japanese sequence_classification doc and update references (#38246 )	2025-05-21 08:33:41 -07:00
ko	Fix typo (#37964 )	2025-05-06 14:59:00 +01:00
ms	[agents] remove agents 🧹 (#37368 )	2025-04-11 18:42:37 +01:00
pt	Transformers cli clean command (#37657 )	2025-04-30 12:15:43 +01:00
te	Fix typos in translated quicktour docs (#35302 )	2024-12-17 09:32:00 -08:00
tr	Translate index.md to Turkish (#27093 )	2023-11-08 08:35:20 -05:00
zh	Translating model_doc/bert.md to Chinese (#37806 )	2025-05-19 10:14:57 -07:00
_config.py	[#29174 ] ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1 Fix (#29888 )	2024-04-08 14:21:16 +01:00