transformers/docs/source
Anton Vlasjuk b275a41005
[GPT2] Add SDPA support (#31172)
* `gpt2` sdpa support

* fix (at least) one test, style, repo consistency

* fix sdpa mask in forward --> fixes generation

* test

* test2

* test3

* test4

* simplify shapes for attn mask creation and small comments

* hub fail test

* benchmarks

* flash attn 2 mask should not be inverted on enc-dec setup

* fix comment

* apply some suggestion from code review

- only save _attn_implentation once
- remove unnecessary comment

* change elif logic

* [run-slow] gpt2

* modify `test_gpt2_sample_max_time` to follow previous assertion patterns
2024-06-19 09:40:57 +02:00
..
de Docs / Quantization: Replace all occurences of load_in_8bit with bnb config (#31136) 2024-05-30 16:47:35 +02:00
en [GPT2] Add SDPA support (#31172) 2024-06-19 09:40:57 +02:00
es [docs] Spanish translation of tokenizer_summary.md (#31154) 2024-06-03 16:52:23 -07:00
fr Add missing French translation of tutoriel_pipeline.md (#31396) 2024-06-13 17:48:54 +02:00
hi More fixes for doctest (#30265) 2024-04-16 11:58:55 +02:00
it Docs / Quantization: Replace all occurences of load_in_8bit with bnb config (#31136) 2024-05-30 16:47:35 +02:00
ja docs: fix broken link (#31370) 2024-06-12 11:33:00 +01:00
ko docs: fix broken link (#31370) 2024-06-12 11:33:00 +01:00
ms Remove old TF port docs (#30426) 2024-04-23 16:06:20 +01:00
pt Use HF_HUB_OFFLINE + fix has_file in offline mode (#31016) 2024-05-29 11:55:43 +01:00
te docs: fix broken link (#31370) 2024-06-12 11:33:00 +01:00
tr Translate index.md to Turkish (#27093) 2023-11-08 08:35:20 -05:00
zh docs: fix broken link (#31370) 2024-06-12 11:33:00 +01:00
_config.py [#29174] ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1 Fix (#29888) 2024-04-08 14:21:16 +01:00