Ashvanth.S
3a826a45ca
Update Model card for GPT2 ( #37101 )
...
* Update Model card for gpt2
* Update link for gpt2 space
* fixes docs based on suggestions
* Add transformers-cli and quantization example for GPT-2
* Remove resources and flash attention docs and fix typos
2025-04-07 10:15:28 -07:00
Anton Vlasjuk
b275a41005
[GPT2
] Add SDPA support ( #31172 )
...
* `gpt2` sdpa support
* fix (at least) one test, style, repo consistency
* fix sdpa mask in forward --> fixes generation
* test
* test2
* test3
* test4
* simplify shapes for attn mask creation and small comments
* hub fail test
* benchmarks
* flash attn 2 mask should not be inverted on enc-dec setup
* fix comment
* apply some suggestion from code review
- only save _attn_implentation once
- remove unnecessary comment
* change elif logic
* [run-slow] gpt2
* modify `test_gpt2_sample_max_time` to follow previous assertion patterns
2024-06-19 09:40:57 +02:00
Eduardo Pacheco
22d159ddf9
Adding Flash Attention 2 Support for GPT2 ( #29226 )
...
* First commit to add flash attention 2 for GPT-2
* more improvements
* Make GPT2 pass tests and fixed Decison Transformers copies
* Fixed missing arg
* fix copies
* Added expected speedup
* Update src/transformers/models/gpt2/modeling_gpt2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/gpt2/modeling_gpt2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/gpt2/modeling_gpt2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Added test
* Fixed attn attribute
* Update docs/source/en/model_doc/gpt2.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/model_doc/gpt2.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update Decision transformer attentions
* More updates
* Passing tests
* Fix copies
* Fix copies part 2
* Decision transformer updates
* Update src/transformers/models/gpt2/modeling_gpt2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Fix copies
* Decision transformer not supporting flash attn
* Addressed comments
* Addressed comments
* Addressed comments
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-03-28 09:31:24 +00:00
Maria Khalusova
5964f820db
[Docs] Model_doc structure/clarity improvements ( #26876 )
...
* first batch of structure improvements for model_docs
* second batch of structure improvements for model_docs
* more structure improvements for model_docs
* more structure improvements for model_docs
* structure improvements for cv model_docs
* more structural refactoring
* addressed feedback about image processors
2023-11-03 10:57:03 -04:00
Sylvain Gugger
eb849f6604
Migrate doc files to Markdown. ( #24376 )
...
* Rename index.mdx to index.md
* With saved modifs
* Address review comment
* Treat all files
* .mdx -> .md
* Remove special char
* Update utils/tests_fetcher.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
---------
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2023-06-20 18:07:47 -04:00