Susnato Dhar
|
b5db8ca66f
|
Add flash attention for gpt_bigcode (#26479)
* added flash attention of gpt_bigcode
* changed docs
* Update src/transformers/models/gpt_bigcode/modeling_gpt_bigcode.py
* add FA-2 docs
* oops
* Update docs/source/en/perf_infer_gpu_one.md Last Nit
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix
* oops
* remove padding_mask
* change getattr->hasattr logic
* changed .md file
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
|
2023-10-31 11:21:02 +00:00 |
|