Joel Lamy-Poirier
|
e0921c6b53
|
Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575)
* Add model with cli tool
* Remove unwanted stuff
* Add new code
* Remove inference runner
* Style
* Fix checks
* Test updates
* make fixup
* fix docs
* fix doc
* fix test
* hopefully fix pipeline tests
* refactor
* fix CIs
* add comment
* rename to `GPTBigCodeForCausalLM`
* correct readme
* make fixup + docs
* make fixup
* fixes
* fixes
* Remove pruning
* Remove import
* Doc updates
* More pruning removal
* Combine copies
* Single MQA implementation, remove kv cache pre-allocation and padding
* Update doc
* Revert refactor to match gpt2 style
* Merge back key and value caches, fix some type hints
* Update doc
* Fix position ids pith padding (PR 21080)
* Add conversion script temporarily
* Update conversion script
* Remove checkpoint conversion
* New model
* Fix MQA test
* Fix copies
* try fix tests
* FIX TEST!!
* remove `DoubleHeadsModel`
* add MQA tests
* add slow tests
* clean up
* add CPU checker
* final fixes
* fixes
- fix GPU issue
- fixed slow tests
- skip disk offload
* fix final issue
* Simplify and comment baddbmm fix
* Remove unnecessary code
* Transpose tweaks
* Use beta=1 on cpu, improve tests
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
|
2023-04-10 10:57:21 +02:00 |
|