Yih-Dar
|
bd90cda9a6
|
CI with num_hidden_layers=2 🚀🚀🚀 (#25266)
* CI with layers=2
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
|
2023-08-02 20:22:36 +02:00 |
|
Yih-Dar
|
cf561d7cf1
|
Add torch >=1.12 requirement for Tapas (#24251)
* fix
* fix
* fix
* Update src/transformers/models/tapas/modeling_tapas.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
|
2023-06-13 19:19:40 +02:00 |
|
Joao Gante
|
918a06e25d
|
Generate: add test to check KV format (#23403)
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
|
2023-05-16 19:28:19 +01:00 |
|
Yih-Dar
|
975159bb61
|
Update tiny models and a few fixes (#22928)
* run_check_tiny_models
* update summary
* update mixin
* update pipeline_model_mapping
* update pipeline_model_mapping
* Update for gpt_bigcode
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
|
2023-04-24 14:45:22 +02:00 |
|
Joel Lamy-Poirier
|
e0921c6b53
|
Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575)
* Add model with cli tool
* Remove unwanted stuff
* Add new code
* Remove inference runner
* Style
* Fix checks
* Test updates
* make fixup
* fix docs
* fix doc
* fix test
* hopefully fix pipeline tests
* refactor
* fix CIs
* add comment
* rename to `GPTBigCodeForCausalLM`
* correct readme
* make fixup + docs
* make fixup
* fixes
* fixes
* Remove pruning
* Remove import
* Doc updates
* More pruning removal
* Combine copies
* Single MQA implementation, remove kv cache pre-allocation and padding
* Update doc
* Revert refactor to match gpt2 style
* Merge back key and value caches, fix some type hints
* Update doc
* Fix position ids pith padding (PR 21080)
* Add conversion script temporarily
* Update conversion script
* Remove checkpoint conversion
* New model
* Fix MQA test
* Fix copies
* try fix tests
* FIX TEST!!
* remove `DoubleHeadsModel`
* add MQA tests
* add slow tests
* clean up
* add CPU checker
* final fixes
* fixes
- fix GPU issue
- fixed slow tests
- skip disk offload
* fix final issue
* Simplify and comment baddbmm fix
* Remove unnecessary code
* Transpose tweaks
* Use beta=1 on cpu, improve tests
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
|
2023-04-10 10:57:21 +02:00 |
|