Mayank Mishra
|
a570e2ba87
|
add shared experts for upcoming Granite 4.0 language models (#35894)
* Modular GraniteMoE with shared Experts.
Signed-off-by: Shawn Tan <shawntan@ibm.com>
* Modified
* Import order.
* Modified for style
* Fix space.
* Test
* Remove extra granitemoe file.
* New converted file and tests
* Modified __init__ files.
* Formatting.
* Dummy PT objects
* register granitemoe shared model
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
* fix linting of a file
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
* fix import in modeling file
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
* update generated modeling file
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
* add documentation
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
* update docstrings
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
* update generated modeling file
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
* fix docstrings in config class
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
* merge main
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
---------
Signed-off-by: Shawn Tan <shawntan@ibm.com>
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
Co-authored-by: Shawn Tan <shawntan@ibm.com>
Co-authored-by: Shawn Tan <shawn@wtf.sg>
Co-authored-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
Co-authored-by: Sukriti Sharma <Ssukriti@users.noreply.github.com>
|
2025-02-14 16:55:28 +01:00 |
|