transformers/docs/source
Mayank Mishra a570e2ba87
add shared experts for upcoming Granite 4.0 language models (#35894)
* Modular GraniteMoE with shared Experts.

Signed-off-by: Shawn Tan <shawntan@ibm.com>

* Modified

* Import order.

* Modified for style

* Fix space.

* Test

* Remove extra granitemoe file.

* New converted file and tests

* Modified __init__ files.

* Formatting.

* Dummy PT objects

* register granitemoe shared model

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* fix linting of a file

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* fix import in modeling file

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* update generated modeling file

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* add documentation

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* update docstrings

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* update generated modeling file

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* fix docstrings in config class

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* merge main

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

---------

Signed-off-by: Shawn Tan <shawntan@ibm.com>
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
Co-authored-by: Shawn Tan <shawntan@ibm.com>
Co-authored-by: Shawn Tan <shawn@wtf.sg>
Co-authored-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
Co-authored-by: Sukriti Sharma <Ssukriti@users.noreply.github.com>
2025-02-14 16:55:28 +01:00
..
ar Remove INC notebook reference in documentation (#35936) 2025-01-28 17:10:02 +01:00
de Fix typos in translated quicktour docs (#35302) 2024-12-17 09:32:00 -08:00
en add shared experts for upcoming Granite 4.0 language models (#35894) 2025-02-14 16:55:28 +01:00
es Move DataCollatorForMultipleChoice from the docs to the package (#34763) 2025-02-13 12:01:28 +01:00
fr Add French translation of task_summary and tasks_explained (#33407) 2025-01-06 14:23:52 +01:00
hi [i18n-HI] Translated TFLite page to Hindi (#34572) 2024-11-04 09:40:30 -08:00
it Fix typos in translated quicktour docs (#35302) 2024-12-17 09:32:00 -08:00
ja Move DataCollatorForMultipleChoice from the docs to the package (#34763) 2025-02-13 12:01:28 +01:00
ko Move DataCollatorForMultipleChoice from the docs to the package (#34763) 2025-02-13 12:01:28 +01:00
ms Remove old benchmark code (#35730) 2025-01-21 17:56:43 +00:00
pt Fix typos in translated quicktour docs (#35302) 2024-12-17 09:32:00 -08:00
te Fix typos in translated quicktour docs (#35302) 2024-12-17 09:32:00 -08:00
tr Translate index.md to Turkish (#27093) 2023-11-08 08:35:20 -05:00
zh DeepSpeed github repo move sync (#36021) 2025-02-05 08:19:31 -08:00
_config.py [#29174] ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1 Fix (#29888) 2024-04-08 14:21:16 +01:00