-
Notifications
You must be signed in to change notification settings - Fork 11.6k
Feature Request: Granite 4 Support #13275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
enhancement
New feature or request
Comments
For reference, support PRs in other platforms: |
If this is the same idea with llama 4, then I think we already support this. In short, it's just an Lines 4536 to 4547 in 3bf785f
|
@ngxson That's great! Thanks for pointing that out |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Prerequisites
Feature Description
This issue is to track work to support IBM's Granite 4 model architecture (
GraniteMoEHybrid
intransformers
). The model uses a number of components that are not yet supported inllama.cpp
, but are being worked independently, so I'm raising this issue to triangulate the different work streams that will be needed to support the model.Necessary Components
jamba
by @compilade: llama : support Jamba hybrid Transformer-Mamba models #7531bamba
: Bamba architecture #10810bamba
that's also out-of-date: https://github.com/gabe-l-hart/llama.cpp/tree/BambaArchitectureRefactorGraniteMoEShared
layers: Model: Granite MoE shared #13269mamba2
in non-CPU backendsmetal
backend needs look like they're addressed already in llama : initial Mamba-2 support #9126, but for me that still doesn't work on my M3 (assertion error about non-contiguous data).GraniteMoEHybrid
support tying all of the other pieces togetherMotivation
I lead IBM's efforts to ensure that Granite models work everywhere, and
llama.cpp
is a critical part of "everywhere!"Possible Implementation
No response
The text was updated successfully, but these errors were encountered: