'western Qwen': Ibm Wows with Granite 4 LLM Launch and Hybrid Mamba/transformer
Posted3 months agoActive3 months ago
venturebeat.comTechstory
excitedmixed
Debate
60/100
LLMIbmAI
Key topics
LLM
Ibm
AI
IBM has launched Granite 4, a new LLM with a hybrid Mamba/Transformer architecture, sparking interest and discussion among the HN community about its performance, potential applications, and comparisons to other models.
Snapshot generated from the HN discussion
Discussion Activity
Light discussionFirst comment
2h
Peak period
5
2-4h
Avg / period
2.5
Comment distribution25 data points
Loading chart...
Based on 25 loaded comments
Key moments
- 01Story posted
Oct 3, 2025 at 12:26 AM EDT
3 months ago
Step 01 - 02First comment
Oct 3, 2025 at 1:58 AM EDT
2h after posting
Step 02 - 03Peak activity
5 comments in 2-4h
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 4, 2025 at 7:59 AM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45458987Type: storyLast synced: 11/20/2025, 12:29:33 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
IBM Granite 4.0: hyper-efficient, high performance hybrid models for enterprise
https://www.ibm.com/new/announcements/ibm-granite-4-0-hyper-...
> ISO/IEC 42001 is an international standard that specifies requirements for establishing, implementing, maintaining, and continually improving an Artificial Intelligence Management System (AIMS) within organizations. It is designed for entities providing or utilizing AI-based products or services, ensuring responsible development and use of AI systems.
https://www.iso.org/standard/42001
If anyone has access to ISO standards, I'm really curious what the practical effects of that certification is. I.e. what things does Granite have that others don't, because they had to add/do it to fulfill the certification.
The committee was formed in 2017, chaired by an AI expert: https://www.iso.org/committee/6794475.html
https://www.ibm.com/think/topics/mamba-model
No Mamba in the Ollama version though.
Would Granite run with llama.cpp and use Mamba?
EDIT: Looks like Granite 4 hybrid architecture support was added to llama.cpp back in May: https://github.com/ggml-org/llama.cpp/pull/13550
Yes and no. They've written their own "engine" using GGML libraries directly, but fall back to llama.cpp for models the new engine doesn't yet support.
./llama.cpp/llama-cli -hf unsloth/granite-4.0-h-small-GGUF:UD-Q4_K_XL
Also a support agent finetuning notebook with granite 4: https://colab.research.google.com/github/unslothai/notebooks...
The IBM article has this image showing that it's supposed to be a bit ahead of GPT OSS 120B for at least some tasks (horrible URL but oh well): https://www.ibm.com/content/dam/worldwide-content/creative-a...
So in general it's going to be worse than GPT-5 and also Sonnet 4.5, but closer to GPT-5 mini. At least you can run this on prem, but none of the others. Pretty good, could possibly replace Qwen3 for quite a few use cases!
20GB @ 100,000 context.
But for some reason... LM studio isnt loading it onto gpu for me?
I just updated to 0.3.28 and still wont load onto gpu.
Switching from Vulkan to rocm. It's now working properly?
https://docs.unsloth.ai/new/ibm-granite-4.0
Fantastic work from unsloth folks as usual.
As it's running in roo code, it's using more like 26GB of vram.
~30TPS
Roo code does not work with it.
Kilo code next. It seems to be about 22GB of vram.
Kilo code works great.
The model however didn't 1 shot my first benchmark. That's pretty bad news for this model given magistral 2509 or apriel 15b are better.
Better on pass 2, still no 100%
3rd pass achieved.
Im predicting it'll be around 30% on livecodebench. Probably like 15% on aiderpolyglot. Very disappointed in its coding capability.
I just found:
https://artificialanalysis.ai/models/granite-4-0-h-small
25.1% on livecodebench. Absolutely deserved.
2% terminal bench.
16% on coding index. Completely deserved.