Ibm Granite 4.0: Hyper-Efficient, High Performance Hybrid Models for Enterprise
Posted3 months agoActive3 months ago
ibm.comTechstory
calmpositive
Debate
20/100
AIEnterprise SoftwareIbm
Key topics
AI
Enterprise Software
Ibm
IBM announces Granite 4.0, a new generation of hybrid AI models for enterprise use, sparking discussion on their potential efficiency and performance benefits.
Snapshot generated from the HN discussion
Discussion Activity
Light discussionFirst comment
7h
Peak period
1
6-9h
Avg / period
1
Key moments
- 01Story posted
Oct 2, 2025 at 11:16 AM EDT
3 months ago
Step 01 - 02First comment
Oct 2, 2025 at 6:08 PM EDT
7h after posting
Step 02 - 03Peak activity
1 comments in 6-9h
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 4, 2025 at 7:59 AM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Discussion (3 comments)
Showing 3 comments
danielhanchen
3 months ago
1 replyMade some dynamic GGUFs for those interested! https://huggingface.co/unsloth/granite-4.0-h-small-GGUF (32B Mamba Hybrid + MoE)
CMay
3 months ago
1 replyThanks! Any idea why I'm getting such poor performance on these new models? Whether Small or Tiny, on my 24GB 7900XTX I'm seeing like 8 tokens/s using the latest llama.cpp with vulkan. Even if it was running 4x faster than this I would be asking why I'm getting so few tokens/s when it sounds like the models are supposed to bring increased inference efficiency.
danielhanchen
3 months ago
Oh I think its a Vulcan backend issue - someone raised it with me and said the rocm backend is much faster
View full discussion on Hacker News
ID: 45450841Type: storyLast synced: 11/17/2025, 12:10:15 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.