Not

Hacker

News!

Not

Hacker

News!

AI-observed conversations & context

Daily AI-observed summaries, trends, and audience signals pulled from Hacker News so you can see the conversation before it hits your feed.

LiveBeta

Explore

Home
Hiring
Products
Companies
Discussion
Q&A
Privacy Policy

Resources

Visit Hacker News
HN API
Modal cronjobs
Meta Llama

Briefings

Inbox recaps on the loudest debates & under-the-radar launches.

Connect

© 2026 Not Hacker News! — independent Hacker News companion.

Not affiliated with Hacker News or Y Combinator. We simply enrich the public API with analytics.

Not

Hacker

News!

Home
Discussion
AI Inference

AI Inference

20 stories

•

24h: 0%

•

7d: 0

•

236 comments

Top contributors:HenryNdubuaku yvbbrjdr sorenjan officerk PaulHoule

Stories

Related Stories

20 stories tagged with ai inference

Cactus (yc S25) – AI Inference on Smartphones

12363 commentsby HenryNdubuaku

Posted4 months agoActiveabout 1 month ago

Nvidia Dgx Spark In-Depth Review: a New Standard for Local AI Inference

11593 commentsby yvbbrjdr

Posted3 months agoActiveabout 1 month ago

Windows ML Is Generally Available

11446 commentsby sorenjan

Posted3 months agoActiveabout 1 month ago

Analog Optical Computer for AI Inference and Combinatorial Optimization

10120 commentsby officerk

Posted4 months agoActiveabout 1 month ago

Chunkllm: a Lightweight Pluggable Framework for Accelerating Llms Inference

968 commentsby PaulHoule

Posted2 months agoActiveabout 1 month ago

Compiler Optimizations for 5.8ms GPT-Oss-120b Inference (not on Gpus)

90 commentsby olibaw

Posted3 months agoActiveabout 1 month ago

Optimizing AI Inference with Edge Computing

80 commentsby sachamorard

Posted4 months agoActiveabout 1 month ago

2:4 Semi-Structured Sparsity: 27% Faster AI Inference on Nvidia Hardware

72 commentsby HappyTeam

Posted3 months agoActiveabout 1 month ago

Sharing Base Model in GPU Vram Across Multiple Inference Stack Process [video]

71 commentsby medicis123

Posted4 months agoActiveabout 1 month ago

Llama.cpp: Deterministic Inference Mode (cuda): Rmsnorm, Matmul, Attention

60 commentsby diwank

Posted4 months agoActiveabout 1 month ago

Speculative Cascades – a Hybrid Approach for Smarter, Faster LLM Inference

60 commentsby emschwartz

Posted4 months agoActiveabout 1 month ago

Lazy Loading Isn't the Magic Pill to Fix AI Inference

40 commentsby ssingh_hn

Postedabout 2 months agoActiveabout 1 month ago

Melange - Pegging AI Inference to the Cost of the Most Expensive Model

41 commentsby Paralus

Posted3 months agoActiveabout 1 month ago

Don't Buy These Gpu's for Local AI Inference

41 commentsby ericdotlee

Posted3 months agoActiveabout 1 month ago

I Wrote Inference for Qwen3 0.6b in C/cuda

40 commentsby mk93074

Posted3 months agoActiveabout 1 month ago

Nvidia Unveils Rubin Cpx

40 commentsby dataking

Posted4 months agoActiveabout 1 month ago

Analog Optical Computer for AI Inference and Combinatorial Optimization

40 commentsby bookofjoe

Posted4 months agoActiveabout 1 month ago

Canonical Releases Silicon-Optimized Inference Snaps

30 commentsby glitchc

Posted2 months agoActiveabout 1 month ago

Euclyd – Startup to Take on AI Inference with Sip, Custom Memory

30 commentsby frozenseven

Posted3 months agoActiveabout 1 month ago

Nvidia B200 Low Power Usage for Inference AI Workload

31 commentsby viyops

Posted4 months agoActiveabout 1 month ago

Not

Hacker

News!

AI-observed conversations & context

Daily AI-observed summaries, trends, and audience signals pulled from Hacker News so you can see the conversation before it hits your feed.

LiveBeta

Explore

Home
Hiring
Products
Companies
Discussion
Q&A
Privacy Policy

Resources

Visit Hacker News
HN API
Modal cronjobs
Meta Llama

Briefings

Inbox recaps on the loudest debates & under-the-radar launches.

Connect

© 2026 Not Hacker News! — independent Hacker News companion.

Not affiliated with Hacker News or Y Combinator. We simply enrich the public API with analytics.

AI Inference | Trending Topic on Hacker News | Not Hacker News!