Not
Hacker
News
!
Home
Hiring
Products
Discussion
Q&A
Users
Not
Hacker
News
!
Home
Hiring
Products
Discussion
Q&A
Users
Home
/
Discussion
/
AI Inference
Back to Discussion
AI Inference
Loading...
20 stories
•
24h:
0%
•
7d: 0
•
236 comments
Top contributors:
HenryNdubuaku
yvbbrjdr
sorenjan
officerk
PaulHoule
Stories
Related Stories
20 stories tagged with ai inference
Cactus (yc S25) – AI Inference on Smartphones
123
63 comments
by HenryNdubuaku
Posted
4 months ago
Active
about 1 month ago
AI inference
mobile devices
open-source
Nvidia Dgx Spark In-Depth Review: a New Standard for Local AI Inference
115
93 comments
by yvbbrjdr
Posted
3 months ago
Active
about 1 month ago
NVIDIA DGX Spark
AI Inference
Hardware Review
Windows ML Is Generally Available
114
46 comments
by sorenjan
Posted
3 months ago
Active
about 1 month ago
Windows ML
AI inference
on-device AI
Analog Optical Computer for AI Inference and Combinatorial Optimization
101
20 comments
by officerk
Posted
4 months ago
Active
about 1 month ago
optical computing
AI inference
analog computing
Chunkllm: a Lightweight Pluggable Framework for Accelerating Llms Inference
96
8 comments
by PaulHoule
Posted
2 months ago
Active
about 1 month ago
LLM Optimization
AI Inference
Machine Learning Frameworks
Compiler Optimizations for 5.8ms GPT-Oss-120b Inference (not on Gpus)
9
0 comments
by olibaw
Posted
3 months ago
Active
about 1 month ago
compiler optimizations
AI inference
hardware acceleration
Optimizing AI Inference with Edge Computing
8
0 comments
by sachamorard
Posted
4 months ago
Active
about 1 month ago
Edge Computing
AI Inference
Cloud Computing
2:4 Semi-Structured Sparsity: 27% Faster AI Inference on Nvidia Hardware
7
2 comments
by HappyTeam
Posted
3 months ago
Active
about 1 month ago
AI Inference
Sparsity
NVIDIA Hardware
Sharing Base Model in GPU Vram Across Multiple Inference Stack Process [video]
7
1 comments
by medicis123
Posted
4 months ago
Active
about 1 month ago
GPU optimization
AI inference
VRAM management
Llama.cpp: Deterministic Inference Mode (cuda): Rmsnorm, Matmul, Attention
6
0 comments
by diwank
Posted
4 months ago
Active
about 1 month ago
AI Inference
CUDA
Llama.cpp
Speculative Cascades – a Hybrid Approach for Smarter, Faster LLM Inference
6
0 comments
by emschwartz
Posted
4 months ago
Active
about 1 month ago
LLM
AI inference
Google research
Lazy Loading Isn't the Magic Pill to Fix AI Inference
4
0 comments
by ssingh_hn
Posted
about 2 months ago
Active
about 1 month ago
AI Inference
Lazy Loading
Performance Optimization
Melange - Pegging AI Inference to the Cost of the Most Expensive Model
4
1 comments
by Paralus
Posted
3 months ago
Active
about 1 month ago
AI inference
cost optimization
cloud computing
Don't Buy These Gpu's for Local AI Inference
4
1 comments
by ericdotlee
Posted
3 months ago
Active
about 1 month ago
GPU
AI Inference
Hardware Recommendations
I Wrote Inference for Qwen3 0.6b in C/cuda
4
0 comments
by mk93074
Posted
3 months ago
Active
about 1 month ago
C Programming
CUDA
AI Inference
Nvidia Unveils Rubin Cpx
4
0 comments
by dataking
Posted
4 months ago
Active
about 1 month ago
Nvidia
GPU
AI Inference
Analog Optical Computer for AI Inference and Combinatorial Optimization
4
0 comments
by bookofjoe
Posted
4 months ago
Active
about 1 month ago
optical computing
AI inference
combinatorial optimization
Canonical Releases Silicon-Optimized Inference Snaps
3
0 comments
by glitchc
Posted
2 months ago
Active
about 1 month ago
Canonical
Ubuntu
AI Inference
Snap Packaging
Euclyd – Startup to Take on AI Inference with Sip, Custom Memory
3
0 comments
by frozenseven
Posted
3 months ago
Active
about 1 month ago
AI Inference
Startup
Hardware Acceleration
Nvidia B200 Low Power Usage for Inference AI Workload
3
1 comments
by viyops
Posted
4 months ago
Active
about 1 month ago
NVIDIA
AI inference
GPU performance
AI Inference | Trending Topic on Hacker News | Not Hacker News!