The Fluid Substrate: Streaming 1tb Models From Nvme via Io_uring
Posted24 days ago
zenodo.orgResearchstory
informativeneutral
Debate
0/100
High-Performance ComputingStorage OptimizationAI Research
Key topics
High-Performance Computing
Storage Optimization
AI Research
Discussion Activity
Light discussionFirst comment
N/A
Peak period
1
Start
Avg / period
1
Key moments
- 01Story posted
Dec 9, 2025 at 4:33 PM EST
24 days ago
Step 01 - 02First comment
Dec 9, 2025 at 4:33 PM EST
0s after posting
Step 02 - 03Peak activity
1 comments in Start
Hottest window of the conversation
Step 03 - 04Latest activity
Dec 9, 2025 at 4:33 PM EST
24 days ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 46210986Type: storyLast synced: 12/9/2025, 10:50:31 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
We wrote this paper to propose an architectural shift we call Fluid Federated Learning (FFL).
The core engineering contributions are:
Prism Protocol: We implemented a "Software-Defined Memory" architecture. It uses io_uring to stream sparse, random projections of model weights directly from NVMe storage to the GPU.
This allows us to process "Virtual Batches" of terabyte-scale models on commodity hardware by exploiting the Johnson-Lindenstrauss lemma (Holographic Slicing).
Federated State-Space Duality (F-SSD): Instead of averaging gradients (which is slow and leaky), we exploit the duality between Transformers and SSMs (like Mamba) to federate the Recurrent States.
The Result: We can run massive foundation models on edge devices with limited VRAM by treating the SSD as a "slow" memory tier without destroying optimization fidelity.
I’m curious if anyone here has experimented with io_uring for model serving? We found the async I/O overhead to be negligible compared to the memory gains, but wondering if there are better ways to handle the sparse projections.
Happy to answer questions on the implementation.