Not

Hacker News!

Beta
Home
Jobs
Q&A
Startups
Trends
Users
Live
AI companion for Hacker News

Not

Hacker News!

Beta
Home
Jobs
Q&A
Startups
Trends
Users
Live
AI companion for Hacker News
  1. Home
  2. /Story
  3. /Title: Ask HN: Scheduling stateful nodes when MMAP makes memory accounting a lie
  1. Home
  2. /Story
  3. /Title: Ask HN: Scheduling stateful nodes when MMAP makes memory accounting a lie
Nov 24, 2025 at 12:30 PM EST

Title: Ask HN: Scheduling stateful nodes when MMAP makes memory accounting a lie

leo_e
1 points
0 comments

Mood

skeptical

Sentiment

neutral

Category

ask_hn

Key topics

Distributed Systems

Memory Management

Load Balancing

Mmap

We’re hitting a classic distributed systems wall and I’m looking for war stories or "least worst" practices.

The Context: We maintain a distributed stateful engine (think search/analytics). The architecture is standard: a Control Plane (Coordinator) assigns data segments to Worker Nodes. The workload involves heavy use of mmap and lazy loading for large datasets.

The Incident: We had a cascading failure where the Coordinator got stuck in a loop, DDOS-ing a specific node.

The Signal: Coordinator sees Node A has significantly fewer rows (logical count) than the cluster average. It flags Node A as "underutilized."

The Action: Coordinator attempts to rebalance/load new segments onto Node A.

The Reality: Node A is actually sitting at 197GB RAM usage (near OOM). The data on it happens to be extremely wide (fat rows, huge blobs), so its logical row count is low, but physical footprint is massive.

The Loop: Node A rejects the load (or times out). The Coordinator ignores the backpressure, sees the low row count again, and retries immediately.

The Core Problem: We are trying to write a "God Equation" for our load balancer. We started with row_count, which failed. We looked at disk usage, but that doesn't correlate with RAM because of lazy loading.

Now we are staring at mmap. Because the OS manages the page cache, the application-level RSS is noisy and doesn't strictly reflect "required" memory vs "reclaimable" cache.

The Question: Attempting to enumerate every resource variable (CPU, IOPS, RSS, Disk, logical count) into a single scoring function feels like an NP-hard trap.

How do you handle placement in systems where memory usage is opaque/dynamic?

Dumb Coordinator, Smart Nodes: Should we just let the Coordinator blind-fire based on disk space, and rely 100% on the Node to return hard 429 Too Many Requests based on local pressure?

Cost Estimation: Do we try to build a synthetic "cost model" per segment (e.g., predicted memory footprint) and schedule based on credits, ignoring actual OS metrics?

Control Plane Decoupling: Separate storage balancing (disk) from query balancing (mem)?

Feels like we are reinventing the wheel. References to papers or similar architecture post-mortems appreciated.

Discussion Activity

Light discussion

First comment

2h

Peak period

4

Hour 3

Avg / period

2.5

Comment distribution5 data points
Loading chart...

Based on 5 loaded comments

Key moments

  1. 01Story posted

    Nov 24, 2025 at 12:30 PM EST

    8h ago

    Step 01
  2. 02First comment

    Nov 24, 2025 at 2:31 PM EST

    2h after posting

    Step 02
  3. 03Peak activity

    4 comments in Hour 3

    Hottest window of the conversation

    Step 03
  4. 04Latest activity

    Nov 24, 2025 at 5:34 PM EST

    3h ago

    Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (0 comments)

Discussion hasn't started yet.

ID: 46036614Type: storyLast synced: 11/24/2025, 5:33:35 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

View on HN

Not

Hacker News!

AI-observed conversations & context

Daily AI-observed summaries, trends, and audience signals pulled from Hacker News so you can see the conversation before it hits your feed.

LiveBeta

Explore

  • Home
  • Jobs radar
  • Tech pulse
  • Startups
  • Trends

Resources

  • Visit Hacker News
  • HN API
  • Modal cronjobs
  • Meta Llama

Briefings

Inbox recaps on the loudest debates & under-the-radar launches.

Connect

© 2025 Not Hacker News! — independent Hacker News companion.

Not affiliated with Hacker News or Y Combinator. We simply enrich the public API with analytics.