Back to Home11/13/2025, 1:55:30 PM

When Reverse Proxies Surprise You: Hard Lessons from Operating at Scale

96 points
11 comments

Mood

thoughtful

Sentiment

positive

Category

tech

Key topics

reverse proxies

scalability

system administration

Debate intensity20/100

The article discusses lessons learned from operating reverse proxies at scale, and the discussion highlights key takeaways and shares related experiences.

Snapshot generated from the HN discussion

Discussion Activity

Moderate engagement

First comment

5d

Peak period

6

Day 6

Avg / period

5.5

Comment distribution11 data points

Based on 11 loaded comments

Key moments

  1. 01Story posted

    11/13/2025, 1:55:30 PM

    6d ago

    Step 01
  2. 02First comment

    11/18/2025, 9:46:32 AM

    5d after posting

    Step 02
  3. 03Peak activity

    6 comments in Day 6

    Hottest window of the conversation

    Step 03
  4. 04Latest activity

    11/18/2025, 9:15:25 PM

    22h ago

    Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (11 comments)
Showing 11 comments
whstl
1d ago
2 replies
It's nice to see someone else preaching this:

> Production Lesson: Never let exceptions dictate the norm. Handle them explicitly, in isolated paths or tiers, instead of polluting the mainline logic. What looks like "flexibility" is often just deferred fragility waiting to surface at scale.

I've seen this pattern far too often in production systems. In the name of "covering edge cases", a huge amount of complexity is moved over to configuration languages, interfaces, APIs, etc, to be more flexible. Not only this doesn't free up the developers time (because it overcomplicates it all), it also makes things worse on the other side for the users of such structures. We already have something "flexible": source code itself, no need to reinvent the wheel.

immibis
1d ago
2 replies
whstl
1d ago
1 reply
I wish people would realize that moving back to code is possible, though.

It rarely happens because at this point the codebase is so littered with problems that things start requiring long QA, code freezes and once-a-month deployments, and it's impossible to get anything done.

dottedmag
1d ago
[delayed]
marcosdumay
1d ago
1 reply
Config values and a configurable plugins system completely solve the problem, dominating over the entire clock.

Iterating further from config values is a great predictor that a project will become a disaster to use, and probably fail completely.

btown
1d ago
1 reply
Ah, but what happens when your plugins need to themselves be configured for different client deployments?

You add a few flags, then you need to figure out backwards compatibility as your plugin evolves (which involves defining prioritization rules between options), then those rules get complex enough to have conditionals (say, for granular traffic patterns), which means you have a DSL. And when the DSL gets complex enough, it needs an entire Software Development Lifecycle, which means it's effectively hard-coded. Or, you have people fork the plugin, which is a hard-code in and of itself.

All in all, you don't avoid the "configurability clock," you just decentralize it!

The real problem is that clients inevitably have conflicting needs that cut across any modularization barriers you might think to build. When a configured plugin can have spooky action at a distance, perhaps under-tested due to configuration, is it truly modular? Thus, the clock emerges.

marcosdumay
22h ago
You do multiple plugins or use constant configuration values for them. That's why you want plugins, for putting all complex stuff in actual code that doesn't have to live with the main product.

That doesn't decentralize the clock, it gives a maximum capable interface for the few people that need to handle exceptional cases, and a minimally capable one to the people that just want to use your software as is. That is, you make the product live on two opposite values of the clock at the same time.

nijave
1d ago
I see something similar with AI generated code where it tries much too hard to handle all the exceptions and ends up swallowing or obfuscating them instead of making things more reliable. Claude seems particularly bad unless you prompt it to minimize complexity
stacktrace
1d ago
Very interesting read! But I want to point out a small correction - the DNS collapse issue at HAProxy, along with O(N^2), also had some O(N^3) code paths, which is just mind-blowing.

Also, I believe this should be the correct GitHub issue link - https://github.com/haproxy/haproxy/issues/1404

> Production Lesson: Code that "works fine" at small scale may still hide O(N²) or worse behavior. At hundreds or thousands of nodes, those costs stop being theoretical and start breaking production.

bell-cot
1d ago
Re-sort the takeaway points, to put this one first:

> Prioritize human factors. Outage recovery depends on what operators can see and do under stress. When dashboards fail, clear logs, simple commands, and predictable behavior matter more than complex mechanisms.

Why - to make it really, really clear to bullet-skimming managers and complexity-loving engineers that too-clever "solutions", and just-an-afterthought "testing & training", and poorly documented configurations will turn into worlds of pain when things really go wrong. The "smart people" won't be in the Operations Center then. Let alone with all the details fresh in their minds. And several of them may have taken jobs elsewhere, to not much care if the org is desperate for their help right now.

dwedge
23h ago
The engineer killing the proxy because they assumed processes running as "nobody" were stray (whatever that means - processes without a parent don't change username, and nobody doesn't mean no username) doesn't belong in that list. That was just an engineer out of their depth (I assume one used to dealing with other systems)
ID: 45914929Type: storyLast synced: 11/19/2025, 7:11:53 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.