How Many HTTP Requests/second Can a Single Machine Handle? (2024)

Posted4 months agoActive4 months ago

BinaryIgor

64 points

56 comments

binaryigor.comTechstory

calmmixed

Debate

70/100

Performance OptimizationScalabilityServer Architecture

Key topics

Performance Optimization

Scalability

Server Architecture

The article explores how many HTTP requests a single machine can handle, sparking a discussion on the limits of single-machine scalability and the trade-offs between simplicity and performance.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

23m

Peak period

0-6h

Avg / period

11.2

Comment distribution56 data points

Loading chart...

Based on 56 loaded comments

Key moments

01Story posted
Aug 31, 2025 at 2:10 PM EDT
4 months ago
Step 01
02First comment
Aug 31, 2025 at 2:33 PM EDT
23m after posting
Step 02
03Peak activity
48 comments in 0-6h
Hottest window of the conversation
Step 03
04Latest activity
Sep 2, 2025 at 11:40 PM EDT
4 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (56 comments)

Showing 56 comments

rokkamokka

4 months ago

1 reply

A toy example but it's an interesting read nonetheless. We also host our monolith app on a few bare metal machines (considerably beefier than the example however) and it works well, although the app does considerably more queries (and more complex queries) than this. Transaction locking issues are our bane though.

BinaryIgorAuthor

4 months ago

How many queries do you usually handle? Why a few? One doesn't suffice? What resources do they have?

ayende

4 months ago

1 reply

Another way to do that is to look at the tech empower benchmarks. It tests on big machines, but you can get > 1 M req/sec across a wide variety of environments.

3eb7988a1663

4 months ago

1 reply

Should be noted that some of those are cheating for the purposes of the test. Hard coded responses to avoid doing any actual work.

hu3

4 months ago

1 reply

The ASP.NET C# numbers are real.

banashark

4 months ago

I'd double check the implementations: https://news.ycombinator.com/item?id=33606902

Many frameworks game those benchmarks. While there are some metrics they are useful for, I've yet to see any production code that's actually stripped down like the gamified versions there.

BobbyTables2

4 months ago

1 reply

I applaud the author’s curiosity but hope they realize this is like comparing the 0-60 performance of a Cadillac SUV vs a Ford Excursion.

A low end ARM processor(like a raspberry pi) can crank out 1000 requests a second with a CGI program handing the requests — using a single CPU core. Of course this doesn’t happen by with traditional CGI. (Actual performance with traditional CGI will be more like 20-50/s or worse).

Like the stereotypical drivers of such vehicles, the industry has become so fat and stupid that an x86 system handling 500 requests/sec actually sounds impressive. Sadly, considering the bloated nature of modern stacks, it kinda is.

BinaryIgorAuthor

4 months ago

True :) My main motivation was to at least realistically move us into a right (simpler) direction - from currently still popular microservices architectures deployed to multi-machine Kubernetes clusters to handle, on average, 5 req/s

andersmurphy

4 months ago

1 reply

A single machine can handle much, much more if you use sqlite and batch updates/inserts.

Honestly, unless you're bandwidth/uplink limited (e.g running a CDN) then a single machine will take you really far.

Also simpler systems tend to have better uptime/reliability. Doesn't get much simpler than a single box.

kgeist

4 months ago

1 reply

On my pretty modest dev machine with 12 CPUs, I once managed to achieve 14k RPS with Go+SQLite in a write+read test on a real project I was developing (it used a framework so there was also some overhead due to all the abstractions). I didn't even batch anything. The only problem was, I quickly found that SQLite's WAL checkpointer couldn't keep up with the write rates, the WAL file quickly grew to 100s of GBs (this is actually a known issue and is mentioned in their docs), so I had to add a special goroutine to monitor the size of the WAL file and force checkpointing manually when it got too big.

So when people say 1k is "highload" and requires a whole cluster, I'm not sure what to think of it. You can squeeze so much more out of a single fairly modest machine.

andersmurphy

4 months ago

Sqlite has some sharp edges for sure honestly even basic batching all inserts/updates in a transaction every 100ms will get you to 30000+ updates a second on a 4 core shared CPU VPS (assuming nvme drives).

That's the other thing AWS tends to have really dated SSDs.

Honestly, it's like the industry has jumped the shark. 1k is not a lot of load. It's like when people say single writer means you can't be performant, it's the opposite most of the time single writer lets you batch and batching is where the magic happens.

mannyv

4 months ago

2 replies

Well, you have to understand what you're testing.

With a test like this, you're really testing two different things:

1. How fast your database is,

2. How fast your frontend is

Since the query is simple, your frontend is basically a DB access layer and should be taking no time. And since the table is indexed the query should also take no time.

The only other interesting question is if the database can handle the number of connections and the storage is. The app is using connection pools, but the actual size of the database machine is never mentioned...which is a problem. How big is the DB instance? A small instance could be crushed with 80 connections. A database on a hard drive may not be able to handle the load either (though since the data volume is small, it could be that everything ends up cached anyway).

So this is sort of interesting, but sort of not interesting.

BinaryIgorAuthor

4 months ago

1 reply

It's all described in the blog post and there is a link to the source code as well :)

Both the app and db are hosted on the same machine - they are sharing resources. This fact, type of storage and other details of the setup are contained in this section: https://binaryigor.com/how-many-http-requests-can-a-single-m...

I think you're right that I didn't mention the details of the db connection pool; they are here: https://github.com/BinaryIgor/code-examples/blob/master/sing...

Long story short, there's a Hikari Connection Pool with initial 10 connections, resizable to 20.

postquantumfax

4 months ago

60000 is near the theoretical max of a 5 tupple with all but the client port being fixed. If you are going to test with this many connections per client you are hopefully using multiple IPs per client or multiple server IPs.

fabian2k

4 months ago

Postgres with unmodified default settings can handle thousands of requests like that per second on relatively small hardware. The connection pool is a potential bottleneck, but one you should be able to avoid. I think the default limit for Postgres would be something like 100 connections, that's plenty with a pool in front of it.

kiitos

4 months ago

2 replies

database on the same machine as the application server, RPS limits enforced via

            var issuedRequests = i + 1;
            if (issuedRequests % REQUESTS_PER_SECOND == 0 && issuedRequests < REQUESTS) {
                System.out.println("%s, %d/%d requests were issued, waiting 1s before sending next batch..."
                    .formatted(LocalDateTime.now(), issuedRequests, REQUESTS));
                Thread.sleep(1000);
            }

don't take any conclusions away from this post, friends

itsthecourier

4 months ago

1 reply

you can conclude this may be optimized further and yet conclude his numbers are at least a baseline

kiitos

4 months ago

it's not about optimization, it's about soundness, and the numbers aren't sound

BinaryIgorAuthor

4 months ago

2 replies

That's by intention, I wanted to test REQUESTS_PER_SECOND max, in every test case.

Same with db - I wanted to see, what kind of load a system (not just app) deployed to a single machine can handle.

It can be obviously optimized even further, I didn't try to do that in the article

akoboldfrying

4 months ago

1 reply

Based on that code snippet, and making some (possibly unjustified) assumptions about the rest of the code, your actual request rate could be as low as 50% of your claimed request rate:

Suppose it takes 0.99s to send REQUESTS_PER_SECOND requests. Then you sleep for 1s. Result: You send REQUESTS_PER_SECOND requests every 1.99s. (If sending the batch of requests could take longer than a second, then the situation gets even worse.)

The issue GP has with app and DB on the same box is a red herring -- that was explicitly the condition under test.

BinaryIgorAuthor

4 months ago

1 reply

That's not true; it's sleeping while the thread pool is busy doing requests - analyze the code again:)

kiitos

4 months ago

you're simply not understanding the behavior of the code you've written

it's fine, we all went thru these gauntlets, but, if you're interested in learning more, take all of this feedback in good faith, and compare/contrast what your tool is doing vs. what sound/valid load testing tools like vegeta and hey and (maybe) k6 do. (but definitely not ab or wrk, which are unsound)

kiitos

4 months ago

1 reply

i mean the details are far beyond what can be effectively communicated in a HN comment but if your loadgen tool is doing anything like sleep(1000ms) it is definitely not making any kind of sound request-per-second load against its target

and, furthermore, if the application and DB are co-located on the same machine, you're co-mingling service loads, and definitely not measuring or capturing any kind of useful load numbers, in the end

tl;dr is that these benchmarks/results are ultimately unsound, it's not about optimization, it's about validity

if you want to benchmark the application, then either you (a) mock the DB at as close to 0 cost as you can, or (b) point all application endpoints to the same shared (separate-machine) DB instance, and make sure each benchmark run executes exactly the same set of queries against against a DB instance that is 100% equivalent to the other runs, resetting in-between each run

BinaryIgorAuthor

4 months ago

1 reply

The point of the test was to test a SYSTEM on the same machine, not just the app - db and app are on the same machine by design, not mistake.

Tests on the other hand were executed on multiple different machines - it's all described in the article. Sleep works properly, because there's an unbounded thread pool that makes http request - each request has its own virtual thread.

kiitos

4 months ago

there is no real-world deployment where both (a) db+app are colocated on the same machine, and (b) performance benchmarking is relevant. these are mutually exclusive properties. (sqlite != db)

masterj

4 months ago

4 replies

From the article: “Huge machine - 8 CPUs, 16 GB of memory”

That’s barely more than a raspberry pi? (4 vs 8 cores) Huge machines today have 20+ TBs of RAM and hundreds of cores. Even top-end consumer machines can have 512GB of RAM!

I do agree with the author that single machines can scale far beyond what most orgs / companies need, but I think they may be underestimating how far that goes by orders-of-magnitude

withinboredom

4 months ago

2 replies

A large number of issues on an open source server are people wondering why perf is so bad when they give it a single core. Single core performance hasn’t improved much in the last 10-15 years, but more and more of them can be accessed. It blows me away how expensive they are that people need to worry about it.

cosmotic

4 months ago

1 reply

What evidence are you using for that single core claim?

In 8 years, Ryzen went from 1166 geekbench 6 single core to 3398.

withinboredom

4 months ago

1 reply

http://cpudb.stanford.edu/visualize/clock_frequency.html

ossopite

4 months ago

Single core performance isn't just clock frequency. It must be multiplied by average IPC, but really it's more difficult since you have to account for factors like new SIMD instructions. Effective IPC improvements are where a significant fraction of single core speedup came from in this period

arnaudsm

4 months ago

Intel's single core performance has 3.4xed in 15 years (980X vs 285K)

Single core perf doubled every 8 years, multicore every 6 years, and GPUs every 3 years !

o_m

4 months ago

You can spend less than double, what Digitalocean charges for 8 cores, at Hetzner and get ten times more cores on a single machine

BinaryIgorAuthor

4 months ago

True :) I did it on purpose to show that even with these modest resources you can achieve amazing performance - better than most systems would ever need

paulddraper

4 months ago

AWS calls it 2xlarge

spapas82

4 months ago

2 replies

> External volume for the database - it does not write to the local file system (we use DigitalOcean Block Storage)

Is this common? Why not use the local filesystem? Actually, I thought that using anything else beyond the local filesystem for the database is a no-no. Am I missing something?

crazygringo

4 months ago

1 reply

Databases on cloud providers are usually not on file systems local to the instance because local instances are meant to fail at any time.

Block storage is meant to be reliable, so databases go there. Yes it's slower but you don't lose data.

Generally, the only time you want a local database in the cloud is if it's being used for short-lived data meaningful only to that particular instance in time.

Or it can work if your database rarely changes and you make regular backups that are easy to revert to, like for a blog.

fabian2k

4 months ago

1 reply

Databases have tools to work with storage or servers that can fail. You would need to use replication between multiple database servers and a backup method to some other storage.

Databases with high availability and robust storage were possible before the cloud.

crazygringo

4 months ago

Sure, but replication and automatic failover is a huge pain to configure. It's a gigantic step in architecture complexity, requiring multiple database servers.

I'm not saying it can't be done. But block storage is built for reliability in a way that ephemeral instances are not. There's a good reason why every guide will tell you to set your database up on block storage rather than an instance's local disk. If your instance fails, just spin up another instantly and reconnect to the same block storage.

Pre-cloud, the equivalent would have been using redundant RAID storage to handle disk failures (easy), before upgrading to replication with an always-running replica (harder).

hu3

4 months ago

1 reply

Yeah I wouldn't even entertain running RDBMS in network storage for fsync and mmap reasons alone.

victorbjorklund

4 months ago

1 reply

Isnt that how most managed postgres work? Or db in kubernetes etc?

hu3

4 months ago

No it's important to use local disk. Network disk means magnitude higher latency i/o at best. Even in kubernetes, it has special machinery to manage databases.

comprev

4 months ago

1 reply

Needs (2024)

caymanjim

4 months ago

2 replies

Needs (2000) really. This contrived test is using tiny VPSes (even the "big" machine is tiny), slow network-mounted DB storage, nothing like a production stack that a real API server would use. Bespoke simple profiling mechanism. Nothing wrong with OP learning the basics and experimenting, but there's nothing of value in the findings.

BinaryIgorAuthor

4 months ago

1 reply

Regarding the machine's sizes - I did it on purpose, to showcase that even with this limited resources, you can still achieve way better performance than most systems will ever need.

I know that you can have significantly bigger machines; network-mounted DB storage on the other hand is not slow - it's designed specifically for these kind of use cases

fabian2k

4 months ago

Local NVMe storage is incredibly fast though for databases. Probably not a big factor in your test here, but certainly a relevant factor if you was to push the DB more.

ytch

4 months ago

The title reminds me of C10K problems:

https://en.wikipedia.org/wiki/C10k_problem

ashwindharne

4 months ago

1 reply

I always find that my regular crud apps kind of grow into something not-so-cruddy due to a single feature (realtime communication, bursty usage profile, large batch jobs to precompute something to expensive to do at request time) and the architecture just explodes from there. s

also, it always feels like I need a second instance at the very least for redundancy, but then we have to ensure they're stateless and that batch jobs are sharded across them (or only run on one), and again we hit an architecture explosion. Wish that I was more comfortable just dropping a single spring boot instance on a vm and calling it a day; spring boot has a lot of bells and whistles and you can get pretty far without the architecture explosion but it is almost inevitable

high_priest

4 months ago

1 reply

One of the reasons I have completely dropped interpreted languages (Python, Java, JS) and followed the "back to compiled software" hype. I am now writing my software purely in Go and Rust and again using pipes, queues and temp storage, to connect smaller programs, tools, services. The Unix philosophy was revolutionary and (for me) the ultimate solution for software organisation. But, one never knows what he had, until he loses it, so I treat the few years of experimentation with interpreted alternatives, as a positive.

throwaway7783

4 months ago

1 reply

Putting java and python in the same bracket on performance because they are interpreted, completely ignores what JVM is and can do today.

andersmurphy

4 months ago

Not the biggest fan of Java the language. But the JVM is a beast. Has better virtual threads and much faster FFI than go these days too.

willsmith72

4 months ago

personally I use cloudflare workers not because 1 host couldn't handle the traffic (it could), but the maintenance is a breeze

obviously at high load (1k TPS+) talking in servers is way cheaper than serverless, so the tradeoff can start to swing

swiftcoder

4 months ago

Sometimes it confuses me how much we are just sort of treading water on the server performance front: the C10K problem was solved in 1999. WhatsApp was hosting a million TCP connections per box in 2011.

It is not all that hard to hit 10k requests/second on modern hardware. 100k requests/second is achievable with some careful technology choices.

BiraIgnacio

4 months ago

Load Testing: how many HTTP requests/second can a Single Machine while doing <insert thing/things> handle?

dang

4 months ago

Related ongoing thread:

Use One Big Server (2022) - https://news.ycombinator.com/item?id=45085029 - Aug 2025 (61 comments)

ted_dunning

4 months ago

This entire post could be 3 paragraphs of test conditions, 2 paragraphs of narrative and one graph. IT would have been more informative to boot.

A picture would have been worth quite a bit more than a thousand words.

yencabulator

4 months ago

> very_high_load: 4000 requests per second - 4 machines x 1000 RPS

This is an incredibly naive article.

View full discussion on Hacker News

ID: 45085446Type: storyLast synced: 11/20/2025, 5:57:30 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN