Eliminating Cold Starts 2: Shard and Conquer
Posted3 months agoActive3 months ago
blog.cloudflare.comTechstory
supportivepositive
Debate
40/100
CloudflareServerless ComputingPerformance Optimization
Key topics
Cloudflare
Serverless Computing
Performance Optimization
Cloudflare's engineering blog post on eliminating cold starts in their Workers platform sparks discussion on the effectiveness of their approach and the trade-offs of using a service that scales to zero.
Snapshot generated from the HN discussion
Discussion Activity
Moderate engagementFirst comment
3d
Peak period
10
60-66h
Avg / period
4.5
Comment distribution18 data points
Loading chart...
Based on 18 loaded comments
Key moments
- 01Story posted
Sep 26, 2025 at 5:40 PM EDT
3 months ago
Step 01 - 02First comment
Sep 29, 2025 at 5:54 AM EDT
3d after posting
Step 02 - 03Peak activity
10 comments in 60-66h
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 30, 2025 at 12:45 AM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45391302Type: storyLast synced: 11/20/2025, 2:09:11 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
It's very much still maturing as an offering. But it does exist!
I thought lamda@edge was going in this direction but it’s a slightly faster, more constrained version of lambdas with all the same potential downsides
This is a curl request from my machine right now to SSR react app hosted on CF Worker: ``` DNS lookup: 0.296826s Connect: 0.320031s Start transfer: 2.710684s Total: 2.710969s ```
Second request: ``` DNS lookup: 0.002970s Connect: 0.015917s Start transfer: 0.176399s Total: 0.176621s ```
2.5 seconds difference.
Edit: you can kind of tell this from the connect timings listed above. TLS is faster the second time around, but not enough to make much difference to the overall speedup
[1] https://blog.cloudflare.com/introducing-0-rtt/
First hit: ``` DNS Lookup: 0.026284s Connect (TCP): 0.036498s Time app connect (TLS): 0.059136s Start Transfer: 1.282819s Total: 1.282928s ```
Second hit: ``` DNS Lookup: 0.003575s Connect (TCP): 0.016697s Time app connect (TLS): 0.032679s Start Transfer: 0.242647s Total: 0.242733s ```
Metrics description:
time_namelookup: The time, in seconds, it took from the start until the name resolving was completed.
time_connect: The time, in seconds, it took from the start until the TCP connect to the remote host (or proxy) was completed.
time_appconnect: The time, in seconds, it took from the start until the SSL/SSH/etc connect/handshake to the remote host was completed.
time_starttransfer: The time, in seconds, it took from the start until the first byte was just about to be transferred. This includes time_pretransfer and also the time the server needed to calculate the result.
2.5 seconds seems way too long to be attributed to the Worker cold start alone.
My point is that "cold start" is often more than just booting VM instance.
And I noticed not everybody understands it. I used to have conversations in which people argue that there is no difference in deploying web frontend to Cloudflare vs a stateful solution because of this confusing advertisement.
If you start to say "If you are using this for production, 5-10/month is real cost you need to pay + transactions" Well, now the cost is about to same to deploy fly.io shared CPU container and it does not come with cold starts, vendor lock in and can run as long as you want. Cloudflare knows that so they don't want to introduce that charge or even talk about it.