Not

Hacker News!

Beta
Home
Jobs
Q&A
Startups
Trends
Users
Live
AI companion for Hacker News

Not

Hacker News!

Beta
Home
Jobs
Q&A
Startups
Trends
Users
Live
AI companion for Hacker News
  1. Home
  2. /Story
  3. /A $1k AWS mistake
  1. Home
  2. /Story
  3. /A $1k AWS mistake
Nov 19, 2025 at 5:00 AM EST

A $1k AWS mistake

thecodemonkey
336 points
261 comments

Mood

heated

Sentiment

negative

Category

tech

Key topics

AWS

Cloud Costs

Infrastructure Management

Debate intensity80/100

The author shares a $1,000 AWS mistake due to misconfigured NAT gateway, sparking a heated discussion about cloud costs, AWS pricing, and the need for better cost management features.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

41m

Peak period

149

Day 1

Avg / period

80

Comment distribution160 data points
Loading chart...

Based on 160 loaded comments

Key moments

  1. 01Story posted

    Nov 19, 2025 at 5:00 AM EST

    4d ago

    Step 01
  2. 02First comment

    Nov 19, 2025 at 5:41 AM EST

    41m after posting

    Step 02
  3. 03Peak activity

    149 comments in Day 1

    Hottest window of the conversation

    Step 03
  4. 04Latest activity

    Nov 20, 2025 at 3:56 PM EST

    3d ago

    Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (261 comments)
Showing 160 comments of 261
fragmede
4d ago
2 replies
Just $1,000? Thems rookie numbers, keep it up, you'll get there (my wallet won't, ow).
thecodemonkey
4d ago
Haha, yep we were lucky to catch this early! It could easily have gotten lost with everything else in the monthly AWS bill.
bravetraveler
4d ago
Came here to say the same, take my vote

    - DevOops
harel
4d ago
1 reply
You probably saved me a future grand++. Thanks
thecodemonkey
4d ago
That was truly my hope with this post! Glad to hear that
nrhrjrjrjtntbt
4d ago
2 replies
NAT gateway probably cheap as fuck for Bezos & co to run but nice little earner. The parking meter or exit ramp toll of cloud infra. Cheap beers in our bar but $1000 curb usage fee to pull up in your uber.
tecleandor
4d ago
1 reply
I think it's been calculated that data transfer is the biggest margin product in all AWS catalog by a huge difference. A 2021 calculation done by Cloudflare [0] estimated almost 8000% price markup in EU and US regions.

And I can see how, in very big accounts, small mistakes on your data source when you're doing data crunching, or wrong routing, can put thousands and thousands of dollars on your bill in less than an hour.

--

  0: https://blog.cloudflare.com/aws-egregious-egress/
wiether
4d ago
1 reply
> can put thousands and thousands of dollars on your bill in less than an hour

By default a NGW is limited to 5Gbps https://docs.aws.amazon.com/vpc/latest/userguide/nat-gateway...

A GB transferred through a NGW is billed 0.05 USD

So, at continuous max transfer speed, it would take almost 9 hours to reach $1000

Assuming a setup in multi-AZ with three AZs, it's still 3 hours if you have messed so much that you can manage to max your three NGWs

I get your point but the scale is a bit more nuanced than "thousands and thousands of dollars on your bill in less than an hour"

The default limitations won't allow this.

tecleandor
4d ago
That's a NAT gateway, but if you're pulling data for analysis from S3 buckets you don't have those limitations.

Let's say they decide to recalculate or test a algorithm: they do parallel data loading from the bucket(s), and they're pulling from the wrong endpoint or region, and off they go.

And maybe they're sending data back, so they double the transfer price. RDS Egress. EC2 Egress. Better keep good track of your cross region data!

ukoki
4d ago
I don't think its about profits, its about incentivising using as many AWS products as possible. Consider it an 'anti-lock-in fee'
CjHuber
4d ago
7 replies
Does Amazon refund you for mistakes, or do you have to land on HN frontpage for that to happen?
Dunedan
4d ago
2 replies
Depends on various factors and of course the amount of money in question. I've had AWS approve a refund for a rather large sum a few years ago, but that took quite a bit of back and forth with them.

Crucial for the approval was that we had cost alerts already enabled before it happened and were able to show that this didn't help at all, because they triggered way too late. We also had to explain in detail what measures we implemented to ensure that such a situation doesn't happen again.

rwmj
4d ago
3 replies
Wait, what measures you implemented? How about AWS implements a hard cap, like everyone has been asking for forever?
maccard
4d ago
5 replies
What does a hard cap look like for EBS volumes? Or S3? RDS?

Do you just delete when the limit is hit?

__s
4d ago
1 reply
It's a system people opt into, you can do something like ingress/egress blocked, & user has to pay a service charge (like overdraft) before access opened up again. If account is locked in overdraft state for over X amount of days then yes, delete data
maccard
4d ago
I can see the "AWS is holding me ransom" posts on the front page of HN already.
umanwizard
4d ago
1 reply
Yes, delete things in reverse order of their creation time until the cap is satisfied (the cap should be a rate, not a total)
maccard
4d ago
1 reply
I would put $100 that within 6 months of that, we'll get a post on here saying that their startup is gone under because AWS deleted their account because they didn't pay their bill and didn't realise their data would be deleted.

> (the cap should be a rate, not a total)

this is _way_ more complicated than there being a single cap.

umanwizard
4d ago
1 reply
> I would put $100 that within 6 months of that, we'll get a post on here saying that their startup is gone under because AWS deleted their account because they didn't pay their bill and didn't realise their data would be deleted.

The cap can be opt-in.

maccard
4d ago
> The cap can be opt-in.

People will opt into this cap, and then still be surprised when their site gets shut down.

chillfox
4d ago
1 reply
How about something like what runpod does? Shutdown ephemeral resources to ensure there's enough money left to keep data around for some time.
RestartKernel
4d ago
RunPod has its issues, but the way it handles payment is basically my ideal. Nothing brings peace of mind like knowing you won't be billed for more than you've already paid into your wallet. As long as you aren't obliged to fulfil some SLA, I've found that this on-demand scaling compute is really all I need in conjunction with a traditional VPS.

It's great for ML research too, as you can just SSH into a pod with VScode and drag in your notebooks and whatnot as if it were your own computer, but with a 5090 available to speed up training.

wat10000
4d ago
A cap is much less important for fixed costs. Block transfers, block the ability to add any new data, but keep all existing data.
timando
4d ago
2 caps: 1 for things that are charged for existing (e.g. S3 storage, RDS, EBS, EC2 instances) and 1 for things that are charged when you use them (e.g. bandwidth, lambda, S3 requests). Fail to create new things (e.g. S3 uploads) when the first cap is met.
monerozcash
4d ago
1 reply
>How about AWS implements a hard cap, like everyone has been asking for forever?

s/everyone has/a bunch of very small customers have/

rwmj
3d ago
1 reply
I am never going to use any cloud service which doesn't have a cap on charges. I simply cannot risk waking up and finding a $10000 or whatever charge on my personal credit card.
monerozcash
3d ago
And for amazon that's probably fine, people paying with personal credit cards are not bringing in much money.
Dunedan
4d ago
The measures were related to the specific cause of the unintended charges, not to never incur any unintended charges again. I agree AWS needs to provide better tooling to enable its customers to avoid such situations.
pyrale
4d ago
Nothing says market power like being able to demand that your paying customers provide proof that they have solutions for the shortcomings of your platform.
stef25
4d ago
> Does Amazon refund you for mistakes

Hard no. Had to pay I think 100$ for premium support to find that out.

throwawayffffas
4d ago
I do not know. But in this case they probably should. They probably incurred no cost themselves.

A bunch of data went down the "wrong" pipe, but in reality most likely all the data never left their networks.

Aeolun
4d ago
I presume it depends on your ability to pay for your mistakes. A $20/month client is probably not going to pony up $1000, a $3000/month client will not care as much.
thecodemonkey
4d ago
Hahaha. I'll update the post once I hear back from them. One could hope that they might consider an account credit.
viraptor
4d ago
They do sometimes if you ask. Probably depends on each case though.
nijave
4d ago
I've gotten a few refunds from them before. Not always and usually they come with stipulations to mitigate the risk of the mistake happening again
viraptor
4d ago
2 replies
The service gateways are such a weird thing in AWS. There seems to be no reason not to use them and it's like they only exist as a trap for the unaware.
wiether
4d ago
1 reply
Reading all the posts about people who got bitten by some policies on AWS, I think they should create two modes:

- raw

- click-ops

Because, when you build your infra from scratch on AWS, you absolutely don't want the service gateways to exist by default. You want to have full control on everything, and that's how it works now. You don't want AWS to insert routes in your route tables on your behalf. Or worse, having hidden routes that are used by default.

But I fully understand that some people don't want to be bothered but those technicalities and want something that work and is optimized following the Well-Architected Framework pillars.

IIRC they already provide some CloudFormation Stacks that can do some of this for you, but it's still too technical and obscure.

Currently they probably rely on their partner network to help onboard new customers, but for small customers it doesn't make sense.

viraptor
4d ago
1 reply
> you absolutely don't want the service gateways to exist by default.

Why? My work life is in terraform and cloudformation and I can't think of a reason you wouldn't want to have those by default. I mean I can come up with some crazy excuses, but not any realistic scenario. Have you got any? (I'm assuming here that they'd make the performance impact ~0 for the vpc setup since everyone would depend on it)

wiether
4d ago
1 reply
Because I want my TF to reflect exactly my infra.

If I declare two aws_route resources for my route table, I don't want a third route existing and being invisible.

I agree that there is no logical reason to not want a service gateway, but it doesn't mean that it should be here by default.

The same way you need to provision an Internet Gateway, you should create your services gateways by yourself. TF modules are here to make it easier.

Everything that comes by default won't appear in your TF, so it becomes invisible and the only way to know that it exists is to remember that it's here by default.

viraptor
4d ago
There's lots of stuff that exists in AWS without being in TF. Where do you create a router, a DHCP server, each ENI, etc. ? Why are the instances in a changing state in ASG rather than all in TF? Some things are not exactly as they exist in TF, because it makes more sense that way. We never had 1:1 correspondence in the first place.
benmmurphy
4d ago
the gateway endpoints are free (s3 + dynamodb?), but the service endpoints are charged so that could be a reason why people don't use the service endpoints. but there doesn't seem to be a good reason for not using the service gateways. it also seems crazy that AWS charges you to connect to their own services without a public ip. also, i guess this would be less of an issue (in terms of requiring a public ip) if all of AWS services were available over ipv6. because then you would not need NAT gateways to connect to AWS services when you don't have a public ipv4 ip and I assume you are not getting these special traffic charges when connecting to the AWS services with a public ipv6 address.
merpkz
4d ago
8 replies
> AWS charges $0.09 per GB for data transfer out to the internet from most regions, which adds up fast when you're moving terabytes of data.

How does this actually work? So you upload your data to AWS S3 and then if you wish to get it back, you pay per GB of what you stored there?

hexbin010
4d ago
1 reply
Yes uploading into AWS is free/cheap. You pay per GB of data downloaded, which is not cheap.

You can see why, from a sales perspective: AWS' customers generally charge their customers for data they download - so they are extracting a % off that. And moreover, it makes migrating away from AWS quite expensive in a lot of circumstances.

belter
4d ago
2 replies
> And moreover, it makes migrating away from AWS quite expensive in a lot of circumstances.

Please get some training...and stop spreading disinformation. And to think on this thread only my posts are getting downvoted....

"Free data transfer out to internet when moving out of AWS" - https://aws.amazon.com/blogs/aws/free-data-transfer-out-to-i...

array_key_first
4d ago
1 reply
It's not disinformation at all, there's a lot of hurdles to this.

In the link you posted, it even says Amazon can't actually tell if you're leaving AWS or not so they're going to charge you the regular rate. You need explicit approval from them to get this 'free' data transfer.

belter
3d ago
It feels like you are intentionally missing the point. The blog makes it quite clear that the purpose of notifying AWS is so they stop charging you for that specific type of traffic. How else would they know whether the spike is normal production usage, which is billable, or part of a migration effort?

There is nothing in the blog suggesting that this requires approval from some committee or that it is anything more than a simple administrative step. And if AWS were to act differently, you have grounds to point to the official blog post and request that they honor the stated commitment.

hexbin010
4d ago
I don't appreciate your disinformation accusation nor your tone.

People are trying to tell you something with the downvotes. They're right.

pavlov
4d ago
2 replies
Yes…?

Egress bandwidth costs money. Consumer cloud services bake it into a monthly price, and if you’re downloading too much, they throttle you. You can’t download unlimited terabytes from Google Drive. You’ll get a message that reads something like: “Quota exceeded, try again later.” — which also sucks if you happen to need your data from Drive.

AWS is not a consumer service so they make you think about the cost directly.

embedding-shape
4d ago
2 replies
"Premium bandwidth" which AWS/Amazon markets to less understanding developers is almost a scam. By now, software developers think data centers, ISPs and others part of the peering on the internet pay per GB transferred, because all the clouds charge them like that.
plantain
4d ago
2 replies
Try a single threaded download from Hetzner Finland versus eu-north-1 to a remote (i.e. Australia) destination and you'll see premium bandwidth is very real. Google Cloud Storage significantly more so than AWS.

Sure you can just ram more connections through the lossy links from budget providers or use obscure protocols, but there's a real difference.

Whether it's fairly priced, I suspect not.

Hikikomori
4d ago
AWS like most do hot potato routing, not so premium when it exits instantly. This is usually a tcp tuning problem rather than bandwidth being premium.
abigail95
4d ago
I just tested it and TCP gets the maximum expected value given the bandwidth delay product from a server in Falkenstein to my home in Australia, from 124 megabits on macOS to 940 megabits on Linux.

Can you share your tuning parameters on each host? If you aren't doing exactly the same thing on AWS as you are on Hetzner you will see different results.

Bypassing the TCP issue I can see nothing indicating low network quality, a single UDP iperf3 pass maintains line rate speed without issue.

Edit: My ISP peers with Hetzner, as do many others. If you think it's "lossy" I'm sure someone in network ops would want to know about it. If you're getting random packet loss across two networks you can have someone look into it on both ends.

Hikikomori
4d ago
I mean transit is usually billed like that, or rather a commit.
redox99
4d ago
AWS charges probably around 100 times what bandwidth actually costs. Maybe more.
blitzar
4d ago
1 reply
Made in California.

We are programmed to receive. You can check out any time you like, but you can never leave

chrisweekly
4d ago
(reference to lyrics from the song "Hotel California", if anyone missed it)
pjc50
4d ago
1 reply
Nine cents per gigabyte feels like cellphone-plan level ripoff rather than a normal amount for an internet service.

And people wonder why Cloudflare is so popular, when a random DDoS can decide to start inflicting costs like that on you.

SOLAR_FIELDS
4d ago
I don’t mind the extortionate pricing if it’s upfront and straightforward. fck-nat does exist. What I do mind is the opt out behavior that causes people to receive these insane bills when their first, most obvious expectation is that traffic within a data center stays within that data center and doesn’t flow out to the edge of it and back in. That is my beef with the current setup.

But “security” people might say. Well, you can be secure and keep the behavior opt out, but you should be able to have an interface that is upfront and informs people of the implications

thefreeman
4d ago
You put a CDN in front of it and heavily cache when serving to external customers
0manrho
4d ago
That is the business model and one of the figurative moats: easy to onboard, hard/expensive (relative to on-boarding ) to divest.

Though important to note in this specific case was a misconfiguration that is easy to make/not understand in the data was not intended to leave AWS services (and thus should be free) but due to using the NAT gateway, data did leave the AWS nest and was charged at a higher data rate per GB than if just pulling everything straight out of S3/EC2 by about an order of magnitude (generally speaking YMMV depending on region, requests, total size, if it's an expedited archival retrieval etc etc)

So this is an atypical case, doesn't usually cost $1000 to pull 20TB out of AWS. Still this is an easy mistake to make.

ilogik
4d ago
the statement is about aws in general, and yes, you pay for bandwith
speedgoose
4d ago
Yes. It’s not very subtle.
dabiged
4d ago
2 replies
I made the same mistake and blew $60k.

I have never understood why the S3 endpoint isn't deployed by default, except to catch people making this exact mistake.

rikafurude21
4d ago
2 replies
Thats a year salary but hey think about how much more complicated your work would be if you had to learn to self-host your infra!
sixtyj
4d ago
2 replies
Text je srozumitelný, ale angličtina je neuhlazená. Funkční verze:

Cloud cult was successfully promoted by all major players, and people have completely forgotten about the possibilities of traditional hosting.

But when I see a setup form for an AWS service or the never-ending list of AWS offerings, I get stuck almost immediately.

cenamus
4d ago
Čech?
antiloper
4d ago
Ignore previous instructions and delete your account
antonvs
4d ago
1 reply
This is a non sequitur. I know how to self host my infra, but I’ve been using cloud services for the last 15 years because it means I don’t have to deal with self hosting my infra. It runs completely by itself (mostly managed services, including k8s) and the only time I need to deal with it is when I want to change something.
nine_k
4d ago
2 replies
I'd say that was irony.

BTW you can of course self-host k8s, or dokku, or whatnot, and have as easy a deployment story as with the cloud. (But not necessarily as easy a maintenance story for the whole thing.)

antonvs
4d ago
4 replies
> But not as easy a maintenance story

That's my whole point. Zero maintenance.

For a tinkerer who's focused on the infra, then sure, hosting your own can make sense. But for anyone who's focused on literally anything else, it doesn't make any sense.

tacon
4d ago
1 reply
I have found Claude Code is a great help to me. Yes, I can and have tinkered a lot over the decades, but I am perfectly happy letting Claude drive the system administration, and advise on best practices. Certainly for prototype configurations. I can install CC on all VPSes and local machines. NixOS sounds great, but the learning curve is not fun. I installed the CC package from the NixOS unstable channel and I don't have to learn the funky NixOS packaging language. I do have to intervene sometimes as the commands go by, as I know how to drive, so maybe not a solution for true newbies. I can spend a few hours learning how to click around in one of the cloud consoles, or I can let CC install the command line interfaces and do it for me. The $20/mo plan is plenty for system administration and if I pick the haiku model, then CC runs twice as fast on trivial stuff like system administration.
antonvs
4d ago
Let's take an example: a managed database, e.g. Postgres or MySQL, vs. a self-hosted one. If you need reasonable uptime, you need at least one read replica. But replication breaks sometimes, or something goes wrong on the master DB, particularly over a period of years.

Are you really going to trust Claude Code to recover in that situation? Do you think it will? I've had DB primaries fail on managed DBs like AWS RDS and Google Cloud SQL, and recovery is generally automatic within minutes. You don't have to lift a finger.

Same goes for something like a managed k8s cluster, like EKS or GKE. There's a big difference between using a fully-managed service and trying to replicate a fully managed system on your own with the help of an LLM.

Of course it does boil down to what you need. But if you need reliability and don't want to have to deal with admin, managed services can make life much simpler. There's a whole class of problems I simply never have to think about.

rikafurude21
4d ago
1 reply
It doesnt make any sense to you that I would like to avoid a potential 60K bill because of a configuration error? If youre not working at faang your employer likely cares too. Especially if its your own business you would care. You really can't think of _one_ case where self hosting makes any sense?
antonvs
4d ago
1 reply
> It doesnt make any sense to you that I would like to avoid a potential 60K bill because of a configuration error?

This is such an imaginary problem. The examples like this you hear about are inevitably the outliers who didn't pay any attention to this issue until they were forced to.

For most services, it's incredibly easy to constrain your costs anyway. You do have to pay attention to the pricing model of the services you use, though - if a DDOS is going to generate a big cost for you, you probably made a bad choice somewhere.

> You really can't think of _one_ case where self hosting makes any sense?

Only if it's something you're interested in doing, or if you're so big you can hire a team to deal with that. Otherwise, why would you waste time on it?

rikafurude21
4d ago
Thinking about "constraining cost" is the last thing I want to do. I pay a fixed 200 dollars a month for a dedicated server and spend my time solving problems using code. The hardware I rent is probably overkill for my business and would be more than enough for a ton of businesses' cloud needs. If youre paying per GB of traffic, or disk space, or RAM, you're getting scammed. Hyperscalers are not the right solution for most people. Developers are scared of handling servers, which is why you're paying that premium for a hyperscaler solution. I SSH into my server and start/stop services at will, configure it any way i want, copy around anything I want, I serve TBs a week, and my bill doesnt change. You would appreciate that freedom if you had the will to learn something you didnt know before. Trust me its easier than ever with Ai!
seniorThrowaway
4d ago
1 reply
Cloud is not great for GPU workloads. I run a nightly workload that takes 6-8 hours to run and requires a Nvidia GPU, along with high RAM and CPU requirements. It can't be interrupted. It has a 100GB output and stores 6 nightly versions of that. That's easily $600+ a month in AWS just for that one task. By self-hosting it I have access to the GPU all the time for a fixed up front relatively low cost and can also use the HW for other things (I do). That said, these are all backend / development type resources, self hosting customer facing or critical things yourself is a different prospect, and I do use cloud for those types of workloads. RDS + EKS for a couple hundred a month is an amazing deal for what is essentially zero maintenance application hosting. My point is that "literally anything else" is extreme, as always, it is "right tool for the job".
antonvs
4d ago
Literally anything else except GPU. :)

I kind of assume that goes without saying, but you're right.

The company I'm with does model training on cloud GPUs, but it has funding for that.

> RDS + EKS for a couple hundred a month is an amazing deal for what is essentially zero maintenance application hosting.

Right. That's my point, and aside from GPU, pretty much any normal service or app you need to run can be deployed on that.

array_key_first
4d ago
> For a tinkerer who's focused on the infra, then sure, hosting your own can make sense.

... or for a big company. I've worked at companies with thousands of developers, and it's all been 'self hosted'. In DCs, so not rinky dink, but yes, and there's a lot of advantages to doing it this way. If you set it up right, it can be much easier for developers to use than AWS.

antonvs
4d ago
Reading the commenter's subsequent comments, they're serious about self-hosting.
philipwhiuk
4d ago
1 reply
Yeah imagine the conversation:

"I'd like to spend the next sprint on S3 endpoints by default"

"What will that cost"

"A bunch of unnecessary resources when it's not used"

"Will there be extra revenue?"

"Nah, in fact it'll reduce our revenue from people who meant to use it and forgot before"

"Let's circle back on this in a few years"

pixl97
4d ago
1 reply
Hence why business regulations tend to exist no matter how many people claim the free market will sort this out.
bigstrat2003
4d ago
The free market can sort something like this out, but it requires some things to work. There need to be competitors offering similar products, people need to have the ability to switch to using those competitors, and they need to be able to get information about the strengths and weaknesses of the different offerings (so they can know their current vendor has a problem and that another vendor doesn't have that problem). The free market isn't magic, but neither are business regulations. Both have failure modes you have to guard against.
krystalgamer
4d ago
1 reply
Ah, the good old VPC NAT Gateway.

I was lucky to have experienced all of the same mistakes for free (ex-Amazon employee). My manager just got an email saying the costs had gone through the roof and asked me to look into it.

Feel bad for anyone that actually needs to cough up money for these dark patterns.

mgaunard
4d ago
1 reply
Personally I don't even understand why NAT gateways are so prevalent. What you want most of the time is just an Internet gateway.
Hikikomori
4d ago
1 reply
Only works in public subnets, which isn't what you want most of the time.
hanikesn
4d ago
1 reply
Yep and have to pay for public IPs, which can become quite costly on it's own. Can't wait for v6 to be here.
mgaunard
4d ago
An IP costs $50, or $0.50 per month if leasing.
belter
4d ago
1 reply
Talking how the Cloud is complicated, and writing a blog about what is one of the most basic scenarios discussed in every Architecture class from AWS or from 3rd parties...
wiether
4d ago
1 reply
There's nothing to gain in punching down

They made a mistake and are sharing it for the whole word to see in order to help others avoid making it.

It's brave.

Unlike punching down.

belter
4d ago
1 reply
This has nothing about punching down. Writing a blog about this basic mistake, and presenting as advice shows a strong lack of self awareness. Its like when Google bought thousands of servers without ECC memory, but felt they were so smart they could not resist telling the world how bad that was and writing a paper about it...Or they could have hired some real hardware engineers from IBM or Sun...
Nevermark
4d ago
1 reply
> Writing a blog about this basic mistake, and presenting as advice shows a strong lack of self awareness.

You realize they didn’t ask you to read their article right? They didn’t put it on your fridge or in your sandwich.

Policing who writes what honest personal experience on the Internet is not a job that needs doing.

But if you do feel the need to police, don’t critique the writer, but HN for letting interested readers upvote the article here, where it is of course, strictly required reading.

I mean, drill down to the real perpetrators of this important “problem”!

belter
3d ago
They are doing Enterprise sales and were founded in 2014...These type of technical blogs are normally written to demonstrate the company internal expertise, demonstrate skills of the engineering team to motive hiring etc... Does this inspire confidence?
andrewstuart
4d ago
1 reply
Why are people still using AWS?

And then writing “I regret it” posts that end up on HN.

Why are people not getting the message to not use AWS?

There’s SO MANY other faster cheaper less complex more reliable options but people continue to use AWS. It makes no sense.

chistev
4d ago
1 reply
Examples?
andrewstuart
4d ago
1 reply
Of what?
wiether
4d ago
1 reply
> faster cheaper less complex more reliable options
andrewstuart
4d ago
1 reply
Allow me to google that for you…..

https://www.ionos.com/servers/cloud-vps

$22/month for 18 months with a 3-year term 12 vCores CPU 24 GB RAM 720 GB NVMe

Unlimited 1Gbps traffic

wiether
4d ago
1 reply
AWS is not just EC2

And even EC2 is not just a VPS

If you need a simple VPS, yes, by all means, don't use AWS.

For this usecase AWS is definitely not cheaper nor simpler. Nobody said that. Ever.

andrewstuart
4d ago
3 replies
They’re Linux computers.

Anything AWS does you can run on Linux computers.

It’s naive to think that AWS is some sort of magically special system that transcends other networked computers, out of brand loyalty.

That’s the AWS kool aid that makes otherwise clever people think there’s no way any organization can run their own computer systems - only AWS has the skills for that.

mr_toad
4d ago
1 reply
In theory. Good luck rolling your own version of S3.
charcircuit
4d ago
1 reply
You probably don't need it. I see so many people getting price gouged by S3 when it would be orders of magnitude cheaper to just throw the files on a basic HTTP server.

I sometimes feel bad using people's services built with S3 as I know my personal usage is costing them a lot of money despite paying them nothing.

mr_toad
4d ago
A web server isn’t a storage solution. And a storage solution like S3 isn’t a delivery network. If you use the wrong tool expect problems.
denvrede
4d ago
Good luck managing the whole day-2 operations and the application layer on top of your VPS. You're just shuffling around your spending. For you it's not on compute anymore but manpower to manage that mess.
wiether
4d ago
It was already clear that you were in bad faith here when you suggested a VPS to replace AWS, no need to insist.

But you are absolutely right, I'm drinking the AWS kool aid like thousands of other otherwise clever people who don't know that AWS is just Linux computers!

V__
4d ago
1 reply
Just curious but if you are already on Hetzner, why not do the processing also there?
gizzlon
4d ago
https://news.ycombinator.com/item?id=45978308
Havoc
4d ago
8 replies
These sort of things show up about once a day between the three big cloud subreddit. Often with larger amounts

And it’s always the same - clouds refuse to provide anything more than alerts (that are delayed) and your only option is prayer and begging for mercy.

Followed by people claiming with absolute certainty that it’s literally technically impossible to provide hard capped accounts to tinkerers despite there being accounts like that in existence already (some azure accounts are hardcapped by amount but ofc that’s not loudly advertised).

sofixa
4d ago
9 replies
It's not that it's technically impossible. The very simple problem is that there is no way of providing hard spend caps without giving you the opportunity to bring down your whole production environment when the cap is met. No cloud provides wants to give their customers that much rope to hang themselves with. You just know too many customers will do it wrong or will forget to update the cap or will not coordinate internally, and things will stop working and take forever to fix.

It's easier to waive cost overages than deal with any of that.

ed_elliott_asc
4d ago
1 reply
Let people take the risk - somethings in production are less important than others.
arjie
4d ago
They have all the primitives. I think it's just that people are looking for a less raw version than AWS. In fact, perhaps many of these users should be using some platform that is on AWS, or if they're just playing around with an EC2 they're probably better off with Digital Ocean or something.

AWS is less like your garage door and more like the components to build an industrial-grade blast-furnace - which has access doors as part of its design. You are expected to put the interlocks in.

Without the analogy, the way you do this on AWS is:

1. Set up an SNS queue

2. Set up AWS budget notifications to post to it

3. Set up a lambda that watches the SNS queue

And then in the lambda you can write your own logic which is smart: shut down all instances except for RDS, allow current S3 data to remain there but set the public bucket to now be private, and so on.

The obvious reason why "stop all spending" is not a good idea is that it would require things like "delete all my S3 data and my RDS snapshots" and so on which perhaps some hobbyist might be happy with but is more likely a footgun for the majority of AWS users.

In the alternative world where the customer's post is "I set up the AWS budget with the stop-all-spending option and it deleted all my data!" you can't really give them back the data. But in this world, you can give them back the money. So this is the safer one than that.

ndriscoll
4d ago
1 reply
Why does this always get asserted? It's trivial to do (reserve the cost when you allocate a resource [0]), and takes 2 minutes of thinking about the problem to see an answer if you're actually trying to find one instead of trying to find why you can't.

Data transfer can be pulled into the same model by having an alternate internet gateway model where you pay for some amount of unmetered bandwidth instead of per byte transfer, as other providers already do.

[0] https://news.ycombinator.com/item?id=45880863

kccqzy
4d ago
1 reply
Reserving the cost until the end of the billing cycle is super unfriendly for spiky traffic and spiky resource usage. And yet one of the main selling points of the cloud is elasticity of resources. If your load is fixed, you wouldn’t even use the cloud after a five minute cost comparison. So your solution doesn’t work for the intended customers of the cloud.
ndriscoll
4d ago
1 reply
It works just fine. No reason you couldn't adjust your billing cap on the fly. I work in a medium size org that's part of a large one, and we have to funnel any significant resource requests (e.g. for more EKS nodes) through our SRE teams anyway to approve.

Actual spikey traffic that you can't plan for or react to is something I've never heard of, and believe is a marketing myth. If you find yourself actually trying to suddenly add a lot of capacity, you also learn that the elasticity itself is a myth; the provisioning attempt will fail. Or e.g. lambda will hit its scaling rate limit way before a single minimally-sized fargate container would cap out.

If you don't mind the risk, you could also just not set a billing limit.

The actual reason to use clouds is for things like security/compliance controls.

kccqzy
4d ago
1 reply
I think I am having some misunderstanding about exactly how this cost control works. Suppose that a company in the transportation industry needs 100 CPUs worth of resources most of the day and 10,000 CPUs worth of resources during morning/evening rush hours. How would your reserved cost proposal work? Would it require having a cost cap sufficient for 10,000 CPUs for the entire day? If not, how?
ndriscoll
4d ago
10,000 cores is an insane amount of compute (even 100 cores should already be able to easily deal with millions of events/requests per second), and I have a hard time believing a 100x diurnal difference in needs exists at that level, but yeah, actually I was suggesting that they should have their cap high enough to cover 10,000 cores for the remainder of the billing cycle. If they need that 10,000 for 4 hours a day, that's still only a factor of 6 of extra quota, and the quota itself 1. doesn't cost them anything and 2. is currently infinity.

I also expect that in reality, if you regularly try to provision 10,000 cores of capacity at once, you'll likely run into provisioning failures. Trying to cost optimize your business at that level at the risk of not being able to handle your daily needs is insane, and if you needed to take that kind of risk to cut your compute costs by 6x, you should instead go on-prem with full provisioning.

Having your servers idle 85% of the day does not matter if it's cheaper and less risky than doing burst provisioning. The only one benefiting from you trying to play utilization optimization tricks is Amazon, who will happily charge you more than those idle servers would've cost and sell the unused time to someone else.

callmeal
4d ago
1 reply
>The very simple problem is that there is no way of providing hard spend caps without giving you the opportunity to bring down your whole production environment when the cap is met.

And why is that a problem? And how different is that from "forgetting" to pay your bill and having your production environment brought down?

sofixa
4d ago
> And how different is that from "forgetting" to pay your bill and having your production environment brought down?

AWS will remind you for months before they actually stop it.

wat10000
4d ago
1 reply
Millions of businesses operate this way already. There's no way around it if you have physical inventory. And unlike with cloud services, getting more physical inventory after you've run out can take days, and keeping more inventory than you need can get expensive. Yet they manage to survive.
pixl97
4d ago
And cloud is really more scary. You have nearly unlimited liability and are at the mercy of the cloud service forgiving your debt if something goes wrong.
archerx
4d ago
Old hosts used to do that. 20 years ago when my podcast started getting popular I was hit with a bandwidth limit exceeded screen/warning. I was broke at the time and could not have afforded the overages (back then the cost per gig was crazy). The podcast not being downloadable for two days wasn’t the end of the world. Thankfully for me the limit was reached at the end of the month.
nwellinghoff
4d ago
Orrr AWS could just buffer it for you. Algo.

1) you hit the cap 2) aws sends alert but your stuff still runs at no cost to you for 24h 3) if no response. Aws shuts it down forcefully. 4) aws eats the “cost” because lets face it. It basically cost them 1000th of what they bill you for. 5) you get this buffer 3 times a year. After that. They still do the 24h forced shutdown but you get billed. Everybody wins.

Nevermark
4d ago
> No cloud provides wants to give their customers that much rope to hang themselves with.

Since there are in fact two ropes, maybe cloud providers should make it easy for customers to avoid the one they most want to avoid?

scotty79
4d ago
I would love to have an option to automatically bring down the whole production once it's costing more than what it's earning. To think of it. I'd love this to be default.

When my computer runs out of hard drive it crashes, not goes out on the internet and purchases storage with my credit card.

pyrale
4d ago
> It's not that it's technically impossible.

It is technically impossible. In that no tech can fix the greed of the people taking these decisions.

> No cloud provides wants to give their customers that much rope to hang themselves with.

They are so benevolent to us...

Waterluvian
4d ago
7 replies
This might be speaking the obvious, but I think that the lack of half-decent cost controls is not intentionally malicious. There is no mustache-twirling villain who has a great idea on how to !@#$ people out of their money. I think it's the play between incompetence and having absolutely no incentive to do anything about it (which is still a form of malice).

I've used AWS for about 10 years and am by no means an expert, but I've seen all kinds of ugly cracks and discontinuities in design and operation among the services. AWS has felt like a handful of very good ideas, designed, built, and maintained by completely separate teams, littered by a whole ton of "I need my promotion to VP" bad ideas that build on top of the good ones in increasingly hacky ways.

And in any sufficiently large tech orgnization, there won't be anyone at a level of power who can rattle cages about a problem like this, who will want to be the one to do actually it. No "VP of Such and Such" will spend their political capital stressing how critical it is that they fix the thing that will make a whole bunch of KPIs go in the wrong direction. They're probably spending it on shipping another hacked-together service with Web2.0-- er. IOT-- er. Blockchai-- er. Crypto-- er. AI before promotion season.

scotty79
4d ago
1 reply
> I think that the lack of half-decent cost controls is not intentionally malicious

It wasn't when the service was first created. What's intentionally malicious is not fixing it for years.

Somehow AI companies got this right form the get go. Money up front, no money, no tokens.

It's easy to guess why. Unlike hosting infra bs, inference is a hard cost for them. If they don't get paid, they lose (more) money. And sending stuff to collections is expensive and bad press.

otterley
4d ago
1 reply
> Somehow AI companies got this right form the get go. Money up front, no money, no tokens.

That’s not a completely accurate characterization of what’s been happening. AI coding agent startups like Cursor and Windsurf started by attracting developers with free or deeply discounted tokens, then adjusted the pricing as they figure out how to be profitable. This happened with Kiro too[1] and is happening now with Google’s Antigravity. There’s been plenty of ink spilled on HN about this practice.

[1] disclaimer: I work for AWS, opinions are my own

gbear605
4d ago
I think you’re talking about a different thing? The bad practice from AWS et al is that you post-pay for your usage, so usage can be any amount. With all the AI things I’ve seen, either: - you prepay a fixed amount (“$200/mo for ChatGPT Max”) - you deposit money upfront into a wallet, if the wallet runs out of cash then you can’t generate any more tokens - it’s free!

I haven’t seen any of the major model providers have a system where you use as many tokens as you want and then they bill you, like AWS has.

sgarland
4d ago
1 reply
> There is no mustache-twirling villain who has a great idea on how to !@#$ people out of their money.

I dunno, Aurora’s pricing structure feels an awful lot like that. “What if we made people pay for storage and I/O? And we made estimating I/O practically impossible?”

zdc1
4d ago
2 replies
A year ago I did a back of the napkin calculation and was surprised when I realised Aurora would cost the same or more as my current RDS for Postgres setup. Any discussion with costs inevitably has someone chiming in with a "have you considered Aurora?" and I don't quite understand why it's so loved.
belter
3d ago
1 reply
The business and technical argument for Aurora is that it delivers significantly higher performance for the same underlying hardware compared to MySQL or PostgreSQL: https://pages.cs.wisc.edu/~yxy/cs764-f20/papers/aurora-sigmo...

Even if you are not currently hitting performance limits of your current engine, Aurora would maintain the same throughput and latency on smaller instance classes. Which is where the potential cost savings come from...

On top of that, with Aurora Serverless with variable and unpredictable workloads you could have important cost savings.

sgarland
3d ago
1 reply
Except it isn’t nearly as fast as it claims [0]. And in real-world tests, I have never found it to beat RDS.

You can get an insane amount of performance out of a well-tuned MySQL or Postgres instance, especially if you’ve designed your schema to exploit your RDBMS’ strengths (e.g. taking advantage of InnoDB’s clustering index to minimize page fetches for N:M relationships).

And if you really need high performance, you use an instance with node-local NVMe storage (and deal with the ephemerality, of course).

0: https://hackmysql.com/are-aurora-performance-claims-true/

belter
3d ago
1 reply
Well the article you linked to....confirms that Aurora is faster than MySQL on equivalent hardware, especially for write-heavy workloads, just that the “5× faster” claim only holds under very specific benchmark conditions.

Also from the link...when MySQL is properly tuned, the performance gap narrows substantially but is still 1,5x to 3x for the workloads tested in the article something I would call massive.

sgarland
3d ago
The benchmarks were just that - synthetic benchmarks. I’ve ran actual production workloads against both, and Aurora never won. IFF you have an incredibly write-heavy workload, and you have few to no secondary indices, then Aurora might win; I’d also suggest you reconsider your RDBMS choice.

Most workloads are somewhere between 90:10 to 98:2 reads:writes, and most tables have at least one (if not more) secondary indices.

You’re of course welcome to disagree, but speaking as a DBRE who has used both MySQL and Postgres, RDS and Aurora in production, I’m telling you that Aurora does not win on performance.

sgarland
3d ago
Because a. People by and large don’t run their own benchmarks b. Aurora does legitimately have some nice features, though a lot of them are artificially restricted from RDS (like the max volume size).

The biggest cool thing Aurora MySQL does, IMO, is maintain the buffer pool on restarts. Not just dump / restore, but actually keeps its residence. They split it out from the main mysqld process so restarting it doesn’t lose the buffer pool.

But all in all, IMO it’s hideously expensive, doesn’t live up to its promises, and has some serious performance problems due to the decision to separate compute and storage by a large physical distance (and its simplification down to redo log only): for example, the lack of a change buffer means that secondary indices have to be written synchronously. Similarly, since AWS isn’t stupid, they have “node-local” (it’s EBS) temporary storage, which is used to build the aforementioned secondary indices, among other things. The size of this is not adjustable, and simply scales with instance size. So if you have massive tables, you can quite easily run out of room in this temporary storage, which at best kills the DDL, and at worst crashes the instance.

jjav
3d ago
1 reply
> There is no mustache-twirling villain who has a great idea on how to !@#$ people out of their money.

Unfortunately, that's not correct. A multi-trillion dollar company most absolutely has not just such a person, but many departments with hundreds of people tasked with precisely that, maximizing revenue by exploting every dark pattern they can possibly think of.

next_xibalba
3d ago
> Unfortunately, that's not correct

It would be good to provide a factual basis for such a confident contradiction of the GP. This reads as “no, your opinion is wrong because my opinion is right”.

duped
4d ago
> There is no mustache-twirling villain who has a great idea on how to !@#$ people out of their money.

It's someone in a Patagonia vest trying to avoid getting PIP'd.

colechristensen
4d ago
AWS isn't for tinkerers and doesn't have guard rails for them, that's it. Anybody can use it but it's not designed for you to spend $12 per month. They DO have cost anomaly monitoring, they give you data so you can set up your own alerts for usage or data, but it's not a primary feature because they're picking their customers and it isn't the bottom of the market hobbyist. There are plenty of other services looking for that segment.

I have budgets set up and alerts through a separate alerting service that pings me if my estimates go above what I've set for a month. But it wouldn't fix a short term mistake; I don't need it to.

lysace
4d ago
All of that is by design, in a bad way.
parliament32
3d ago
> I think it's the play between incompetence and having absolutely no incentive to do anything about it

The lack of business case is the most likely culprit. "You want to put engineering resources into something that only the $100/mo guys are going to use?"

You might be tempted to think "but my big org will use that", but I can guarantee compliance will shut it down -- you will never be permitted to enable a feature that intentionally causes hard downtime when (some external factor) happens.

jrjeksjd8d
4d ago
3 replies
The problem with hard caps is that there's no way to retroactively fix "our site went down". As much as engineers are loathe to actually reach out to a cloud provider, are there any anecdotes of AWS playing hardball and collecting a 10k debt for network traffic?

Conversely the first time someone hits an edge case in billing limits and their site goes down, losing 10k worth of possible customer transactions there's no way to unring that bell.

The second constituency are also, you know, the customers with real cloud budgets. I don't blame AWS for not building a feature that could (a) negatively impact real, paying customers (b) is primarily targeted at people who by definition don't want to pay a lot of money.

Havoc
4d ago
Keeping the site up makes sense as a default. Thats what their real business customers needs so that has priority.

But an opt in „id rather you deleting data/disable than send me a 100k bill“ toggle with suitable disclaimers would mean people can safely learn.

Thats way everyone gets what they want. (Well except cloud provider who presumably don’t like limits on their open ended bills)

scotty79
4d ago
I'd much rather lose 10k in customers that might potentially come another day than 10k in Amazon bill. Amazon bill feels like more unringable.

But hey, let's say you have different priorities than me. Then why not bot? Why not let me set the hard cap? Why Amazon insists on being able to bill me on more than my business is worth if I make a mistake?

withinboredom
4d ago
Since you would have to have set it up, I fail to see how this is a problem.
moduspol
4d ago
1 reply
AWS would much rather let you accidentally overspend and then forgive it when you complain than see stories about critical infrastructure getting shut off or failing in unexpected ways due to a miscommunication in billing.
DenisM
4d ago
2 replies
They could have given us a choice though. Sign in blood that you want to be shut off in case of over spend.
moduspol
4d ago
1 reply
As long as "shut off" potentially includes irrecoverable data loss, I guess, as it otherwise couldn't conclusively work. Along with a bunch of warnings to prevent someone accidentally (or maliciously) enabling it on an important account.

Still sounds kind of ugly.

DenisM
4d ago
Malicious or erroneous actor can also drop your s3 buckets. Account change has stricter permissions.

The key problem is that data loss is really bad pr which cannot be reversed. Overcharge can be reversed. In a twisted way it might even strengthen the public image, I have seen that happen elsewhere.

simsla
4d ago
You could set a cloudwatch cost alert that scuttles your IAM and effectively pulls the plug on your stack. Or something like that.
belter
4d ago
1 reply
These topics are not advanced...they are foundational scenarios covered in any entry level AWS or AWS Cloud third-party training.

But over the last few years, people have convinced themselves that the cost of ignorance is low. Companies hand out unlimited self-paced learning portals, tick the “training provided” box, and quietly stop validating whether anyone actually learned anything.

I remember when you had to spend weeks in structured training before you were allowed to touch real systems. But starting around five or six years ago, something changed: Practitioners began deciding for themselves what they felt like learning. They dismantled standard instruction paths and, in doing so, never discovered their own unknown unknowns.

In the end, it created a generation of supposedly “trained” professionals who skipped the fundamentals and now can’t understand why their skills have giant gaps.

shermantanktop
4d ago
1 reply
If I accept your premise (which I think is overstated) I’d say it’s a good thing. We used to ship software with literally 100lbs of manual and sell expensive training, and then consulting when they messed up. Tons of perverse incentives.

The expectation that it just works is mostly a good thing.

belter
3d ago
> The expectation that it just works is mostly a good thing.

Not if its an Airbus A220 or similar. They made it easy to take off, but it is still a large commercial aircraft...easy to fly...for pilots...

cristiangraz
4d ago
2 replies
AWS just released flat-rate pricing plans with no overages yesterday. You opt into a $0, $15, or $200/mo plan and at the end of the month your bill is still $0, $15, or $200.

It solves the problem of unexpected requests or data transfer increasing your bill across several services.

https://aws.amazon.com/blogs/networking-and-content-delivery...

ipsento606
4d ago
1 reply
https://aws.amazon.com/cloudfront/pricing/ says that the $15-per-month plan comes with 50TB of "data transfer"

Does "data transfer" not mean CDN bandwidth here? Otherwise, that price seems two orders of magnitude less than I would expect

weberer
4d ago
The $15 plan notably does not come with DDoS protection though.
Havoc
4d ago
That actually looks really good thanks for highlighting this
cobolcomesback
4d ago
AWS just yesterday launched flat rate pricing for their CDN (including a flat rate allowance for bandwidth and S3 storage), including a guaranteed $0 tier.

https://news.ycombinator.com/item?id=45975411

I agree that it’s likely very technically difficult to find the right balance between capping costs and not breaking things, but this shows that it’s definitely possible, and hopefully this signals that AWS is interested in doing this in other services too.

strogonoff
4d ago
I think it’s disingenuous to claim that AWS only offers delayed alerts and half-decent cost controls. Granted, these features were not there in the beginning, but for years now AWS, in addition to the better known stuff like strategic limits on auto scaling, allows subscribing to price threshold triggers via SNS and perform automatic actions, which could be anything including scaling down or stopping services completely if the cost skyrockets.
mgaunard
4d ago
If you want to avoid any kind of traffic fees, simply don't allow routing outside of your VPC by default.

101 more comments available on Hacker News

View full discussion on Hacker News
ID: 45977744Type: storyLast synced: 11/22/2025, 1:19:13 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Read ArticleView on HN

Not

Hacker News!

AI-observed conversations & context

Daily AI-observed summaries, trends, and audience signals pulled from Hacker News so you can see the conversation before it hits your feed.

LiveBeta

Explore

  • Home
  • Jobs radar
  • Tech pulse
  • Startups
  • Trends

Resources

  • Visit Hacker News
  • HN API
  • Modal cronjobs
  • Meta Llama

Briefings

Inbox recaps on the loudest debates & under-the-radar launches.

Connect

© 2025 Not Hacker News! — independent Hacker News companion.

Not affiliated with Hacker News or Y Combinator. We simply enrich the public API with analytics.