Go Ahead, Self-Host Postgres

Posted17 days agoActive15 days ago

pavel_lishin

671 points

394 comments

pierce.devTech DiscussionstoryHigh profile

calmpositive

Debate

20/100

Data_storageSelf-HostingDatabase Administration

Key topics

Data_storage

Self-Hosting

Database Administration

The debate around self-hosting Postgres versus relying on managed services is heating up, with veterans sharing their decades-long experiences of rock-solid self-hosting and others chiming in on the reliability and support trade-offs. While some swear by the control and reliability of self-hosting, others point out that the need for 24/7 support is often driven by global customer scope, and that managed services can be a safer bet for critical applications. The discussion reveals a divide between those who've never had issues with self-hosting and those who've been burned by outages, with some noting that even big companies like AWS aren't immune to downtime. As one commenter quipped, when AWS goes down, "you share the link to AWS being down and go back to sleep," highlighting the stark contrast between the stress of dealing with self-hosted outages and the relative ease of relying on a major provider.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

Peak period

120

0-6h

Avg / period

17.8

Comment distribution160 data points

Loading chart...

Based on 160 loaded comments

Key moments

01Story posted
Dec 20, 2025 at 10:43 AM EST
17 days ago
Step 01
02First comment
Dec 20, 2025 at 10:51 AM EST
8m after posting
Step 02
03Peak activity
120 comments in 0-6h
Hottest window of the conversation
Step 03
04Latest activity
Dec 22, 2025 at 2:33 PM EST
15 days ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (394 comments)

Showing 160 comments of 394

arichard123

17 days ago

1 reply

I've been self hosting it for 20 years. Best technical decision I ever made. Rock solid

newsoftheday

17 days ago

1 reply

I've been selfhosting it for at least 10 years, it and mysql, mysql longer. No issues selfhosting either. I have backups and I know they work.

moxplod

17 days ago

What server company are you guys using with high reliability? Looking for server in US-East right now.

ipsento606

17 days ago

3 replies

> If your database goes down at 3 AM, you need to fix it.

Of all the places I've worked that had the attitude "If this goes down at 3AM, we need to fix it immediately", there was only one where that was actually justifiable from a business perspective. I'm worked at plenty of places that had this attitude despite the fact that overnight traffic was minimal and nothing bad actually happened if a few clients had to wait until business hours for a fix.

I wonder if some of the preference for big-name cloud infrastructure comes from the fact that during an outage, employees can just say "AWS (or whatever) is having an outage, there's nothing we can do" vs. being expected to actually fix it

From this perspective, the ability to fix problems more quickly when self hosting could be considered an antifeature from the perspective of the employee getting woken up at 3am

zbentley

17 days ago

1 reply

Really? That might be an anecdote sampled from unusually small businesses, then. Between myself and most peers I’ve ever talked to about availability, I heard an overwhelming majority of folks describe systems that really did need to be up 24/7 with high availability, and thus needed fast 24/7 incident response.

That includes big and small businesses, SaaS and non-SaaS, high scale (5M+rps) to tiny scale (100s-10krps), and all sorts of different markets and user bases. Even at the companies that were not staffed or providing a user service over night, overnight outages were immediately noticed because on average, more than one external integration/backfill/migration job was running at any time. Sure, “overnight on call” at small places like that was more “reports are hardcoded to email Bob if they hit an exception, and integration customers either know Bob’s phone number or how to ask their operations contact to call Bob”, but those are still environments where off-hours uptime and fast resolution of incidents was expected.

Between me, my colleagues, and friends/peers whose stories I know, that’s an N of high dozens to low hundreds.

What am I missing?

runako

17 days ago

1 reply

> What am I missing?

IME the need for 24x7 for B2B apps is largely driven by global customer scope. If you have customers in North American and Asia, now you need 24x7 (and x365 because of little holiday overlap).

That being said, there are a number of B2B apps/industries where global scope is not a thing. For example, many providers who operate in the $4.9 trillion US healthcare market do not have any international users. Similarly the $1.5 trillion (revenue) US real estate market. There are states where one could operate where healthcare spending is over $100B annually. Banks. Securities markets. Lots of things do not have 24x7 business requirements.

zbentley

17 days ago

2 replies

I’ve worked for banks, multiple large and small US healthcare-related companies, and businesses that didn’t use their software when they were closed for the night.

All of those places needed their backend systems to be up 24/7. The banks ran reports and cleared funds with nightly batches—hundreds of jobs a night for even small banking networks. The healthcare companies needed to receive claims and process patient updates (e.g. your provider’s EMR is updated if you die or have an emergency visit with another provider you authorized for records sharing—and no, this is not handled by SaaS EMRs in many cases) over night so that their systems were up to date when they next opened for business. The “regular” businesses closed for the night generated reports and frequently had IT staff doing migrations, or senior staff working on something at midnight due the next day (when the head of marketing is burning the midnight oil on that presentation, you don’t want to be the person explaining that she can’t do it because the file server hosting the assets is down all the time after hours).

And again, that’s the norm I’ve heard described from nearly everyone in software/IT that I know: most businesses expect (and are willing to pay for or at least insist on) 24/7 uptime for their computer systems. That seems true across the board: for big/small/open/closed-off-hours/international/single-timezone businesses alike.

runako

17 days ago

2 replies

You are right that a lot of systems at a lot of places need 24x7. Obviously.

But there are also a not-insignificant number of important systems where nobody is on a pager, where there is no call rotation[1]. Computers are much more reliable than they were even 20 years ago. It is an Acceptable Business Choice to not have 24x7 monitoring for some subset of systems.

Until very recently[2], Citibank took their public website/user portal offline for hours a week.

1 - if a system does not have a fully staffed call rotation with escalations, it's not prepared for a real off-hours uptime challenge 2 - they may still do this, but I don't have a way to verify right now.

stickfigure

17 days ago

1 reply

This lasts right up until an important customer can't access your services. Executives don't care about downtime until they have it, then they suddenly care a lot.

true_religion

17 days ago

[delayed]

sixdonuts

17 days ago

Thousands of orgs have full stack OT/CI apps/services that must run 24/7 365 and are run fully on premise.

chickensong

17 days ago

Uptime is also a sales and marketing point, regardless of real-world usage. Business folks in service-providing companies will usually expect high availability by default, only tempered by the cost and reality of more nines.

Also, in addition to perception/reputation issues, B2B contracts typically include an SLA, and nobody wants to be in breach of contract.

I think the parent you're replying to is wrong, because I've worked at small companies selling into large enterprise, and the expectation is basically 24/7 service availability, regardless of industry.

laz

17 days ago

1 reply

The worst SEV calls are the one where you twiddle your thumbs waiting for a support rep to drop a crumb of information about the provider outage.

You wake up. It's not your fault. You're helpless to solve it.

OccamsMirror

17 days ago

2 replies

Not when that provider is AWS and the outage is hitting news websites. You share the link to AWS being down and go back to sleep.

sixdonuts

17 days ago

News is one thing, if the app/service down impacts revenue, safety or security you won't be getting any sleep AWS or not.

laz

17 days ago

No. You sit on the call and wait to restore your service to your users. There’s bullshit toil in disabling scale in as the outage gets longer.

Eventually, AWS has a VP of something dial in to your call to apologize. They’re unprepared and offer no new information. The get handed to a side call for executive bullshit.

AWS comes back. Your support rep only vaguely knows what’s going on. Your system serves some errors but digs out.

Then you go to sleep.

jonahx

17 days ago

This is also the basis for most SaaS purchases by large corporations. The old "Nobody gets fired for choosing IBM."

odie5533

17 days ago

2 replies

Recommends hosting postgres yourself. Doesn't recommend a distribution stack. If you try this at a startup to save $50 a month, you will never recoup the time you wasted setting it up. We pay dedicated managed services for these things so we can make products on top of them.

ijustlovemath

17 days ago

3 replies

There's not much to recommend; just use the Postgres from your distribution's LTS repo. I like Debian for its rock solid stability.

notaseojh

17 days ago

2 replies

"just use postgres from your distro" is *wildly* underselling the amount of work that it takes to go from apt install postgres to having a production ready setup (backups, replica, pooling, etc). Granted, if it's a tiny database just pg-dumping might be enough, but for many that isn't going to be enough.

true_religion

17 days ago

[delayed]

dvtkrlbs

17 days ago

I don't think any of these would take more than a week to setup. Assuming you create a nice runbook with every step it would not be horrible to maintain as well. Barman for backups and unless you need multi-master you can use the builtin publication and subscription. Though with scale things can complicated really fast but most of the time you won't that much traffic to have something complicated.

marcosdumay

17 days ago

1 reply

The one problem with using your distro's Postgres is that your upgrade routine will be dictated by a 3rd party.

And Postgres upgrades are not transparent. So you'll have a 1 or 2 hours task, every 6 to 18 months that you have only a small amount of control over when it happens. This is ok for a lot of people, and completely unthinkable for some other people.

true_religion

17 days ago

[delayed]

odie5533

17 days ago

Patroni, Pigsty, Crunchy, CloudNativePG, Zalando, ...

ezekg

17 days ago

Maybe come back when your database spend is two or three orders of magnitude higher? It gets expensive pretty fast in my experience.

ijustlovemath

17 days ago

1 reply

And if you want a supabase-like functionality, I'm a huge fan of PostgREST. Make a view for your application and boom, you have a REST API. It uses JWT for auth, but usually I have application on the same VLAN as DB so it's not as rife for abuse.

satvikpendem

17 days ago

1 reply

You can self host Supabase too.

SamDc73

17 days ago

1 reply

Last time I checked, it was a pain in the ass to self-host it

satvikpendem

16 days ago

I assume by their own design and also because there are a lot of moving pieces they packaged up together.

irusensei

17 days ago

1 reply

And then there is the urge to Postgres everything.

I was disappointed alloy doesn't support timescaledb as a metrics endpoint. Considering switching to telegraf just because I can store the metrics on Postgres.

ErroneousBosh

17 days ago

3 replies

I've always just Postgressed everything. I used MySQL a bit in the PHP3 days, but eventually moved onto Postgres.

SQLite when prototyping, Postgres for production.

If you need to power a lawnmower and all you have is a 500bhp Scania V8, you may as well just do it.

WillDaSilva

17 days ago

1 reply

It's pretty easy these days to spin up a local Postgres container. Might as well use it for prototyping too, and save yourself the hassle of switching later.

tcdent

17 days ago

1 reply

It might seem minor, but the little things add up. Make your dev environment mirror prod from the start will save you a bunch of headaches. Then, when you're ready to deploy, there is nothing to change.

Even better, stage to a production-like environment early, and then deploy day can be as simple as a DNS record change.

OccamsMirror

17 days ago

Thanks to LetsEncrypt DNS-01, you can absolutely spin up a production-like environment with SSL and everything. It's definitely worth doing.

solarengineer

17 days ago

1 reply

[delayed]

ErroneousBosh

17 days ago

1 reply

Because when I get tired of reconstructing the contents of the database between my various dev machines (at home, at work, on a remote server, on my laptop) I can just scp the sqlite db across.

Because it's "low effort" to just fire it into sqlite and if I have to do ridiculous things to the schema as I footer around working out exactly what I want the database to do.

I don't want to use nodejs if I can possibly avoid it and you literally could not pay me to even look at Java, there isn't enough money in the world.

solarengineer

17 days ago

[delayed]

nileshtrivedi

17 days ago

1 reply

I have now switched to pglite for prototyping, because it lets me use all the postgres features.

ErroneousBosh

17 days ago

1 reply

Oho, what is this pglite that I have never heard of? I already like the sound of it.

ishandotpage

17 days ago

`pglite` is a WASM version of postgres. I use it in one of my side projects for providing a postgres DB running in the user's browser.

For most purposes, it works perfectly fine, but with two main caveats:

1. It is single user, single connection (i.e. no MVCC) 2. It doesn't support all postgres extensions (particularly postGIS), though it does support pgvector

https://github.com/supabase-community/pg-gateway is something that may be used to use pglite for prototyping I guess, but I haven't used this.

ZeroConcerns

17 days ago

7 replies

So, yeah, I guess there's much confusion about what a 'managed database' actually is? Because for me, the table stakes are:

-Backups: the provider will push a full generic disaster-recovery backup of my database to an off-provider location at least daily, without the need for a maintenance window

-Optimization: index maintenance and storage optimization are performed automatically and transparently

-Multi-datacenter failover: my database will remain available even if part(s) of my provider are down, with a minimal data loss window (like, 30 seconds, 5 minutes, 15 minutes, depending on SLA and thus plan expenditure)

-Point-in-time backups are performed at an SLA-defined granularity and with a similar retention window, allowing me to access snapshots via a custom DSN, not affecting production access or performance in any way

-Slow-query analysis: notifying me of relevant performance bottlenecks before they bring down production

-Storage analysis: my plan allows for #GB of fast storage, #TB of slow storage: let me know when I'm forecast to run out of either in the next 3 billing cycles or so

Because, well, if anyone provides all of that for a monthly fee, the whole "self-hosting" argument goes out of the window quickly, right? And I say that as someone who absolutely adores self-hosting...

thedougd

17 days ago

2 replies

It's even worse when you start finding you're staffing specialized skills. You have the Postgres person, and they're not quite busy enough, but nobody else wants to do what they do. But then you have an issue while they're on vacation, and that's a problem. Now I have a critical service but with a bus factor problem. So now I staff two people who are now not very busy at all. One is a bit ambitious and is tired of being bored. So he's decided we need to implement something new in our Postgres to solve a problem we don't really have. Uh oh, it doesn't work so well, the two spend the next six months trying to work out the kinks with mixed success.

arcbyte

17 days ago

2 replies

Slack is a necessary component in well functioning systems.

thedougd

17 days ago

Of course! It should be included in the math when comparing in-housing Postgres vs using a managed service.

zbentley

17 days ago

And rental/SaaS models often provide an extremely cost effective alternative to needing to have a lot of slack.

Corollary: rental/SaaS models provide that property in large part because their providers have lots of slack.

satvikpendem

17 days ago

1 reply

This would be a strange scenario because why would you keep these people employed? If someone doesn't want to do the job required, including servicing Postgres, then they wouldn't be with me any longer, I'll find someone who does.

sixdonuts

17 days ago

1 reply

No doubt. Reading this thread leads me to believe that almost no one wants to take responsibility for anything anymore, even hiring the right people. Why even hire someone who isn't going to take responsibility for their work and be part of a team? If an org is worried about the "bus factor" they are probably not hiring the right people and/or the org management has poor team building skills.

satvikpendem

16 days ago

1 reply

Exactly, I just don't understand the grandparent's point, why have a "Postgres person" at all? I hire an engineer who should be able to do it all, no wonder there's been a proliferation of full stack engineers over specialized ones.

And especially having worked in startups, I was expected to do many different things, from fixing infrastructure code one day to writing frontend code the next. If you're in a bigger company, maybe it's understandable to be specialized, but especially if you're at a company with only a few people, you must be willing to do the job, whatever it is.

stackskipton

16 days ago

Because working now at what used to be startup size, not having X Person leads to really bad technical debt problems as that person Handling X was not really skilled enough to be doing so but it was illusion of success. Those technical debt problems are causing us massive issues now and costing the business real money.

marcosdumay

17 days ago

1 reply

IMO, the reason to self-host your database is latency.

Yes, I'd say backups and analysis are table stakes for hiring it, and multi-datacenter failover is a relevant nice to have. But the reason to do it yourself is because it's literally impossible to get anything as good as you can build in somebody's else computer.

andersmurphy

17 days ago

Yup, often orders of magnitude better.

dangoodmanUT

17 days ago

3 replies

There should be no data loss window with a hosted database

xboxnolifes

17 days ago

Why is that?

andersmurphy

17 days ago

Feom what I remember if AWS loses your data they are basically give you some credits and that's it.

jeremyjh

17 days ago

That requires synchronous replication, which reduces availability and performance.

odie5533

17 days ago

1 reply

Self-host things the boss won't call at 3 AM about: logs, traces, exceptions, internal apps, analytics. Don't self-host the database or major services.

cube00

17 days ago

Depending on your industry, logs can be very serious business.

satvikpendem

17 days ago

1 reply

If you set it up right, you can automate all this as well by self hosting. There is really nothing special about automating backups or multi region fail over.

awestroke

17 days ago

1 reply

But then you have to check that these mechanisms work regularly and manually

satvikpendem

17 days ago

1 reply

One thing I learned working in the industry, you have to check them when you're using AWS too.

awestroke

17 days ago

2 replies

Really? You're saying RDS backups can't be trusted?

satvikpendem

17 days ago

1 reply

Trusted in what sense, that they'll always work perfectly 100% of the time? No, therefore one must still check them from time to time, and it's really no different when self hosting, again, if you do it correctly.

awestroke

17 days ago

1 reply

What are some common ways that RDS backups fail to be restored?

satvikpendem

17 days ago

Why are you asking me this? Are you trying to test whether I've actually used RDS before? I'm sure a quick search will find you the answer to your question.

SoftTalker

17 days ago

No backup strategy can be blindly trusted. You must verify it, and also test that restores actually work.

graemep

17 days ago

2 replies

Which providers do all of that?

BoorishBears

17 days ago

I don't know which don't?

The default I've used on Amazon and GCP both do (RDS, Cloud SQL)

jeffbee

17 days ago

GCP Alloy DB

wahnfrieden

17 days ago

Yugabyte open source covers a lot of this

isuckatcoding

17 days ago

3 replies

Take a look at https://github.com/vitabaks/autobase

In case you want to self host but also have something that takes care of all that extra work for you

satvikpendem

17 days ago

I wonder how well this plays with other self hosted open source PaaS, is it just a Docker container we can run I assume?

yakkomajuri

17 days ago

Just skimmed the readme. What's the connection pooling situation here? Or is it out of scope?

runako

17 days ago

Thank you, this looks awesome.

heipei

17 days ago

12 replies

I still don't get how folks can hype Postgres with every second post on HN, yet there is no simple batteries-included way to run a HA Postgres cluster with automatic failover like you can do with MongoDB. I'm genuinely curious how people deal with this in production when they're self-hosting.

christophilus

17 days ago

1 reply

I’ve been tempted by MariaDB for this reason. I’d love to hear from anyone who has run both.

paulryanrogers

17 days ago

1 reply

IMO Maria has fallen behind MySQL. I wouldn't chose it for anything my income depends on.

(I do use Maria at home for legacy reasons, and have used MySQL and Pg professionally for years.)

danaris

17 days ago

3 replies

> IMO Maria has fallen behind MySQL. I wouldn't chose it for anything my income depends on.

Can you give any details on that?

I switched to MariaDB back in the day for my personal projects because (so far as I could tell) it was being updated more regularly, and it was more fully open source. (I don't recall offhand at this point whether MySQL switched to a fully paid model, or just less-open.)

Seattle3503

17 days ago

SKIP LOCKED was added in 10.6 (~2021), years after MySQL had it (~2017). My company was using MariaDB around the time and was trailing a version or two and it made implementing a queue very painful.

paulryanrogers

17 days ago

IME MariaDB doesn't recover or run as reliably as modern versions of MySQL, at least with InnoDB.

chuckadams

17 days ago

[delayed]

paulryanrogers

17 days ago

1 reply

RDS provides some HA. HAProxy or PGBouncer can help when self hosting.

notaseojh

17 days ago

2 replies

it's easy to through names out like this (pgbackrest is also useful...) but getting them setup properly in a production environment is not at all straightforward, which I think is the point.

zbentley

17 days ago

…in which case, you should probably use a hosted offering that takes care of those things for you. RDS Aurora (Serverless or not), Neon, and many other services offer those properties without any additional setup. They charge a premium for them, however.

paulryanrogers

17 days ago

Yeah my hope is that the core team will adopt a built in solution, much as they finally came around on including logical replication.

Until then it is nice to have options, even if they do require extra steps.

mfalcao

17 days ago

1 reply

The most common way to achieve HÁ is using Patroni. The easiest way to set it up is using Autobase: https://github.com/vitabaks/autobase

There’s also pg_auto_failover which is a Postgres extension and a bit less complex than Patroni, but it has its drawbacks.

franckpachot

17 days ago

1 reply

Be sure to read the Муths and Truths about Synchronous Replication in PostgreSQL (by the author of Patroni) before considering those solutions as cloud-native high availability: https://www.postgresql.eu/events/pgconfde2025/sessions/sessi...

da02

17 days ago

What is your preferred alternative to Patroni?

tresil

17 days ago

2 replies

If you’re running Kubernetes, CloudNativePG seems to be the “batteries included” HA Postgres cluster that’s becoming the standard in this area.

monus

16 days ago

1 reply

We’ve recently had a disk failure in the primary and CloudNativePG promoted another to be primary but it wasn’t zero downtime. During transition, several queries failed. So something like pgBouncer together with transactional queries (no prepared statements) is still needed which has performance penalty.

_rwo

16 days ago

> So something like pgBouncer together with transactional queries

FYI - it's already supported by cloudnativepg [1]

I was playing with this operator recently and I'm truly impressed - it's a piece of art when it comes to postgres automation; alongside with barman [2] it does everything I need and more

[1] https://cloudnative-pg.io/docs/1.28/connection_pooling [2] https://cloudnative-pg.io/plugin-barman-cloud/

franckpachot

17 days ago

CloudNativePG is automation around PostgreSQL, not "batteries included", and not the idea of Kubernetes where pods can die or spawn without impacting the availability. Unfortunately, naming it Cloud Native doesn't transform a monolithic database to an elastic cluster

jknoepfler

17 days ago

1 reply

I use Patroni for that in a k8s environment (although it works anywhere). I get an off-the-shelf declarative deployment of an HA postgres cluster with automatic failover with a little boiler-plate YAML.

Patroni has been around for awhile. The database-as-a-service team where I work uses it under the hood. I used it to build database-as-a-service functionality on the infra platform team I was at prior to that.

It's basially push-button production PG.

tempest_

17 days ago

1 reply

We use patroni and run it outside of k8s on prem, no issues in 6 or 7 years. Just upgraded from pg 12 to 17 with basically no down time without issue either.

baobun

17 days ago

1 reply

Yo I'm curious if you have some pointers on how you went about this?

Currently scratching my head on what the appropriate upgrade procedure is for a non-k8s/operator spilo/patroni cluster for minimal downtime and risk.

tempest_

16 days ago

1 reply

I did not use a script (my environment is bare metal running ubuntu 24).

I read these and then wrote my own scripts that were tailored to my environment.

https://pganalyze.com/blog/5mins-postgres-zero-downtime-upgr...

https://www.pgedge.com/blog/always-online-or-bust-zero-downt...

https://knock.app/blog/zero-downtime-postgres-upgrades

Basically - Created a new cluster on new machines - Started logically replicating - Waited for that to complete and then left it there replicating for a while until I was comfortable with the setup - We were already using haproxy and pgbouncer - Then I did a cut over to the new setup - This was for a database 600gb-1tb in size - The client application was not doing anything overly fancy which meant there was very little to change going from 12 to 17

baobun

15 days ago

Thank you! o7

franckpachot

17 days ago

1 reply

Beyond the hype, the PostgreSQL community is aware of the lack of "batteries-included" HA. This discussion on the idea of a Built-in Raft replication mentions MongoDB as:

>> "God Send". Everything just worked. Replication was as reliable as one could imagine. It outlives several hardware incidents without manual intervention. It allowed cluster maintenance (software and hardware upgrades) without application downtime. I really dream PostgreSQL will be as reliable as MongoDB without need of external services.

https://www.postgresql.org/message-id/0e01fb4d-f8ea-4ca9-8c9...

abrookewood

17 days ago

1 reply

"I really dream PostgreSQL will be as reliable as MongoDB" ... someone needs to go and read up on Mongo's history!

Sure, the PostrgreSQL HA story isn't what we all want it to be, but the reliability is exceptional.

computerfan494

17 days ago

1 reply

Postgres violated serializability on a single node for a considerable amount of time [1] and used fsync incorrectly for 20 years [2]. I personally witnessed lost data on Postgres because of the fsync issue.

Database engineering is very hard. MongoDB has had both poor defaults as well as bugs in the past. It will certainly have durability bugs in the future, just like Postgres and all other serious databases. I'm not sure that Postgres' durability stacks up especially well with modern MongoDB.

[1] https://jepsen.io/analyses/postgresql-12.3

[2] https://archive.fosdem.org/2019/schedule/event/postgresql_fs...

abrookewood

16 days ago

Thanks for adding that - I wasn't aware.

dpedu

16 days ago

1 reply

This is my gripe with Postgres as well. Every time I see comments extolling the greatness of Postgres, I can't help but think "ah, that's a user, not a system administrator" and I think that's a completely fair judgement. Postgres is pretty great if you don't have to take care of it.

forinti

15 days ago

1 reply

I manage Postgresql and the thing I really love about it is that there's not much no manage. It just works. Even setting up streaming replication is really easy.

dpedu

15 days ago

Initial setup is rarely the hard part of any technology.

groundzeros2015

17 days ago

Because that’s an expensive and complex boondoggle almost no business needs.

franckpachot

17 days ago

It's largely cultural. In the SQL world, people are used to accepting the absence of real HA (resilience to failure, where transactions continue without interruption) and instead rely on fast DR (stop the service, recover, check for data loss, start the service). In practice, this means that all connections are rolled back, clients must reconnect to a replica known to be in synchronous commit, and everything restarts with a cold cache.

Yet they still call it HA because there's nothing else. Even a planned shutdown of the primary to patch the OS results in downtime, as all connections are terminated. The situation is even worse for major database upgrades: stop the application, upgrade the database, deploy a new release of the app because some features are not compatible between versions, test, re-analyze the tables, reopen the database, and only then can users resume work.

Everything in SQL/RDBMS was thought for a single-node instance, not including replicas. It's not HA because there can be only one read-write instance at a time. They even claim to be more ACID than MongoDB, but the ACID properties are guaranteed only on a single node.

One exception is Oracle RAC, but PostgreSQL has nothing like that. Some forks, like YugabyteDB, provide real HA with most PostgreSQL features.

About the hype: many applications that run on PostgreSQL accept hours of downtime, planned or unplanned. Those who run larger, more critical applications on PostgreSQL are big companies with many expert DBAs who can handle the complexity of database automation. And use logical replication for upgrades. But no solution offers both low operational complexity and high availability that can be comparable to MongoDB

forinti

16 days ago

I love Postgresql simply because it never gives me any trouble. I've been running it for decades without trouble.

OTOH, Oracle takes most of my time with endless issues, bugs, unexpected feature modifications, even on OCI!

wb14123

17 days ago

Yeah I'm also wondering that. I'm looking for self-host PostgreSQL after Cockroach changed their free tier license but found the HA part of PostgreSQL is really lacking. I tested Patroni which seems to be a popular choice but found some pretty critical problems (https://www.binwang.me/2024-12-02-PostgreSQL-High-Availabili...). I tried to explore some other solutions, but found out the lack of a high level design really makes the HA for PostgreSQL really hard if not impossible. For example, without the necessary information in WAL, it's hard to enforce primary node even with an external Raft/Paxos coordinator. I wrote some of them down in this blog (https://www.binwang.me/2025-08-13-Why-Consensus-Shortcuts-Fa...) especially in the section "Highly Available PostgreSQL Cluster" and "Quorum".

My theory of why Postgres is still getting the hype is either people don't know the problem, or it's acceptable on some level. I've worked in a team that maintains the in house database cluster (even though we were using MySQL instead of PostgreSQL) and the HA story was pretty bad. But there were engineers manually recover the data lost and resolve data conflicts, either from the recovery of incident or from customer tickets. So I guess that's one way of doing business.

dangoodmanUT

17 days ago

Patroni, Zolando operator on k8s

nhumrich

17 days ago

2 replies

What do you postgres self hosters use for performance analysis? Both GCP-SQL and RDS have their performance analysis pieces of the hosted DB and it's incredible. Probably my favorite reason for using them.

sa46

17 days ago

I’ve been very happy with Pganalyze.

cuu508

17 days ago

I use pgdash and netdata for monitoring and alerting, and plain psql for analyzing specific queries.

donatj

17 days ago

2 replies

The author brings up the point, but I have always found surprising how much more expensive managed databases are than a comparable VPS.

I would expect a little bit more as a cost of the convenience, but in my experience it's generally multiple times the expense. It's wild.

This has kept me away from managed databases in all but my largest projects.

orev

17 days ago

3 replies

Once they convince you that you can’t do it yourself, you end up relying on them, but didn’t develop the skills you would need to migrate to another provider when they start raising prices. And they keep raising prices because by then you have no choice.

zbentley

17 days ago

1 reply

There is plenty of provider markup, to be sure. But it is also very much not a given that the hosted version of a database is running software/configs that are equivalent to what you could do yourself. Many hosted databases are extremely different behind the scenes when it comes to durability, monitoring, failover, storage provisioning, compute provisioning, and more. Just because it acts like a connection hanging off a postmaster service running on a server doesn’t mean that’s what your “psql” is connected to on RDS Aurora (or many of the other cloud-Postgres offerings).

aranelsurion

17 days ago

> Just because it acts like a connection hanging off

If anything that’s a feature for ease of use and compatibility.

ch2026

17 days ago

Wait, are you talking about cloud providers or LLMs?

citizenpaul

17 days ago

I have not tested this in real life yet but it seems like all the argument about vendor lock in can be solved, if you bite the bullet and learn basic Kubernetes administration. Kubernetes is FOSS and there are countless Kubernetes as a service providers.

I know there are other issues with Kubernetes but at least its transferable knowledge.

nrhrjrjrjtntbt

17 days ago

Yes if the DB is 5x the VM and the the VM is 10x the dedicated server from say OVH etc. then you are payng 50x.

notaseojh

17 days ago

1 reply

Another thread where I can't determine whether the "it's easy" suggestions are from people who are clueless or expert.

Nextgrid

17 days ago

Ironically you need a bit of both. You need to be expert enough to make it work, but not "too" expert to be stuck in your ways and/or influenced by all the fear-mongering.

An expert will give you thousands of theoretical reasons why self-hosting the DB is a bad idea.

An "expert" will host it, enjoy the cost savings and deal with the once-a-year occurrence of the theoretical risk (if it ever occurs).

molf

17 days ago

2 replies

> I'd argue self-hosting is the right choice for basically everyone, with the few exceptions at both ends of the extreme:

> If you're just starting out in software & want to get something working quickly with vibe coding, it's easier to treat Postgres as just another remote API that you can call from your single deployed app

> If you're a really big company and are reaching the scale where you need trained database engineers to just work on your stack, you might get economies of scale by just outsourcing that work to a cloud company that has guaranteed talent in that area. The second full freight salaries come into play, outsourcing looks a bit cheaper.

This is funny. I'd argue the exact opposite. I would self host only:

* if I were on a tight budget and trading an hour or two of my time for a cost saving of a hundred dollars or so is a good deal; or

* at a company that has reached the scale where employing engineers to manage self-hosted databases is more cost effective than outsourcing.

I have nothing against self-hosting PostgreSQL. Do whatever you prefer. But to me outsourcing this to cloud providers seems entirely reasonable for small and medium-sized businesses. According to the author's article, self hosting costs you between 30 and 120 minutes per month (after setup). It's easy to do the math...

convolvatron

17 days ago

2 replies

its not. I've been in a few shops that use RDS because they think their time is better spend doing other things.

except now they are stuck trying to maintain and debug Postgres without having the same visibility and agency that they would if they hosted it themselves. situation isn't at all clear.

molf

17 days ago

1 reply

Interesting. Is this an issue with RDS?

I use Google Cloud SQL for PostgreSQL and it's been rock solid. No issues; troubleshooting works fine; all extensions we need already installed; can adjust settings where needed.

convolvatron

17 days ago

its more of a general condition - its not that RDS is somehow really faulty, its just that when things do go wrong, its not really anybody's job to introspect the system because RDS is taking care of it for us.

in the limit I dont think we should need DBAs, but as long as we need to manage indices by hand, think more than 10 seconds about the hot queries, manage replication, tune the vacuumer, track updates, and all the other rot - then actually installing PG on a node of your choice is really the smallest of problems you face.

Nextgrid

17 days ago

3 replies

One thing unaccounted for if you've only ever used cloud-hosted DBs is just how slow they are compared to a modern server with NVME storage.

This leads the developers to do all kinds of workarounds and reach for more cloud services (and then integrating them and - often poorly - ensuring consistency across them) because the cloud hosted DB is not able to handle the load.

On bare-metal, you can go a very long way with just throwing everything at Postgres and calling it a day.

briHass

17 days ago

This is the reason I manage SQL Server on a VM in Azure instead of their PaaS offering. The fully managed SQL has terrible performance unless you drop many thousands a month. The VM I built is closer to 700 a month.

Running on IaaS also gives you more scalability knobs to tweak: SSD Iops and b/w, multiple drives for logs/partitions, memory optimized VMs, and there's a lot of low level settings that aren't accessible in managed SQL. Licensing costs are also horrible with managed SQL Server, where it seems like you pay the Enterprise level, but running it yourself offers lower cost editions like Standard or Web.

andersmurphy

17 days ago

100% this directly connected nvme is a massive win. Often several orders of magnitude.

You can take it even further in some context if you use sqlite.

I think one of the craziest ideas of the cloud decade was to move storage away from compute. It's even worse with things like AWS lambda or vercel.

Now vercel et al are charging you extra to have your data next to your compute. We're basically back to VMs at 100-1000x the cost.

NewJazz

17 days ago

[delayed]

Nextgrid

17 days ago

4 replies

> employing engineers to manage self-hosted databases is more cost effective than outsourcing

Every company out there is using the cloud and yet still employs infrastructure engineers to deal with its complexity. The "cloud" reducing staff costs is and was always a lie.

matthewmacleod

17 days ago

2 replies

I don’t think it’s a lie, it’s just perhaps overstated. The number of staff needed to manage a cloud infrastructure is definitely lower than that required to manage the equivalent self-hosted infrastructure.

Whether or not you need that equivalence is an orthogonal question.

Nextgrid

17 days ago

1 reply

> The number of staff needed to manage a cloud infrastructure is definitely lower than that required to manage the equivalent self-hosted infrastructure.

There's probably a sweet spot where that is true, but because cloud providers offer more complexity (self-inflicted problems) and use PR to encourage you to use them ("best practices" and so on) in all the cloud-hosted shops I've been in a decade of experience I've always seen multiple full-time infra people being busy with... something?

There was always something to do, whether to keep up with cloud provider changes/deprecations, implementing the latest "best practice", debugging distributed systems failures or self-inflicted problems and so on. I'm sure career/resume polishing incentives are at play here too - the employee wants the system to require their input otherwise their job is no longer needed.

Maybe in a perfect world you can indeed use cloud-hosted services to reduce/eliminate dedicated staff, but in practice I've never seen anything but solo founders actually achieve that.

crazygringo

17 days ago

> but because cloud providers offer more complexity (self-inflicted problems)

It's complexity but it's also providing features. If you didn't use those cloud features, you'd be writing or gluing together and maintaining your own software to accomplish the same tasks, which takes even more staff

> Maybe in a perfect world you can indeed use cloud-hosted services to reduce/eliminate dedicated staff

So let's put it another way: either you're massively reducing/eliminating staff to achieve the same level of functionality, or you're keeping the equivalent staff but massively increasing functionality.

The point is, clouds let you deliver a lot more with a lot less people, no matter which way you cut it. The people spending money on them aren't mostly dumb.

freedomben

17 days ago

Exactly. Companies with cloud infra often still have to hire infra people or even an infra team, but that team will be smaller than if they were self-hosting everything, in some cases radically smaller.

I love self-hosting stuff and even have a bias towards it, but the cost/time tradeoff is more complex than most people think.

molf

17 days ago

4 replies

> Every company out there is using the cloud and yet still employs infrastructure engineers

Every company beyond a particular size surely? For many small and medium sized companies hiring an infrastructure team makes just as little sense as hiring kitchen staff to make lunch.

add-sub-mul-div

17 days ago

2 replies

You're paying people to do the role either way, if it's not dedicated staff then it's taking time away from your application developers so they can play the role of underqualified architects, sysadmins, security engineers.

scott_w

17 days ago

1 reply

From experience (because I used to do this), it’s a lot less time than a self-hosted solution, once you’re factoring in the multiple services that need to be maintained.

pinkgolem

17 days ago

1 reply

As someone who has done both.. i disagree, i find self hosting to a degree much easier and much less complex

Local reproducibility is easier, and performance is often much better

scott_w

17 days ago

1 reply

It depends entirely on your use case. If all you need is a DB and Python/PHP/Node server behind Nginx then you can get away with that for a long time. Once you throw in a task runner, emails, queue systems, blob storage, user-uploaded content, etc. you can start running beyond your own ability or time to fix the inevitable problems.

As I pointed out above, you may be better served mixing and matching so you spend your time on the critical aspects but offload those other tasks to someone else.

Of course, I’m not sitting at your computer so I can’t tell you what’s right for you.

pinkgolem

17 days ago

I mean, fair, we are ofc offloading some of that.. email being one of those.

Task runner/que at least for us postgres works for both cases.

We also self host an s3 storage and allow useruploaded content in within strict borders.

flomo

17 days ago

1 reply

Yeah, and nobody is looking at the other side of this. There just are not a lot of good DBA/sysop type who even want to work for some non-tech SMB. So this either gets outsourced to the cloud, or some junior dev or desktop support guy hacks it together. And then who knows if the backups are even working.

Fact is a lot of these companies are on the cloud because their internal IT was a total fail.

Nextgrid

17 days ago

2 replies

If they just paid half of the markup they currently pay for the cloud I'm sure they'll be swimming in qualified candidates.

strken

17 days ago

2 replies

Our AWS spend is something like $160/month. Want to come build bare metal database infrastructure for us for $3/day?

Nextgrid

17 days ago

1 reply

When you need to scale up and don't want that $160 to increase 10x to handle the additional load the numbers start making more sense: 3 month's worth of the projected increase upfront is around 4.3k, which is good money for a few days' work for the setup/migration and remains a good deal for you since you break even after 3 months and keep on pocketing the savings indefinitely from that point on.

Of course, my comment wasn't aimed at those who successfully keep their cloud bill in the low 3-figures, but the majority of companies with a 5-figure bill and multiple "infrastructure" people on payroll futzing around with YAML files. Even half the achieved savings should be enough incentive for those guys to learn something new.

solatic

16 days ago

1 reply

> few days' work

But initial setup is maybe 10% of the story. The day 2 operations of monitoring, backups, scaling, and failover still needs to happen, and it still requires expertise.

If you bring that expertise in house, it costs much more than 10x ($3/day -> $30/day = $10,950/year).

If you get the expertise from experts who are juggling you along with a lot of other clients, you get something like PlanetScale or CrunchyData, which are also significantly more expensive.

Nextgrid

16 days ago

> monitoring

Most monitoring solutions support Postgres and don't actually care where your DB is hosted. Of course this only applies if someone was actually looking at the metrics to begin with.

> backups

Plenty of options to choose from depending on your recovery time objective. From scheduled pg_dumps to WAL shipping to disk snapshots and a combination of them at any schedule you desire. Just ship them to your favorite blob storage provider and call it a day.

> scaling

That's the main reason I favor bare-metal infrastructure. There is no way anything on the cloud (at a price you can afford) can rival the performance of even a mid-range server that scaling is effectively never an issue; if you're outgrowing that, the conversation we're having is not about getting a big DB but using multiple DBs and sharding at the application layer.

> failover still needs to happen

Yes, get another server and use Patroni/etc. Or just accept the occasional downtime and up to 15 mins of data loss if the machine never comes back up. You'd be surprised how many businesses are perfectly fine with this. Case in point: two major clouds had hour-long downtimes recently and everyone basically forgot about it a week later.

snovv_crash

16 days ago

3 replies

At 160/mo you are using so little you might as well host off of a raspberry pi on your desk with a USB3 SSD attached. Maintenance and keeping a hot backup would take a few hours to set up, and you're more flexible too. And if you need to scale, rent a VPS or even dedicated machine from Hetzner.

An LLM could set this up for you, it's dead simple.

strken

16 days ago

1 reply

I'm not going to put customer data on a USB-3 SSD sitting on my desk. Having a small database doesn't mean you can ignore physical security and regulatory compliance, particularly if you've still got reasonable cash flow. Just as one example, some of our regulatory requirements involve immutable storage - how am I supposed to make an SSD that's literally on my desk immutable in any meaningful way? S3 handles this in seconds. Same thing with geographically distributed replicas and backups.

I also disagree that the ongoing maintenance, observability, and testing of a replicated database would take a few hours to set up and then require zero maintenance and never ping me with alerts.

snovv_crash

15 days ago

The lede I buried there is whether all of this theater actually gives you better security and availability than 'toy' hardware.

Looking at all the recent AWS, Azure and Cloudflare outages, I posit that it doesn't.

flomo

16 days ago

LOL, I think we could summarize a lot of this discussion with "just because you can do this in someone's basement, doesn't make it a good idea for most businesses." SMB IT has been a disaster zone because they hired whatever hobbyists who are more into video games than RDBMSes.

flomo

16 days ago

Nice troll. But TFA is about corporate IT so hopefully you get whatever.

flomo

17 days ago

I wonder how much of this thread is wishful thinking? "Companies should be begging me to be their computer janitor!"

Anyway, for companies not heavily into tech, lots of this stuff is not that expensive.

spwa4

17 days ago

For small companies things like vercel, supabase, firebase, ... wipe the floor with Amazon RDS.

For medium sized companies you need "devops engineers". And in all honesty, more than you'd need sysadmins for the same deployment.

For large companies, they split up AWS responsibilities into entire departments of team (for example, all clouds have math auth so damn difficult most large companies have -not 1- but multiple departments just dealing with authorization, before you so much as start your first app)

barnabee

16 days ago

It depends very much what the company is doing.

At my last two places it very quickly got to the point where the technical complexity of deployments, managing environments, dealing with large piles of data, etc. meant that we needed to hire someone to deal with it all.

They actually preferred managing VMs and self hosting in many cases (we kept the cloud web hosting for features like deploy previews, but that’s about it) to dealing with proprietary cloud tooling and APIs. Saved a ton of money, too.

On the other hand, the place before that was simple enough to build and deploy using cloud solutions without hiring someone dedicated (up to at least some pretty substantial scale that we didn’t hit).

esseph

17 days ago

[delayed]

aranelsurion

17 days ago

1 reply

> still employs infrastructure engineers

> The "cloud" reducing staff costs

Both can be true at the same time.

Also:

> Otherwise you're waking up at 3am no matter what.

Do you account for frequency and variety of wakeups here?

Nextgrid

17 days ago

1 reply

> Do you account for frequency and variety of wakeups here?

Yes. In my career I've dealt with way more failures due to unnecessary distributed systems (that could have been one big bare-metal box) rather than hardware failures.

You can never eliminate wake-ups, but I find bare-metal systems to have much less moving parts means you eliminate a whole bunch of failure scenarios so you're only left with actual hardware failure (and HW is pretty reliable nowadays).

wredcoll

17 days ago

If this isn't the truth. I just spent several weeks, on and off, debugging a remote hosted build system tool thingy because it was in turn made of at least 50 different microservice type systems and it was breaking in the middle of two of them.

There was, I have to admit, a log message that explained the problem... once I could find the specific log message and understand the 45 steps in the chain that got to that spot.

scott_w

17 days ago

> Every company out there is using the cloud and yet still employs infrastructure engineers to deal with its complexity. The "cloud" reducing staff costs is and was always a lie.

This doesn’t make sense as an argument. The reason the cloud is more complex is because that complexity is available. Under a certain size, a large number of cloud products simply can’t be managed in-house (and certainly not altogether).

Also your argument is incorrect in my experience.

At a smaller business I worked at, I was able to use these services to achieve uptime and performance that I couldn’t achieve self-hosted, because I had to spend time on the product itself. So yeah, we’d saved on infrastructure engineers.

At larger scales, what your false dichotomy suggests also doesn’t actually happen. Where I work now, our data stores are all self-managed on top of EC2/Azure, where performance and reliability are critical. But we don’t self-host everything. For example, we use SES to send our emails and we use RDS for our app DB, because their performance profiles and uptime guarantees are more than acceptable for the price we pay. That frees up our platform engineers to spend their energy on keeping our uptime on our critical services.

dewey

17 days ago

I recently was also doing some research into what projects exist that come close to a “managed Postgres on Digital Ocean” experience, sadly there’s some building blocks but nothing that really makes it a complete no-brainer.

https://blog.notmyhostna.me/posts/what-i-wish-existed-for-se...

234 more comments available on Hacker News

View full discussion on Hacker News

ID: 46336947Type: storyLast synced: 12/23/2025, 3:40:28 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN