Indexing 100m Vectors in 20 Minutes on Postgresql with 12gb RAM
Key topics
The impressive feat of indexing 100M vectors in 20 minutes on PostgreSQL with just 12GB RAM has sparked a lively debate about the need for cloud infrastructure. While some commenters, like duckbot3000, wonder if cloud is still necessary beyond remote backups, others, such as setr and riku_iki, point out the complexities of on-premises infrastructure and the value of cloud services in handling failovers and redundancy. The discussion highlights that cloud infrastructure simplifies tasks like disk redundancy and failover nodes, with some, like positron26, using it to streamline their operations, while others, like nwellinghoff, lament the limitations of managed cloud services, such as AWS RDS.
Snapshot generated from the HN discussion
Discussion Activity
Active discussionFirst comment
10h
Peak period
18
120-132h
Avg / period
4.3
Based on 26 loaded comments
Key moments
- 01Story posted
Dec 8, 2025 at 12:30 PM EST
25 days ago
Step 01 - 02First comment
Dec 8, 2025 at 10:38 PM EST
10h after posting
Step 02 - 03Peak activity
18 comments in 120-132h
Hottest window of the conversation
Step 03 - 04Latest activity
Dec 14, 2025 at 10:56 AM EST
19 days ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
Rent your VPS and add in extra volumes for like $10 per 100GB.
Netcup is under-rated but there are also other providers too at lowendbox/lowendtalk and I am interested to try out hetzner too sometime.
Not counting the fact that Netcup is raided (also Netcup is limited to 8TB on a VPS).
That is like 4.7 Euro /TB. That is like 4$/TB. 6 Euro / TB in a raid 5 setup.
I do not understand why they are not using this new pricing model on their older servers. There the best you can get is like 10 Euro /TB (for the single 15TB U.2).
Nice to know, but I was just guessing at what a reasonable price would be :-)
But realistically, I believe major clouds (google, aws) likely has more robust org and infra for recovery than I can built and maintain.
Not every use case requires 100% uptime
Actual active-active HA of your datastores is really hard to do (CAP theorem and all that). The majority of companies don't do it.
And yes, i agree, the PG failover setup (and especially dealing with a failure afterwards, to restore the ex-master is beyond infuriating).
But its not pay 10x the amount, while eating easily 10x performance infuriating :)
Engineers who started their career during the cloud craze and don't know anything else are also not the kind to rock the boat, lest the cash cow dies and their whole "investment" in their career becomes useless.
Imagine paying $250+/mo for 32GB of RAM and 4 VCPUs. No wonder Amazon is swimming in cash, the markup on this is bonkers.
https://www.latitude.sh/pricing/c2-small-x86?gen=gen-2
hetzner doesn't even have specs this low from what i can tell!
https://www.hetzner.com/dedicated-rootserver/#cores_threads_...
Or, for 184eur/mo you can get one of their bare metal GPU offerings with 64GB of RAM, a i5-13500 and an RTX4000.
A year commitment takes 30% off.
You must have the data upfront, you cannot build this in an incremental fashion
There is also bo mention on how this would handle updates, and from the description, even if updates are possible, this will degrade over time, requiring new indexing batch