Uuidv7 Comes to Postgresql 18
Posted3 months agoActive3 months ago
thenile.devTechstoryHigh profile
calmmixed
Debate
70/100
Uuidv7PostgresqlDatabase DesignSecurity
Key topics
Uuidv7
Postgresql
Database Design
Security
The introduction of UUIDv7 in PostgreSQL 18 sparks a discussion on its implications for database design, security, and usability, with some users expressing concerns and others highlighting its benefits.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
51m
Peak period
32
2-4h
Avg / period
6.1
Comment distribution67 data points
Loading chart...
Based on 67 loaded comments
Key moments
- 01Story posted
Sep 21, 2025 at 10:24 AM EDT
3 months ago
Step 01 - 02First comment
Sep 21, 2025 at 11:14 AM EDT
51m after posting
Step 02 - 03Peak activity
32 comments in 2-4h
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 22, 2025 at 2:31 PM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45323008Type: storyLast synced: 11/20/2025, 6:30:43 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
This happens way more often than companies want to admit.
OP unsurprisingly left out the details of how they caught them.
> Our founder asked me "Is there anything we're sending over the API that might clue someone in when someone signs up? Are we sending a createdAt field or something?" and I said "No, but we do have a timestamp in one of the IDs..." -- well, we removed the field and this behavior stopped soon after.
We have applications where we control the creation of the primary key, and where the primary key will be exposed to end users, such as when using a typical web app framework built with Rails, Phoenix, Loco, Laravel, etc. For these applications, UUIDv7 time is too problematic for security, so we prefer binary-stored UUIDv4 even though it's less efficient.
We also have applications where we control the creation of the primary key, and where we can ensure the primary key is never shown to users. For these applications, UUIDv7 is slower at inserts and joins, so we prefer BIGSERIAL for primary key, and binary-stored UUIDv4 for showing to users such as in URLs.
(And why the heck are different types or variants of UUIDs called "versions"?)
You've embrittled your system.
> You've embrittled your system.
this is the main argument for keeping surrogate keys internal - they really should be thought of like pointers, dangling pointers outside of your control are brittle. ideally anything exposed to the wild that points back to a surrogate key decodes with extra information you can use to invalidate it (like a safe-pointer!)
it does make sense and it's what you should do instead of using a UUID as PK for this purpose.
And I say this as someone who recently has to convert some tables from auto increment IDs to uuid. In that instance, they were sharded tables that were relying on the IDs to be globally unique, and made heavy use of the IDs to scan records in time order. So uuids were something which could solve the first problem while preserving the functionality of the second requirement.
Your distributed table case sounds like a great use case for UUIDv7.
It took me under 15 seconds to come up with that.
Privacy wise,
- Knowing sequential IDs leaks the rate of creation and amount of said entity which can translate in number of customers or rate of sales.
- Knowing timed IDs leaks activity patterns. This gets worse as you cross reference data.
- Random IDs reveal nothing.
---
Security wise,
- Sequential IDs can be guessed.
Performance wise,
- Sequential IDs may result in self-inflicted hotspots.
- Random IDs lends themselves to sharding, but make indexing, column-compression, and maintaining order after inserts hard.This implies the existence of an endpoint that returns a list of items, which could by itself be used to determine customers or rate of sales. This also means you have a broken security model that leaks a list of customers or list of sales, that you should probably not have access to begin with.
- Knowing timed IDs leaks activity patterns. This gets worse as you cross reference data.
Again if you can list items freely you can do this anyway, capture what exists now and do diffs to determine update times and creation times.
- table-global sequence :: Which leaks activity signals to all users that can create and see new Ids. This is the naive sequence you get when creating an incremental Id in a table.
- user-local sequence :: How many invoices a single user has, which is safe if kept within the reach of a single user. The sequence though, is slower and more awkward to generate.
Say you have a store that allows a user to check out just their own invoices.
- store.com/profile/invoices/{sequence_id}/
This does not imply that using a random id will return you back the data from another user, so it isn't necessarily as unsafe as you guessed. You'll probably get a 404 that does not even acknowledges the existence of said Id (but may be suspect to timing attacks to guess if it exists).
---
With timed Ids you do need a data leak out of bubble of a single user. Database design should always try to guard against that anyway. That's why we salt our passwords and store only their digest (right?).
You can masquerade internal Ids with opaque Ids if you want to maintain a translation layer. There's also more distributed use cases that require coming up with new Ids in isolation, so they will be "exposed" anyways as you sync up with other nodes.
It would have to leak sensitive information to be "subtracting security", which implies you're relying on timestamp secrecy to ensure your security. This would be one of the "other problems" the gp mentioned.
It would be much easier to discuss the merits of your argument if you had an example of the dangers of leaking creation timestamps for database entries.
Otherwise, carparks & database creation timestamps have nothing in common that is meaningfully relevant to your argument. You cannot just generalise all worldly concepts & call it a day.
My analogy was meant for a reader with a modicum of ability to connect dots to better interpret the parent and aunt/uncle replies.
Genuinely, without any snark intended: please presume I'm an idiot here because I fully acknowledge I may be missing something blatantly obvious & am just trying to understand your argument better.
> The other post
> the parent and aunt/uncle replies.
I've gone & re-read the parent / grand parent replies in this thread on the assumption I had missed something but I can't find any reference to estimating growth rates of online companies via publicly exposed db record timestamps.
Nor can I conceive of an obvious system in my head by which one would do so. I acknowledge that such a hypothetical system almost certainly exists, but it seems non-obvious (to me) & as such it's quite difficult to reason about & discuss.
And I’m honestly not a fan of public services using primary keys publicly for anything important. I’d much rather nice or shorter URLs.
What might be an improvement is if you can look up records efficiently from the random bits of the UUID automatically, replacing the timestamp with an index.
Unlisted URLs, like YouTube videos are a popular example used by a reputable tech company.
> UUIDv7 has 6 random bytes
Careful. The spec allows 74 bits to be filled randomly. However you are allowed to exchange up to 12 bits for a more accurate timestamp and a counter of up to 42 bits. If you can get a fix on the timestamp and counter, the random portion only provides 20 bits (1M possiblities).
Python 3.14rc introduces a UUIDv7 implementation that has only 32 random bits, for example.
Basically, you need to see what your implementation does.
The numbers you provided are suspicious, but seem quite feasible to attack. 1M IDs in 4B means each guess has ~ 1-in-4000 chance. You can make 4000 requests in an hour at a one-per-second rate. A successful attack can guess one ID, it doesn't need to enumerate all of them.
The backwards compatibility is a wild trade off.
Either way my comment was hyperbole, but the concept is the same, 10000 records per millisecond and you get the point. For 99.999% of SQL use cases UUIDv7 is good.
I only advocate for UUID so much because 3 separate times in my career I have been the one to have to add UUIDs so we don't leak number of patients, let users scrape the site by just incrementing (amongst other protections). So much easier to just UUID everything.
File sharing endpoints for a business? No. Use another uuid4 based 'sharing uuid' that you map internally to the PK uuid7.
https://en.wikipedia.org/wiki/German_tank_problem
If users/products are onboarded in bulk/during B2B account signup, then, leaking the creation times of each of them with any search that returns their UUIDs, becomes metadata that can be used to correlate users with each other, if imperfectly.
Often, the benefits of a UUID with natural ordering outweigh this. But it's something to weigh before deciding to switch to UUIDv7.
If your primary keys are monotonic or time based, bad actors can simply walk your API.
Anything else, as you're rightly pointing it out, is a bit of a stretch.
Last year I went to renew my Id and they told me, sorry, the (centralised) system is down, but before computers things were done in a more resilient local, offline authoring + sync when convenient way that didn't result in "Sorry, computer says no. Schedule a new appointment."
https://news.ycombinator.com/item?id=45275973
An interesting compromise is transforming the UUIDv7 to a UUIDv4 at the API boundary, like e.g. UUIDv47 [1]. On the other hand if you are doing that you can also go with u64 primary keys and transform those
1: https://github.com/stateless-me/uuidv47
Seems like it would be wise to add caveats around using this form in external facing applications or APIs.
The A UUIDv7 leaks to the outside when it was created, but guessing the next timestamp value is still completely unfeasible. 62 bits is plenty of security if each attempt requires an API request
Why does everybody want to find excuses to leave footguns around?
Ideally you have some sort of rate limit on your APIs...
This depends very much on the type of UUID e.g. a type 1 UUID is just a timestamp, a MAC address and a "collision" counter to use if your clock ticks too slowly or you want batches of UUIDs.
The other big deal is that uuids can be created on the client and supplied by the client. That can make a lot of code drastically simpler (idempotency comes to mind)
Use UUIDv7 for your primary key but only for large scale databases and only for internal keys. Don't expose these keys to anything outside the database or application.
Use UUIDv4 columns with unique indexes over them as "external IDs" which are what is exposed via APIs to other systems.
Basically, create two IDs for one record - one random but not the primary key, and one sequential, that is the primary key.
I have done this in real systems.. and it works.
> UUIDs have a bad reputation, mostly based on randomly allocated UUIDs effect on indexes. But UUIDs also give us 16 bytes of space to play with, which can be to our advantage. We can use the space of UUIDs to structure and encode data into our identifiers. This can be really useful for multi-tenant applications, for sharding or paritioning. UUIDs can also help improve you web app security, by not leaking sequentially allocated ids.
> We'll take a look at index performance concerns with randomly allocated UUIDs, sequential ids and sensibly structured UUIDs.
> Seeing how we can go about extracting information from these UUIDs and how to build these UUIDs inside PostgreSQL, and how we can look at the performance of these functions.
> Finally looking at adding support for bitwise operations on UUIDs using an extension.
Slides https://www.postgresql.eu/events/pgdaynl2025/schedule/sessio...
Live stream https://youtube.com/watch?v=tJYEuIpzch4&t=2h36m
What May Surprise You About UUIDv7 https://medium.com/@sergeyprokhorenko777/what-may-surprise-y...