SolidStart - Hacker News

jph 2 days ago ago

We evaluated UUIDv7 and determined that it's unwise to use it as a primary key.

We have applications where we control the creation of the primary key, and where the primary key will be exposed to end users, such as when using a typical web app framework built with Rails, Phoenix, Loco, Laravel, etc. For these applications, UUIDv7 time is too problematic for security, so we prefer binary-stored UUIDv4 even though it's less efficient.

We also have applications where we control the creation of the primary key, and where we can ensure the primary key is never shown to users. For these applications, UUIDv7 is slower at inserts and joins, so we prefer BIGSERIAL for primary key, and binary-stored UUIDv4 for showing to users such as in URLs.

[-]

laughing_snyder 2 days ago ago

Why would exposing any primary key be bad for security? If your system's security *in any way* depends on the randomness of a database private key, you have other problems. It's not the job of a primary key to add to security. Not to mention that UUIDv7 has 6 random bytes, which, for the vast majority of web applications, even finance, is more than enough randomness. Just imagine how many requests an attacker would need to make to guess even one UUID (281 trillion possible combinations for 6 random bytes, and he also would need to guess the unix timestamp in ms correctly). The only scenario I can think of is that you use the primary as a sort of API key.

[-]

btown 2 days ago ago

One of the big things here is de-anonymization and account correlation. Say you have an application where users'/products' affiliation with certain B2B accounts is considered sensitive; perhaps they need to interact with each other anonymously for bid fairness, perhaps people might be scraping for "how many users does account X have onboarded" as metadata for a financial edge.

If users/products are onboarded in bulk/during B2B account signup, then, leaking the creation times of each of them with any search that returns their UUIDs, becomes metadata that can be used to correlate users with each other, if imperfectly.

Often, the benefits of a UUID with natural ordering outweigh this. But it's something to weigh before deciding to switch to UUIDv7.

8organicbits 2 days ago ago

> system's security in any way depends on the randomness of a database private key

Unlisted URLs, like YouTube videos are a popular example used by a reputable tech company.

> UUIDv7 has 6 random bytes

Careful. The spec allows 74 bits to be filled randomly. However you are allowed to exchange up to 12 bits for a more accurate timestamp and a counter of up to 42 bits. If you can get a fix on the timestamp and counter, the random portion only provides 20 bits (1M possiblities).

Python 3.14rc introduces a UUIDv7 implementation that has only 32 random bits, for example.

Basically, you need to see what your implementation does.

[-]

bearjaws 2 days ago ago

only 32bits, so 4 billion guesses per microsecond... Even if youtube has 1 million videos per microsecond you would never guess them before rate limits.

[-]

8organicbits 2 days ago ago

You're mixing a couple things. The 32 bit random occurs in the Python implementation, which uses a millisecond counter.

The numbers you provided are suspicious, but seem quite feasible to attack. 1M IDs in 4B means each guess has ~ 1-in-4000 chance. You can make 4000 requests in an hour at a one-per-second rate. A successful attack can guess one ID, it doesn't need to enumerate all of them.

[-]

bearjaws 2 days ago ago

Ah I was looking at the pg_uuidv7 python package.

The backwards compatibility is a wild trade off.

Either way my comment was hyperbole, but the concept is the same, 10000 records per millisecond and you get the point. For 99.999% of SQL use cases UUIDv7 is good.

I only advocate for UUID so much because 3 separate times in my career I have been the one to have to add UUIDs so we don't leak number of patients, let users scrape the site by just incrementing (amongst other protections). So much easier to just UUID everything.

Incipient 2 days ago ago

Not sure if this is helpful here, but you're still looking at 32 bits of randomness, regardless of the time window. Use it for anything that you feel that's enough randomness to secure - a private home video of a cat braking a cup? Sure.

File sharing endpoints for a business? No. Use another uuid4 based 'sharing uuid' that you map internally to the PK uuid7.

rhplus 2 days ago ago

The German Tank Problem springs to mind. While not precisely the same problem, it’s still a case where more information that necessary is leaked in seemingly benign IDs. For the Germans, they leaked production volumes. For UUID v7, you’re leaking timing and timestamps.

https://en.wikipedia.org/wiki/German_tank_problem

[-]

andy_ppp 2 days ago ago

The rest of the ID will be random enough that guessing it will take an extremely long time, unless all the tanks were inserted in the same microsecond of course. I’m not sure this is a security issue with UUID though!

Hizonner 2 days ago ago

Because anything that knows the primary key now knows the timestamp. The UUID itself leaks information. It's not that it's not adding security. It's that it's actually subtracting security.

[-]

andy_ppp 2 days ago ago

There’s every chance the API has timestamps on when it was inserted. Honestly I’d rather my data was ordered correctly than imagining the extremely rare situations that leaking the insert time is going to bring the world falling down. You usually want that information.

And I’m honestly not a fan of public services using primary keys publicly for anything important. I’d much rather nice or shorter URLs.

What might be an improvement is if you can look up records efficiently from the random bits of the UUID automatically, replacing the timestamp with an index.

dotancohen a day ago ago

The timestamp can be recovered from the UUID?

lucideer 2 days ago ago

> leaks information

It would have to leak sensitive information to be "subtracting security", which implies you're relying on timestamp secrecy to ensure your security. This would be one of the "other problems" the gp mentioned.

[-]

atomicnumber3 2 days ago ago

Pretty much any information can be used for something. You're ignoring everything they say about how something not critical to application security may still not be desirable to be leaked for other reasons. Example: Target and Walmart may not depend on satellites being unable to image their parking lots from the perspective of loss prevention or corporate security. But it still leaks information they may not want financial analysts to know about their performance.

[-]

lucideer 2 days ago ago

You've used an analogy instead of an example to demonstrate your point: analogies can be helpful for explaining concepts but are rarely accurate enough to prove logical parity.

It would be much easier to discuss the merits of your argument if you had an example of the dangers of leaking creation timestamps for database entries.

Otherwise, carparks & database creation timestamps have nothing in common that is meaningfully relevant to your argument. You cannot just generalise all worldly concepts & call it a day.

[-]

atomicnumber3 2 days ago ago

The other post literally mentions using creation timestamps to judge growth rates of companies on a platform.

My analogy was meant for a reader with a modicum of ability to connect dots to better interpret the parent and aunt/uncle replies.

[-]

lucideer a day ago ago

> a reader with a modicum of ability to connect dots

Genuinely, without any snark intended: please presume I'm an idiot here because I fully acknowledge I may be missing something blatantly obvious & am just trying to understand your argument better.

> The other post

> the parent and aunt/uncle replies.

I've gone & re-read the parent / grand parent replies in this thread on the assumption I had missed something but I can't find any reference to estimating growth rates of online companies via publicly exposed db record timestamps.

Nor can I conceive of an obvious system in my head by which one would do so. I acknowledge that such a hypothetical system almost certainly exists, but it seems non-obvious (to me) & as such it's quite difficult to reason about & discuss.

[-]

chistev a day ago ago

I like you. Lol

limagnolia 2 days ago ago

Sam Walton used to fly investors in his plane over Walmart stores and ask them to count the cars in the parking lot, then he would fly them over competitors stores and ask the same. Just a fun fact about how this is a very real scenario!

sz4kerto 2 days ago ago

Example: if user IDs are not random but eg Bigserial (autoincremented) and they're exposed through some API, then API clients can infer the creation time of said users in the system. Now if my system is storing eg health data for a large population, then it'll be easy to guess the age of the user. Etc. This is not a security problem, this is an information governance problem. But it's a problem. Now if you say that I should not expose these IDs - fine, but then whatever I expose is essentially an ID anyway.

[-]

andy_ppp 2 days ago ago

I really don’t think using primary keys publicly is ever good, just because UUID4 has allowed people to smash junk into the URL doesn’t mean it’s good for the web or the users over a slug or a cleaner ID.

2 days ago ago

[deleted]

echelon 2 days ago ago

Depends how much entropy is in your primary keys.

If your primary keys are monotonic or time based, bad actors can simply walk your API.

wongarsu 2 days ago ago

Deploying UUIDv7 certainly requires more thought about the implications. In many cases leaking the creation time of a key is completely fine, in some cases it isn't

An interesting compromise is transforming the UUIDv7 to a UUIDv4 at the API boundary, like e.g. UUIDv47 [1]. On the other hand if you are doing that you can also go with u64 primary keys and transform those

1: https://github.com/stateless-me/uuidv47

wpollock 2 days ago ago

I wonder if the issue is with exposing internal IDs to end users. I'm sure the experts here have already thought of this, but could someone explain why using encryption or even an HMAC for external views of a primary key doesn't make sense? Maybe because the extra processing is more expensive than just using UUIDv4? Using a KDF such as argon2id on the random bits of a UUIDv7 seems like it might work well for external IDs.

(And why the heck are different types or variants of UUIDs called "versions"?)

[-]

Hizonner 2 days ago ago

Because now, for the rest of eternity, every single person who writes any code that moves data from this table to somewhere else, for any purpose, has to remember that the primary key gives away the creation time of something, which can potentially be linked to something else. A lot of people won't notice that, and a lot of people who do notice it will get the remediation wrong. And you can now forget using a simple view on the database to give any information to any person or program that shouldn't get the creation times.

You've embrittled your system.

[-]

gfody 2 days ago ago

the question was why not use encryption (sqids/hashids/etc) to secure publicly exposed surrogate keys, I don't think this reply is on point .. surrogate keys ideally are never exposed (for a slew of reasons beyond just leaking information) so securing them is a perfectly reasonable thing to do (as seen everywhere on the internet). otoh using any form of uuid as surrogate key is an awful thing to do to your db engine (making its job significantly harder for no benefit)

> You've embrittled your system.

this is the main argument for keeping surrogate keys internal - they really should be thought of like pointers, dangling pointers outside of your control are brittle. ideally anything exposed to the wild that points back to a surrogate key decodes with extra information you can use to invalidate it (like a safe-pointer!)

gfody 2 days ago ago

> but could someone explain why using encryption or even an HMAC for external views of a primary key doesn't make sense?

it does make sense and it's what you should do instead of using a UUID as PK for this purpose.

bri3d 2 days ago ago

It does make all kinds of sense and is a majorly underutilized tool.

alex_duf 2 days ago ago

UUIDv7 makes sense when a distributed system needs to insert vast amount of data that will be consumed chronologically. Typically an event log table.

Anything else, as you're rightly pointing it out, is a bit of a stretch.

[-]

dietr1ch 2 days ago ago

The distributed part is what forces creating Ids outside of the server, where UUIDs become useful, and also where systems become reliable.

Last year I went to renew my Id and they told me, sorry, the (centralised) system is down, but before computers things were done in a more resilient local, offline authoring + sync when convenient way that didn't result in "Sorry, computer says no. Schedule a new appointment."

nighthawk454 2 days ago ago

Recently someone shared a method for encrypting the timestamp portion as well:

https://news.ycombinator.com/item?id=45275973

halayli 2 days ago ago

I think privacy fits better rather than security. If your primary key is being used as a secret then you probably got your schema wrong. how will you encrypt them when required?

moron4hire 2 days ago ago

This is why the UUID versions should have been labeled by letter rather than number. Each UUID version doesn't replace the last. They do different things. The numbered versioning gives the impression that "higher numbers = better" and that's neither the case nor the intention.

kasperset 2 days ago ago

Currently evaluating UUIDv7 as primary key for some inventory origin. I think it should be ok to use it for such use case since it will indicate the time of creation? Any thoughts?

[-]

gtowey 2 days ago ago

You have to ask what problems exactly are you solving? Unless there is a compelling reason to use them, sticking with auto increment IDs is much simpler.

And I say this as someone who recently has to convert some tables from auto increment IDs to uuid. In that instance, they were sharded tables that were relying on the IDs to be globally unique, and made heavy use of the IDs to scan records in time order. So uuids were something which could solve the first problem while preserving the functionality of the second requirement.

[-]

elcritch 2 days ago ago

Yeah depends a lot on scale. If the inventory system only holds thousands of items, UUIDs just add a lot of headache for little gain.

Your distributed table case sounds like a great use case for UUIDv7.

andy_ppp 2 days ago ago

It’s a perfectly good choice, most of the complaints here are exaggerated. If your inventory has SKUs use those externally/for links and for API lookups if possible.

bricss 2 days ago ago

If knowing IDs has a negative impact on security, then application system design is probably a trash.

[-]

dietr1ch 2 days ago ago

The actual concern is privacy.

Privacy wise,

- Knowing sequential IDs leaks the rate of creation and amount of said entity which can translate in number of customers or rate of sales.

- Knowing timed IDs leaks activity patterns. This gets worse as you cross reference data.

- Random IDs reveal nothing.

---

Security wise,

- Sequential IDs can be guessed.

Performance wise,

- Sequential IDs may result in self-inflicted hotspots.

   - Spanner doesn't like  writing rows first keyed with timestamps, https://cloud.google.com/spanner/docs/schema-design#primary-key-prevent-hotspots.

- Random IDs lends themselves to sharding, but make indexing, column-compression, and maintaining order after inserts hard.

[-]

bearjaws 2 days ago ago

- Knowing sequential IDs leaks the rate of creation and amount of said entity which can translate in number of customers or rate of sales.

This implies the existence of an endpoint that returns a list of items, which could by itself be used to determine customers or rate of sales. This also means you have a broken security model that leaks a list of customers or list of sales, that you should probably not have access to begin with.

- Knowing timed IDs leaks activity patterns. This gets worse as you cross reference data.

Again if you can list items freely you can do this anyway, capture what exists now and do diffs to determine update times and creation times.

[-]

dietr1ch 2 days ago ago

With sequential Ids you use one of two sequences,

- table-global sequence :: Which leaks activity signals to all users that can create and see new Ids. This is the naive sequence you get when creating an incremental Id in a table.

- user-local sequence :: How many invoices a single user has, which is safe if kept within the reach of a single user. The sequence though, is slower and more awkward to generate.

Say you have a store that allows a user to check out just their own invoices.

- store.com/profile/invoices/{sequence_id}/

This does not imply that using a random id will return you back the data from another user, so it isn't necessarily as unsafe as you guessed. You'll probably get a 404 that does not even acknowledges the existence of said Id (but may be suspect to timing attacks to guess if it exists).

---

With timed Ids you do need a data leak out of bubble of a single user. Database design should always try to guard against that anyway. That's why we salt our passwords and store only their digest (right?).

jrockway 2 days ago ago

Why leak your primary keys? They are for the DBMS, not your end users.

[-]

dietr1ch 2 days ago ago

Primary keys to what? Users wanting to get a specific piece of data will need to know some user-visible Id for that.

You can masquerade internal Ids with opaque Ids if you want to maintain a translation layer. There's also more distributed use cases that require coming up with new Ids in isolation, so they will be "exposed" anyways as you sync up with other nodes.

bricss a day ago ago

Welp, never use sequential IDs

bearjaws 2 days ago ago

Yeah I am trying to imagine a universe where having the creation time of an item breaks your security model and every path I go down is that the system has terrible security.

[-]

Hizonner 2 days ago ago

I know that the person I'm stalking created a pseudonymous account on service X around time Y. Based on other information, I have a limited number of suspect accounts. The creation time leaks to me, either via a bug which would otherwise have been harmless, or because somebody writing code "can't imagine a universe where having the creation time of an item breaks your security". I use the creation time to figure out which of my candidates is actually the target.

It took me under 15 seconds to come up with that.

[-]

bearjaws 2 days ago ago

It took you 15 seconds because its a terrible example, _around time Y_ is doing insane lifting of this concept. Then "based on other information" okay so some other information is enabling this.

[-]

Hizonner 2 days ago ago

It turns out that in reality, I usually know both "around time Y" and "other information". You're going to narrow me down from 10 accounts to 1, or from 100 to 10.

[-]

bricss 2 days ago ago

In huge number of cases you will have timestamps in the payload anyway, since most db records will have unredacted createdOn, updatedOn fields for display in the UI.

Recursing 2 days ago ago

Interesting comment from a previous thread on UUIDv7 in Postgres: https://news.ycombinator.com/item?id=39262286

[-]

bearjaws 2 days ago ago

It's 300% more likely the competitor took out one of the sales guys and had him dump the Salesforce database.

This happens way more often than companies want to admit.

OP unsurprisingly left out the details of how they caught them.

[-]

ajdude 2 days ago ago

OP give more context in one of the child comments:

> Our founder asked me "Is there anything we're sending over the API that might clue someone in when someone signs up? Are we sending a createdAt field or something?" and I said "No, but we do have a timestamp in one of the IDs..." -- well, we removed the field and this behavior stopped soon after.

smartbit a day ago ago

Fun With UUIDs goes more into depth, presentation at PGday Lowlands sept 12, 2025.

> UUIDs have a bad reputation, mostly based on randomly allocated UUIDs effect on indexes. But UUIDs also give us 16 bytes of space to play with, which can be to our advantage. We can use the space of UUIDs to structure and encode data into our identifiers. This can be really useful for multi-tenant applications, for sharding or paritioning. UUIDs can also help improve you web app security, by not leaking sequentially allocated ids.

> We'll take a look at index performance concerns with randomly allocated UUIDs, sequential ids and sensibly structured UUIDs.

> Seeing how we can go about extracting information from these UUIDs and how to build these UUIDs inside PostgreSQL, and how we can look at the performance of these functions.

> Finally looking at adding support for bitwise operations on UUIDs using an extension.

Slides https://www.postgresql.eu/events/pgdaynl2025/schedule/sessio...

Live stream https://youtube.com/watch?v=tJYEuIpzch4&t=2h36m

sixdimensional 21 hours ago ago

There is a good solution to concerns over this...

Use UUIDv7 for your primary key but only for large scale databases and only for internal keys. Don't expose these keys to anything outside the database or application.

Use UUIDv4 columns with unique indexes over them as "external IDs" which are what is exposed via APIs to other systems.

Basically, create two IDs for one record - one random but not the primary key, and one sequential, that is the primary key.

I have done this in real systems.. and it works.

Fire-Dragon-DoL a day ago ago

Don't uuids have the big advantage of being easy to shard,over auto incrementing ids? That's usually a big deal.

The other big deal is that uuids can be created on the client and supplied by the client. That can make a lot of code drastically simpler (idempotency comes to mind)

[-]

bricss a day ago ago

All correct

sergeyprokhoren 2 days ago ago

95% of the comments here have nothing to do with reality.

What May Surprise You About UUIDv7 https://medium.com/@sergeyprokhorenko777/what-may-surprise-y...

raminf 2 days ago ago

One of the main points of using a UUID as a record identifier was to make it so people couldn't guess adjacent records by incrementing or decrementing the ID. If you add sequential ordering, won't that defeat the purpose?

Seems like it would be wise to add caveats around using this form in external facing applications or APIs.

[-]

bearjaws 2 days ago ago

There's still 62 bits of random data, even if you know the EXACT milisecond the row was created (you likely won't), you still need to do 1 billion guesses per second for 73 years.

Ideally you have some sort of rate limit on your APIs...

wongarsu 2 days ago ago

UUIDv7 has a 48 bit timestamp, 12 bits that either provide sub-millisecond precision or are random (in pg they provide precision) and another 62 bits that are chosen at random.

The A UUIDv7 leaks to the outside when it was created, but guessing the next timestamp value is still completely unfeasible. 62 bits is plenty of security if each attempt requires an API request

[-]

Hizonner 2 days ago ago

... and the next person working on the system thinks "well, this thing is unpredictable, so it's OK if I leak an unsalted hash of it". If they think at all, which is far from certain.

Why does everybody want to find excuses to leave footguns around?

Demiurge 2 days ago ago

The next ID can't be found just by adding 1, can it? How would you guess the next value?

gm678 2 days ago ago

Does anyone know if there are any sorts of optimizations (either internally or available to the user) for a table with a UIIDv7 PK and a `date_created` column?

[-]

gfody 2 days ago ago

various engines have what's called "fast key" optimization specifically for integer sequences - if you're testing performance between an int/serial pk and a uuid the impact is profound to digusting depending on the engine.

curtisszmania 2 days ago ago

[dead]

crest 2 days ago ago

"Unguessable public identifiers" press (X) to doubt

This depends very much on the type of UUID e.g. a type 1 UUID is just a timestamp, a MAC address and a "collision" counter to use if your clock ticks too slowly or you want batches of UUIDs.

[-]

hans_castorp a day ago ago

The article is about UUID v7, so any criticism of v1 is irrelevant here (plus Postgres has supported v4 out of the box for quite a while now)

UUIDv7 Comes to PostgreSQL 18