ULID has an unnecessarily convoluted monotonically incrementing variant in its spec which requires serialization to generate IDs without conflicts. That makes the spec look like trying to solve multiple irrelevant problems in one shot. That’s confusing and more like a design smell. If you need a sequentially incrementing identifier, just use a big enough int.
I think the issue above and UUIDv7’s existence make ULID completely unnecessary.
I've found ULID to be really useful in my API backend that runs on AWS Lambda and writes to DynamoDB. This is a use case where I need Lambda's parallelism, meaning no sharing some common sequential integer. Meanwhile I can roughly order items by creation time which makes pagination and selecting items by time range possible, right on the sort key instead of as a filter. In other words items might end up out of order by a few milliseconds but in my use case that's not a big deal, and I get the benefit of communication-free parallelism.
The monotonic version of ULID is probably faster to generate the next number vs UUIDv7. Whether or not that performance difference matters to you is application-specific.
It’s a big deal because it causes a semantic confusion: for any ID scheme that claims to be “universal”, users expect pervasive uniqueness without resorting to serialization. The “U” in ULID becomes misleading when devs bump into the monotonic variant. That can lead to hard to detect bugs because conflicts or perf degradation may happen only in production. Bad idea to make it part of the spec.
It doesn’t even make sense to have such a scheme in ULID. The only advantage it has over a 64-bit integer is that ULID is 128-bits but, that doesn’t solve any problem or help with anything. It’s just waste, and will cause pain and confusion.
In my eyes, the fact the UUIDs (or specifically v4, as mentioned in the article) encode no information is an advantage. I don’t think encoding the creation date into the UUID is likely to be a huge security hole, but I can imagine some instances where you would want to keep that secret, or at least not have a good reason to publish it.
I would also argue, strongly, that if you are planning to use the datetime an account/event/item was generated, you should track it separately from the UUID, in a dedicated column.
This isn’t intended to refute the article - it was an interesting read and I definitely learned some stuff, but rather to provide an argument for why you might want to stick with v4.
It's a security vs convenience tradeoff. For most people, making the ID practically unguessable, while also preserving index locality in databases, is the right balance.
It is a huge security hole. The problem is that people are using v4 all over the place as a source of “unguessable” tokens. This is not the purpose of a UUID and is wrong of course, but it doesn’t change the reality that many services will become vulnerable if they change UUID schemes.
You can track it separately, but if you use a database like DynamoDB using a ULID as the SK leads to optimized access patterns. If you use a UUID as your SK, then sort order of two items created at the same second is not guaranteed.
> While there are 48 bits of time information in both UUIDv7 (with millisecond precision) and ULID, the ULID standard encodes the time as Unix Epoch in milliseconds for the whole 48 bits whereas UUIDv7 splits the time into 36 seconds bits and 12 millisecond bits.
Ironically, the information and link that you've shared is incorrect / outdated.
The newer draft proposal allows for multiple different options of time information to be added to a UUIDv7 encoding, beyond the typical 48-bits of Unix epoch millisecond data.
First thing that isn't correct or changed since this was written:
> There don't seem to be any existing built-in or extension functions that support generating UUIDv7s in PostgreSQL. To generate a UUIDv7, the function below could be tweaked in order to support the correct formatting.
I like the ULID properties for the database, but it misses so many of the human factors that would make it excel.
Your tokens/identifiers should look like this:
namespace_crockfordEncodedEntropy
- There must be token/identifier prefixes like Stripe. While one could concatenate/prefix these bytes in application code, it's best to just accept the overhead and store them in your records for ease of lookup. It makes an engineer's life so much easier when debugging foreign keys on the fly or during an emergency SEV.
- There should be no hyphens (in string serialization). You probably want to encode and store crockford directly rather than do bytes to hex encoding, again for engineer ease of use.
- You may want to leave some trailing bytes for request routing in active-active high SLA systems. Especially for when record lookup cannot fail.
Doesn’t use the crockford encoding, but does is another one that minimises confusables.
Definitely looks pretty solid, so keen to give them a shot in my next project.
uuidv7 will be supported in PostgreSQL 17, at which point generation should be as fast as uuidv4, or if you implement it in pgsql now it will be as fast as the proposed ULID algorithm.
Insert performance could be even better, iirc for BTree Indexes monotonic increasing values are better than random ones, but feel free to correct me on that ;)
Commenting on this a bit late, but in case anyone reads this later too:
UUIDv7 support unfortunately didn't make it to Postgres 17, since the RFC wasn't completely finalized yet by the time of feature freeze (April 8), see discussion on pgsql-hackers:
In that case I don't understand why the author didn't go for uuidv7? It seems like existing tooling (both in database and outside) deals with it better, it seems like there are no downsides unless you expect your identifiers to be generated past 4147 but don't care if they are generated past 10889 (I'd love to hear that use-case, seems like it must be interesting).
The way to take advantage of the bandwidth of multiple storage devices is to distribute concurrent writes across them, rather than forcing everything to commit sequentially using contended locks or rollbacks.
The DB (or any application) should not have any need to know what devices are underneath its mount point. If you’re striping across disks, that’s a device (or filesystem, for ZFS) level implementation.
I think the issue above and UUIDv7’s existence make ULID completely unnecessary.