Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Here's a thought experiment [...] I would pick the Unix epoch, because I think others would do the same.

Ah yes, nobody will ever need to represent a date/time before 1970-01-01.

As for your thought experiment: recently I had to pick a "NULL/no/unknown date" value to put into a non-nullable database column and I picked 1900-01-01. Unix epoch crossed my mind only briefly and I discarded it immediately because it's too arbitrary and, more importantly, too near valid dates that the user might use. (Use-case: materials from historical archives being ingested.)

So my opinion is that the RFC authors should be commended for having the foresight of choosing an epoch that's widely more applicable and far less arbitrary (the 1st day of the calendar system that most of the world is using today) than the Unix epoch.



> Ah yes, nobody will ever need to represent a date/time before 1970-01-01.

If only we had a way to represent numbers less than zero :)


Will UUIDs with your favourite encoding for negative numbers put 1969-01-01 before or after 1970-01-01 with a lexicographical sort of the UUID bytes?


I think it's certainly rather unlikely that anyone will ever need to generate a v6-8 UUID before 1970.


> Ah yes, nobody will ever need to represent a date/time before 1970-01-01.

More like: the few people interested in doing this can use UUIDv8 (unspecified time format). There are many, many reasons to store times before 1970 in computer systems, but I'm not sure any apply to these UUID formats.

tl;dr of the "Background" section of the RFC: they're designing it so that two UUIDs with fresh timestamps will have values near each other in the database keyspace, which in many cases improves efficiency. If your timestamp isn't freshly generated, I don't know why you'd embed it in a UUID with these formats.

It's arguably not a good practice to extract the timestamp from the UUID at all. Instead, I might just treat them once generated as opaque approximately-sorted bytes. More clear to have a separate fields for timestamps with particular importance. Though I might not feel too religious about that if tight on storage space.


> If your timestamp isn't freshly generated, I don't know why you'd embed it in a UUID with these formats.

Ingesting past/archival data. As to why use these formats: to still benefit from their sorting properties. As to why use UUID at all: because it's a format understood by a wide variety of software.

> It's arguably not a good practice to extract the timestamp from the UUID at all.

Ya, I'd do it only if I trust the source (i.e., a closed system).


> Ingesting past/archival data. As to why use these formats: to still benefit from their sorting properties. As to why use UUID at all: because it's a format understood by a wide variety of software.

Their argument about the value of sorting is quite particular. Put stuff roughly in ascending order by the record creation time so that creating a bunch of records will touch (eg) fewer btree nodes. The simplest way to achieve that is to just use the current time. If you have some other timestamp at hand and happen to be scanning them in ascending order by that, you still get no advantage by this argument over just using the current timestamp:

> However some properties of [RFC4122] UUIDs are not well suited to this task. First, most of the existing UUID versions such as UUIDv4 have poor database index locality. Meaning new values created in succession are not close to each other in the index and thus require inserts to be performed at random locations. The negative performance effects of which on common structures used for this (B-tree and its variants) can be dramatic. As such newly inserted values SHOULD be time-ordered to address this.

My gut tells me that you should rarely if ever stick past timestamps in UUID primary keys. One reason why: can it ever change? Maybe your artifact dating technique was wrong, and you want to update the dates of a bunch of artifacts. But now you have to change your primary key. You probably don't want that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: