Enter your email below to watch this video
Mastering Postgres is single-handedly the most thorough and informative database course I’ve seen. I’ve gone from someone who barely knew the basics of Postgres and how it works to being comfortable with the advanced topics of Postgres.Ryan Hendrickson
Shorten dev cycles with branching and zero-downtime schema migrations.
We ended the discussion about binary data types. Talking about MD5, shot 256 UUIDs. This video, we're going to talk about UUIDs as a data type. What we're not going to talk about is using UUIDs as a primary key. We will talk about that, we have a little bit of groundwork to lay first.
First, we're going to talk about UUIDs as a data type then we need to talk about indexes bee trees. Then we need to talk about secondary keys then we can talk about big ends versus UUIDs. If you don't care, just choose a big end. There's a good reason behind that. It has to do with fracturing the bee trees we don't even know what a bee tree is yet. Let's talk about UUIDs as a data type. The data type here is just simply UUID. You would potentially have a bunch of other columns. You'd probably have an ID. You may have noticed I've been skipping the IDs that's 'cause we haven't gotten to primary keys yet. I just want to save that.
Usually you would have maybe like some data or name or value or whatever. It doesn't matter for this example, UUID, universally unique ID. It's a strong claim. In fact, I think it's a fair claim because I think you're more likely to get hit by lightning like 40 times than have a UUID collision. UUIDs are very, very helpful when you're generating IDs without coordination. That could be in separate systems, on separate machines, microservices, different services. They don't even have to be micro or generating them on the client. Maybe you're doing some sort of optimistic UI you generate something on the client it needs a unique ID. You generate a UUID out there. Totally legit. Perfectly good reason to use them.
Let's go ahead did we already run? Did we already create this table? We created this table then what I'm going to do is I'm just going to insert this as if I've been given this string from the outside world. Either some other system or the client has given this to me it looks like a string. If we read this back, we do select * from UUID example, we will see that it's still looks like a string. Let's take, let's do two things here. Let's do UUID value, let's do three things. PG type of UUID value, we get UUID. It's not a string, it's not text, although you could, you could cast it to text if you wanted, you don't need to. A display is just fine. It's a type of UUID. Then let's do PG column size, UUID value. That's the good stuff. That's the good stuff that we're after, 16 bytes. This entire string here, in fact, let's do this again. Let's do select PG column size of, let's just take, let's just take that string.
Actually let's do it this way. Let's do cast it to text then we'll copy it do it again, not cast to text from UUID example. If we run that, that is why you UUID is such a great data type for UUIDs, because it is efficiently stored, it's stored better on disc, much more compact. It is faster to operate against. This is what I'm talking about. If you think all, how many videos have we done? If you think all the way back to the beginning of this module you think about where I was telling you, pick the data type that is most representative of reality. That is most compact. That is the simplest. This is a great example.
You could store your UUIDs as text. It's better to store 'em as UUIDs. It's just, it's just better. That is what you want to do. Anytime you do have a UUID, you can also generate your own. You can have Postgres generate UUIDs for you. Let me show you how to do that. The only type of UUID that Postgres can generate out of the box without an extension is a random UUID. This is going to lead us nicely into the discussion about primary keys data types later on. There are, I think at the time of recording eight versions of UUID they don't supersede each other. They are just simply different versions.
Each part of a UUID is representative of some type of data, whether that is a timestamp, random bits, machines, whatever, some sort of entropy. Each piece has its own kind of domain that it's responsible for. Versions one through seven or eight, whatever it is currently all do it a little bit differently. The good thing about UU at version seven, which is not in Postgres by default you can get an extension for it. The good thing about UUID V7 is this first part is a timestamp. What that means is that it's always going to go lexographically sorted. It's always going to go at the end. If you needed to, you could extract the timestamp out of the UUID. While the, while UUID V7 is not available by default, you can find extensions for it. Or if you're using a hosted provider, they might have already built it with a V7 extension.
Just to talk about indexes one more time, V7 could make a very fine primary key. However, this gen random UUID would be a terrible primary key. There's a little primer on UUIDs. If you're already using them to generate stuff in the outside world, this is a great way to store them. If you're going to try to use 'em for primary keys, I implore you to get a V7 or wait until the video where we talk about primary keys. Regardless of where the UUIDs come from, please store them as a UUID type you'll get a compact 16 bytes instead of, I think it was like 40 bytes if you store it as a string. 'cause you have the 36 bytes of the string four bytes of overhead.