Mastering Postgres is such a crisp way to learn this subject. With today’s AI tools writing many of the queries via autocomplete it’s vital to have this knowledge to ensure you know what’s actually happening under the hood. My app’s performance improved overnight after learning more about indexing. Aaron’s excitement for the subject makes it really fun to learn whether I’m next to my code editor or on a treadmill run.Peter Ramsing
Shorten dev cycles with branching and zero-downtime schema migrations.
We've talked about nulls in a couple of places, but I want to go over a few things explicitly so that you're well-prepared to handle them because nulls are a way of life.
There are columns that are nullable, and that is totally legit and perfectly fine. You just need to be prepared to handle them. The thing that you must absolutely remember, the thing from which every other thing flows is null is unknown. Null is unknowable. Null is not equal to null because we have no idea what is inside here. It is hidden, it is unknown, it is unknowable.
Let's look at some things we can do. Let's just start with some reminders, select 1 = null, we get NULL back. select null = null, we get NULL back. Two things are hidden under two bushels, are they the same? Who knows? We don't know what's under there. You can, however, so while you cannot do 1 = null, you can say, is 1, or rather, let me phrase this as we're reading it, select 1 is distinct from 1. That's a long way of saying 1 is not equal to 1. Is it distinct from 1? In fact, it is not. It is not distinct from 1, which is the longest possible way of saying, does 1 = 1 true? 1 is not distinct from 1. Look at that, we get an actual value back. We don't get a NULL back. Let's drop this not. select 1 is distinct from null. It is in fact distinct from null. In fact, null is not distinct from null. Select null is distinct from null? No, they're the same thing when you use this operator.
When you use the other operator, you get no clue. You can also say, instead of doing the is distinct, you can do a strict question and say, select null is null, and it's gonna say TRUE. select 1 is null? It will say FALSE, so that is a distinct operator. Remember when we had categories, we can do order by parent_id. We can tell here that parent_id is being ordered ascending because the little guy's at the bottom, the big guy, or rather, the little guy's at the top and the big guy is at the very last or the bottom, but NULL is treated as a large value. So NULL is the biggest. If we order by descending, so the biggest number's at the very top, NULL. NULL is the biggest number. I guess they had to make a choice, and this is as good a choice as any, frankly. You can also change that around. In this case, if we want nulls first, we can just put nulls first. If we switch it to descending, we can say nulls last. We can control the order, we can control the order that the nulls are sorted in.
We're gonna look at a few functions that operate on nulls. If we bring back the select * from categories and we looked at and we see that Electronics has no parent_id, but what if, what if we want to say, "Hey, when it's null, provide this default value." So we can do id, name, and then there's a function called coalesce, which I do love. We can say, coalesce(parent_id, 0). In that case, it's going to place the 0 there.
Let's go ahead and just name this as parent_id. It's going to place the 0 there because parent_id is null. Generally, just broadly, coalesce takes the first non-null value and returns it. In this case, we're gonna get 1 back because the first three arguments are null, the last one is 2, but it doesn't matter 'cause we're stopping at the first non-null argument, which can be incredibly helpful. There's the opposite. There's the opposite of that, and you can have nullif.
If you want to generate nulls when two things are equal, you have that ability. What this does is it's going to, if A and B, the first argument and the second argument, are equal, it's going to return NULL. If they are not equal, it's going to return the first value. In this case, those things are not equal, and so it returns the first value. Those things are equal, and so it nulls it out. You can use that anywhere. You can use that in select nullif if you wanted to throw parent_id and you wanted to null out everyone that had a parent_id of 1. We'll take id, name, null out the parent_id from categories, and there you go.
All the ones that have a parent_id, well, and Electronics, in fact, all the ones that have a parent_id of 1 or NULL end up being NULL because we did our comparison here.
The last thing, and this is just a little bit of a refresh, the last thing is you need to be super careful when you're doing a not in and you're generating a list that might contain a null because if we remember back to the subqueries video, if you're doing a not in and you generate a list that might contain a null, it's going to come back as NULL. If we do select * from categories, and then thinking back to our subquery video, if we did where id not in, and then we had some subquery in here. We had some subquery that, you know, did a bunch of work and produced these ids, 2, 5, and 9, great. Now, you know, 2 is gone, 5 is gone, and 9 is gone. If you're not super careful and your subquery generates a null value, game over, you're hosed. This is a pretty good argument for definitely not allowing your subqueries to generate null values and being extremely careful with not ins and maybe preferring not exists in this case.
Just some rapid-fire tips on nulls. You just gotta be careful with them. They don't really operate like other values because in fact, they are not like other values. That is kind of the whole point of nulls. Consider if your columns actually need to be nullable, and if they do, that's fine, that's totally fine. Do not make up fake values to stand in for null. Null exists, it should be used when it is useful, but then you do have to be careful with your comparisons. If you want to kind of paper over some null values, you can use that coalesce function to provide a default.