Mastering Postgres
Introduction
Introduction to the course
Overview of course structure
Postgres vs. everyone
The psql CLI
Data Types
Introduction to schema
Integers
Numeric
Floating point
Storing money
NaNs and infinity
Casting types
Characters types
Check constraints
Domain types
Charsets and collations
Binary data
UUIDs
Boolean
Enums
Timestamps
Timezones
Dates and times
Advanced Data Types
Intervals
Serial type
Sequences
Identity
Network and mac addresses
JSON
Arrays
Generated columns
Text search types
Bit string
Ranges
Composite types
Nulls
Unique constraints
Exclusion constraints
Foreign key constraints
Indexing
Introduction to indexes
Heaps and CTIDs
B-Tree overview
Primary keys vs. secondary indexes
Primary key types
Where to add indexes
Index selectivity
Composite indexes
Composite range
Combining multiple indexes
Covering indexes
Partial indexes
Index ordering
Ordering nulls in indexes
Advanced Indexing
Functional indexes
Duplicate indexes
Hash indexes
Naming indexes
Understanding Query Plans
Introduction to explain
Explain structure
Scan nodes
Costs and rows
Explain analyze
Generating Results
Introduction to queries
Cross joins
Inner joins
Outer joins
Subqueries
Lateral joins
ROWS FROM
Filling gaps in sequences
Subquery elimination
Combining queries
Set generating functions
Indexing joins
Advanced SQL
Introduction to advanced SQL
Grouping
Grouping sets, rollups, cubes
Window functions
CTEs
CTEs with window functions
Recursive CTE
Hierarchical recursive CTE
Handling nulls
Row value syntax
Views
Materialized views
Removing duplicate rows
Upsert
Returning keyword
COALESCE + generated column
Full Text Search
Introduction to full text search
Searching with LIKE
Vectors, queries, and ranks
Websearch
Ranking
Indexing full text search
Highlighting
Advanced JSON
Intro to JSON
JSON vs JSONB
Validating JSON
Creating JSON objects + arrays
JSON extraction
JSON containment
JSON existence
JSON recordset
Updating JSON
Indexing JSON parts
GIN index
Vectors (pgvector)
Intro to pgvector
Vector embedding columns
Find related articles
Upsert vector embedding
Semantic search
Other operators
Vector indexes
Outro
Thank you
Bonus interviews
Heroku's glory days & Postgres vs the world (with Craig Kerstiens)
Creating a Postgres platform with Monica & Tudor from Xata.io
Bootstrapping an email service provider (with Jesse Hanley)
Locked video

Please purchase the course to watch this video.

Video thumbnail
Advanced SQL
Handling nulls

Full Course

$
349
$399
USD, one-time fee
Window function 😊. CTE with window function 😮. Hierarchical recursive CTE 🤯. Recommended all the way. Can’t wait to see the rest of the videos!
M Wildan Zulfikar
M Wildan Zulfikar

PostgreSQL database platform

Shorten dev cycles with branching and zero-downtime schema migrations.

Test Data

I've made my test data available for you to use and follow along.

Download
or
Use on Xata

Summary

Nulls in databases represent unknown values and behave differently from other data, requiring careful handling. You'll learn how to use functions like `COALESCE` to provide default values and `NULLIF` to create nulls when two values match. Special attention is given to how nulls can affect subqueries—particularly with `NOT IN`—to help you avoid unexpected results in your queries.

Video Transcript

We've talked about nulls in a couple of places, but I want to go over a few things explicitly so that you're well-prepared to handle them because nulls are a way of life.

There are columns that are nullable, and that is totally legit and perfectly fine. You just need to be prepared to handle them. The thing that you must absolutely remember, the thing from which every other thing flows is null is unknown. Null is unknowable. Null is not equal to null because we have no idea what is inside here. It is hidden, it is unknown, it is unknowable.

Let's look at some things we can do. Let's just start with some reminders, select 1 = null, we get NULL back. select null = null, we get NULL back. Two things are hidden under two bushels, are they the same? Who knows? We don't know what's under there. You can, however, so while you cannot do 1 = null, you can say, is 1, or rather, let me phrase this as we're reading it, select 1 is distinct from 1. That's a long way of saying 1 is not equal to 1. Is it distinct from 1? In fact, it is not. It is not distinct from 1, which is the longest possible way of saying, does 1 = 1 true? 1 is not distinct from 1. Look at that, we get an actual value back. We don't get a NULL back. Let's drop this not. select 1 is distinct from null. It is in fact distinct from null. In fact, null is not distinct from null. Select null is distinct from null? No, they're the same thing when you use this operator.

When you use the other operator, you get no clue. You can also say, instead of doing the is distinct, you can do a strict question and say, select null is null, and it's gonna say TRUE. select 1 is null? It will say FALSE, so that is a distinct operator. Remember when we had categories, we can do order by parent_id. We can tell here that parent_id is being ordered ascending because the little guy's at the bottom, the big guy, or rather, the little guy's at the top and the big guy is at the very last or the bottom, but NULL is treated as a large value. So NULL is the biggest. If we order by descending, so the biggest number's at the very top, NULL. NULL is the biggest number. I guess they had to make a choice, and this is as good a choice as any, frankly. You can also change that around. In this case, if we want nulls first, we can just put nulls first. If we switch it to descending, we can say nulls last. We can control the order, we can control the order that the nulls are sorted in.

We're gonna look at a few functions that operate on nulls. If we bring back the select * from categories and we looked at and we see that Electronics has no parent_id, but what if, what if we want to say, "Hey, when it's null, provide this default value." So we can do id, name, and then there's a function called coalesce, which I do love. We can say, coalesce(parent_id, 0). In that case, it's going to place the 0 there.

Let's go ahead and just name this as parent_id. It's going to place the 0 there because parent_id is null. Generally, just broadly, coalesce takes the first non-null value and returns it. In this case, we're gonna get 1 back because the first three arguments are null, the last one is 2, but it doesn't matter 'cause we're stopping at the first non-null argument, which can be incredibly helpful. There's the opposite. There's the opposite of that, and you can have nullif.

If you want to generate nulls when two things are equal, you have that ability. What this does is it's going to, if A and B, the first argument and the second argument, are equal, it's going to return NULL. If they are not equal, it's going to return the first value. In this case, those things are not equal, and so it returns the first value. Those things are equal, and so it nulls it out. You can use that anywhere. You can use that in select nullif if you wanted to throw parent_id and you wanted to null out everyone that had a parent_id of 1. We'll take id, name, null out the parent_id from categories, and there you go.

All the ones that have a parent_id, well, and Electronics, in fact, all the ones that have a parent_id of 1 or NULL end up being NULL because we did our comparison here.

The last thing, and this is just a little bit of a refresh, the last thing is you need to be super careful when you're doing a not in and you're generating a list that might contain a null because if we remember back to the subqueries video, if you're doing a not in and you generate a list that might contain a null, it's going to come back as NULL. If we do select * from categories, and then thinking back to our subquery video, if we did where id not in, and then we had some subquery in here. We had some subquery that, you know, did a bunch of work and produced these ids, 2, 5, and 9, great. Now, you know, 2 is gone, 5 is gone, and 9 is gone. If you're not super careful and your subquery generates a null value, game over, you're hosed. This is a pretty good argument for definitely not allowing your subqueries to generate null values and being extremely careful with not ins and maybe preferring not exists in this case.

Just some rapid-fire tips on nulls. You just gotta be careful with them. They don't really operate like other values because in fact, they are not like other values. That is kind of the whole point of nulls. Consider if your columns actually need to be nullable, and if they do, that's fine, that's totally fine. Do not make up fake values to stand in for null. Null exists, it should be used when it is useful, but then you do have to be careful with your comparisons. If you want to kind of paper over some null values, you can use that coalesce function to provide a default.