Mastering Postgres
Introduction
Introduction to the course
Overview of course structure
Postgres vs. everyone
The psql CLI
Data Types
Introduction to schema
Integers
Numeric
Floating point
Storing money
NaNs and infinity
Casting types
Characters types
Check constraints
Domain types
Charsets and collations
Binary data
UUIDs
Boolean
Enums
Timestamps
Timezones
Dates and times
Advanced Data Types
Intervals
Serial type
Sequences
Identity
Network and mac addresses
JSON
Arrays
Generated columns
Text search types
Bit string
Ranges
Composite types
Nulls
Unique constraints
Exclusion constraints
Foreign key constraints
Indexing
Introduction to indexes
Heaps and CTIDs
B-Tree overview
Primary keys vs. secondary indexes
Primary key types
Where to add indexes
Index selectivity
Composite indexes
Composite range
Combining multiple indexes
Covering indexes
Partial indexes
Index ordering
Ordering nulls in indexes
Advanced Indexing
Functional indexes
Duplicate indexes
Hash indexes
Naming indexes
Understanding Query Plans
Introduction to explain
Explain structure
Scan nodes
Costs and rows
Explain analyze
Generating Results
Introduction to queries
Cross joins
Inner joins
Outer joins
Subqueries
Lateral joins
ROWS FROM
Filling gaps in sequences
Subquery elimination
Combining queries
Set generating functions
Indexing joins
Advanced SQL
Introduction to advanced SQL
Grouping
Grouping sets, rollups, cubes
Window functions
CTEs
CTEs with window functions
Recursive CTE
Hierarchical recursive CTE
Handling nulls
Row value syntax
Views
Materialized views
Removing duplicate rows
Upsert
Returning keyword
COALESCE + generated column
Full Text Search
Introduction to full text search
Searching with LIKE
Vectors, queries, and ranks
Websearch
Ranking
Indexing full text search
Highlighting
Advanced JSON
Intro to JSON
JSON vs JSONB
Validating JSON
Creating JSON objects + arrays
JSON extraction
JSON containment
JSON existence
JSON recordset
Updating JSON
Indexing JSON parts
GIN index
Vectors (pgvector)
Intro to pgvector
Vector embedding columns
Find related articles
Upsert vector embedding
Semantic search
Other operators
Vector indexes
Outro
Thank you
Bonus interviews
Heroku's glory days & Postgres vs the world (with Craig Kerstiens)
Creating a Postgres platform with Monica & Tudor from Xata.io
Bootstrapping an email service provider (with Jesse Hanley)
Locked video

Please purchase the course to watch this video.

Video thumbnail
Advanced SQL
Hierarchical recursive CTE

Full Course

$
349
$399
USD, one-time fee
Finally treated myself to mastering Postgres by Aaron Francis. I highly recommend it! I love how each video is proving a concept - it makes the content easy to retain and engaging.
Andrew Bass

The PostgreSQL Development Platform

Start your project with a Postgres database, Authentication, instant APIs, Edge Functions, Realtime subscriptions, Storage, and Vector embeddings.

Test Data

I've made my test data available for you to use and follow along.

Download
or
Use on Supabase (coming soon)

Summary

Learn how to use a recursive CTE (Common Table Expression) to explore hierarchical data in a categories table, where each category can have a parent category. CTE efficiently maps the path from the root category (e.g., electronics) down to specific items (e.g., Canon cameras). This approach allows you to generate category trees directly in the database, eliminating the need for extra processing outside of it.

Video Transcript

We're gonna use a recursive CTE to generate or to traverse some hierarchical data. We have a table full of categories and every category potentially has a parent category that lives in that same table. We're gonna find our path through that table with this recursive CTE.

Here is the data that we're working with. It's only 15 rows, but you get a sense of the structure. So this is a bunch of categories. We're gonna start down here at the Canon camera because this is a Canon camera right here. And the parent ID of Canon cameras, or is cameras, just generally cameras. The parent ID of cameras is 11, which is generally video gear. Then the parent of video gear is just broadly, electronics.

We wanna map our way through that to show the path to each individual child. What we're gonna do is we're gonna start with recursive. With recursive all categories as, and we'll open that up. We're gonna start with the most parent of parents, the parentest of them all. That is select * from categories where parent ID is, no, these are the grandparents, the great, great, great, great great grandparents. The ones that have no parents above them. This is our root node or our anchor condition that generates these root nodes.

Then we're gonna say union all of something. And we're gonna say select * from all categories. Now what is the something? The something is interesting because we're gonna do select * from, let's drop this down because we'll need the space later, from all categories, which is in fact, the table that we're building up, which at the moment looks like this. That is what the table looks, that's what the all categories recursive CTE looks like when we reach this first recursive condition.

At this point, this is what all categories looks like. Let's keep that in mind, and we are going to enter, join categories. We're gonna bring in categories on all categories, which at this point is just this row. All_categories.id equals categories.parent_id. What we're saying is, all right, we've got this one node here and let's bring together all of its children. Anyone that has a parent ID of one, we're pairing up with this row that has an ID of one. If we run this, it's not gonna work because must have the same number of columns, which is correct, in fact, and because we're joining stuff in, we're getting a much wider result set down here, which is fine 'cause we don't actually want all of the columns. Let's do this, let's do ID and name. That's a good place to start, it's not a good place to end. Then when we join this stuff in, we don't want to reselect or reshow the data from all categories because that is, you know, represented in the rows above.

What we're gonna do is we're just gonna select the data that we joined in by saying categories, ID and ccategories.name. And now if we run that, nothing, I mean it worked, it didn't break. But that's nothing. Let's keep moving because that is good that it didn't break. That's always a good start. But that's not, oh, it's not anything. We need to generate, we need to generate this path through all of the categories.

Let's do that, we're gonna start with our anchor condition, and say name as path. And then down here we'll just say, we'll just say foo. Foo does not exist. Why do I always do that? We're gonna start with foo, so there we go. The anchor condition has name as the path. What we're gonna do down here, in fact, is we're going to concat the path, which means the rows above us, we're gonna take path. We're gonna put this little arrow in here. We're gonna say we need categories, which is the one we've joined in, which is the child, categories.name. That should give us with any luck, hey, look at that. There we go, so coming back down to Canon, we see it goes from electronics to video gear to cameras to Canon. The first run through, we just had that one. The second run through, we got all of those children. Then the third run through, we got the third level or the third depth, and then the fourth run through, we got the fourth depth of information here.

We can prove that by just saying one as depth and then coming over here and saying depth plus one. If we run that, you'll see, there is our depth. That's the first run. That is the second run, that's the third run. That is the fourth run. Pretty cool, right? It's a little mind bendy but hey, we're developers. We do love a little bit of recursion. It makes us think very, very hard. I love this as a pattern, especially when you have a table like this that is a little bit self-referential in terms of who the parent is. This is a great way to generate those trees.

Of course, you don't have to do it with this goofy little arrow syntax that I've done. You could put it in an array, you could put it in a JSON object, you could do it however you want. And of course, you could do this logic on the application side. And that's always kind of the trade off is where do you want do this kind of work? I think in this case, I would do this in the database. The database has the affordances to do that. It's going to work very well, usually very performantly versus bringing all of the data over into your programming language of choice and then parsing it or looping it over there, that is what recursive CTEs are built for.

I would use the tools available to you, and in this case, I think recursive CTE is the right one.