Mastering Postgres
Introduction
Introduction to the course
Overview of course structure
Postgres vs. everyone
The psql CLI
Data Types
Introduction to schema
Integers
Numeric
Floating point
Storing money
NaNs and infinity
Casting types
Characters types
Check constraints
Domain types
Charsets and collations
Binary data
UUIDs
Boolean
Enums
Timestamps
Timezones
Dates and times
Advanced Data Types
Intervals
Serial type
Sequences
Identity
Network and mac addresses
JSON
Arrays
Generated columns
Text search types
Bit string
Ranges
Composite types
Nulls
Unique constraints
Exclusion constraints
Foreign key constraints
Indexing
Introduction to indexes
Heaps and CTIDs
B-Tree overview
Primary keys vs. secondary indexes
Primary key types
Where to add indexes
Index selectivity
Composite indexes
Composite range
Combining multiple indexes
Covering indexes
Partial indexes
Index ordering
Ordering nulls in indexes
Advanced Indexing
Functional indexes
Duplicate indexes
Hash indexes
Naming indexes
Understanding Query Plans
Introduction to explain
Explain structure
Scan nodes
Costs and rows
Explain analyze
Generating Results
Introduction to queries
Cross joins
Inner joins
Outer joins
Subqueries
Lateral joins
ROWS FROM
Filling gaps in sequences
Subquery elimination
Combining queries
Set generating functions
Indexing joins
Advanced SQL
Introduction to advanced SQL
Grouping
Grouping sets, rollups, cubes
Window functions
CTEs
CTEs with window functions
Recursive CTE
Hierarchical recursive CTE
Handling nulls
Row value syntax
Views
Materialized views
Removing duplicate rows
Upsert
Returning keyword
COALESCE + generated column
Full Text Search
Introduction to full text search
Searching with LIKE
Vectors, queries, and ranks
Websearch
Ranking
Indexing full text search
Highlighting
Advanced JSON
Intro to JSON
JSON vs JSONB
Validating JSON
Creating JSON objects + arrays
JSON extraction
JSON containment
JSON existence
JSON recordset
Updating JSON
Indexing JSON parts
GIN index
Vectors (pgvector)
Intro to pgvector
Vector embedding columns
Find related articles
Upsert vector embedding
Semantic search
Other operators
Vector indexes
Outro
Thank you
Bonus interviews
Heroku's glory days & Postgres vs the world (with Craig Kerstiens)
Creating a Postgres platform with Monica & Tudor from Xata.io
Bootstrapping an email service provider (with Jesse Hanley)
Locked video

Please purchase the course to watch this video.

Video thumbnail
Generating Results
Indexing joins

Full Course

$
349
$399
USD, one-time fee
Aaron is a natural teacher and this course is the best introduction to Postgres I have come across. Lessons are easy to follow and he recommends some great tools for working with Postgres.
Joel Drumgoole
Joel Drumgoole

PostgreSQL database platform

Shorten dev cycles with branching and zero-downtime schema migrations.

Test Data

I've made my test data available for you to use and follow along.

Download
or
Use on Xata

Summary

Foreign key constraints ensure referential integrity between related tables, but they don't automatically create indexes in the child table. While the parent table's primary key is indexed by default, adding an index to the foreign key column in the child table can significantly improve join performance—especially in large datasets. You'll learn how indexing foreign keys can optimize queries, such as efficiently joining tables like users and bookmarks.

Video Transcript

Do you remember back when we were talking about building a schema and we talked about the difference between a foreign key and a foreign key constraint? A foreign key constraint enforces referential integrity, and a foreign key is just a concept where you have a pointer to another table and a row in that table.

Here, we're gonna talk about adding indexes to make these joins faster, so we're gonna be indexing our foreign keys now. When you create a primary key, a unique constraint in the parent table, that is enforced by an index. That index is automatically created in the parent table. However, when you create a foreign key, even if foreign key constraint, an index is not automatically created in the child table.

We're gonna continue to use users and bookmarks, I'm gonna prove to you that that much is true. Then we will add an index and see how that can speed us up. First, we're just gonna create a parent table and a child table. This is the one we used back for foreign key constraints. I'm just gonna create these straight in front of you from scratch to prove that no indexes are created for this state_id down here. Create states, create cities, and then select * from pg_indexes where tablename =.

Let's start with states. If we start with states, we see we have the unique index on the primary key. If we switch to cities, we see we have the unique index on the primary key. No index was automatically created for this foreign key, which honestly is fine. That would be, in my opinion, a little bit of an overreach for them to do that automatically for us.

Now, we can do it for ourselves. We're armed with a little bit of knowledge now. The knowledge, the thing that we have just proven here, we have proven that you can create a foreign key that absolutely references another table and even enforces referential integrity. When it comes time to join up, you may encounter a performance penalty, because you don't have an index here.

Closing that out, let's look back at users and bookmarks. This is the join, right? The parent table is users and they have many bookmarks, and we're gonna join those up together. If we do that, we see that we're still in good shape. We've got user ID 39, user ID 39, that join is working, nothing is new here.

If we explain this, you'll see, let me make some space here. You'll see we're doing an index scan on the users primary key. That part is good, because we're doing users.id < 100, but then, we're literally scanning the entire bookmarks table to look for the people with those user IDs that we found. We're scanning the whole thing. That is a problem, right? Especially when these tables get quite large, and you've got, let's say 10,000 users in the left hand, and then you're scanning the bookmarks table looking for 10,000 different user IDs over there.

It's just not a good idea to scan the whole table, usually ever, but especially in this case. What we can do here is we can do create index, and let's stick to our naming, let's do idx_bookmarks. I'm kind of torn on this. You might do idx, you might do fkey, you know? I'm just going to stick to idx, it doesn't matter right now, idx_bookmarks_user_id on bookmarks(user_id).

If we create that index, then we want to see that sequential scan on the bookmarks table go away. I can already tell just by the nesting that that's a lot better, it's a lot shallower. In this case, we are using the index on the bookmarks table and the index on the users table to accomplish this join efficiently and effectively. You could, of course, add a secondary column here. You could add, you know, is_secure, if that was like a generated column or something, or you could add a saved_on for a bookmark date. You could include that in your filtering condition here. You know, saved_on > current date or, you know, current date less a week, or something like that.

You can still have a composite index with the leading, the left hand prefix, being that join column, and you get the benefit of that join with also the benefit of that elimination through the index, that filtering through the index. Regardless, if you are going to be joining up two different tables, it usually, usually, usually makes perfect sense to have an index on that foreign key, either by itself or as a part of a composite index over in the child table.