The most comprehensive course on PostgreSQL

Foreign key constraints ensure referential integrity between related tables, but they don't automatically create indexes in the child table. While the parent table's primary key is indexed by default, adding an index to the foreign key column in the child table can significantly improve join performance—especially in large datasets. You'll learn how indexing foreign keys can optimize queries, such as efficiently joining tables like users and bookmarks.

Do you remember back when we were talking about building a schema and we talked about the difference between a foreign key and a foreign key constraint? A foreign key constraint enforces referential integrity, and a foreign key is just a concept where you have a pointer to another table and a row in that table.

Here, we're gonna talk about adding indexes to make these joins faster, so we're gonna be indexing our foreign keys now. When you create a primary key, a unique constraint in the parent table, that is enforced by an index. That index is automatically created in the parent table. However, when you create a foreign key, even if foreign key constraint, an index is not automatically created in the child table.

We're gonna continue to use users and bookmarks, I'm gonna prove to you that that much is true. Then we will add an index and see how that can speed us up. First, we're just gonna create a parent table and a child table. This is the one we used back for foreign key constraints. I'm just gonna create these straight in front of you from scratch to prove that no indexes are created for this state_id down here. Create states, create cities, and then select * from pg_indexes where tablename =.

Let's start with states. If we start with states, we see we have the unique index on the primary key. If we switch to cities, we see we have the unique index on the primary key. No index was automatically created for this foreign key, which honestly is fine. That would be, in my opinion, a little bit of an overreach for them to do that automatically for us.

Now, we can do it for ourselves. We're armed with a little bit of knowledge now. The knowledge, the thing that we have just proven here, we have proven that you can create a foreign key that absolutely references another table and even enforces referential integrity. When it comes time to join up, you may encounter a performance penalty, because you don't have an index here.

Closing that out, let's look back at users and bookmarks. This is the join, right? The parent table is users and they have many bookmarks, and we're gonna join those up together. If we do that, we see that we're still in good shape. We've got user ID 39, user ID 39, that join is working, nothing is new here.

If we explain this, you'll see, let me make some space here. You'll see we're doing an index scan on the users primary key. That part is good, because we're doing users.id < 100, but then, we're literally scanning the entire bookmarks table to look for the people with those user IDs that we found. We're scanning the whole thing. That is a problem, right? Especially when these tables get quite large, and you've got, let's say 10,000 users in the left hand, and then you're scanning the bookmarks table looking for 10,000 different user IDs over there.

It's just not a good idea to scan the whole table, usually ever, but especially in this case. What we can do here is we can do create index, and let's stick to our naming, let's do idx_bookmarks. I'm kind of torn on this. You might do idx, you might do fkey, you know? I'm just going to stick to idx, it doesn't matter right now, idx_bookmarks_user_id on bookmarks(user_id).

If we create that index, then we want to see that sequential scan on the bookmarks table go away. I can already tell just by the nesting that that's a lot better, it's a lot shallower. In this case, we are using the index on the bookmarks table and the index on the users table to accomplish this join efficiently and effectively. You could, of course, add a secondary column here. You could add, you know, is_secure, if that was like a generated column or something, or you could add a saved_on for a bookmark date. You could include that in your filtering condition here. You know, saved_on > current date or, you know, current date less a week, or something like that.

You can still have a composite index with the leading, the left hand prefix, being that join column, and you get the benefit of that join with also the benefit of that elimination through the index, that filtering through the index. Regardless, if you are going to be joining up two different tables, it usually, usually, usually makes perfect sense to have an index on that foreign key, either by itself or as a part of a composite index over in the child table.

Full Course

PostgreSQL database platform

Summary

Video Transcript