The most comprehensive course on PostgreSQL

Learn why having duplicate indexes in a database is unnecessary and can lead to confusion. For example, if you already have an index on an email column and create a composite index that starts with email (like `email, is_pro`), it’s essentially a duplicate. By removing the single-column index, the composite index can efficiently handle both simple and complex queries, streamlining your database performance.

Early on, when we talked about indexing, I said, it's a separate data structure that must be maintained. Every insert, update, or delete must touch that B-tree. You want to have as many indexes as you need, but as few as you can get away with. There's kind of a balance there. One thing that you definitely don't need is a duplicate index. You don't want to have the same index on two columns with all of the same parameters because that doesn't help anybody. It only costs, and potentially confuses you, the query planner, your teammates, you just don't need it. That much makes sense. With what we've learned about how B-tree indexes are accessed, you might accidentally have a duplicate, even if it doesn't look exactly like a duplicate.

Here's how that would normally happen. You would have something like, create index email on users (email);. As time goes on, this index exists and then business requirements change, right? That's totally normal. That's completely fine. Somebody comes along and says, "Hey, we need to do email_is_pro," for example. You go in and you say, great, I know how to do that. I know how to create a composite index. Maybe I'll do a partial index, but let's go ahead and do a composite index here.

Let's do an (email, is_pro); index, and you move on with your day, you go home, you're happy. That's totally fine. This happens. You do need to consider though, that you've just created a duplicate index. If we look at this one, we'll copy that, and we'll come down here. These are the two indexes that we have created. If we add a little space here, you'll see, hmm, that lines up quite nicely. Those indexes have the exact same leftmost prefix, which means, in the order of access, those are functionally the same index when it comes to querying for email.

If we were to do select * from users where email = '[email protected]';, we'll see that it should use, go through and explain, see that it should use this one up here first. We see it does prefer the smaller index. It does prefer the more compact index. If we were to drop, if we were to drop index email, we would see that it just happily switches over to, there you go, (email, is_pro);. The reason that it does that is because we start on the left side and work towards the right, meaning as far as a B-tree is concerned, those two are the exact same.

This B-tree is a little bit different. It does have that one extra value of is_pro, but it can happily be used to satisfy this query right here. Unless you have an extremely good use case and you know exactly why you want two indexes that share a leftmost prefix, consider this a duplicate index and get rid of this one, because this second one will satisfy the needs of this one, but not vice versa. You can get away with just using this one for the email only queries and the email and is pro queries.

Full Course

PostgreSQL database platform

Summary

Video Transcript