Mastering Postgres is single-handedly the most thorough and informative database course I’ve seen. I’ve gone from someone who barely knew the basics of Postgres and how it works to being comfortable with the advanced topics of Postgres.Ryan Hendrickson
Shorten dev cycles with branching and zero-downtime schema migrations.
We've talked a lot about composite indexes, and frankly for good reason. They're very powerful. I do want show you that Postgres has your back.
If you do have two discrete indexes, it can scan both of those indexes and put those results together. Even if it is less performant than a composite index, it's better than a table scan and usually it's better than reading one index and then doing the other half of the filtering after having pulled the rows out of the heap.
Here are the indexes that I've already created. First on first name and last on last name, and this is the one I haven't yet created. First last on first last name. That's our composite index that we'll use for comparisons here in a second. The query we're going to run is select * from users where first name is Aaron and last name is Francis. We will see something interesting here. We see this BitmapAnd operation. We have a scan on last, a scan on first, and here are the conditions. But once those scans are completed, we go to the BitmapAnd and so what's happening is Postgres is scanning the first name index and finding all of the rows that match, scanning the last name index, and finding all of the rows that match the condition Francis. Then it's combining those and says, show me all of the rows where it is true, it matched for both cases. That gives us our BitmapAnd, and if we change this to an or, you'll see it changes to a BitmapOr, and we covered bitwise operations a little bit back in the bit string data type video I think. This should be ringing a little bit of a bell.
This is saying where the first name is Aaron, or the last name is Francis. To accomplish that, it's going to scan two separate indexes and then do a bitwise OR and say, hey, gimme either one, it's fine, it's totally fine. Then it's gonna go grab them out of the heap over there. Now the question is, is that better or worse than creating one composite index? Well, I'm gonna tell you that it's better. It's better to traverse one single B-tree and get the row IDs or the addresses that you need and go over to the heap and get your rows out. Better than traversing two and getting those and then putting them together either and or or. There's one way that we can tell for sure. That's by creating the composite index, leaving the other two indexes there and letting Postgres decide which one it wants to use. Let's do that now.
Let's go ahead and uncomment that and we're gonna run this. What this is doing is creating an index named first last, over first name and last name. Now if we run this again, you'll see it just decided, Postgres decided to use first last instead of doing that separate index scan and then bitmapping and them together. This is much better, much faster. However, if you look at this OR you'll see it decided to still use the BitmapOr across the two separate indexes. As with everything database related or performance related, you must test and you must check on your own data or own schema or your own query patterns and use cases. The way that a B-tree is structured, this is going to be hard to satisfy with a single B-tree. That's just not what B-trees are built for. However, having two B-trees that you can then combine the results for, that's a pretty good use case for an OR here. When you switch to an AND it's going to be so much faster to have that in order in the same B-tree. It depends on your use case. We are very thankful and very grateful to Postgres that it does have this ability to combine indexes.
I think the question for you is, what is your most common access pattern? What are your queries look like? What does your data look like to combine indexes? I think the question for you here is what's your most common access pattern? And you kind of need to build your indexing strategy around that. It is great that it can combine two separate index scans, and that might be the strategy that you are relying on. That's fine. It's not a second rate strategy. You just need to know that in most cases when you're ANDing these conditions together, a single composite index is going to perform better, but you might not always be ANDing them together. You might be ORing them together, in which case we've just seen two separate B-trees combined with a BitmapOr is more performant. Take a look at your access patterns, take a look at your data, and now hopefully armed with this knowledge, you can test out a few different strategies and see which one works for you.