Mastering Postgres
Introduction
Introduction to the course
Overview of course structure
Postgres vs. everyone
The psql CLI
Data Types
Introduction to schema
Integers
Numeric
Floating point
Storing money
NaNs and infinity
Casting types
Characters types
Check constraints
Domain types
Charsets and collations
Binary data
UUIDs
Boolean
Enums
Timestamps
Timezones
Dates and times
Advanced Data Types
Intervals
Serial type
Sequences
Identity
Network and mac addresses
JSON
Arrays
Generated columns
Text search types
Bit string
Ranges
Composite types
Nulls
Unique constraints
Exclusion constraints
Foreign key constraints
Indexing
Introduction to indexes
Heaps and CTIDs
B-Tree overview
Primary keys vs. secondary indexes
Primary key types
Where to add indexes
Index selectivity
Composite indexes
Composite range
Combining multiple indexes
Covering indexes
Partial indexes
Index ordering
Ordering nulls in indexes
Advanced Indexing
Functional indexes
Duplicate indexes
Hash indexes
Naming indexes
Understanding Query Plans
Introduction to explain
Explain structure
Scan nodes
Costs and rows
Explain analyze
Generating Results
Introduction to queries
Cross joins
Inner joins
Outer joins
Subqueries
Lateral joins
ROWS FROM
Filling gaps in sequences
Subquery elimination
Combining queries
Set generating functions
Indexing joins
Advanced SQL
Introduction to advanced SQL
Grouping
Grouping sets, rollups, cubes
Window functions
CTEs
CTEs with window functions
Recursive CTE
Hierarchical recursive CTE
Handling nulls
Row value syntax
Views
Materialized views
Removing duplicate rows
Upsert
Returning keyword
COALESCE + generated column
Full Text Search
Introduction to full text search
Searching with LIKE
Vectors, queries, and ranks
Websearch
Ranking
Indexing full text search
Highlighting
Advanced JSON
Intro to JSON
JSON vs JSONB
Validating JSON
Creating JSON objects + arrays
JSON extraction
JSON containment
JSON existence
JSON recordset
Updating JSON
Indexing JSON parts
GIN index
Vectors (pgvector)
Intro to pgvector
Vector embedding columns
Find related articles
Upsert vector embedding
Semantic search
Other operators
Vector indexes
Outro
Thank you
Bonus interviews
Heroku's glory days & Postgres vs the world (with Craig Kerstiens)
Creating a Postgres platform with Monica & Tudor from Xata.io
Bootstrapping an email service provider (with Jesse Hanley)
Locked video

Please purchase the course to watch this video.

Video thumbnail
Advanced Indexing
Duplicate indexes

Full Course

$
349
$399
USD, one-time fee
Like many, I only dealt with databases when necessary. This course changed that, making me feel confident as my team's 'Database Expert'! Aaron is an engaging and insightful educator, making this a 10 out of 10 course. I'll definitely buy more courses from him. Highly recommend—no regrets!
Kyle Bennett
Kyle Bennett

PostgreSQL database platform

Shorten dev cycles with branching and zero-downtime schema migrations.

Test Data

I've made my test data available for you to use and follow along.

Download
or
Use on Xata

Summary

Learn why having duplicate indexes in a database is unnecessary and can lead to confusion. For example, if you already have an index on an email column and create a composite index that starts with email (like `email, is_pro`), it’s essentially a duplicate. By removing the single-column index, the composite index can efficiently handle both simple and complex queries, streamlining your database performance.

Video Transcript

Early on, when we talked about indexing, I said, it's a separate data structure that must be maintained. Every insert, update, or delete must touch that B-tree. You want to have as many indexes as you need, but as few as you can get away with. There's kind of a balance there. One thing that you definitely don't need is a duplicate index. You don't want to have the same index on two columns with all of the same parameters because that doesn't help anybody. It only costs, and potentially confuses you, the query planner, your teammates, you just don't need it. That much makes sense. With what we've learned about how B-tree indexes are accessed, you might accidentally have a duplicate, even if it doesn't look exactly like a duplicate.

Here's how that would normally happen. You would have something like, create index email on users (email);. As time goes on, this index exists and then business requirements change, right? That's totally normal. That's completely fine. Somebody comes along and says, "Hey, we need to do email_is_pro," for example. You go in and you say, great, I know how to do that. I know how to create a composite index. Maybe I'll do a partial index, but let's go ahead and do a composite index here.

Let's do an (email, is_pro); index, and you move on with your day, you go home, you're happy. That's totally fine. This happens. You do need to consider though, that you've just created a duplicate index. If we look at this one, we'll copy that, and we'll come down here. These are the two indexes that we have created. If we add a little space here, you'll see, hmm, that lines up quite nicely. Those indexes have the exact same leftmost prefix, which means, in the order of access, those are functionally the same index when it comes to querying for email.

If we were to do select * from users where email = '[email protected]';, we'll see that it should use, go through and explain, see that it should use this one up here first. We see it does prefer the smaller index. It does prefer the more compact index. If we were to drop, if we were to drop index email, we would see that it just happily switches over to, there you go, (email, is_pro);. The reason that it does that is because we start on the left side and work towards the right, meaning as far as a B-tree is concerned, those two are the exact same.

This B-tree is a little bit different. It does have that one extra value of is_pro, but it can happily be used to satisfy this query right here. Unless you have an extremely good use case and you know exactly why you want two indexes that share a leftmost prefix, consider this a duplicate index and get rid of this one, because this second one will satisfy the needs of this one, but not vice versa. You can get away with just using this one for the email only queries and the email and is pro queries.