Mastering Postgres
Introduction
Introduction to the course
Overview of course structure
Postgres vs. everyone
The psql CLI
Data Types
Introduction to schema
Integers
Numeric
Floating point
Storing money
NaNs and infinity
Casting types
Characters types
Check constraints
Domain types
Charsets and collations
Binary data
UUIDs
Boolean
Enums
Timestamps
Timezones
Dates and times
Advanced Data Types
Intervals
Serial type
Sequences
Identity
Network and mac addresses
JSON
Arrays
Generated columns
Text search types
Bit string
Ranges
Composite types
Nulls
Unique constraints
Exclusion constraints
Foreign key constraints
Indexing
Introduction to indexes
Heaps and CTIDs
B-Tree overview
Primary keys vs. secondary indexes
Primary key types
Where to add indexes
Index selectivity
Composite indexes
Composite range
Combining multiple indexes
Covering indexes
Partial indexes
Index ordering
Ordering nulls in indexes
Advanced Indexing
Functional indexes
Duplicate indexes
Hash indexes
Naming indexes
Understanding Query Plans
Introduction to explain
Explain structure
Scan nodes
Costs and rows
Explain analyze
Generating Results
Introduction to queries
Cross joins
Inner joins
Outer joins
Subqueries
Lateral joins
ROWS FROM
Filling gaps in sequences
Subquery elimination
Combining queries
Set generating functions
Indexing joins
Advanced SQL
Introduction to advanced SQL
Grouping
Grouping sets, rollups, cubes
Window functions
CTEs
CTEs with window functions
Recursive CTE
Hierarchical recursive CTE
Handling nulls
Row value syntax
Views
Materialized views
Removing duplicate rows
Upsert
Returning keyword
COALESCE + generated column
Full Text Search
Introduction to full text search
Searching with LIKE
Vectors, queries, and ranks
Websearch
Ranking
Indexing full text search
Highlighting
Advanced JSON
Intro to JSON
JSON vs JSONB
Validating JSON
Creating JSON objects + arrays
JSON extraction
JSON containment
JSON existence
JSON recordset
Updating JSON
Indexing JSON parts
GIN index
Vectors (pgvector)
Intro to pgvector
Vector embedding columns
Find related articles
Upsert vector embedding
Semantic search
Other operators
Vector indexes
Outro
Thank you
Bonus interviews
Heroku's glory days & Postgres vs the world (with Craig Kerstiens)
Creating a Postgres platform with Monica & Tudor from Xata.io
Bootstrapping an email service provider (with Jesse Hanley)
Locked video

Please purchase the course to watch this video.

Video thumbnail
Generating Results
Filling gaps in sequences

Full Course

$
349
$399
USD, one-time fee
Going through the Mastering Postgres course by Aaron Francis and holy cow is it well designed. These types of well-polished knowledge products bring me an immense amount of joy.
Adam Taylor

PostgreSQL database platform

Shorten dev cycles with branching and zero-downtime schema migrations.

Test Data

I've made my test data available for you to use and follow along.

Download
or
Use on Xata

Summary

Learn how to use a left join with `generate_series` in Postgres to fill in missing dates in a sales report. By creating a complete date range, such as from January 1st to January 31st, you ensure that even days with no sales data are included, with zeros replacing missing entries. This approach, combined with the `COALESCE` function, ensures a comprehensive and gap-free report.

Video Transcript

This is my favorite part of courses like this because all the atomic pieces start to come together and we really start to make a lot of progress.

Here we're going to look at generating a series and left-joining to cover gaps in sequences. This one is going to be fun. I've made up a sales table here. We have select * from sales. Sales dates go from January 1st to January 31st. And then there's just an amount and a user ID. But, if we say select sale_date and sum of amount from sales group by, which we'll talk about group by in a little bit. Group by sale_date order by sale_date ascending. You'll see this does add it all up. But, look, right there. 01-03, 01-05, unlucky. There were no sales on the 4th of January for whatever reason. But our report is not necessarily showing that, it just looks like missing data. When you have a report like this and you have this nice, neat, orderly sequence and then a piece of missing data, it really like, (groans) makes me wonder, "Is the whole thing right?"

What we're going to do here is we're going to generate a series and join this data in, such that we get 01-04 on this report, even if it does show a zero. I think the most fun way for us to do this is just to build it up together. I have this here, generate_series 01-01 to 01-31 by day. We've seen this before. We can run that, and if we write the right words, we can run that and get back exactly what we're looking for here. We are going to use this as the basis to join on. This is going to be our left-hand table. It makes some space there. We're going to say, as all_dates, and then, left join. Then we're going to open a subquery. We're just doing it all here. We're going to left join, because no matter what, we want the date to show up. We need the date to show up, that is the point, even if there are no matches in the right-hand table.

The right-hand table is going to become this guy. We do not need this ordering inside, we can move that outside later. We're going to select sale_date and sum(amount), and we'll select that as total_amount from sales group by sale_date as, let's just say, well, we can just alias it back to sales. That's totally legal. We're going to just call this sales on sales.sale_date =, what did we call it? all_dates. We didn't give that one a name, so we're going to say, all_dates(sale_date). We're going to give it a table alias and a column alias. all_dates.sale_date. We have our semicolons. We can get rid of this guy. And now, if we run that, you'll see, kind of, I mean, yeah, that actually, (chuckles) that actually worked first try. This is, in fact, my first take on this, and that worked.

We have the big gnarly sale date over here, which looks like a timestamp time, er, timestamptz. And then we have the sale_date and the total_amount. And that is kind of, that's kind of what we're looking for. Close. Let's make it a little bit prettier. What we can do here, is up here, we can say select, let's do all_dates.sale_date, cast it to a date. How does that look? Okay, we got all dates, that's good.

I forgot what we're actually doing here. We have the sale date and we need the, yep, now we need the total_amount. And we're getting closer. That feels a little bit better. There's one function that we can use here, which is called coalesce. And that's going to put that zero in there. Without the coalesce, let's just run them side by side. Without the coalesce, we have a null. We're intuiting here, and now I'm going to tell you, that the coalesce function simply chooses the first non-null argument. In fact, you could write, just for fun, I don't know what you do for fun, but maybe this is it, and it will still work because it's going to say, "No, no, no, no. There we go, there's one." And in the case of January 4th, it's going to say, "No, there's a zero." So remember, zero and null are not the same thing. Null is unknowable, zero is, well, zero. Coalesce will happily skip over total_amount being null and choose zero. A real-life, honest-to-goodness example of a left join in a generate_series on dates.

Let's look at it one more time and then we'll call it. What we've done here, this is silly. I don't know what you do for fun. That was very silly. All right, so here we go. If we run that, what we've done here is, we're going to skip over the select for now. We took a generate_series and gave it a table alias and column alias. We used that as our left-hand table in a left join, which we know means, "Please preserve everything from the left-hand table no matter what." And we're guaranteed that this is going to be complete because we're stepping from 01 to 31 by one-day increments. That gives us our nice, clean list of dates.

Then we're saying, "Let's left-join something in, but instead of left-joining a table, let's left-join this in," and this is a subquery. Inside this subquery, we selected the sale_date, we grouped by that sale_date, and we aggregated, we summed up the amounts and gave that an alias from the sales table. In fact, we turned around and aliased it right back to the same name, which can be confusing, but is totally legal. Then we said, all right, sales.sale_date, which is this guy here, equals all dates.sale_date, which is that guy there. It preserved everything from the left-hand table, matched up the sale_dates, which brings along the total_amount.

We had those three columns that we had to deal with. We cast the sales_date, or the all_dates.sale_date to a date instead of a timestamptz. Then we coalesced the total amount because on the days where there were no sales, it was showing up as null, and we wanted it to show up as zero, and coalesce will pick the first non-null argument.

In many other databases, you'd be stuck using a recursive CTE to do this because we don't have a generate_series in many other databases. But thank you, Postgres. We have generate_series, and that can help fill in gaps in sequences.