Mastering Postgres
Introduction
Introduction to the course
Overview of course structure
Postgres vs. everyone
The psql CLI
Data Types
Introduction to schema
Integers
Numeric
Floating point
Storing money
NaNs and infinity
Casting types
Characters types
Check constraints
Domain types
Charsets and collations
Binary data
UUIDs
Boolean
Enums
Timestamps
Timezones
Dates and times
Advanced Data Types
Intervals
Serial type
Sequences
Identity
Network and mac addresses
JSON
Arrays
Generated columns
Text search types
Bit string
Ranges
Composite types
Nulls
Unique constraints
Exclusion constraints
Foreign key constraints
Indexing
Introduction to indexes
Heaps and CTIDs
B-Tree overview
Primary keys vs. secondary indexes
Primary key types
Where to add indexes
Index selectivity
Composite indexes
Composite range
Combining multiple indexes
Covering indexes
Partial indexes
Index ordering
Ordering nulls in indexes
Advanced Indexing
Functional indexes
Duplicate indexes
Hash indexes
Naming indexes
Understanding Query Plans
Introduction to explain
Explain structure
Scan nodes
Costs and rows
Explain analyze
Generating Results
Introduction to queries
Cross joins
Inner joins
Outer joins
Subqueries
Lateral joins
ROWS FROM
Filling gaps in sequences
Subquery elimination
Combining queries
Set generating functions
Indexing joins
Advanced SQL
Introduction to advanced SQL
Grouping
Grouping sets, rollups, cubes
Window functions
CTEs
CTEs with window functions
Recursive CTE
Hierarchical recursive CTE
Handling nulls
Row value syntax
Views
Materialized views
Removing duplicate rows
Upsert
Returning keyword
COALESCE + generated column
Full Text Search
Introduction to full text search
Searching with LIKE
Vectors, queries, and ranks
Websearch
Ranking
Indexing full text search
Highlighting
Advanced JSON
Intro to JSON
JSON vs JSONB
Validating JSON
Creating JSON objects + arrays
JSON extraction
JSON containment
JSON existence
JSON recordset
Updating JSON
Indexing JSON parts
GIN index
Vectors (pgvector)
Intro to pgvector
Vector embedding columns
Find related articles
Upsert vector embedding
Semantic search
Other operators
Vector indexes
Outro
Thank you
Bonus interviews
Heroku's glory days & Postgres vs the world (with Craig Kerstiens)
Creating a Postgres platform with Monica & Tudor from Xata.io
Bootstrapping an email service provider (with Jesse Hanley)
Locked video

Please purchase the course to watch this video.

Video thumbnail
Advanced Data Types
Bit string

Full Course

$
349
$399
USD, one-time fee
The course broke down complex subjects into easy-to-understand videos that I've referred back to multiple times since completing it. It strengthened my knowledge by filling gaps and expanding into new areas I hadn't known about—absolutely worth doing!
Lawrence
Lawrence

PostgreSQL database platform

Shorten dev cycles with branching and zero-downtime schema migrations.

Test Data

I've made my test data available for you to use and follow along.

Download
or
Use on Xata

Summary

Learn about bit strings, which are sequences of ones and zeros used to store multiple true/false values in a compact format. While they can be useful for larger datasets, like feature flags, they're not always ideal, especially for small numbers of Boolean values, as they can be confusing. Bit strings offer an alternative to integers, JSON columns, or separate tables in Postgres, providing flexibility for handling specific types of data.

Video Transcript

This is one of my favorite esoteric data types, and it is a bit string, and that is just simply a string full of on-off values, ones and zeros, that's it. And what those ones and zeros mean, totally unclear, and this is the problem with the bit string is it's completely opaque and it is a total mess, as a human, to try to look at it and be like, what does any of that mean? So I can't super advocate for using it everywhere, but it is the right tool for some jobs, so I wanna tell you about it, but please don't go convert everything in your database to bit strings, because your coworkers will not be happy about that.

Let me show you these bit strings. A bit string looks something like this. You can prefix it with a B, and then open your string, and then you can say, let's have four bits, with the very first one being on and the other ones being off. If we ran that, you'll see we get that, but PG type of will tell us that we are, in fact, working with a bit column. You could also do it like this, 0001, cast to a bit four, and that would give you a bit column as well. Let's get rid of these guys and just go back down to the bit strings, and those look identical to me.

Now, what is the point, just kind of in general, of a bit string? It can be a very nice way to store multiple, discrete pieces of true/false, on/off information, to store a bunch of those in a single column. Now, it is very hard to reason about as a human, so if you have two or three Boolean values, I'm gonna say put those two or three Boolean values in two or three Boolean columns. I think it changes when you get to 64 pieces of information.

You don't want 64 columns, and so a bit string can be really helpful there. You could also potentially use an integer and do your bitwise operations on just an integer. That's a little bit of a stylistic preference, I think it is. If you're doing bitwise operations, I think it's kind of nice to look at it and see it as a bit string. That's much more understandable to me, even if it's still totally opaque what the actual bits mean.

It's still very clear to me, when I look at it and see a bit string, that there are eight values and two of 'em are on, versus just seeing the number 17 or whatever. So if you want to use an integer, you can. Bit strings are also there, but let me show you how you can do some bit operations here. We're gonna pretend that this is the user's feature flags. Is this the best way to store feature flags? Probably not. You could probably do it as discrete Boolean columns or as a JSON column or even as a separate table, but we're just gonna pretend for now, and then we will say, let's try to figure out if this user has feature one enabled, and the way that we can do that is by using a bit mask. Let's imagine this user has feature one and feature three enabled, and this is the mask right here.

This is the bit mask, and what we're gonna do here, what a bit mask does is it compares both of these bit strings and returns a value where if any bit is on in both places, it will be on in the result. If it is on in one place, but not the other, it will be off in the result, and if it's off in both places, it will be off in the result, and so we should expect to see, there you go. We should expect to see 001 right there, because this is the only bit where it's on in both the user's feature flag and the bit mask. Now, if we were to change the mask to this, you would run it, and you would see, yep, indeed, that is what we were expecting. Is this useful in every situation? Absolutely not.

Is it super cool? You better believe it is. In Postgres, you have a bit and a bit varying column, so let's say you've got a table here, you'd probably have some ID, we can skip that, and we will say that, let's do a bit three, and we will call that bit three, and then you can have a bit varying, and you can say bit varying up to, I don't know, 32 bits or something like that. If you create that and then you insert into bits values, we'll say B 001, and then the bit varying, you can put in as many as you want, up to 32, and so if we run that, it worked just fine. However, if you run that, it must match the way that it was declared. In our case, it was declared as a three-bit column.

Keep this one in the back of your mind for any time where a problem that is kind of shaped like a bit mask comes up, you do have bit strings in Postgres. Of course, you could just use integers, you could use discrete Boolean columns, you could use a JSON column, you could use a separate table, so there are lots of solutions, and that's one of the great things about being a developer is you get to decide which solution is best for your needs.