Mastering Postgres
Introduction
Introduction to the course
Overview of course structure
Postgres vs. everyone
The psql CLI
Data Types
Introduction to schema
Integers
Numeric
Floating point
Storing money
NaNs and infinity
Casting types
Characters types
Check constraints
Domain types
Charsets and collations
Binary data
UUIDs
Boolean
Enums
Timestamps
Timezones
Dates and times
Advanced Data Types
Intervals
Serial type
Sequences
Identity
Network and mac addresses
JSON
Arrays
Generated columns
Text search types
Bit string
Ranges
Composite types
Nulls
Unique constraints
Exclusion constraints
Foreign key constraints
Indexing
Introduction to indexes
Heaps and CTIDs
B-Tree overview
Primary keys vs. secondary indexes
Primary key types
Where to add indexes
Index selectivity
Composite indexes
Composite range
Combining multiple indexes
Covering indexes
Partial indexes
Index ordering
Ordering nulls in indexes
Advanced Indexing
Functional indexes
Duplicate indexes
Hash indexes
Naming indexes
Understanding Query Plans
Introduction to explain
Explain structure
Scan nodes
Costs and rows
Explain analyze
Generating Results
Introduction to queries
Cross joins
Inner joins
Outer joins
Subqueries
Lateral joins
ROWS FROM
Filling gaps in sequences
Subquery elimination
Combining queries
Set generating functions
Indexing joins
Advanced SQL
Introduction to advanced SQL
Grouping
Grouping sets, rollups, cubes
Window functions
CTEs
CTEs with window functions
Recursive CTE
Hierarchical recursive CTE
Handling nulls
Row value syntax
Views
Materialized views
Removing duplicate rows
Upsert
Returning keyword
COALESCE + generated column
Full Text Search
Introduction to full text search
Searching with LIKE
Vectors, queries, and ranks
Websearch
Ranking
Indexing full text search
Highlighting
Advanced JSON
Intro to JSON
JSON vs JSONB
Validating JSON
Creating JSON objects + arrays
JSON extraction
JSON containment
JSON existence
JSON recordset
Updating JSON
Indexing JSON parts
GIN index
Vectors (pgvector)
Intro to pgvector
Vector embedding columns
Find related articles
Upsert vector embedding
Semantic search
Other operators
Vector indexes
Outro
Thank you
Bonus interviews
Heroku's glory days & Postgres vs the world (with Craig Kerstiens)
Creating a Postgres platform with Monica & Tudor from Xata.io
Bootstrapping an email service provider (with Jesse Hanley)
Locked video

Please purchase the course to watch this video.

Video thumbnail
Vectors (pgvector)
Semantic search

Full Course

$
349
$399
USD, one-time fee
Going through the Mastering Postgres course by Aaron Francis and holy cow is it well designed. These types of well-polished knowledge products bring me an immense amount of joy.
Adam Taylor

PostgreSQL database platform

Shorten dev cycles with branching and zero-downtime schema migrations.

Test Data

I've made my test data available for you to use and follow along.

Download
or
Use on Xata

Summary

Learn how to create a vibe-based semantic search feature using embeddings, where users can find articles that match the overall theme rather than just specific keywords. By converting a search query into an embedding and comparing it against your article database, you can find the best matches based on the "vibe" of the content. This approach enhances your search functionality in Postgres, all without requiring major changes to your existing tech stack.

Video Transcript

We just looked at creating a related articles feature. I want to show you how you can use embeddings to do a sort of, kinda like a semantic search. Like the words may not actually appear in the article, but the vibes might, right? Maybe that's what embeddings are. Maybe embeddings are vibes. When the user types in a search phrase, what we're going to do is we're going to send that phrase off to be turned into an embedding, and then we're going to query our database against that embedding.

We're going to do this all from the application side, because again, we have 1,500... Yeah, 1,536 dimensions, and I just, I don't want to deal with that. I don't want to copy and paste all that and put it into TablePlus. We're going to do it again from the application side, use your powerful brain to translate this into the language of your choice. The actual PHP part doesn't matter. We're just going to use this same command. We're just going to say, bail out early. Now we can, let's copy some stuff. Let's do this.

Let's say OpenAI embeddings instead of contents. What do you want our search query to be? Let's do... Let's see if we can hone in on that reputation is portable article. Now the vibes of that article basically says like, it doesn't matter what platform you hang out on, whether that's Twitter, LinkedIn, Bluesky, YouTube, whatever. If you do good work, your reputation is portable, you can just take it to every platform 'cause you're known for doing good work. Let's say, 'making a name for yourself', that is going to be our user's search query. Then hopefully, reputation is portable is going to come back. This is how we access the actual array of floating point numbers. The response, you go into the embeddings, take the first one and it's on a key called embedding. Then let's do this.

We're going to search the database. We're going to say, we'll change this to a select, and actually we don't need any of that, and we probably don't need any of that. All right, select *. Let's just do title. Select title from articles where, nope, order by embedding. Then the L2 distance away from that, limit 5. The actual embedding is that embedding. Since we don't need that much space, let's clean this up just a little bit, auto format, and we'll call this result, and then we can just dump and die the result there. Just to recap from OpenAI, we're going to make a vector embedding out of this search query, 'making a name for yourself'. Then we're going to query the Postgres connection and say, select title from articles order by the similarity to that embedding.

Hopping over here, we can run embed again, and something didn't work. Array to string. Ah, I did the exact same thing again, which I have done before, which is I need to turn it into the correct format. We're going to implode that embedding and turn it into an array like string. Now let's try that again. There you go. Yes! "Reputation is portable." "You're always doing something wrong." "What if you tried hard?" "This doesn't warrant a blog post." And "because I wanted to." And honestly, just approaching it from a vibes based angle, all five of those articles are the exact same vibe, and reputation is portable is in fact probably the one I would want to read if I had searched for making a name for yourself. Now let's do one more. Let's hop back over here, and let's do one more. You know, we could do one that has an actual match. The thing is, these words appear in the article.

In that case, a full text search would probably work just the same 'cause there's recursive CTE, but you see, we do get five other, rather four other database esque articles that come back. The vibes are directionally correct. Then what if we search for one where the words aren't there? Let's say 'solving a problem with sql'. Why would a user ever search that? I don't really know, but let's see what comes up. Look, that's great. Recursive CTE, distance querying, schema builder, count star, efficient pagination. Just from a vibes based angle, maybe vectors are vibes, maybe that's my new catchphrase. Vectors are vibes. From a vibes based angle, this search that we implemented in like two seconds is returning articles that I would want to read based on those search queries. Now of course, you can combine full text search with vector embedding based similarity, and you can kinda make your own search engine.

Again, I will say that there are search engines that are specifically built to be search engines, and those are probably better for the more complicated use cases. If you're running, you know, a simple SaaS product or a blog or maybe even e-commerce, honestly, this would be great. The great part about this is you don't have to bring anything else into your stack. You just use Postgres. You'll need to find some provider to create embeddings. Again, there are lots of local models that will create embeddings that are good enough for search like this. We've looked at related products and we've looked at some sort of vibes based search. Hopefully, you're starting to get a sense of how powerful these vector operations can be.