Mastering Postgres is single-handedly the most thorough and informative database course I’ve seen. I’ve gone from someone who barely knew the basics of Postgres and how it works to being comfortable with the advanced topics of Postgres.Ryan Hendrickson
Shorten dev cycles with branching and zero-downtime schema migrations.
We just looked at creating a related articles feature. I want to show you how you can use embeddings to do a sort of, kinda like a semantic search. Like the words may not actually appear in the article, but the vibes might, right? Maybe that's what embeddings are. Maybe embeddings are vibes. When the user types in a search phrase, what we're going to do is we're going to send that phrase off to be turned into an embedding, and then we're going to query our database against that embedding.
We're going to do this all from the application side, because again, we have 1,500... Yeah, 1,536 dimensions, and I just, I don't want to deal with that. I don't want to copy and paste all that and put it into TablePlus. We're going to do it again from the application side, use your powerful brain to translate this into the language of your choice. The actual PHP part doesn't matter. We're just going to use this same command. We're just going to say, bail out early. Now we can, let's copy some stuff. Let's do this.
Let's say OpenAI embeddings instead of contents. What do you want our search query to be? Let's do... Let's see if we can hone in on that reputation is portable article. Now the vibes of that article basically says like, it doesn't matter what platform you hang out on, whether that's Twitter, LinkedIn, Bluesky, YouTube, whatever. If you do good work, your reputation is portable, you can just take it to every platform 'cause you're known for doing good work. Let's say, 'making a name for yourself', that is going to be our user's search query. Then hopefully, reputation is portable is going to come back. This is how we access the actual array of floating point numbers. The response, you go into the embeddings, take the first one and it's on a key called embedding. Then let's do this.
We're going to search the database. We're going to say, we'll change this to a select, and actually we don't need any of that, and we probably don't need any of that. All right, select *. Let's just do title. Select title from articles where, nope, order by embedding. Then the L2 distance away from that, limit 5. The actual embedding is that embedding. Since we don't need that much space, let's clean this up just a little bit, auto format, and we'll call this result, and then we can just dump and die the result there. Just to recap from OpenAI, we're going to make a vector embedding out of this search query, 'making a name for yourself'. Then we're going to query the Postgres connection and say, select title from articles order by the similarity to that embedding.
Hopping over here, we can run embed again, and something didn't work. Array to string. Ah, I did the exact same thing again, which I have done before, which is I need to turn it into the correct format. We're going to implode that embedding and turn it into an array like string. Now let's try that again. There you go. Yes! "Reputation is portable." "You're always doing something wrong." "What if you tried hard?" "This doesn't warrant a blog post." And "because I wanted to." And honestly, just approaching it from a vibes based angle, all five of those articles are the exact same vibe, and reputation is portable is in fact probably the one I would want to read if I had searched for making a name for yourself. Now let's do one more. Let's hop back over here, and let's do one more. You know, we could do one that has an actual match. The thing is, these words appear in the article.
In that case, a full text search would probably work just the same 'cause there's recursive CTE, but you see, we do get five other, rather four other database esque articles that come back. The vibes are directionally correct. Then what if we search for one where the words aren't there? Let's say 'solving a problem with sql'. Why would a user ever search that? I don't really know, but let's see what comes up. Look, that's great. Recursive CTE, distance querying, schema builder, count star, efficient pagination. Just from a vibes based angle, maybe vectors are vibes, maybe that's my new catchphrase. Vectors are vibes. From a vibes based angle, this search that we implemented in like two seconds is returning articles that I would want to read based on those search queries. Now of course, you can combine full text search with vector embedding based similarity, and you can kinda make your own search engine.
Again, I will say that there are search engines that are specifically built to be search engines, and those are probably better for the more complicated use cases. If you're running, you know, a simple SaaS product or a blog or maybe even e-commerce, honestly, this would be great. The great part about this is you don't have to bring anything else into your stack. You just use Postgres. You'll need to find some provider to create embeddings. Again, there are lots of local models that will create embeddings that are good enough for search like this. We've looked at related products and we've looked at some sort of vibes based search. Hopefully, you're starting to get a sense of how powerful these vector operations can be.