Aaron is a natural teacher and this course is the best introduction to Postgres I have come across. Lessons are easy to follow and he recommends some great tools for working with Postgres.Joel Drumgoole
Shorten dev cycles with branching and zero-downtime schema migrations.
Now we're going to look at how we can reach into a json blob and pluck out certain parts. We can do this on the select sIDe, or we can do this on the wear sIDe. We can filter by a part of a json blob. We're not yet to indexing. We'll get there in a little bit. For now, I want to look at how do you access parts of a json blob. There are a couple different operators that we're going to look at right now. Here is a hard coded json blob. You can imagine this lives in some column, in some table. For now, we're just going to operate on this because it's really easy to look at and reason about. Let's say we want to extract just the order ID. We can use this single arrow operator here and we can say order ID. That gives us back one, two, three, four, five.
Now, if we were to extract, let's extract status, if we take a look at that, we see that shipped is in quotes, which is interesting. If we did, let's do PG type of, and we wrapped the entire thing, you'll see we're still getting json back. When we use that single arrow operator, that is the json extraction operator, there is a json unquoting extraction operator, which can be a little bit more useful if you're expecting a string back instead of a string surrounded in quotes. The unquoting operator is very similar to the plane extraction operator, except that it's this double arrow guy. If we run that, you'll see shipped with no quotes. We can verify by doing PG type of, PG type of, and we look at that and it's text. When we do these single arrow, that is json. Interestingly, if we did order ID, that also turns it into text. We can cast that into, we'll say an end four, and that will now come back as an integer. Now if you are trying to chain something, you do want it to come back as json because you want to keep adding arrows onto it.
For example, if we were to do something like this, let's take customer, so we messed something up here, we remove too many parentheses. Let's take customer. Customer is going to be this object here and I would run type of, but obviously this is json. You can keep going and we can do customer name and that is now a json string, which we can unquote and turn it into just regular text. You can mix and match. You can go one arrow deep to get a json object and then you could per perhaps do, let's do items, and then you can use an index. We'll do items one, and we see that it is zero based. It's a zero based array, so we're getting this first item here for one. If we did zero, we would get the zeroth item. We can run that, we can run that again, and we can say, give me the, gimme the price out of that guy. Or we could say, give me the product and that's going to be a quoted string, which we can unquote with the json unquoting operator. There are a couple other operators I want to show you, but there's one thing that I want to tell you first, and that is you're not going to get an error if you try to access paths that don't exist. You'll get null back, but it's not going to throw an error.
This can be really nice if your json doesn't have a terribly rigid schema and you don't know if items are going to be there or if the status is going to be there, you can go ahead and access it as if it is, and then you need to be sure to account for nulls on the other side. If we take a look back here and we pull items out and we say items, let's just put a bunch of foobar in there and we just get null back so we don't get a problem, we just get null back. In fact, we can keep chaining and say bazbuz, and we'll just get null back. You're pretty safe here. You need to account for the eventuality that this comes back as null. That's fine, we're used to that. The good news is you don't have to wrap this in some sort sort of error catching behavior. It's just going to say, I don't know, there's nothing there. I'm just going to return null.
Another neat thing about arrays is you can access them as we did from beginning to end. You can also access them from end to beginning and pull from the end of the array back. We can say, give me one back from the count, or we could go two back and that gets us equivalently to the zeroth product, so those things end up being the same. You can access those arrays from either end, which is quite nice. Another operator that I want to show you is this guy right here. With this guy, you can do some interesting stuff. You can do, let's say customer and then name. If you do that, you kind of get to walk that path in a a little bit tighter text only format rather than those arrows.
You could potentially build that up somewhere else and then just plop it right in there, which can be quite nice. There's also an extraction operator for that that operates only on the last part of that path. This also works just fine items, 0, product. You're going to get laptop in items, 0, product or items, 1, product, you'll get mouse. Going from the end, you get the same thing. This little path operator has an extraction option and it's, I don't know, it's kind of preference. It's kind of preference which one you use. This might be easier to construct over on the application side. There is one other method that I want to show you, and I'm going to show you this because it adopts a standard from the outside world outsIDe of Postgres itself. It adopts the json path standard. This isn't my favorite. I prefer Postgres' terse extraction operators, but we're here to master Postgres, not just learn things that Aaron likes, so let me show you a few of these.
We'll do jsonb path query, and you'll see here that I've taken away the casting to jsonb. This is literally just text, which is actually fine here because this jsonb path query is going to up cast it to json. Now, typically this is already going to be jsonb because it is coming out of your database. Normally you would see something like this, select that from data and we'll put the path in there. Sometimes you'll see this called attributes. If you're fancy, it might be called aurata, great word. Anywhere where your json is stored, you can pass that in as the first argument. We're going to pass a string. Now the json path, the dollar sign denotes the root node or the document. If we did $.status, you'll see we get back shipped in quotes indicating that this is still a json string. Now we can do something you might be familiar with customer.name. You get Alice back. You can do array traversal items[0].product, and you get laptop back.
Now the question is how do we unquote this? That's why I don't super love that or this style because you do end up having to do something goofy like that, which is unquote, and then you don't pass anything in, which means just unquote this scalar value that has been given to you. That is a way that you could get the status unquoted like that, and that works just fine. There are several more functions. I'll leave a link to the docs down below. The Postgres support for json is extremely robust. There are several functions that operate on just arrays, extracting values out of arrays. You can chain them all together. It gets to be very powerful. These are probably going to be your bread and butter. These are the ones you're going to use in maybe 90, 95% of cases, especially cases where we start creating generated columns or doing functional indexes on expressions that reach into a json blob. That stuff is super duper fun.
There are a few other things we're going to look at next, and that's something like the containment operators and a few more of the advanced json operators. These extraction operators are the ones you need to commit to memory because they're going to be the ones that are most useful.