Mastering Postgres has dramatically improved my understanding of indexes. It has helped me make informed decisions around my queries and schema rather than blindly adding indexes and hoping for the best.Dwight Watson
Shorten dev cycles with branching and zero-downtime schema migrations.
Ranges are not necessarily types unto themselves. It's sort of a meta type and each range has a subtype.
You can have an integer four range, an integer eight range, and a numeric range, a date range, timestamp range. You can have all kinds of different ranges. And when you declare a range, you declare the lower bound and the upper bound, either inclusive or exclusive. You could also have an unbounded upper and an unbounded lower. That's totally possible. And then you can do queries to determine does a certain discrete point exist within that range. Do these ranges overlap? Are these ranges contain? Does one contain the other range? You can also create exclusion constraints on some of these ranges such that you can't have overlapping ranges. That might be incredibly useful, for example, for a hotel room because you don't want two reservations overlapping in the same room, that would be bad. So ranges are very, very neat.
I'm gonna show you how to create several of them. And then later on we're gonna look at exclusion constraints and we'll come back to using ranges there. The first thing I want to show you is creating a range. I'm gonna open these brackets here and I'm gonna say, let's say one to five. And I'm going to cast this to an int for range. I will select that back and you'll see that it did, it totally changed. It completely changed what I gave it, which is kind of confusing.
Keep this in mind, it changed it to (1,6) instead of [1,5]. But if I changed this to numrange, you'll see it is now [1,5]. Let's write these down 'cause we're gonna figure this out together. One five bracket for that guy and then we'll come down here and say as an int4range became (1,6). What can we determine from this? Well, we can determine that there are two separate styles for writing a range. You have the bracket and you have the parentheses. What if we were to change this to (1,6) for the int4range, still (1,6) and (1,6) for the numrange. (1,6) there. What is going on here is that when you declare a range, you declare the lower bound and you declare the upper bound. You have to declare whether that is inclusive or exclusive.
What is happening here is when you declare a range, you have to declare the lower bound and the upper bound unless it's unbounded and you have to declare if that lower bound and upper bound are included or excluded from the range. In the case of this numrange here, what we're saying is include one and include five. The hard brackets, the square brackets mean inclusive. What happened down here is we did one five like that, it transformed it into one, six exclusive. The valid values here are one, two, three, four, five. And you could say that it's one through five inclusive or you could say it's one through six exclusive. These two things end up being the same thing when it comes to integers.
Now why did it change? Why did it change for the int4range? See it changed to one, six, whereas for the numrange it left it at one, five with the five being inclusive. Why did that happen? To answer that question, we have to think about these different data types, which is why I love doing stuff in order like this because now I know we've already covered integer forge, we've already covered numeric, we've already covered a lot of stuff and we can build upon it. Let's look at this again. The numrange was left at one, five inclusive on both ends. The intrange was changed to one, six with the upper bound excluded. That is because the numrange is continuous, there's no discreet step. If we were to list all of the values that were allowed in this numrange, we would be here for infinity because it would be one and 1.0000000000001. And then I'm not gonna do it, but you get it right? This is a continuous range without discreet steps.
A timestamp is functionally the same thing. Even though the precision is limited, it's functionally the same thing, a timestamp range. However, a date range has discreet steps. It is day by day by day. An intrange has discreet steps. This and this are the exact same for an intrange. However, when you do it like this for numrange, they're not the same because then suddenly 5.99 becomes valid, whereas in this case it's just five and then it's over. There is no continuous function where other numbers could be included.
I do want to show you one other way to create a range and then I have a table with some range data that we're gonna query against. Another way to create a range, we did it like this. We did one, five, one, five close, and then we did numrange. Totally fine, totally legit. You can also do this numrange one, five, but notice that it's different. It did one, five and then it excluded the upper bound even though the lower bound was included. That is the default. When you use these constructor functions, so if you did intrange like that, I think you need to do int4range like that. You'll see again it's the same style, lower bound included, upper bound excluded. You do have a third parameter here where you can put in your lower and upper bound inclusion or exclusion. That will change it. If we were to switch this back to numrange, you would see one, five with the square brackets on both sides and you could say actually exclude the lower bound and include the upper bound. That is totally fine.
This can be, I don't know, this can be a little bit nicer or maybe less cumbersome to work with than the string version depending on, you know, where you're coming from. If you're a human writing it, it's kind of the same but maybe programmatically it's easier to have discreet arguments like this. I don't know. Both are legit, I don't really care. Now here is a table id, normal, int4range, numrange, date_range, ts_range.
Let's read all of this back. Select * from, what did I call it? Range example, I should have guessed that. Select * from range example. And let's take a look at some of the stuff we have in here. Okay, so we have some that have lower bounds included and upper bounds excluded on integers numbers and dates. And you see the ts_range, which is the timestamp range, has lower bound included and upper bound included and then lower bound excluded. Kind of a mix of all of it. You'll see we do have some unbounded ranges. You see the five, nothing, one, nothing, this date range totally wide open on both ends. It's viable. You could absolutely do this. You could absolutely maybe say that this hotel room is under construction or is no longer existing maybe. Let's block it out with an unbounded date range in both directions. Hey, why not? Empty, however, is a different thing altogether that represents no range whatsoever. That is fundamentally different than all (laughs) the entirety of time. That is a completely different range.
Now if we look at some of these queries here, so the include operator, so you can see if a range contains a value using this little wonky guy here, the at sign and then the less than, greater than, who knows? If you run this, you'll see we're looking for the intrange where the range includes five. We got one to 11, two to 101 and five unbounded because five was inclusive down here. If we were to change five to exclusive over there and run this again, you'll see that guy did in fact drop out. If we were to change this to 11, unfortunately that's not gonna work because 11 is excluded there. We do get a few other guys popping in there, but we don't get the one that said 11. You can also check for overlaps. We just checked for the containment operator, that's what it's called, the containment operator. Does the end range contain the number five?
We can also look to see is there any overlap in the ranges and we can do that with the overlap operator, which is a double and here. If we were to look for ranges that overlap with 10 inclusive and 20 exclusive in the intrange column, let's just select ID and intrange to narrow this down a little bit. We see that there are some that overlap. Six unbounded definitely goes up through and past 10 and 20, 10 and 20 definitely overlaps, 10 to 101 overlaps, and one to 11 barely overlaps. But if we were to change that number one would drop out because if we look at that again, 11 is excluded and here we're saying 11 is included. The item number one does in fact drop out. Like I said earlier, you can use exclusion constraints to make sure that in your table you don't have any overlaps.
This right here we're just querying for overlaps and containment and that sort of thing. You can also query for intersections and upper and lower bounds. Let's look at that real quick. I'm gonna break away from this table for a second and we're just gonna construct it manually. Let's do an int4range from 10 to 20 and let's check the intersection or rather compute the intersection of another int4range from 15 to 20. Let's make it 25 and see what we get back there. We get a range result. If we did our old friend pg_typeof, and we got that back, it's an int4range. The intersection of two int4ranges is in fact an int4range and it is inclusive of 15 all the way up to exclusive of 20.
How can we make that inclusive of 20? Well, we can declare this range as inclusive of 20, and then because this one goes beyond 20, this will come back as inclusive of 20, which of course is the same for a discreet range of exclusive of 21. Hopefully all of this is making sense. It takes a little bit of practice honestly to get used to the inclusive, exclusive, continuous range, discreet range, all of that kind of stuff. But I want to show you just a few more things and that's programmatically determining the upper and lower bounds of these ranges should you need to figure that out. We still have our range here of 10 to 20 inclusive on both ends, and you can run and upper, but you get back 21.
Frankly I don't love this but I understand it and honestly I can't think of a better way. Let me show you what you can do here. I don't like that it says 21 because I said 20 inclusive and technically 21 is correct. If you were to look at upper inclusive. It says upper inclusive is false, which is correct when you say, all right, well this is a discreet range and that there are discreet steps of one, one unit per step 20 is included. Technically 21 excluded is true. Here's why I can't think of a better way because if we said numeric where a numeric range where 20 is excluded and normally you'd be doing this on a column, not just on a literal here, and I mistyped that that should be numrange, not numeric. If you had this situation right here, there is no way to represent the upper bound except by saying 20 excluded because like we said, there are infinitely many steps between 19 and 20.
If we run that, we get the upper bound is 20, but it is not inclusive, which is literally the only way that you could represent that. Of course you do have lower as well, for both of these functions. You could say lower and lower inclusive. Now just when you thought we were done with ranges, I'm gonna throw one more spanner in the works and that is a multi range. So far we've looked at upper and lower bounds. There is such thing as a multi range which can contain several upper and lower bound pairs to form a non-contiguous range, which would be multiple ranges. Let me show you that real quick. Okay, hang in there. We're almost done. Here is an int4multirange. You can see that it is inclusive of three, two, exclusive of seven, and then it starts over at eight and then exclusive of nine. If we run that, you'll see we just get back kind of the exact same thing that we put in. But we can also do the contains operator on this. If we said contains four, we would see true, seven would be false, and nine would be false. But eight should be there because it is inclusive. This is a nice way to store multiple discreet ranges.
We're getting way out into the bounds of sometimes useful, but it is sometimes useful. I think at the beginning I said anything you can imagine existing Postgres has a data type for it. I mean we're here for goodness sakes. I copied this part out of the docs just so I could show you all of them. You've got int4range, int4multirange, int eight, int eight, num, num, ts and tstz, which is nice. You could have a timestamp without a time zone, which questionable usefulness or a timestamp with a time zone, which is quite nice. Those both have multi ranges and then date range also has a datemultirange. So a lot about ranges there. Ranges are of course kind of a meta type with the subtype being something you're familiar with in four and eight, date, timestamp, tz, something like that.
Even beyond that, we have multi ranges which contain ranges, which themselves contain actual primitive types. Hopefully you can see a potential use case for your application. Either way it's good knowledge to have should you encounter a situation where you need to determine if a discreet point exists within a continuous or in the case of multi range, non-contiguous set of points. Range is what you're looking for.