Enter your email below to watch this video
I bought Mastering Postgres because I use Postgres a lot and wanted to support Aaron and his content. I already learned more than I expected and have been very impressed with the quantity and work Aaron put in.Timo Strackfeldt
Shorten dev cycles with branching and zero-downtime schema migrations.
If you are sick of hearing about numbers, I've got good news for you. We're going to talk about characters now. We're not going to talk about numbers, we're going to talk about characters.
There are three different columns in which you can store character data. There is the fixed length character column, there's the character varying column there's the text column. Three columns that are in fact identical. They're the exact same thing. Now you may hear me say that think, "Ah, I think Aaron might have lost his mind." Well, let me tell you, that may be true. Under the hood, they're stored in the exact same data structure. This may be weird if you're coming from a different database or just like trying to intuit what would be more performant.
In fact, using the very compact fixed length character column is no more performant than using a character varying or in fact a text. I know that that sounds weird it is weird that I highlighted it multiple times in the docs here. In this big square right here, it says that character might be the worst performing of the three. That was very surprising to me. That's why we're here, to learn things. Let me show you some of these columns. We're going to start here with a fixed width character column. Before you go off create your tables like this, stay tuned.
What we're doing is we're declaring this abbreviation column as five characters wide. What happens if you insert something that is fewer than five characters wide? It gets padded with spaces. Select * from example. You'll see a is in there it's five. What if it was longer than five? Bcde should be five, f should be six you get an error. Do not use this type. Do not use the fixed width character type. It has several drawbacks. One is it doesn't actually enforce that it's five characters long. It just pads it out to be five characters long. If your abbreviation must be five characters long, you're hosed 'cause I just inserted one that was one character long. The other thing is because under the hood these storage types are all the same, you don't get any benefit from using a character type column, a fixed width character type column. You don't get any performance benefit. You get a little bit of a storage hit 'cause it is storing all of those spaces a little bit of a performance hit because it does oftentimes have to trim those spaces right back off. It doesn't give you good enforcement 'cause you could insert fewer characters. It doesn't save any space 'cause it's going to pad it out it's using the same underlying storage type. You might have a performance hit because it does have to trim off those trailing spaces quite often. Then when you get into comparisons, sometimes it depends on the coalition, whether the white spaces are significant.
Sometimes it depends on the operator if the white spaces are significant. The like operator, I think they're significant. The equals operator, I think they're insignificant. Just don't use it. Just don't use it. Let's keep moving. I will tell you the correct way to enforce a five-character limit, we're not going to do that right now. We'll do that in a future video. Char or char is the SQL standard. You could also do character. When it comes to, something you may be familiar with, varchar, varchar, varchar, the Postgres fancy way of saying that is character varying, which is I feel like it has a lot of good gravitas.
When you have a character varying, this is separate from the fixed width character. This is saying, ah, it may be up to some certain limit you can pass in some certain limit here. This is where I want you to think carefully about if this limit is correct. If we were to, we should probably change this to varchar. If we were to change this to last_name, 255 seems right. If you just put in 30 because you think, ah, that's good enough, this is where you need to be really careful have a good understanding of what your domain is because that might be a little stingy it can be a little bit of a pain to change later on, which again is going to lead us to, let's use an unbounded character column add some check constraints or a domain on top of it, which we will do soon. I feel like we're running up against needing to learn a lot about check constraints we will do that soon.
One thing here is this will not pad spaces out to 30. This allows you to put 30 characters in the column, it doesn't pad out like the fixed width column does. You also don't have to pass an N. You don't have to pass a number in here. This becomes the exact same as this. Now, text character varying without a limit can hold arbitrarily large sets of text. Should you expose this to an open text box without any validation, you could get a lot of text. We might need to put a constraint on this. I will say that after the text reaches a certain size, Postgres has this neat implementation called TOAST. It's like oversized attribute storage or something. The table of oversized attribute storage. I don't know, it's all caps TOAST. I forget what it stands for, basically what it does is it takes this large chunk of text. This is not something you're in control of you don't have to worry about this.
This is just really interesting I thought you should know. It takes that large chunk of text then splits it out into a different table such that your rows on disk can still be really compact you can still get to that TOAST or that oversized attribute if you want. MySQL has the same thing when you put a set of text in there that's too big, it puts it off into the big text parking lot basically. This is totally invisible to you, it is good to know you don't have to do this manually. Should you have a giant column, you don't have to think, well, I should cordon that off into its own table. Postgres is going to handle that for you. My recommendation is do not use the fixed width character column. That is not what you're looking for.
I'll show you what you're looking for in the next video. If you're going to use the varchar or the character varying, do make sure that it is large enough to hold the entirety of your domain. Let's say it's somebody's last name, that's their identity, they cannot change it, make sure that it's large enough to hold their last name. If it's like the title of a blog post, you're in charge of that. You can say, sorry, 75 characters is all you get. That's totally fine, I'm fine with that. I think in most cases what you'll end up wanting is a text.
A text with maybe a check constraint or maybe a domain on top of that. I think in order of preference, it's probably text, character varying, very, very last, i.e., don't use it, is fixed width character, which is totally different than other databases. If this surprises you, that's fine. It surprised me too when I first learned it. That is the case in Postgres it is a little bit different than the other databases. Now let's talk about those check constraints domains.