Mastering Postgres
Introduction
Introduction to the course
Overview of course structure
Postgres vs. everyone
The psql CLI
Data Types
Introduction to schema
Integers
Numeric
Floating point
Storing money
NaNs and infinity
Casting types
Characters types
Check constraints
Domain types
Charsets and collations
Binary data
UUIDs
Boolean
Enums
Timestamps
Timezones
Dates and times
Advanced Data Types
Intervals
Serial type
Sequences
Identity
Network and mac addresses
JSON
Arrays
Generated columns
Text search types
Bit string
Ranges
Composite types
Nulls
Unique constraints
Exclusion constraints
Foreign key constraints
Indexing
Introduction to indexes
Heaps and CTIDs
B-Tree overview
Primary keys vs. secondary indexes
Primary key types
Where to add indexes
Index selectivity
Composite indexes
Composite range
Combining multiple indexes
Covering indexes
Partial indexes
Index ordering
Ordering nulls in indexes
Advanced Indexing
Functional indexes
Duplicate indexes
Hash indexes
Naming indexes
Understanding Query Plans
Introduction to explain
Explain structure
Scan nodes
Costs and rows
Explain analyze
Generating Results
Introduction to queries
Cross joins
Inner joins
Outer joins
Subqueries
Lateral joins
ROWS FROM
Filling gaps in sequences
Subquery elimination
Combining queries
Set generating functions
Indexing joins
Advanced SQL
Introduction to advanced SQL
Grouping
Grouping sets, rollups, cubes
Window functions
CTEs
CTEs with window functions
Recursive CTE
Hierarchical recursive CTE
Handling nulls
Row value syntax
Views
Materialized views
Removing duplicate rows
Upsert
Returning keyword
COALESCE + generated column
Full Text Search
Introduction to full text search
Searching with LIKE
Vectors, queries, and ranks
Websearch
Ranking
Indexing full text search
Highlighting
Advanced JSON
Intro to JSON
JSON vs JSONB
Validating JSON
Creating JSON objects + arrays
JSON extraction
JSON containment
JSON existence
JSON recordset
Updating JSON
Indexing JSON parts
GIN index
Vectors (pgvector)
Intro to pgvector
Vector embedding columns
Find related articles
Upsert vector embedding
Semantic search
Other operators
Vector indexes
Outro
Thank you
Bonus interviews
Heroku's glory days & Postgres vs the world (with Craig Kerstiens)
Creating a Postgres platform with Monica & Tudor from Xata.io
Bootstrapping an email service provider (with Jesse Hanley)
Locked video

Please purchase the course to watch this video.

Video thumbnail
Advanced Data Types
Arrays

Full Course

$
349
$399
USD, one-time fee
The course "Mastering Postgres" helped me practice query commands and understand why they're important. Inspiring me to explore the 'why' and 'how' behind each one. Truly a great course!
Nakolus
Nakolus

PostgreSQL database platform

Shorten dev cycles with branching and zero-downtime schema migrations.

Test Data

I've made my test data available for you to use and follow along.

Download
or
Use on Xata

Summary

Learn how to use arrays as a standalone data type in Postgres and when they should be preferred over creating separate tables. Postgres arrays offer a variety of functions and operators for efficient querying. Learn how to create arrays, insert data, and use operators to manipulate and query array data effectively.

Video Transcript

Another data type that Postgres supports is arrays. And now we do have arrays in JSON but we're talking about something different, which is just arrays as their own standalone data type. And that's what we're gonna look at here.

There are a lot of functions and operators and ways that you can query arrays that's actually incredibly cool and incredibly robust, we will say. We'll do a bunch of videos on that in the querying section of this course but I wanna show you the data type and show you a few of those operators. I think the interesting question for you is when should you use arrays versus breaking out the data into its own table? It's kind of a case by case basis, I can imagine something like storing an array of sensor readings or something like that, that doesn't necessarily need to be broken out into its own table because perhaps those readings are always associated with a sensor at a certain timestamp and you're doing those every minute. That makes plenty of sense to me. You're pulling back the whole thing and you want the whole chunk of readings right there together.

I could also see either storing tags as a textual representation of the tag or in fact storing the tag IDs and kind of simplifying, removing that intermediate linking table. That can make sense that proposes its... Or that raises its own issues in that you don't have that referential integrity anymore. If a tag gets deleted, it's not necessarily gonna go in and delete that item out of the array so there are trade-offs. You can decide when they are useful for you, but I am gonna show you that they exist and how cool they are so let me show you that.

Here we have a table and we just have an ID, and then we have an integer text, Boolean, and nested array columns. And you can see that we're declaring the type and then we're using these square brackets to denote that that column is an array, but it is an array full of integers. You could also say that it is an array just like that. That would be totally fine.

This one down here is a nested array and so we're saying that it is an array, an inside each array element in fact is its own array. Now that we have created that, we're gonna insert some values that I have right here. We're gonna just insert into the integer text and Boolean, we'll save the nested here for a second. And you have two formats that you can use here to declare or say that something is an array. This is the little more verbose format that's quite clear in my opinion so we can insert that. And then you have this format here, which is the curly brace format. This curly brace represents the beginning of an array, the end of an array. You can see this is our nested array right here because it is saying that the whole thing is an array.

And then inside of that are three elements, and then inside of each element are three integers. If we run that, you'll see we'll get rid of all of that and we'll do a select * from array_example and so you see the format that comes back to us is that curly brace format. If that looks familiar to you, that's probably why because that is the standard output format. Just like the JSON columns, there are a ton of functions that we can use against these array columns, some of which are kind of esoteric and some of which are totally awesome.

I'm trying to keep our discussion separate, I wanna talk about data types and building a good foundation here. And then when we get into more of the indexing and querying, we will build upon those data types and figure out what indexes go where, how to query it, how to update it, that kind of stuff. I'm trying to keep that... I'm trying to keep us on the straight and narrow in terms of data types.

However, I kind of wanna show you a few of these operators so let me show you just a few. Here is our base query, we just have this one row here that actually has a text array in it. The first thing you can do is pull out a single element. However, you cannot pull out the zeroth element because it is one based indexing. Honestly kind of wild. You know, I just don't see that very much any anymore. One based indexing where if you are using a JSON array in Postgres, that is zero based indexing, so keep that in mind. One based indexing, you can also do slices. We can say one to three or we can say, just gimme all of the items up to three. Or we can say, give me all of the items three and beyond. You can do open-ended slices, you can do closed slices, you can do one to 10. And even if there aren't 10, it's still gonna be totally fine so that is how you can do some slicing. I think the next one, let's do daisy. So we can say, show me ID and text_array from array_example where... I dunno why that capital W... Where text_array and then we do this wonky little guy right here and we can say array of let's say poppy, in fact. This is the array includes operator.

If we were to say, you know, something else, I'd say, "Well, there are no rows that include poppyasdf but there are some rows that include poppy." And of course you could use the shorthand poppy like that and that would still come back good. The last one that I wanna show you here, again there are many, but the last one I wanna show you here is fundamentally different because this returns a result set. If we did unnest of text array, it's gonna turn it into rows, which you can then operate on. If you had, and this is a little preview, if you had with flowers as and then you created a CTE out of this, select * from flowers, then you can do whatever you want. Select * from flowers let's name this... Boy this is fun, you guys are gonna love the querying section. As flower so where flower equals, what do we wanna say? Let's do poppy again so there you go. It's pretty cool, frankly. I do love this stuff especially unnest, which takes an array and turns it into a result set and the items that are not arrays will just be copied down. That's a nice way. That's a nice way to get back into SQL land out of just kinda like array land. We do love SQL land.

So hopefully you can see a few places where arrays might be useful. There are of course trade-offs like we talked about at the beginning but that's up for you to decide for your own use case. They are a very valuable and very functional data type, and we'll look at adding indexes and queries on them later.