Mastering Postgres has dramatically improved my understanding of indexes. It has helped me make informed decisions around my queries and schema rather than blindly adding indexes and hoping for the best.Dwight Watson
Shorten dev cycles with branching and zero-downtime schema migrations.
welcome back to database School uh I'm your host eron Francis but more importantly you're here to listen to the guests uh today we're going to be talking about postgress which may or may not be how you're supposed to pronounce it but we have the expert here with us we've got Craig Kirin um and I think I pronounced that correctly so Craig do you want to give yourself a little introduction and we can get started sure uh so I'm Craig I work over at crunchy data we're kind of all things postest been around for a little 10 years um I've been doing postgress for better or worse for going on maybe north of 15 years now um uh kind of stumbled into it um but had you know early early product at Heroku um It's actually kind of the black sheep early at Heroku I came on board there to launch their python support so like everyone knows Heroku from the Ruby Community right but came on board to actually kind of work on some of their other languages ended up doing a whole lot with Heroku postrest um you know help them scale to I think at the peak we were running like 1.5 million postrest databases across all our customers so the you know the idea of a one in a million problem was what once a week right yeah um left there to go to cus data very different set of things instead of millions of customers like our average customer was 40 terabytes J the largest was 960 terabytes in a postc crust cluster so not quite petabyte scale but I'm going to round up I'm going to cheat a little uh we can round up yeah that's fine um um they were acquired by Microsoft ran all Azure postest for a little while um and then was kind of ready to get back to small team building again it's I I kind of joke um it's like unfinished business from my Heroku postgress days of just making postgress absolutely amazing for developers um but that's kind of a little bit of my history basically postest for a long time but I don't come from a databased background like I think you probably talked to a bunch of folks that are like yeah I spent 10 years at Oracle internals and you know I contrib did this feature and that feature to squl light I much more come at it from like a developer angle and we'll probably get into some of that but it it's like a man after my own heart that is good I like to hear that yeah it's it's like I can you know I've actually never committed to LY a code to postrest and it'll probably stay that way but um it is a great database and I'm definitely fond of it so I want to hear about the Glory Days of Heroku I was just I was I think I was just arguing with um Dax on Twitter as one does about uh everybody's trying to rebuild Heroku and I'm like yeah let's do it bring bring back good old Heroku so when did you join them and and what was it like back then uh right around acquisition time so it was around pre-acquisition was a full-time employee just after I joined I don't remember my exact number maybe 22 23 employee um oh wow it was it was still small then like like the All Hands meeting wasn't a meeting everyone just sat around the lunch table on Wednesdays and the founders would talk up um and yeah so I man it was an interesting time like like now I kind of feel like I belong in Tech like imposter syndrome is a thing right you you're surrounded by smart people I was reminiscing with a lot of us early Heroku folks and we all felt like we didn't belong and we're building something amazing and pushing each other um there's a lot of talk these days about you know remote versus in office and I I don't think Heroku have been created with that same culture remote like there's just feeding off each other in a room not to say you can't build great things but it was a time and place us and GitHub were similar similar companies um we we had some folks in those early days leave from Heroku go to GitHub and um vice versa some of the early GitHub folks come over to Heroku um man there was a lot of philosophies in there like I mean at the core of it is how easy it is to deploy an app and I think it's easier today but there's still nothing that matches or beats get push Heroku Master hey we'll get back to the interview in just one second but since you're here you like databases you probably like postgress let me tell you about mastering postgress this is a course that I am teaching coming out October 15th this is a way that you can Master the most important part of your stack your database so go to mastering postgress decom and sign up to hear when we release it okay back to the interview how is that possible in the the year of Our Lord 2024 how is it possible that we've lost get push Heroku master I I think I chimed a little bit on that thread and someone's like but it's so easy to build a Docker file and real gives you one by default and I'm just like cool great and then when you want to update it or muck with something and system depend like that's still all on you right like yes and what if I need a database what if I need a redus the the Heroku Marketplace was ahead of its time but it's the right time right the first Roku add-ons um Rus to go Heroku built and then gave away to Reus to go to run huh I didn't know that that's funny um I think it was also the same with I want to say it was maybe zencoder the first like video you know transcript like video encoding service I think Heroku built the first one of that and gave it away to another like Heroku didn't want to be everything right but we wanted people to like Best in Class support these add-ons so many of the early early Heroku add-ons her Roku built and then we just gave away to a company and a friend that we knew to run them as full-on companies just to start the extension or add-on flywheel turning a little pretty much y like I mean we had people come and asking to like run [ __ ] and to run reddis and eventually we caved on reddis but like we said no it's just postest for a long long long time mhm that is super interesting I did not know that um that's some good lore right there and then am I understanding correctly that you either built or spearheaded the whole data Clips thing I was so myself and the rest of the team aside from one person were anti- dat Clips okay tell for first tell us what data Clips is SL was uh and then tell us tell us that story as well so so data Eclipse inside heroki was like one of the best features we ever built and used within Heroku postgress that internally we dog food it so much and was one of the least adopted features of Heroku broadly like people just didn't use it um it's basically like a GitHub gist but live against your database so you write some SQL you save it and you get a unique URL right and you just send this around the internet you can share it um you can secure it if you wanted with like SSO but you could also have it you know it's a unguessable URL that you can just go and like a pin. csv2 and it'll download a CSV so internally once we had this uh um we would build up just SQL like just write SQL that's live against the database it'll Auto refresh every hour and then within Google Sheets instead of rolling out like metabase or powerbi or whatever bi reporting tool in just Google Sheets you can do like equal import data open parenthesis quote url. CSV and it'll live load this yep I I love the import data in Google Sheets you're speaking directly to me okay so so keep going so we we ran the bulk of Heroku business like bi reports for a long long long time just on data Clips against live production a read replica of our production database load these up in Google Sheets create some charts on them and and boom that was our like kpi board meeting like every team like where's our dashboard um we would syn you know zenes tickets into a Heroku database VI us some like you know Heroku and then like let's chart how many critical tickets and then you know what what type like all of just basically like crud like syn some data into postgress and chart it now where did it come from one PM had this idea that it was like he wanted to like just write he wanted this feature right in functionality and the way you get buying at Heroku wasn't like uh PMS told people what to do or exec told people what to do it was like you motivate someone that this is a good idea and you get people excited and they go build it H it was it was not a top down and it you know if you thought something was interesting you build a demo and show it off and you get people behind it and that's how new products launch um I mean heroku's build pack stuff and the cedar stack was a couple of people hacking at a Tahoe offsite for a week and this is how it could work showing it to the rest of the company and light bulbs going off right um so very much like best idea is when and you get alignment behind him he got no Buy from the team no one wanted to build this everyone thought it was a terrible idea so he went and found some third party Consultants I I mean at the time no it was I was I was just like no we like should be doing these other things for databases and like it was uh it was Matt Soo was the PM and pushed it and I mean we all had a May culpa later that was like we were wrong on this one the entire team but he just found a third party consultant to build this as a third party SSO app and then man detangling that and making it first party was such a pain because it was a it was a hack but um it wasn't a hack but it was not built in the way we would have built it if we built it first party so that was untangling that was was fun um but yeah crunchy that was one of the first things that I'm like all right we're rebuilding this and we're rebuilding it better and it's going to be like core and Central TR product because it's like once you have the data writing SQL is easy to get the answer and yes so many times in a live meeting it's like all right I can write this query against the database and have an answer and yes within Heroku postest we always had a saying like if we've got data let's use it if all we got our opinions let's go with mine I like that that's mine now I'm I'm going to steal that we went to using data a lot and it's so it's like live in a meeting like we'll move on from a topic and five minutes later we'll Circle back because it's like oh I've got the results of how many people like actually set up logging or like don't use ha or um I mean as a database provider I I love to look at how many customers have a a name of their database like prod is in the name and they don't have high availability enabled mhm that's a pretty good that's a pretty good Target list of people to reach out to and uh get them to activate that so I like um I think what resonates with me about all that is I like the central source of Truth and I like owning it like you were talking about uh zenes tickets and stuff and once you start like once you start tying trying to all these sources together I've always found that the easiest way to do that is use my programming language slf framework and just put it in my database and so like that's the that's the thing that I like you know as we're building out these you know learning platforms I'm syncing all the data down to our database so that I can just look at stuff and so that super res resonates with me as like the developer so like you uh this guy paid somebody to build it then y'all brought it in house and then you you said it wasn't very widely adopted did uh Heroku users not use it that much but the people that did loved it because I've heard about it from I still use Heroku for one thing but I used to use Heroku a lot but I never used posts and so I never used data Clips but I have heard it's got like a cult following of people that still love data Clips Miss data Clips sounds like you included so what was the story like once y'all rolled it out in house or or the first party version of it did people use it I mean yeah I think it was like that cult falling but it was a really small cult right it never caught on it was never Mass use um you know at at one point in time we had this build idea of building Heroku postgress was built as postgress for Heroku users MH but we never created like the Heroku experience for poststress like the idea of get push heroki master um we had a prot type one time of what if you ran get bisect on your database like show me like when this data changed based on this query right um within the cloud like we already have point in time recovery we can just replay the right ahead log see when data changes if you want to know when you know Aaron updated his email to be you know try hard Studios right like okay when did that happen when did he change his account to that for whatever reason well that we have all that history in the wall and we could just spin up multiple followers and slowly replay bit by bit by bit like that's a magical kind of Co experience um a lot of I mean it's fascinating back then we had um a lot of the the stuff that exists in the the current clouds of like I want branching for my database like we actually had that supported one point and no one used it where it's just like there's no point postrest itself way way way back used to have time travel like people talk they used like the toomic love the time travel functionality and it's like no one used it in postgress like they emailed out to the mailing list and it was cricket so a lot of these things I think are are are fun and shiny but not what purveyors or databases use all the time data clipse is one of those things that like I'm I'm shocked I think because it's I think it's because often like once a company gets to having like a bi team or an analytics team they want to own their tools and and not be kind of in the native stack usually yeah that makes sense to me so it falls in this real uh real small sliver of companies that would use it it's the people that are kind of like uh probably pretty early founder hacker types that want to go in write raw SQL against you know the the database versus setting up metabase Tableau some you know some other bi tool once you get a big team you've got just as enough people where it's like well you know because you can connect your database and write SQL and you know that's easy to do but like I need to share this with three people right and one of them is not Technical and one of them is so they want to see the sequel so it's uh but I found it fascinating once we started sharing it at Heroku that um one of my favorite things was the marketing team would be like hey we need to know our monthly active users and I'm like okay great here's here's the report for that and they'd come back next month and be like hey I need this again I'm like okay like well did I filter out the we had like at one point like a Facebook app partnership like did I filter those out last month or not like I don't remember how I did that so oh right um or they need like hey instead of last month I need the last two weeks and they see the data clip which had the sequel in there and they're like hey Craig what happens if I type this and change this from one month to two week like and suddenly we have marketing people like hey is am I writing SQL now like am I a developer which was you got you got them hooked you gave him a taste there's no turning back now fascinating and I I love that I mean the other thing I always you know started to do after that too was like make sure uh do you have a PC Co RC yourself set up or not no I saw yours the other day and bookmarked it but no like how many times have you written a query and then gone back like six months later like what did I what was that query where is it at so you can easily set it up to save the history of every query you run with the database name that you're connected to cool so it's like you know core DB versus like marketing DB right like I don't want to mix and match those and like sorting through every his query I've ever run but it's like once I'm connected to this DB it's a pretty common set of queries that I run so super handy to like Auto say all of your query history yeah cool yeah that I mean that makes perfect sense so if you found you found the perfect tool and every everyone else was like ah you know we're either too big or we're too small but the people that were right there they loved it so after after Heroku uh I I want to get to this this tweet I I'll give a tease of the Tweet but I have one question before that so here's here's the inciting incident so I follow you on Twitter obviously and the Tweet was uh a v you were talking to a VC and he says uh way back when we bet on my sequel you said not crazy for that time and then the VC says postrest really came out of nowhere in the last few years any thoughts on how that happened and you said how much time you got and so that is when I was like I've got uh all the time in the world so that's what I want to get to but before we do that um you went to work for cus is that right yep yep so explain to me I so as you know but others may not I'm coming out of predominantly a MySQL world uh historically explain to me uh similarities and differences to the extent that you know them between cus and vess um um I i' poke at it right away because like cus has uh constraints and foreign Keys um which uh I I think vest has kind of some sort of some support now I I did love the blog post saying this is not possible and distribute a database and I was just like um um over here we've had this for a few years um no comment s generally is is was originally focused on way more like the h space like hybrid transactional processing like inest a lot ofp Stu but then we want to do really fast you know counts and aggregations like fan out and like crazy parallelism okay cus um we stumbled into this idea of multi-tenant apps right um vest does different things with my sequel on how it shards like I believe it's a little more magical where scius follows I think more of a postgress view where like you're very explicit with how you shard with cus you say this is my Shard key and you want to group like the right thing Under The Shard key right so if you're salesforce.com pause there for two seconds give us a one or two sentence overview of what cus uh like on the 10 what does cus do cus is an extension that turns post grass into a sharded distributed horizontally scalable database okay so you've discovered this tenant C model and you're about to say something about Salesforce carry on so um and and so I'll let's see how quickly can I give the full primer um so for those that haven't sharded right Like A Shard is a you know I believe it comes from Ultima Online I think um I think that it comes from some game I do think that's right yeah um is a way of dividing up your database right you've got like database 1 database 2 database 3 like poor man sharting that I've seen in a bunch of apps is you know hey users 1 through 100 go in this database users 101 through 200 go in this database um within sharding you've got like nodes like physical nodes like node one two 3 4 and then you've got shards when you're sharding you want to have more shards than you have nodes so if I got like four nodes I maybe want to create 32 shards up front the reason is it's easy to go and move A Shard after the fact it's not easier to split a start up so if I would had like a you know table one table 5 Table 9 live on node one inside my postgress instance and if I'm salesforce.com instead of going users one through 100 which are going to be my oldest biggest most valuable as time goes on all packed in together I'm going to say well you know customer one going on Shard one on node one customer two going on Shard 2 uh or sorry node two Shard two right customer three uh so forth so on right and I get back here to customer 5 going back on node one but then I can later go and say oh I want to move customer 5 out to their own node so I can create a new postrest instance and then with you know logical replication or we had this way way way back I think I have a blog post maybe 10 maybe 12 years ago of uh I think the company was was Think Through Math a big Heroku customer that used Heroku followers and basically spin up spin up a follower promoted one and then you're going to go and delete half the data from one and then delete half the data from the old primary and now you just split it up right it works pretty well um so cus if you have something like a customer ID right well you're not joining across databases and that's where you get nice foreign key constraints and all that because like customer ID goes on every single table um cus is um I was actually I don't know why I was thinking about this in the the shower this morning I think you know I was talking with someone about cus yesterday oh we we we share an Affliction I have that same problem so you're not alone there it's a weird analogy but SST is kind of like like chemotherapy like okay if if you have cancer and you catch it early pretty effective it works right just now building an app um and if you have cancer right if you don't have cancer you're not going in there being like hey I'm about to go out and get sunburn can I go ahead and get some like chemo like preemptively okay fair enough good good so far you catch it late stage right you're a you're a 10 terab database you have you know crazy weird joins you don't have your customer ID on every table um you've got complicated weird queries that don't join on customer ID like you know maybe the Cho's going to work but if it's late late stage cancer you're going to you're going through a lot of pain for an unclear outcome right so if you end up if you end up at a point where sharting makes sense but your data model is not already tuned for that you're going toy workload okay in query workload you're going to be kind of up a creek and so at some point before that you got to start doing the leg work to clean up your data model and queries such that then you can introduce sharding without uh without the pain yep and cus has come a long way to kind of test if your apps ready right they've got a mode where it's like you can warn on queries and that sort of like it's gotten better and better on that um but I would say on average people spent when they showed up with a big mature app it was four to six months to kind of a few engineers like re architecting their app to kind of you know their data model um and you've got a at that point you've already got a big database the idea of like well we're going to go and back fill our customer ID on every table that you know once you've you've done rails new that's pretty quick and easy on a rail app with 150 models not so quick and easy that like not impact production yeah okay that makes sense all right I was wondering where that analogy was going to go but I think it works so that makes sense I I'll accept that so let's let's talk about uh we we can we can close out by talking about crunchy because you'll have a lot of interesting stuff going on over there but here's what I want to ask you you're telling this VC so so this VC I don't know if it's real or made up but it's a great story The VC is saying where did postgress come from seems like it came out of nowhere uh you seem to have a lot of opinions on that which I want to hear my perspective is I grew up in a MySQL household um my dad was actually SQL server but I was a PHP boy still am and so we were always you know we were always lamp stack M Stack you know MySQL PHP so that was kind of like my my uh long history and then it does in my opinion seem like postgress kind of came out of nowhere um in our world uh it does feel like laral which is a PHP framework has always kind of been my Sequel and rails has always kind of been postgress but I will say uh larel cloud is launching and the first database flavor that they're going with is postgress um and so it does kind of feel like the tides are turning a little bit and I'm sure um that you follow Mark Callahan on Twitter and every and I think he's like uh I think he is a pure a noble Saint he seems totally driven by data only and he seems very thorough um so I don't think he has a dog in any fight but every time he tweets about the new my squel releases it's like hey another regression hey another performance regression hey and it's like what is going on over there and so from kind of like my perspective that is what I'm seeing um in terms of like the battle or the war or whatever so sounds like me and this either real or fictional VC are on the same page what is your take on why postgress came out of nowhere you know whatever 20 years later but what what's your opinion on that yeah I think there there's a few things in there and so I'll I'll back up and cover a few things right and so before I like Heroku wasn't my first foray into postgress so I was at a company um back in 060708 true viso that extended postest to be a streaming database to essentially do map reduce on data as it came in so um this was back in the like in the data world like the C complex event processing space postgress wasn't new to the database world like okay postgress right is post Ingress like databases follow one of two lineages right and Ingress is half of the databases out there um I forget I don't think SQL maybe actually no SQL server is not but if you look like db2 um sbase but if you look like um red shift right red shift was par Exel Amazon licensed par Exel par Exel ran out of money and they hired all of the par Exel Engineers right so like Amazon never acquired par Exel but it was purely par Exel technology right um if you look back at green plum asata natisa all of these data warehousing products I've never heard of any of these things you you claim you're not you're some like application developer you don't do a lot of you're the expert I've never heard of any of these things in my life so like postest has been around for a long time like because of it permissive license because of its amazing code quality like if you're like hacking databases as a PhD student you start with postest and then you modify it interesting if you're building a data warehouse 25 years ago 20 years ago 15 years ago what you would do with postgress is you would Fork it because that's just what you did back then right um it's super permissive license it's fine to do uh Fork it turn it into an MPP database like massively parallel processing and then start building out you know your data warehouse and 3 or 4 years later you you start to have something that just about works like building a data warehouse is a long slow process so fast forward a little bit right and postr like maintaining and keeping current like red shift is kind of stuck on postgress I believe it's 8.2 is the version now there's hints of things that have come in like window functions in postgress arrived I believe in postgress 8.4 uh when I first found window functions I was just like man I can do all of this in SQL and not have to go back and forth to my programming like this is this is an interesting real database so I give some of that context of like if you look way back at product companies and like acquisition like real legit data warehouses that were now kind of resemble postest but they they really truly started at postest if you talk to others today like a youa biter cockroach they're like oh we're postrest but it's like we started with the parser or like the wire protocol that's not starting with the postrest code base you started with a completely different code base back then you actually started with postris now at triso when poststress like we went from postgress 84 to 90 it took us like six months to go from one version to to just rebase all of that so fast forward a little bit and you've got extensions right which make it even more fascinating for postest to build on um maybe I'll come back to that piece but uh so I showed up at Heroku and I was over here like launching python like I I ran product for our payments and billing group um did some stuff with the add-ons Marketplace and I'm over here and here's all these rails developers like throw in just like Tech stuff into a field and like oh we'll we'll convert time stamps but back and forth I'm like why aren't you using time stamp TZ why aren't you using range types why aren't you I'm like like Heroku is supposed to be really really good reals developers right like really sa and like guys why are you treating you know I think dhh1 family said like the database is just a dumb hash in the sky I'm like no it's like this is this is the source of Truth like how are you I knew the the python community and the Jango Community really well and they're like we're going to treat all databases the same and I'm like you're you're kidding me like at one point SQL light listed null as a data type and I'm like I'm very confused by this I think it was just a misconfusion in the docs but I'm just like you got five data types you know sqlite well right like five data types or is there more now oh no it's still that few yeah and if you've ever built like a um scheduling app right if you're building like a a scheduling app and calendaring app you don't want overlapping you know like hey we want 30 uh people in a class and we don't want to you know overflow that and like it's College registration time and everyone wants not to have the 8 a.m. class right so they flood to the noon class like don't you want a proper constraint of the database to ensure that doesn't happen and you unwind it and like postr has constraint for that right like why aren't you using these things every developer I've ever met that says no no no I I enforce constraints at the application layer when I'm like cool can I have five minutes with your database and just run a few queries and see what happens um every time they're like huh can you tell me which of those is correct I'm like why why won't you believe me when I like find duplicate or incorrect data so so at Heroku I was just evangelizing postrest internally I'm like you know hey dumb rails developers you should really learn how to use a database here's all the amazing stuff it can do right and I accidentally got recruited over there but I think Heroku had a massive piece right like every rails deer got a postrest database and we made it easy to use and we started pushing on it becoming great for rail developers like rails you can pass in like a DSN like the postgress you like postgress Co SL slash like we work to get that into a standard within postgress like that the lib PQ driver would parse that so it's the postrest driver that's parsing it and now you can just pass that in you don't have to P Break Out username password host name all that which is just ugly and painful and so worked with like the rails community and then push on that in the python Community like you should just accept a postr connection string as as the thing to connect um we really just work to make it amazing both with the language communities and within the the you know postest itself and try to get them to meet both ways so pause there was Heroku supporting postgress only was that a big reason the that it was adopted in rails yeah it was Heroku was synonymous with rails and Heroku it was funny because Heroku we had one guy on the Ops Team that's like hey my sequel versus postrest which should we use um this is this person was not a major contributor to anything else at Heroku except he's like H in all of the the like Linux Unix forums I'm on like postrest seems to have a really good security track record and be safe for your data and I think if we're running a database it should not lose your data it's like okay that's a pretty there was a little more thought than that behind it but not massively like it was a little bit of like this seems like it's got a good track record let's go with it um that is wild because I so closely associate rails the rails Community with postgress and rails with Heroku and Heroku with postgress but I didn't know you know chicken or egg I didn't realize that uh Heroku having postgress only and being the spot for rails is how rails and postgress ended up thick as thieves I did not know that yeah I mean I have't I wish I still had it I don't think I saved it from the work email like um we have an email from Amazon years later that Amazon never wanted to support post press like like if if my SQL RDS had existed we probably would just wrap that like Heroku postest like we had all these rails developers asking for database and thought how hard could this be and like if if RDS was there we probably just said like all right let's wrap this or let's use it it didn't exist so we built Heroku postgress Amazon about the same time or shortly after happened to come out with mySQL RDS and then years later the GM of that group was like we got tired of all the complaints about us not supporting postest like you're the reason we went and supported postest on RDS like there was no plans for that they were all in on my Sequel and they got tired of the complaints about like we want this Heroku thing over here we like post grass can you support it Heroku supports it why don't you um that was a I mean you can thank Heroku a lot I think there's a few pieces but like Heroku definitely had some track record behind it h that this is great lore I freaking love this um okay so you touched for a second on um you touch for a second on the extensions my uh point of view/ opinion is that the it's kind of like this is not derogatory postgress is kind of like react you you cannot leave because the community the ecosystem is so incredibly strong and you know I think it's like a meme on Twitter at this point it's just like just use postgis for that oh you have a problem just use post stress for that because truly it does seem like there is an extension for everything so how did that like what was the inciting incident incident to allow that ecosystem to flourish yeah so I mean it's been there for a while and if you read back to like uh stonebreaker's original thesis and a big idea was extensibility it wasn't quite I think the extension framework it exists today um I twice over colleague or just once like Demetri fontain like extensions are always there but not easy as easy to use he went and added the right here Fon it's a good one I got lots of notes out of that one yeah uh oh man I need to go have some wine with Demitri he's he's very French so um he's a good writer yeah um I can I can tell you off camera plenty of stories about D too um that that are not in the appendix of the book um yeah so he wrote the the create extension Command right so extensions were there but it had to be loaded weird and weren't as user friendly um we we actually contracted out with him at one point to at Heroku we had the idea of building an extension Marketplace for postgress oh that's interesting yeah and so right this is I mean this is 12 years ago we're thinking about this and you see people kind of trying this now uh he was like well can I can I build the like you know the server side of it in anything and at Heroku at the time we had every everything we had go we had Java I don't know if it was Java or Scola in this deck we had python we had Ruby like we were we were fully poly glot everything it was it got cleaned up later and standardized but we were everything and so we're like sure write in whatever you want um learned my lesson uh he came back with this all written in common lisp and uh yeah yeah we uh that never made it to production um what a legend uh that's still that's in my like the back pocket like one day I'm going to ship an extension Marketplace but like that that framework of create extension right you still got to build them and have them on the server but I would say even before that postgress was more and more of a data platform like you had things like fulltech search right you you started to have postgis which was an extension but like it shipped right alongside right you had things like H store um Json B like we almost wound up with something very different where we had like a completely different approach which was J store like it was a different competing thing at the time man postrest developers are fascinating like I think now they're they're more tuned into General web world but at the time like uh like one's like what what is Json why would I care you know like I I built website once it was a good website it was HTML like this and this wasn't 20 years ago this was like 10 years ago right and we're like hang on no no no here's why Json matters in the world um and so when it got Json like the Json support in postgress well not 9.2 9.2 was cheating right that was Json into a text field that was validation on it Json B um which my my colleague will says stands for better um we do love will so good um oh other will you need to know other oh okay I don't know other will I'm sure he's great I'm sure he's great I love main will primary will uh shout out to other will um who is a web master um you can visit his website bit fision um it's an amazing website if you're there make sure to sign the guest book not a lot of web Masters around anymore I think I think him and maybe Justin Jackson are the only two web Masters left he's been our Web Master he was Web Master at Roku and at ctis and here at crunchy too so okay we'll call him Web Master will cuz other Wills it's it's a little derogatory so Web Master will and will King so uh Json B is Json better according to Web Master will yeah so I mean if you look at it like the idea like oh you need a nosql database oh you need full teex search oh you need like postrest has been a platform for a while I think I like started saying that in something like 2012 like it's a data platform but then enter extensions right which I I don't know of any other database that well now duck TB has extensions right but like outside of that I don't know of another one um and I think you see you know I caught up with Hest month or two ago like a lot of like Duck DB stuff is modeled after postrest it's like man there's some good stuff in here I think they're doing a lot on developer experience and improving that but you know hey postris has had this for a while it's just not been accessible and developer friendly and so low-level hooks that can turn postrest into whatever one person joked and wrote a [ __ ] uh FD [ __ ] compatibility fdw which just wrote your data to Dev null um because you can do that um you like you can change where the data goes from whether it gets synced or you know the query plan or all that sort of thing so I I think as much as postest continues to move forward and Be steady and Advance bit by bit by bit it's you know the releases are getting more and more boring cuz it's like well we're faster we're still your data yes and if something gets really really standardized and stops changing We'll add it otherwise you've got this huge extension ecosystem that like go do crazy stuff until it kind of becomes a standard and that sort of thing so is that is that how informally is is that how you think it works that the extensions are Proving Grounds and once something once something reaches some sort of critical mass or whatever the team kind of like looks at rolling it in is that how it's worked historically s sort of with some functional like there was some some extension stuff around like Json path that then became first class operators right um bits and pieces some stuff you'll never find in core I doubt you find cus like sharded distributed stuff in core um I doubt you find cumner or time series in core so I think there will always be a world for extensions um like post Crist itself ships with want to say it's 17 extensions in contrib like it ships with some um uu ID was one for a while right now I think it's going to get a more standard uu ID type um so as these things standard I don't think contri going to grow it's it was an interesting idea I think it's 5 years away but in five years I think we kind of figure out a full-on Marketplace for extensions now how that works within the community versus vendors versus others is interesting but I think we get there eventually that's super interesting the idea of extension Marketplace for a database is very novel to me I'm sure many have thought of it but I have never thought of that so how does um how does postrest win the hearts and minds of so many people because what I feel like is I I do feel like the postgress die hards are Die Hard um and so we get a little bit of understanding from the Heroku story about how it was like uh introduced and spread into perhaps the rails community and then further um cuz you know Heroku supports all kinds of languages but what is it these days that makes The Advocates Advocate is it the lack of ownership by Oracle is it the extensions is it the stability like what what are you seeing in your line of work where it's like these this is why people love it you know on The Advocate side it's probably a mix of all those right I think but I mean to be super honest like I think developers were were bad at like rational decision making like we're like rails is so much better than Python and jeno and here's why and I'm like Ruby and python are the same language yeah there like yes there are differences and now I'm going to get roasted in the comments it's going to be fun yeah you said it not me don't come at me guys I I was at a conference like uh it was about 10 years ago and and speaking at the speaker's dinner and someone was commenting like yeah we're just all like you know we just cargo cult and jump on the bandwagon and we we defend our things like we were so thought out and formed our own opinions and it's like you know someone else just said it and I follow along like for example like postest like as a jeno developer I know I use postest but I can't tell you why and half the table got quiet because I was down at the end right and I was just like uh concurrent index creation transactional ddl window function I was just like and it was just like oh oh that's really cool it is a really good database right so I think when you find someone that digs in right that prints out the manual that's is it 2,000 pages and reads it you know um have you read the full manual yet I've only printed 500 pages I have it I have it right here but I did not print uh you know I didn't print a lot of the stuff it's a lot of docks man and in the early days several of the Heroku team members all printed out and write it cover to cover we we're like we need to know this as good as anyone if we're running a database as a service thank you finally finally somebody that agrees with me I love that I said they did I didn't I was not in supportive I was like pass pass along my regards to your former uh co-workers and tell them that I think they're very very smart yes I think that's a great idea especially if you're going to be running it as a service I'm just trying to teach it so I print it out and read it but uh yeah like-minded folks but yeah it's it is a good database right I I do think you know oracle and what they've done with mySQL doesn't hurt it right I do think an incredibly permissive license that you can take it and go and build on top of it doesn't hurt it um I think as you look at Frameworks that now take advantage of the features that didn't for a long time right every database was just a dumb database with texts and time stamps and you know integers and decimals and why do we need more data types right like because you may want to migrate from at one point to another like have you ever migrated a database like it like if you have that philosophy about your database why don't you have your philosophy about your app code like we can't use anything unique to laravel because it might not exist in rails because we might need to migrate to rails like I've never heard that argument made yeah but I hear it made about a database all the time we can't use unique myql features because we might migrate to postgress or we can't use unique well why then make the same argument to your frame work into your your you know primary programming language and so I do think that the real die hards appreciate you know some things like like transactional ddl is still so nice like you run a migration it fails Midway what happens and it's like that doesn't happen every day but man when that does happen it's not a fun time no and so postrest being safe and reliable like as a data person it's near and dear to my heart now I think the rest of the programming world like we we do cargo cult and because it's the cool thing and you're not wrong for choosing postest that's where it gets the bulk of its love but the die hards I think have real reasons like it's it's licensing no Central owner um to me it's not just open source it's no Central owner that's MH like find another thing outside of maybe the Linux kernel that is run in a similar way yeah that's interesting the no Central owner
yeah I guess sqlite still maintained still super controlled by drh and I think his two friends um which is why lib sequel was forked because they don't allow pull requests laral has a corporation rails has a foundation I guess um and even then like the foundation yeah exactly it's I mean even in the like the I don't know if it's in the core Bas or not like no one company can prise more than 50% of the core uh Board of postgress I mean you look at the distribution of commits from like um and it's it's an oddity people are like oh the core team that's the people that commit stuff it's like no core is more of like a steering body of seven or nine people that controls the trademark and defends it and you know code of conduct and just making sure things are okay you know committers are the ones that do all the work right and that's 50 60 people and if you look at that the the distribution of companies um I don't know that any company has more than 10 committers and I don't even think it's I think Microsoft was really high at one point at six or seven I don't think they're that high anymore but it's it's really spread out in in kind of a crazy way and that no Central ownership is is a fascinating thing Beyond just the open source part yeah that is kind of wild I don't know if I knew that about postgress but I certainly haven't thought about it in in terms of other projects in the in the open source world and I will have to Noodle on that because that is very interesting and I think you're right I think it's Unique um it's crazy yeah when you when you pause you're like oh yeah this this open source database and that open source and it's driven by a company right part of it is it's it's hard to build a database right and yeah I don't think it can be recreated in this day and age like getting getting 10 different developers to agree and you know just work together like in an open source fashion without a financial backing behind it sounds crazy sounds hard sounds real hard um I want to close with two things one I'm going to spring on you because you mentioned it and then I want to talk about cidus and all its offerings what do you know about duct DB it seems so cool and I know so little about it but it seems spiritually similar to sqlite for obvious reasons so what do what do you know about duck DB what are they doing what do you like about it um just tell me your thoughts on Duck DB yeah it's to me it's fascinating and I think that the general space of database is fascinating right now like if if you dove into databases and started like you know podcasting about them 10 years ago you you know you'd have like will King level viewership on his personal um YouTube channel um I can't wait till he watches the entire episode sh fired will King catching Strays over there sorry about that buddy um like data databases aren't sexy right and yet they're almost becoming sexy now is like is that kind of a thing um duck TB I I think the the Creator Hest like really appreciates the idea of playing developer experience to database which is not a thing I've ever heard anyone talk about um I think sqlite is a perfect it's a great local developer database like you kind of don't need it as a server hosted database generally right um it's the vectorized execution like doing doing deep database engineering like the vectorized stuff is fascinating when I say vectors I don't mean AI like database Vector execution um postgis has had vectors in it for years and years and years what they're doing is different and interesting um but the general developer experience side of to me is is fascinating I actually think it'll be around for years and years and years and years to come based on that but I've also been in the database world to know like every database you're you know you go and look at today that you evaluate there's a really good chance they're not around in 10 years and I say that having you know mentioned astata and AA and green PL all those that I've never heard of that were the hot thing 15 years ago right that it's like how could this not be the next Oracle in a 10 billion doll company and yet where are they now right it's hard to build databases it's hard to build a company around them um duck DB is fascinating because developer and experience and post and and databases don't normally go together in the same sentence MH well always post sorry I always joke postrest is the least bad database like databases aren't good they're not fun and friendly it's not like yay I get to work with my database today I you may say that but I think most of us speak for yourself honestly um but most folks it's like hey postc Crush works it's reliable it's safe you know I I think people do feel that way about duck TB which is is great for it yeah I've been watching them uh because of my foray into sqlite with uh turo and I think the turo folks are friendly with the duck DB people and they're like it I think they're maybe at small data SF right now or last week or something and it just seems so cool like I just watch all the stuff they're doing on Twitter and I'm like I want to do that but uh who has the time but someday I'm going to dive into duck DB for real and learn everything there is to know about it because I do think it's fun I think it's a lot of fun um all right here's where I want to wrap it tell me uh you work at crunchy data crunchy data it seems like they've got like 10 things going on so tell me all the like crunchy crunchy bridge for all these different platforms and like what exactly does crunchy do postgress postgress go on go on anywhere you want it anyway like we have Fortune 500 companies you know large Banks and healthcare companies that rely on us for like on premise postrest right like they're like you go hey we've heard about this postest thing we want to do open source can you help us we we have you know I don't know $100 million to spend in Oracle and we're trying to you know check out this open source thing can you tell it like wild we live in a different world right Enterprises move differently right um so we have you know hey everything you need to run production postest and like while postest is great when as an Enterprise you're like all right we're going to evaluate monitoring okay we're going to there's five products right and we're going to we're an Enterprise so we're going to do a bake off and do a six- week trial of every monitoring tool and now we're going to do a six Monon trial of every Backup Tool and every Dr tool and every HR tool and we finally evaluated you know 20 tools and we found we like you know a you know q and t uh and F s uhoh s doesn't work with a crap we've got to go back and reevaluate now like do we want to get rid of a or do we want to get rid of like um so basically kind of like red hat was to Linux right like curated for the Enterprises right curated postrest distribution so that's kind of where we started like interesting okay like you're an Enterprise you want production postest we've got you covered we've got one of the oldest kubernetes operators uh stateful operators right so way back in the early days we got to know core on this operator pattern thing built a stateful operator for postest back when everyone's like no no no no you can't do database in kubernetes you can't do databases in containers like that was the like wild west you shouldn't be doing this we did it and we did it well and safely um I think it's coming up on almost six years old now as like a stateful kubernetes operator um I came on board about four years ago to build out crunchy bridge and as I kind of mentioned earlier it's like unfinished business with postest like I want developers a database should kind of like your bank it should just be like when your bank sends you a letter it's probably because you get free like uh social security protection for a year because they had a data breach right like you don't want to hear from me you want to build your app you want to me to fade away into the background right so how can I give you everything you need out of the box like our $10 a month plan runs in an isolated VPC support for bpc peering SSO like there's no SSO tax uh encryption at rest encryption at Transit point in time recovery um like with our equivalent of data clips for like 10 bucks a month for a true production grade database right um and are you orchestrating that within my for example AWS account or am I are you hosting that on your own in it's in our own but it's in an entirely isolated DPC so every customer like we run completely isolated data planes so a centralized control plane for all the provisioning monitoring all that but every customer's data is completely isolated got it okay um and I mean a lot of it's just you know hey how can we make this so simple right like when your discarge to get full we alert you alert you alert you and then Auto add you know scale it up and add size to it right um we've had a ton of folks migrate from Heroku or from RDS or wherever and like for the exact same Hardware see we've seen up to a 3 to 5x performance boost often it's 10 20 30% I was kind of shocked at the 3 to 5x it turns out like running postrest really well is a valuable thing um and then we recently actually started to get into kind of expand Beyond just standard posts and moving more to like the data warehousing space so there we've embedded duck DB under the covers and we've got post that talks to duck DB natively and DB duct DB that knows how to like respond postgress so things like map data types and structs and geospatial functions and uh we like natively operate on iceberg inside you know like you just connect to postgress and then you can query like an iceberg table just like that are you familiar with Iceberg or not I have no idea what iceberg is you familiar with par yeah barely so par is like a cumer compressed like file right if you had like 100 million records and like think of it as a fancy fancy CSV with like data types that are in the column or compressed format right okay mhm Iceberg came out of Netflix but is like a I think it's an Apache found Foundation like file type that is par Plus+ so if you've got a 100 million record file that you write out one time right yeah well if you got 100 new records that you got like suddenly like well we got 100 users we've got to delete from this file with paret You' got to write out a new file Iceberg has this like metadata that has a repository that points to oh here's this like 100,000 file and then here's this change CH here's this change CH here's this change set so you don't have to go write out a whole set it's like Park Plus+ um snowflake um data bricks um uh click house all are generally starting to support Iceberg now it's like the future of like in that data Lake space so it's fascinating but the thing is for us like you're using postgress you love postgress you don't have to think about iceberg you don't have to know about it right if you want to say I want to just archive my data into S3 in a compressed colum or format just continue to work with postest we do all that heavy lifting for you so that's our biggest L like last push like I managed to to recruit the like cus lead architect cus lead engineer to start building the like the world's expert on postrest extensions and I don't see that lightly they're like let's go do crazy data warehousing things with postrest and kind of take it to where it's ever been cool but basically if you want someone to help you run postest or manage postest for you the short is we do it really well and you don't need to be a postest expert I I gave up on trying to teach people like postrest and that they should learn squl I I appreciate uh your campaign so much because I'm just like man these developers like they just want to write an you know active record and it's it's man the the queries that come out of it are are rough um SQL is not a good language it's a powerful one it's it's not a good language it's a powerful one well I'm doing what I can to turn to turn the tides there I want uh I I come from a very it sounds like very uh similar place as you as like from the developer point of view and my goal is to hopefully Empower developers to know and like and be proficient in uh SQL and these different databases and then if you want to use active record that's fine but at least now you understand what's happening and you can spot uh a terrible active record query whenever you see it versus just like ah I just use you know I just use active record or eloquent or whatever um I want you to be like an expert user of an omm should you decide to use an OM that's my yeah I I just don't want you to be afraid of your database like that's that's why we built and launched like the postest playground like I don't know if you poked but it's like it's like this is postest running in your web browser but now it's like guided tutorials where we like bootstrap like the the database with some example like you know we'll go run some queries ahead of time in your browser against it so that you have like stats in your database in the browser and you don't have to worry about like what happens if I run this command it's in your browser refresh data's back like the idea that how do we let you work with postgress and get more comfortable with it so you're not scared of your database MH yeah that that in browser playground is cool I've absolutely played around with that y'all also have a great blog I have some of them printed out around here but youall have a great blog with lots of contributors um over at crunchy it's very impressive um okay well thank you for being so generous with your time is there anything else uh that we didn't cover or and uh where can people find you to learn more to follow along um I think you mentioned you know the crunchy data blog and newsletter it's a you know spend a lot of time there I'm pretty easy to find you can if you can kind of spell my last name Craig kiry so I you know find me I'm pretty easy to find on the internet um and by all means like drop me notes DMS emails I'm generally happy to help I can't like it's been a fun Tech ride I'm not done yet I'm not washed up and retiring tomorrow but it's been a fun Tech ride and so like very happy to try to you know connect with folks and get back you know wherever I can well this is one way that you have done so I super appreciate it thank you for doing this and we will talk soon yeah thanks so much see you