Slinging CDK Knowledge with Matt Coulter

Episode Summary

The keynote speeches at re:Invent are always a gamble, but fortunately this past year Matt Coulter, Technical Architect at Liberty Mutual, performed amply. While today he doesn’t have 20K people in front of him, Matt still has some great things to say. Matt and Corey talk about what it is like to present in front of such large audiences. Matt also divulges some of the details on what Liberty Mutual has working on in the cloud. Notably so, Matt chimes up on what all he has been worked on in CDK. For Matt a focus on reusable and well built tools in CDK and how they’re solving issues is crucial, as well as taking some architectural bounds at Liberty Mutual. Check out his take on all things CDK!

Episode Show Notes & Transcript

About Matt
Matt is an AWS DevTools Hero, Serverless Architect, Author and conference speaker. 

He is focused on creating the right environment for empowered teams to rapidly deliver business value in a well-architected, sustainable and serverless-first way.

You can usually find him sharing reusable, well architected, serverless patterns over at cdkpatterns.com or behind the scenes bringing CDK Day to life.


Links:

Transcript
Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.


Corey: It seems like there is a new security breach every day. Are you confident that an old SSH key, or a shared admin account, isn’t going to come back and bite you? If not, check out Teleport. Teleport is the easiest, most secure way to access all of your infrastructure. The open source Teleport Access Plane consolidates everything you need for secure access to your Linux and Windows servers—and I assure you there is no third option there. Kubernetes clusters, databases, and internal applications like AWS Management Console, Yankins, GitLab, Grafana, Jupyter Notebooks, and more. Teleport’s unique approach is not only more secure, it also improves developer productivity. To learn more visit: goteleport.com. And not, that is not me telling you to go away, it is: goteleport.com.


Corey: This episode is sponsored in part by our friends at Rising Cloud, which I hadn’t heard of before, but they’re doing something vaguely interesting here. They are using AI, which is usually where my eyes glaze over and I lose attention, but they’re using it to help developers be more efficient by reducing repetitive tasks. So, the idea being that you can run stateless things without having to worry about scaling, placement, et cetera, and the rest. They claim significant cost savings, and they’re able to wind up taking what you’re running as it is in AWS with no changes, and run it inside of their data centers that span multiple regions. I’m somewhat skeptical, but their customers seem to really like them, so that’s one of those areas where I really have a hard time being too snarky about it because when you solve a customer’s problem and they get out there in public and say, “We’re solving a problem,” it’s very hard to snark about that. Multus Medical, Construx.ai and Stax have seen significant results by using them. And it’s worth exploring. So, if you’re looking for a smarter, faster, cheaper alternative to EC2, Lambda, or batch, consider checking them out. Visit risingcloud.com/benefits. That’s risingcloud.com/benefits, and be sure to tell them that I said you because watching people wince when you mention my name is one of the guilty pleasures of listening to this podcast.


Corey: Welcome to Screaming in the Cloud. I’m Corey Quinn. I’m joined today by Matt Coulter, who is a Technical Architect at Liberty Mutual. You may have had the privilege of seeing him on the keynote stage at re:Invent last year—in Las Vegas or remotely—that last year of course being 2021. But if you make better choices than the two of us did, and found yourself not there, take the chance to go and watch that keynote. It’s really worth seeing.


Matt, first, thank you for joining me. I’m sorry, I don’t have 20,000 people here in the audience to clap this time. They’re here, but they’re all remote as opposed to sitting in the room behind me because you know, social distancing.


Matt: And this left earphone, I just have some applause going, just permanently, just to keep me going. [laugh].


Corey: That’s sort of my own internal laugh track going on. It’s basically whatever I say is hilarious, to that. So yeah, doesn’t really matter what I say, how I say it, my jokes are all for me. It’s fine. So, what was it like being on stage in front of that many people? It’s always been a wild experience to watch and for folks who haven’t spent time on the speaking circuit, I don’t think that there’s any real conception of what that’s like. Is this like giving a talk at work, where I just walk on stage randomly, whatever I happened to be wearing? And, oh, here’s a microphone, I’m going to say words. What is the process there?


Matt: It’s completely different. For context for everyone, before the pandemic, I would have pretty regularly talked in front of, I don’t know, maybe one, two hundred people in Liberty, in Belfast. So, I used to be able to just, sort of, walk in front of them, and lean against the pillar, and use my clicker, and click through, but the process for actually presenting something as big as a keynote and re:Invent is so different. For starters, you think that when you walk onto the stage, you’ll actually be able to see the audience, but the way the lights are set up, you can pretty much see about one row of people, and they’re not the front row, so anybody I knew, I couldn’t actually see.


And yeah, you can only see, sort of like, the from the void, and then you have your screens, so you’ve six sets of screens that tell you your notes as well as what slides you’re on, you know, so you can pivot. But other than that, I mean, it feels like you’re just talking to yourself outside of whenever people, thankfully, applause. It’s such a long process to get there.


Corey: I’ve always said that there are a few different transition stages as the audience size increases, but for me, the final stage is more or less anything above 750 people. Because as you say, you aren’t able to see that many beyond that point, and it doesn’t really change anything meaningfully. The most common example that you see in the wild is jokes that work super well with a small group of people fall completely flat to large audiences. It’s why so much corporate numerous cheesy because yeah, everyone in the rehearsals is sitting there laughing and the joke kills, but now you’ve got 5000 people sitting in a room and that joke just sounds strained and forced because there’s no longer a conversation, and no one has the shared context that—the humor has to change. So, in some cases when you’re telling a story about what you’re going to say on stage, during a rehearsal, they’re going to say, “Well, that joke sounds really corny and lame.” It’s, “Yeah, wait until you see it in front of an audience. It will land very differently.” And I’m usually right on that.


I would also advise, you know, doing what you do and having something important and useful to say, as opposed to just going up there to tell jokes the whole time. I wanted to talk about that because you talked about how you’re using various CDK and other serverless style patterns in your work at Liberty Mutual.


Matt: Yeah. So, we’ve been using CDK pretty extensively since it was, sort of, Q3 2019. At that point, it was new. Like, it had just gone GA at the time, just came out of dev preview. And we’ve been using CDK from the perspective of we want to be building serverless-first, well-architected apps, and ideally we want to be building them on AWS.


Now, the thing is, we have 5000 people in our IT organization, so there’s sort of a couple of ways you can take to try and get those people onto the cloud: You can either go the route of being, like, there is one true path to architecture, this is our architecture and everything you want to build can fit into that square box; or you can go the other approach and try and have the golden path where you say this is the paved road that is really easy to do, but if you want to differentiate from that route, that’s okay. But what you need to do is feed back into the golden path if that works. Then everybody can improve. And that’s where we’ve started been using CDK. So, what you heard me talk about was the software accelerator, and it’s sort of a different approach.


It’s where anybody can build a pattern and then share it so that everybody else can rapidly, you know, just reuse it. And what that means is effectively you can, instead of having to have hundreds of people on a central team, you can actually just crowdsource, and sort of decentralize the function. And if things are good, then a small team can actually come in and audit them, so to speak, and check that it’s well-architected, and doesn’t have flaws, and drive things that way.


Corey: I have to confess that I view the CDK as sort of a third stage automation approach, and it’s one that I haven’t done much work with myself. The first stage is clicking around in the console; the second is using CloudFormation or Terraform; the third stage is what we’re talking about here is CDK or Pulumi, or something like that. And then you ascend to the final fourth stage, which is what I use, which is clicking around in the AWS console, but then you lie to people about it. ClickOps is poised to take over the world. But that’s okay. You haven’t gotten that far yet. Instead, you’re on the CDK side. What advantages does CDK offer that effectively CloudFormation or something like it doesn’t?


Matt: So, first off, for ClickOps in Liberty, we actually have the AWS console as read-only in all of our accounts, except for sandbox. So, you can ClickOps in sandbox to learn, but if you want to do something real, unfortunately, it’s going to fail you. So.—


Corey: I love that pattern. I think I might steal that.


Matt: [laugh]. So, originally, we went heavy on CloudFormation, which is why CDK worked well for us. And because we’ve actually—it’s been a long journey. I mean, we’ve been deploying—2014, I think it was, we first started deploying to AWS, and we’ve used everything from Terraform, to you name it. We’ve built our own tools, believe it or not, that are basically CDK.


And the thing about CloudFormation is, it’s brilliant, but it’s also incredibly verbose and long because you need to specify absolutely everything that you want to deploy, and every piece of configuration. And that’s fine if you’re just deploying a side project, but if you’re in an enterprise that has responsibilities to protect user data, and you can’t just deploy anything, they end up thousands and thousands and thousands of lines long. And then we have amazing guardrails, so if you tried to deploy a CloudFormation template with a flaw in it, we can either just fix it, or reject the deploy. But CloudFormation is not known to be the fastest to deploy, so you end up in this developer cycle, where you build this template by hand, and then it goes through that CloudFormation deploy, and then you get the failure message that it didn’t deploy because of some compliance thing, and developers just got frustrated, and were like, sod this. [laugh].


I’m not deploying to AWS. Back the on-prem. And that’s where CDK was a bit different because it allowed us to actually build abstractions with all of our guardrails baked in, so that it just looked like a standard class, for developers, like, developers already know Java, Python, TypeScript, the languages off CDK, and so we were able to just make it easy by saying, “You want API Gateway? There’s an API Gateway class. You want, I don’t know, an EC2 instance? There you go.” And that way, developers could focus on the thing they wanted, instead of all of the compliance stuff that they needed to care about every time they wanted to deploy.


Corey: Personally, I keep lobbying AWS to add my preferred language, which is crappy shell scripting, but for some reason they haven’t really been quick to add that one in. The thing that I think surprises me, on some level—though, perhaps it shouldn’t—is not just the adoption of serverless that you’re driving at Liberty Mutual, but the way that you’re interacting with that feels very futuristic, for lack of a better term. And please don’t think that I’m in any way describing this in a way that’s designed to be insulting, but I do a bunch of serverless nonsense on Twitter for Pets. That’s not an exaggeration. twitterforpets.com has a bunch of serverless stuff behind it because you know, I have personality defects.


But no one cares about that static site that’s been a slide dump a couple of times for me, and a running joke. You’re at Liberty Mutual; you’re an insurance company. When people wind up talking about big enterprise institutions, you’re sort of a shorthand example of exactly what they’re talking about. It’s easy to contextualize or think of that as being very risk averse—for obvious reasons; you are an insurance company—as well as wanting to move relatively slowly with respect to technological advancement because mistakes are going to have drastic consequences to all of your customers, people’s lives, et cetera, as opposed to tweets or—barks—not showing up appropriately at the right time. How did you get to the, I guess, advanced architectural philosophy that you clearly have been embracing as a company, while having to be respectful of the risk inherent that comes with change, especially in large, complex environments?


Matt: Yeah, it’s funny because so for everyone, we were talking before this recording started about, I’ve been with Liberty since 2011. So, I’ve seen a lot of change in the length of time I’ve been here. And I’ve built everything from IBM applications right the way through to the modern serverless apps. But the interesting thing is, the journey to where we are today definitely started eight or nine years ago, at a minimum because there was something identified in the leadership that they said, “Listen, we’re all about our customers. And that means we don’t want to be wasting millions of dollars, and thousands of hours, and big trains of people to build software that does stuff. We want to focus on why are we building a piece of software, and how quickly can we get there? If you focus on those two things you’re doing all right.”


And that’s why starting from the early days, we focused on things like, okay, everything needs to go through CI/CD pipelines. You need to have your infrastructure as code. And even if you’re deploying on-prem, you’re still going to be using the same standards that we use to deploy to AWS today. So, we had years and years and years of just baking good development practices into the company. And then whenever we started to move to AWS, the question became, do we want to just deploy the same thing or do we want to take full advantage of what the cloud has to offer? And I think because we were primed and because the leadership had the right direction, you know, we were just sitting there ready to say, “Okay, serverless seems like a way we can rapidly help our customers.” And that’s what we’ve done.


Corey: A lot of the arguments against serverless—and let’s be clear, they rhyme with the previous arguments against cloud that lots of people used to make; including me, let’s be clear here. I’m usually wrong when I try to predict the future. “Well, you’re putting your availability in someone else’s hands,” was the argument about cloud. Yeah, it turns out the clouds are better at keeping things up than we are as individual companies.


Then with serverless, it’s the, “Well, if they’re handling all that stuff for you on their side, when they’re down, you’re down. That’s an unacceptable business risk, so we’re going to be cloud-agnostic and multi-cloud, and that means everything we build serverlessly needs to work in multiple environments, including in our on-prem environment.” And from the way that we’re talking about servers and things that you’re building, I don’t believe that is technically possible, unless some of the stuff you’re building is ridiculous. How did you come to accept that risk organizationally?


Matt: These are the conversations that we’re all having. Sort of, I’d say once a week, we all have a multi-cloud discussion—and I really liked the article you wrote, it was maybe last year, maybe the year before—but multi-cloud to me is about taking the best capabilities that are out there and bringing them together. So, you know, like, Azure [ID 00:12:47] or whatever, things from the other clouds that they’re good at, and using those rather than thinking, “Can I build a workload that I can simultaneously pay all of the price to run across all of the clouds, all of the time, so that if one’s down, theoretically, I might have an outage?” So, the way we’ve looked at it is we embraced really early the well-architected framework from AWS. And it talks about things like you need to have multi-region availability, you need to have your backups in place, you need to have things like circuit breakers in place for if third-party goes down, and we’ve just tried to build really resilient architectures as best as we can on AWS. And do you know what I think, if [laugh] it AWS is not—I know at re:Invent, there it went down extraordinarily often compared to normal, but in general—


Corey: We were all tired of re:Invent; their us-east-1 was feeling the exact same way.


Matt: Yeah, so that’s—it deserved a break. But, like, if somebody can’t buy insurance for an hour, once a year, [laugh] I think we’re okay with it versus spending millions to protect that one hour.


Corey: And people make assumptions based on this where, okay, we had this problem with us-east-1 that froze things like the global Route 53 control planes; you couldn’t change DNS for seven hours. And I highlighted that as, yeah, this is a problem, and it’s something to severely consider, but I will bet you anything you’d care to name that there is an incredibly motivated team at AWS, actively fixing that as we speak. And by—I don’t know how long it takes to untangle all of those dependencies, but I promise they’re going to be untangled in relatively short order versus running data centers myself, when I discover a key underlying dependency I didn’t realize was there, well, we need to break that. That’s never going to happen because we’re trying to do things as a company, and it’s just not the most important thing for us as a going concern. With AWS, their durability and reliability is the most important thing, arguably compared to security.


Would you rather be down or insecure? I feel like they pick down—I would hope in most cases they would pick down—but they don’t want to do either one. That is something they are drastically incentivized to fix. And I’m never going to be able to fix things like that and I don’t imagine that you folks would be able to either.


Matt: Yeah, so, two things. The first thing is the important stuff, like, for us, that’s claims. We want to make sure at any point in time, if you need to make a claim you can because that is why we’re here. And we can do that with people whether or not the machines are up or down. So, that’s why, like, you always have a process—a manual process—that the business can operate, irrespective of whether the cloud is still working.


And that’s why we’re able to say if you can’t buy insurance in that hour, it’s okay. But the other thing is, we did used to have a lot of data centers, and I have to say, the people who ran those were amazing—I think half the staff now work for AWS—but there was this story that I heard where there was an app that used to go down at the same time every day, and nobody could work out why. And it was because someone was coming in to clean the room at that time, and they unplugged the server to plug in a vacuum, and then we’re cleaning the room, and then plugging it back in again. And that’s the kind of thing that just happens when you manage people, and you manage a building, and manage a premises. Whereas if you’ve heard that happened that AWS, I mean, that would be front page news.


Corey: Oh, it absolutely would. There’s also—as you say, if it’s the sales function, if people aren’t able to buy insurance for an hour, when us-east-1 went down, the headlines were all screaming about AWS taking an outage, and some of the more notable customers were listed as examples of this, but the story was that, “AWS has massive outage,” not, “Your particular company is bad at technology.” There’s sort of a reputational risk mitigation by going with one of these centralized things. And again, as you’re alluding to, what you’re doing is not life-critical as far as the sales process and getting people to sign up. If an outage meant that suddenly a bunch of customers were no longer insured, that’s a very different problem. But that’s not your failure mode.


Matt: Exactly. And that’s where, like, you got to look at what your business is, and what you’re specifically doing, but for 99.99999% of businesses out there, I’m pretty sure you can be down for the tiny window that AWS is down per year, and it will be okay, as long as you plan for it.


Corey: So, one thing that really surprised me about the entirety of what you’ve done at Liberty Mutual is that you’re a big enterprise company, and you can take a look at any enterprise company, and say that they have dueling mottos, which is, “I am not going to comment on that,” or, “That’s not funny.” Like, the safe mode for any large concern is to say nothing at all. But a lot of folks—not just you—at Liberty have been extremely vocal about the work that you’re doing, how you view these things, and I almost want to call it advocacy or evangelism for the CDK. I’m slightly embarrassed to admit that for a little while there, I thought you were an AWS employee in their DevRel program because you were such an advocate in such strong ways for the CDK itself.


And that is not something I expected. Usually you see the most vocal folks working in environments that, let’s be honest, tend to play a little bit fast and loose with things like formal corporate communications. Liberty doesn’t and yet, there you folks are telling these great stories. Was that hard to win over as a culture, or am I just misunderstanding how corporate life is these days?


Matt: No, I mean, so it was different, right? There was a point in time where, I think, we all just sort of decided that—I mean, we’re really good at what we do from an engineering perspective, and we wanted to make sure that, given the messaging we were given, those 5000 teck employees in Liberty Mutual, if you consider the difference in broadcasting to 5000 versus going external, it may sound like there’s millions, billions of people in the world, but in reality, the difference in messaging is not that much. So, to me what I thought, like, whenever I started anyway—it’s not, like, we had a meeting and all decided at the same time—but whenever I started, it was a case of, instead of me just posting on all the internal channels—because I’ve been doing this for years—it’s just at that moment, I thought, I could just start saying these things externally and still bring them internally because all you’ve done is widened the audience; you haven’t actually made it shallower. And that meant that whenever I was having the internal conversations, nothing actually changed except for it meant external people, like all their Heroes—like Jeremy Daly—could comment on these things, and then I could bring that in internally. So, it almost helped the reverse takeover of the enterprise to change the culture because I didn’t change that much except for change the audience of who I was talking to.


Corey: This episode is sponsored by our friends at Oracle HeatWave is a new high-performance accelerator for the Oracle MySQL Database Service. Although I insist on calling it “my squirrel.” While MySQL has long been the worlds most popular open source database, shifting from transacting to analytics required way too much overhead and, ya know, work. With HeatWave you can run your OLTP and OLAP, don’t ask me to ever say those acronyms again, workloads directly from your MySQL database and eliminate the time consuming data movement and integration work, while also performing 1100X faster than Amazon Aurora, and 2.5X faster than Amazon Redshift, at a third of the cost. My thanks again to Oracle Cloud for sponsoring this ridiculous nonsense.


Corey: One thing that you’ve done that I want to say is admirable, and I stumbled across it when I was doing some work myself over the break, and only right before this recording did I discover that it was you is the cdkpatterns.com website. Specifically what I love about it is that it publishes a bunch of different patterns of ways to do things. This deviates from a lot of tutorials on, “Here’s how to build this one very specific thing,” and instead talks about, “Here’s the architecture design; here’s what the baseline pattern for that looks like.” It’s more than a template, but less than a, “Oh, this is a messaging app for dogs and I’m trying to build a messaging app for cats.” It’s very generalized, but very direct, and I really, really like that model of demo.


Matt: Thank you. So, watching some of your Twitter threads where you experiment with new—


Corey: Uh oh. People read those. That’s a problem.


Matt: I know. So, whatever you experiment with a new piece of AWS to you, I’ve always wondered what it would be like to be your enabling architect. Because technically, my job in Liberty is, I meant to try and stay ahead of everybody and try and ease the on-ramp to these things. So, if I was your enabling architect, I would be looking at it going, “I should really have a pattern for this.” So that whenever you want to pick up that new service the patterns in cdkpatterns.com, there’s 24, 25 of them right there, but internally, there’s way more than dozens now.


The goal is, the pattern is the least amount to code for you to learn a concept. And then that way, you can not only see how something works, but you can maybe pick up one of the pieces of the well-architected framework while you’re there: All of it’s unit tested, all of it is proper, you know, like, commented code. The idea is to not be crap, but not be gold-plated either. I’m currently in the process of upgrading that all to V2 as well. So, that [unintelligible 00:21:32].


Corey: You mentioned a phrase just now: “Enabling architect.” I have to say this one that has not crossed my desk before. Is that an internal term you use? Is that an enterprise concept I’ve somehow managed to avoid? Is that an AWS job role? What is that?


Matt: I’ve just started saying [laugh] it’s my job over the past couple of years. That—I don’t know, patent pending? But the idea to me is—


Corey: No, it’s evocative. I love the term, I’d love to learn more.


Matt: Yeah, because you can sort of take two approaches to your architecture: You can take the traditional approach, which is the ‘house of 
no’ almost, where it’s like, “This is the architecture. How dare you want to deviate. This is what we have decided. If you want to change it, here’s the Architecture Council and go through enterprise architecture as people imagine it.” But as people might work out quite quickly, whenever they meet me, the whole, like, long conversational meetings are not for me. What I want to do is teach engineers how to help themselves, so that’s why I see myself as enabling.


And what I’ve been doing is using techniques like Wardley Mapping, which is where you can go out and you can actually take all the components of people’s architecture and you can draw them on a map for—it’s a map of how close they are to the customer, as well as how cutting edge the tech is, or how aligned to our strategic direction it is. So, you can actually map out all of the teams, and—there’s 160, 170 engineers in Belfast and Dublin, and I can actually go in and say, “Oh, that piece of your architecture would be better if it was evolved to this. Well, I have a pattern for that,” or, “I don’t have a pattern for that, but you know what? I’ll build one and let’s talk about it next week.” And that’s always trying to be ahead, instead of people coming to me and I have to say no.


Corey: AWS Proton was designed to do something vaguely similar, where you could set out architectural patterns of—like, the two examples that they gave—I don’t know if it’s in general availability yet or still in public preview, but the ones that they gave were to build a REST API with Lambda, and building something-or-other with Fargate. And the idea was that you could basically fork those, or publish them inside of your own environment of, “Oh, you want a REST API; go ahead and do this.” It feels like their vision is a lot more prescriptive than what yours is.


Matt: Yeah. I talked to them quite a lot about Proton, actually because, as always, there’s different methodologies and different ways of doing things. And as I showed externally, we have our software accelerator, which is kind of our take on Proton, and it’s very open. Anybody can contribute; anybody can consume. And then that way, it means that you don’t necessarily have one central team, you can have—think of it more like an SRE function for all of the patterns, rather than… the Proton way is you’ve separate teams that are your DevOps teams that set up your patterns and then separate team that’s consumer, and they have different permissions, different rights to do different things. If you use a Proton pattern, anytime an update is made to that pattern, it auto-deploys your infrastructure.


Corey: I can see that breaking an awful lot.


Matt: [laugh]. Yeah. So, the idea is sort of if you’re a consumer, I assume you [unintelligible 00:24:35] be going to change that infrastructure. You can, they’ve built in an escape hatch, but the whole concept of it is there’s a central team that looks to what the best configuration for that is. So, I think Proton has so much potential, I just think they need to loosen some of the boundaries for it to work for us, and that’s the feedback I’ve given them directly as well.


Corey: One thing that I want to take a step beyond this is, you care about this? More than most do. I mean, people will work with computers, yes. We get paid for that. Then they’ll go and give talks about things. You’re doing that as well. They’ll launch a website occasionally, like, cdkpatterns.com, which you have. And then you just sort of decide to go for the absolute hardest thing in the world, and you’re one of four authors of a book on this. Tell me more.


Matt: Yeah. So, this is something that there’s a few of us have been talking since one of the first CDK Days, where we’re friends, so there’s AWS Heroes. There’s Thorsten Höger, Matt Bonig, Sathyajith Bhat, and myself, came together—it was sometime in the summer last year—and said, “Okay. We want to write a book, but how do we do this?” Because, you know, we weren’t authors before this point; we’d never done it before. We weren’t even sure if we should go to a publisher, or if we should self-publish.


Corey: I argue that no one wants to write a book. They want to have written a book, and every first-time author I’ve ever spoken to at the end has said, “Why on earth would anyone want to do this a second time?” But people do it.


Matt: Yeah. And that’s we talked to Alex DeBrie, actually, about his book, the amazing Dynamodb Book. And it was his advice, told us to self-publish. And he gave us his starter template that he used for his book, which took so much of the pain out because all we had to do was then work out how we were going to work together. And I will say, I write quite a lot of stuff in general for people, but writing a book is completely different because once it’s out there, it’s out there. And if it’s wrong, it’s wrong. You got to release a new version and be like, “Listen, I got that wrong.” So, it did take quite a lot of effort from the group to pull it together. But now that we have it, I want to—I don’t have a printed copy because it’s only PDF at the minute, but I want a copy just put here [laugh] in, like, the frame. Because it’s… it’s what we all want.


Corey: Yeah, I want you to do that through almost a traditional publisher, selfishly, because O’Reilly just released the AWS Cookbook, and I had a great review quote on the back talking about the value added. I would love to argue that they use one of mine for The CDK Book—and then of course they would reject it immediately—of, “I don’t know why you do all this. Using the console and lying about it is way easier.” But yeah, 
obviously not the direction you’re trying to take the book in. But again, the industry is not quite ready for the lying version of ClickOps.


It’s really neat to just see how willing you are to—how to frame this?—to give of yourself and your time and what you’ve done so freely. I sometimes make a joke—that arguably isn’t that funny—that, “Oh, AWS Hero. That means that you basically volunteer for a $1.6 trillion company.”


But that’s not actually what you’re doing. What you’re doing is having figured out all the sharp edges and hacked your way through the jungle to get to something that is functional, you’re a trailblazer. You’re trying to save other people who are working with that same thing from difficult experiences on their own, having to all thrash and find our own way. And not everyone is diligent and as willing to continue to persist on these things. Is that a somewhat fair assessment how you see the Hero role?



Matt: Yeah. I mean, no two Heroes are the same, from what I’ve judged, I haven’t met every Hero yet because pandemic, so Vegas was the first time [I met most 00:28:12], but from my perspective, I mean, in the past, whatever number of years I’ve been coding, I’ve always been doing the same thing. Somebody always has to go out and be the first person to try the thing and work out what the value is, and where it’ll work for us more work for us. The only difference with the external and public piece is that last 5%, which it’s a very different thing to do, but I personally, I like even having conversations like this where I get to meet people that I’ve never met before.


Corey: You sort of discovered the entire secret of why I have an interview podcast.


Matt: [laugh]. Yeah because this is what I get out of it, just getting to meet other people and have new experiences. But I will say there’s 
Heroes out there doing very different things. You’ve got, like, Hiro—as in Hiro, H-I-R-O—actually started AWS Newbies and she’s taught—ah, it’s hundreds of thousands of people how to actually just start with AWS, through a course designed for people who weren’t coders before. That kind of thing is next-level compared to anything I’ve ever done because you know, they have actually built a product and just given it away. I think that’s amazing.


Corey: At some level, building a product and giving it away sounds like, “You know, I want to never be lonely again.” Well, that’ll work because you’re always going to get support tickets. There’s an interesting narrative around how to wind up effectively managing the community, and users, and demands, based on open-source maintainers, that we’re all wrestling with as an industry, particularly in the wake of that whole log4j nonsense that we’ve been tilting at that windmill, and that’s going to be with us for a while. One last thing I want to talk about before we wind up calling this an episode is, you are one of the organizers of CDK Day. What is that?


Matt: Yeah, so CDK Day, it’s a complete community-organized conference. The past two have been worldwide, fully virtual just because of the situation we’re in. And I mean, they’ve been pretty popular. I think we had about 5000 people attended the last one, and the idea is, it’s a full day of the community just telling their stories of how they liked or disliked using the CDK. So, it’s not a marketing event; it’s not a sales event; we actually run the whole event on a budget of exactly $0. But yeah, it’s just a day of fun to bring the community together and learn a few things. And, you know, if you leave it thinking CDK is not for you, I’m okay with that as much as if you just make a few friends while you’re there.


Corey: This is the first time I’d realized that it wasn’t a formal AWS event. I almost feel like that’s the tagline that you should have under it. It’s—because it sounds like the CDK Day, again, like, it’s this evangelism pure, “This is why it’s great and why you should use it.” But I love conferences that embrace critical views. I built one of the first talks I ever built out that did anything beyond small user groups was “Heresy in the Church of Docker.”


Then they asked me to give that at ContainerCon, which was incredibly flattering. And I don’t think they made that mistake a second time, but it was great to just be willing to see some group of folks that are deeply invested in the technology, but also very open to hearing criticism. I think that’s the difference between someone who is writing a nuanced critique versus someone who’s just [pure-on 00:31:18] zealotry. “But the CDK is the answer to every technical problem you’ve got.” Well, I start to question the wisdom of how applicable it really is, and how objective you are. I’ve never gotten that vibe from you.


Matt: No, and that’s the thing. So, I mean, as we’ve worked out in this conversation, I don’t work for AWS, so it’s not my product. I mean, if it succeeds or if it fails, it doesn’t impact my livelihood. I mean, there are people on the team who would be sad for, but the point is, my end goal is always the same. I want people to be enabled to rapidly deliver their software to help their customers.


If that’s CDK, perfect, but CDK is not for everyone. I mean, there are other options available in the market. And if, even, ClickOps is the way to go for you, I am happy for you. But if it’s a case of we can have a conversation, and I can help you get closer to where you need to be with some other tool, that’s where I want to be. I just want to help people.


Corey: And if I can do anything to help along that axis, please don’t hesitate to let me know. I really want to thank you for taking the time to speak with me and being so generous, not just with your time for this podcast, but all the time you spend helping the rest of us figure out which end is up, as we continue to find that the way we manage environments evolves.


Matt: Yeah. And, listen, just thank you for having me on today because I’ve been reading your tweets for two years, so I’m just starstruck at this moment to even be talking to you. So, thank you.


Corey: No, no. I understand that, but don’t worry, I put my pants on two legs at a time, just like everyone else. That’s right, the thought leader on Twitter, you have to jump into your pants. That’s the rule. Thanks again so much. I look forward to having a further conversation with you about this stuff as I continue to explore, well honestly, what feels like a brand new paradigm for how we manage code.


Matt: Yeah. Reach out if you need any help.


Corey: I certainly will. You’ll regret asking. Matt [Coulter 00:33:06], Technical Architect at Liberty Mutual. I’m Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you’ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you’ve hated this podcast, please leave a five-star review on your podcast platform of choice, write an angry comment, then click the submit button, but lie and say you hit the submit button via an API call.


Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.


Announcer: This has been a HumblePod production. Stay humble.

Transcript

Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.

Corey: It seems like there is a new security breach every day. Are you confident that an old SSH key, or a shared admin account, isn’t going to come back and bite you? If not, check out Teleport. Teleport is the easiest, most secure way to access all of your infrastructure. The open source Teleport Access Plane consolidates everything you need for secure access to your Linux and Windows servers—and I assure you there is no third option there. Kubernetes clusters, databases, and internal applications like AWS Management Console, Yankins, GitLab, Grafana, Jupyter Notebooks, and more. Teleport’s unique approach is not only more secure, it also improves developer productivity. To learn more visit: goteleport.com. And not, that is not me telling you to go away, it is: goteleport.com.

Corey: This episode is sponsored in part by our friends at Rising Cloud, which I hadn’t heard of before, but they’re doing something vaguely interesting here. They are using AI, which is usually where my eyes glaze over and I lose attention, but they’re using it to help developers be more efficient by reducing repetitive tasks. So, the idea being that you can run stateless things without having to worry about scaling, placement, et cetera, and the rest. They claim significant cost savings, and they’re able to wind up taking what you’re running as it is in AWS with no changes, and run it inside of their data centers that span multiple regions. I’m somewhat skeptical, but their customers seem to really like them, so that’s one of those areas where I really have a hard time being too snarky about it because when you solve a customer’s problem and they get out there in public and say, “We’re solving a problem,” it’s very hard to snark about that. Multus Medical, Construx.ai and Stax have seen significant results by using them. And it’s worth exploring. So, if you’re looking for a smarter, faster, cheaper alternative to EC2, Lambda, or batch, consider checking them out. Visit risingcloud.com/benefits. That’s risingcloud.com/benefits, and be sure to tell them that I said you because watching people wince when you mention my name is one of the guilty pleasures of listening to this podcast.

Corey: Welcome to Screaming in the Cloud. I’m Corey Quinn. I’m joined today by Matt Coulter, who is a Technical Architect at Liberty Mutual. You may have had the privilege of seeing him on the keynote stage at re:Invent last year—in Las Vegas or remotely—that last year of course being 2021. But if you make better choices than the two of us did, and found yourself not there, take the chance to go and watch that keynote. It’s really worth seeing.

Matt, first, thank you for joining me. I’m sorry, I don’t have 20,000 people here in the audience to clap this time. They’re here, but they’re all remote as opposed to sitting in the room behind me because you know, social distancing.

Matt: And this left earphone, I just have some applause going, just permanently, just to keep me going. [laugh].

Corey: That’s sort of my own internal laugh track going on. It’s basically whatever I say is hilarious, to that. So yeah, doesn’t really matter what I say, how I say it, my jokes are all for me. It’s fine. So, what was it like being on stage in front of that many people? It’s always been a wild experience to watch and for folks who haven’t spent time on the speaking circuit, I don’t think that there’s any real conception of what that’s like. Is this like giving a talk at work, where I just walk on stage randomly, whatever I happened to be wearing? And, oh, here’s a microphone, I’m going to say words. What is the process there?

Matt: It’s completely different. For context for everyone, before the pandemic, I would have pretty regularly talked in front of, I don’t know, maybe one, two hundred people in Liberty, in Belfast. So, I used to be able to just, sort of, walk in front of them, and lean against the pillar, and use my clicker, and click through, but the process for actually presenting something as big as a keynote and re:Invent is so different. For starters, you think that when you walk onto the stage, you’ll actually be able to see the audience, but the way the lights are set up, you can pretty much see about one row of people, and they’re not the front row, so anybody I knew, I couldn’t actually see.

And yeah, you can only see, sort of like, the from the void, and then you have your screens, so you’ve six sets of screens that tell you your notes as well as what slides you’re on, you know, so you can pivot. But other than that, I mean, it feels like you’re just talking to yourself outside of whenever people, thankfully, applause. It’s such a long process to get there.

Corey: I’ve always said that there are a few different transition stages as the audience size increases, but for me, the final stage is more or less anything above 750 people. Because as you say, you aren’t able to see that many beyond that point, and it doesn’t really change anything meaningfully. The most common example that you see in the wild is jokes that work super well with a small group of people fall completely flat to large audiences. It’s why so much corporate numerous cheesy because yeah, everyone in the rehearsals is sitting there laughing and the joke kills, but now you’ve got 5000 people sitting in a room and that joke just sounds strained and forced because there’s no longer a conversation, and no one has the shared context that—the humor has to change. So, in some cases when you’re telling a story about what you’re going to say on stage, during a rehearsal, they’re going to say, “Well, that joke sounds really corny and lame.” It’s, “Yeah, wait until you see it in front of an audience. It will land very differently.” And I’m usually right on that.

I would also advise, you know, doing what you do and having something important and useful to say, as opposed to just going up there to tell jokes the whole time. I wanted to talk about that because you talked about how you’re using various CDK and other serverless style patterns in your work at Liberty Mutual.

Matt: Yeah. So, we’ve been using CDK pretty extensively since it was, sort of, Q3 2019. At that point, it was new. Like, it had just gone GA at the time, just came out of dev preview. And we’ve been using CDK from the perspective of we want to be building serverless-first, well-architected apps, and ideally we want to be building them on AWS.

Now, the thing is, we have 5000 people in our IT organization, so there’s sort of a couple of ways you can take to try and get those people onto the cloud: You can either go the route of being, like, there is one true path to architecture, this is our architecture and everything you want to build can fit into that square box; or you can go the other approach and try and have the golden path where you say this is the paved road that is really easy to do, but if you want to differentiate from that route, that’s okay. But what you need to do is feed back into the golden path if that works. Then everybody can improve. And that’s where we’ve started been using CDK. So, what you heard me talk about was the software accelerator, and it’s sort of a different approach.

It’s where anybody can build a pattern and then share it so that everybody else can rapidly, you know, just reuse it. And what that means is effectively you can, instead of having to have hundreds of people on a central team, you can actually just crowdsource, and sort of decentralize the function. And if things are good, then a small team can actually come in and audit them, so to speak, and check that it’s well-architected, and doesn’t have flaws, and drive things that way.

Corey: I have to confess that I view the CDK as sort of a third stage automation approach, and it’s one that I haven’t done much work with myself. The first stage is clicking around in the console; the second is using CloudFormation or Terraform; the third stage is what we’re talking about here is CDK or Pulumi, or something like that. And then you ascend to the final fourth stage, which is what I use, which is clicking around in the AWS console, but then you lie to people about it. ClickOps is poised to take over the world. But that’s okay. You haven’t gotten that far yet. Instead, you’re on the CDK side. What advantages does CDK offer that effectively CloudFormation or something like it doesn’t?

Matt: So, first off, for ClickOps in Liberty, we actually have the AWS console as read-only in all of our accounts, except for sandbox. So, you can ClickOps in sandbox to learn, but if you want to do something real, unfortunately, it’s going to fail you. So.—

Corey: I love that pattern. I think I might steal that.

Matt: [laugh]. So, originally, we went heavy on CloudFormation, which is why CDK worked well for us. And because we’ve actually—it’s been a long journey. I mean, we’ve been deploying—2014, I think it was, we first started deploying to AWS, and we’ve used everything from Terraform, to you name it. We’ve built our own tools, believe it or not, that are basically CDK.

And the thing about CloudFormation is, it’s brilliant, but it’s also incredibly verbose and long because you need to specify absolutely everything that you want to deploy, and every piece of configuration. And that’s fine if you’re just deploying a side project, but if you’re in an enterprise that has responsibilities to protect user data, and you can’t just deploy anything, they end up thousands and thousands and thousands of lines long. And then we have amazing guardrails, so if you tried to deploy a CloudFormation template with a flaw in it, we can either just fix it, or reject the deploy. But CloudFormation is not known to be the fastest to deploy, so you end up in this developer cycle, where you build this template by hand, and then it goes through that CloudFormation deploy, and then you get the failure message that it didn’t deploy because of some compliance thing, and developers just got frustrated, and were like, sod this. [laugh].

I’m not deploying to AWS. Back the on-prem. And that’s where CDK was a bit different because it allowed us to actually build abstractions with all of our guardrails baked in, so that it just looked like a standard class, for developers, like, developers already know Java, Python, TypeScript, the languages off CDK, and so we were able to just make it easy by saying, “You want API Gateway? There’s an API Gateway class. You want, I don’t know, an EC2 instance? There you go.” And that way, developers could focus on the thing they wanted, instead of all of the compliance stuff that they needed to care about every time they wanted to deploy.

Corey: Personally, I keep lobbying AWS to add my preferred language, which is crappy shell scripting, but for some reason they haven’t really been quick to add that one in. The thing that I think surprises me, on some level—though, perhaps it shouldn’t—is not just the adoption of serverless that you’re driving at Liberty Mutual, but the way that you’re interacting with that feels very futuristic, for lack of a better term. And please don’t think that I’m in any way describing this in a way that’s designed to be insulting, but I do a bunch of serverless nonsense on Twitter for Pets. That’s not an exaggeration. twitterforpets.com has a bunch of serverless stuff behind it because you know, I have personality defects.

But no one cares about that static site that’s been a slide dump a couple of times for me, and a running joke. You’re at Liberty Mutual; you’re an insurance company. When people wind up talking about big enterprise institutions, you’re sort of a shorthand example of exactly what they’re talking about. It’s easy to contextualize or think of that as being very risk averse—for obvious reasons; you are an insurance company—as well as wanting to move relatively slowly with respect to technological advancement because mistakes are going to have drastic consequences to all of your customers, people’s lives, et cetera, as opposed to tweets or—barks—not showing up appropriately at the right time. How did you get to the, I guess, advanced architectural philosophy that you clearly have been embracing as a company, while having to be respectful of the risk inherent that comes with change, especially in large, complex environments?

Matt: Yeah, it’s funny because so for everyone, we were talking before this recording started about, I’ve been with Liberty since 2011. So, I’ve seen a lot of change in the length of time I’ve been here. And I’ve built everything from IBM applications right the way through to the modern serverless apps. But the interesting thing is, the journey to where we are today definitely started eight or nine years ago, at a minimum because there was something identified in the leadership that they said, “Listen, we’re all about our customers. And that means we don’t want to be wasting millions of dollars, and thousands of hours, and big trains of people to build software that does stuff. We want to focus on why are we building a piece of software, and how quickly can we get there? If you focus on those two things you’re doing all right.”

And that’s why starting from the early days, we focused on things like, okay, everything needs to go through CI/CD pipelines. You need to have your infrastructure as code. And even if you’re deploying on-prem, you’re still going to be using the same standards that we use to deploy to AWS today. So, we had years and years and years of just baking good development practices into the company. And then whenever we started to move to AWS, the question became, do we want to just deploy the same thing or do we want to take full advantage of what the cloud has to offer? And I think because we were primed and because the leadership had the right direction, you know, we were just sitting there ready to say, “Okay, serverless seems like a way we can rapidly help our customers.” And that’s what we’ve done.

Corey: A lot of the arguments against serverless—and let’s be clear, they rhyme with the previous arguments against cloud that lots of people used to make; including me, let’s be clear here. I’m usually wrong when I try to predict the future. “Well, you’re putting your availability in someone else’s hands,” was the argument about cloud. Yeah, it turns out the clouds are better at keeping things up than we are as individual companies.

Then with serverless, it’s the, “Well, if they’re handling all that stuff for you on their side, when they’re down, you’re down. That’s an unacceptable business risk, so we’re going to be cloud-agnostic and multi-cloud, and that means everything we build serverlessly needs to work in multiple environments, including in our on-prem environment.” And from the way that we’re talking about servers and things that you’re building, I don’t believe that is technically possible, unless some of the stuff you’re building is ridiculous. How did you come to accept that risk organizationally?

Matt: These are the conversations that we’re all having. Sort of, I’d say once a week, we all have a multi-cloud discussion—and I really liked the article you wrote, it was maybe last year, maybe the year before—but multi-cloud to me is about taking the best capabilities that are out there and bringing them together. So, you know, like, Azure [ID 00:12:47] or whatever, things from the other clouds that they’re good at, and using those rather than thinking, “Can I build a workload that I can simultaneously pay all of the price to run across all of the clouds, all of the time, so that if one’s down, theoretically, I might have an outage?” So, the way we’ve looked at it is we embraced really early the well-architected framework from AWS. And it talks about things like you need to have multi-region availability, you need to have your backups in place, you need to have things like circuit breakers in place for if third-party goes down, and we’ve just tried to build really resilient architectures as best as we can on AWS. And do you know what I think, if [laugh] it AWS is not—I know at re:Invent, there it went down extraordinarily often compared to normal, but in general—

Corey: We were all tired of re:Invent; their us-east-1 was feeling the exact same way.

Matt: Yeah, so that’s—it deserved a break. But, like, if somebody can’t buy insurance for an hour, once a year, [laugh] I think we’re okay with it versus spending millions to protect that one hour.

Corey: And people make assumptions based on this where, okay, we had this problem with us-east-1 that froze things like the global Route 53 control planes; you couldn’t change DNS for seven hours. And I highlighted that as, yeah, this is a problem, and it’s something to severely consider, but I will bet you anything you’d care to name that there is an incredibly motivated team at AWS, actively fixing that as we speak. And by—I don’t know how long it takes to untangle all of those dependencies, but I promise they’re going to be untangled in relatively short order versus running data centers myself, when I discover a key underlying dependency I didn’t realize was there, well, we need to break that. That’s never going to happen because we’re trying to do things as a company, and it’s just not the most important thing for us as a going concern. With AWS, their durability and reliability is the most important thing, arguably compared to security.

Would you rather be down or insecure? I feel like they pick down—I would hope in most cases they would pick down—but they don’t want to do either one. That is something they are drastically incentivized to fix. And I’m never going to be able to fix things like that and I don’t imagine that you folks would be able to either.

Matt: Yeah, so, two things. The first thing is the important stuff, like, for us, that’s claims. We want to make sure at any point in time, if you need to make a claim you can because that is why we’re here. And we can do that with people whether or not the machines are up or down. So, that’s why, like, you always have a process—a manual process—that the business can operate, irrespective of whether the cloud is still working.

And that’s why we’re able to say if you can’t buy insurance in that hour, it’s okay. But the other thing is, we did used to have a lot of data centers, and I have to say, the people who ran those were amazing—I think half the staff now work for AWS—but there was this story that I heard where there was an app that used to go down at the same time every day, and nobody could work out why. And it was because someone was coming in to clean the room at that time, and they unplugged the server to plug in a vacuum, and then we’re cleaning the room, and then plugging it back in again. And that’s the kind of thing that just happens when you manage people, and you manage a building, and manage a premises. Whereas if you’ve heard that happened that AWS, I mean, that would be front page news.

Corey: Oh, it absolutely would. There’s also—as you say, if it’s the sales function, if people aren’t able to buy insurance for an hour, when us-east-1 went down, the headlines were all screaming about AWS taking an outage, and some of the more notable customers were listed as examples of this, but the story was that, “AWS has massive outage,” not, “Your particular company is bad at technology.” There’s sort of a reputational risk mitigation by going with one of these centralized things. And again, as you’re alluding to, what you’re doing is not life-critical as far as the sales process and getting people to sign up. If an outage meant that suddenly a bunch of customers were no longer insured, that’s a very different problem. But that’s not your failure mode.

Matt: Exactly. And that’s where, like, you got to look at what your business is, and what you’re specifically doing, but for 99.99999% of businesses out there, I’m pretty sure you can be down for the tiny window that AWS is down per year, and it will be okay, as long as you plan for it.

Corey: So, one thing that really surprised me about the entirety of what you’ve done at Liberty Mutual is that you’re a big enterprise company, and you can take a look at any enterprise company, and say that they have dueling mottos, which is, “I am not going to comment on that,” or, “That’s not funny.” Like, the safe mode for any large concern is to say nothing at all. But a lot of folks—not just you—at Liberty have been extremely vocal about the work that you’re doing, how you view these things, and I almost want to call it advocacy or evangelism for the CDK. I’m slightly embarrassed to admit that for a little while there, I thought you were an AWS employee in their DevRel program because you were such an advocate in such strong ways for the CDK itself.

And that is not something I expected. Usually you see the most vocal folks working in environments that, let’s be honest, tend to play a little bit fast and loose with things like formal corporate communications. Liberty doesn’t and yet, there you folks are telling these great stories. Was that hard to win over as a culture, or am I just misunderstanding how corporate life is these days?

Matt: No, I mean, so it was different, right? There was a point in time where, I think, we all just sort of decided that—I mean, we’re really good at what we do from an engineering perspective, and we wanted to make sure that, given the messaging we were given, those 5000 teck employees in Liberty Mutual, if you consider the difference in broadcasting to 5000 versus going external, it may sound like there’s millions, billions of people in the world, but in reality, the difference in messaging is not that much. So, to me what I thought, like, whenever I started anyway—it’s not, like, we had a meeting and all decided at the same time—but whenever I started, it was a case of, instead of me just posting on all the internal channels—because I’ve been doing this for years—it’s just at that moment, I thought, I could just start saying these things externally and still bring them internally because all you’ve done is widened the audience; you haven’t actually made it shallower. And that meant that whenever I was having the internal conversations, nothing actually changed except for it meant external people, like all their Heroes—like Jeremy Daly—could comment on these things, and then I could bring that in internally. So, it almost helped the reverse takeover of the enterprise to change the culture because I didn’t change that much except for change the audience of who I was talking to.

Corey: This episode is sponsored by our friends at Oracle HeatWave is a new high-performance accelerator for the Oracle MySQL Database Service. Although I insist on calling it “my squirrel.” While MySQL has long been the worlds most popular open source database, shifting from transacting to analytics required way too much overhead and, ya know, work. With HeatWave you can run your OLTP and OLAP, don’t ask me to ever say those acronyms again, workloads directly from your MySQL database and eliminate the time consuming data movement and integration work, while also performing 1100X faster than Amazon Aurora, and 2.5X faster than Amazon Redshift, at a third of the cost. My thanks again to Oracle Cloud for sponsoring this ridiculous nonsense.

Corey: One thing that you’ve done that I want to say is admirable, and I stumbled across it when I was doing some work myself over the break, and only right before this recording did I discover that it was you is the cdkpatterns.com website. Specifically what I love about it is that it publishes a bunch of different patterns of ways to do things. This deviates from a lot of tutorials on, “Here’s how to build this one very specific thing,” and instead talks about, “Here’s the architecture design; here’s what the baseline pattern for that looks like.” It’s more than a template, but less than a, “Oh, this is a messaging app for dogs and I’m trying to build a messaging app for cats.” It’s very generalized, but very direct, and I really, really like that model of demo.

Matt: Thank you. So, watching some of your Twitter threads where you experiment with new—

Corey: Uh oh. People read those. That’s a problem.

Matt: I know. So, whatever you experiment with a new piece of AWS to you, I’ve always wondered what it would be like to be your enabling architect. Because technically, my job in Liberty is, I meant to try and stay ahead of everybody and try and ease the on-ramp to these things. So, if I was your enabling architect, I would be looking at it going, “I should really have a pattern for this.” So that whenever you want to pick up that new service the patterns in cdkpatterns.com, there’s 24, 25 of them right there, but internally, there’s way more than dozens now.

The goal is, the pattern is the least amount to code for you to learn a concept. And then that way, you can not only see how something works, but you can maybe pick up one of the pieces of the well-architected framework while you’re there: All of it’s unit tested, all of it is proper, you know, like, commented code. The idea is to not be crap, but not be gold-plated either. I’m currently in the process of upgrading that all to V2 as well. So, that [unintelligible 00:21:32].

Corey: You mentioned a phrase just now: “Enabling architect.” I have to say this one that has not crossed my desk before. Is that an internal term you use? Is that an enterprise concept I’ve somehow managed to avoid? Is that an AWS job role? What is that?

Matt: I’ve just started saying [laugh] it’s my job over the past couple of years. That—I don’t know, patent pending? But the idea to me is—

Corey: No, it’s evocative. I love the term, I’d love to learn more.

Matt: Yeah, because you can sort of take two approaches to your architecture: You can take the traditional approach, which is the ‘house of no’ almost, where it’s like, “This is the architecture. How dare you want to deviate. This is what we have decided. If you want to change it, here’s the Architecture Council and go through enterprise architecture as people imagine it.” But as people might work out quite quickly, whenever they meet me, the whole, like, long conversational meetings are not for me. What I want to do is teach engineers how to help themselves, so that’s why I see myself as enabling.

And what I’ve been doing is using techniques like Wardley Mapping, which is where you can go out and you can actually take all the components of people’s architecture and you can draw them on a map for—it’s a map of how close they are to the customer, as well as how cutting edge the tech is, or how aligned to our strategic direction it is. So, you can actually map out all of the teams, and—there’s 160, 170 engineers in Belfast and Dublin, and I can actually go in and say, “Oh, that piece of your architecture would be better if it was evolved to this. Well, I have a pattern for that,” or, “I don’t have a pattern for that, but you know what? I’ll build one and let’s talk about it next week.” And that’s always trying to be ahead, instead of people coming to me and I have to say no.

Corey: AWS Proton was designed to do something vaguely similar, where you could set out architectural patterns of—like, the two examples that they gave—I don’t know if it’s in general availability yet or still in public preview, but the ones that they gave were to build a REST API with Lambda, and building something-or-other with Fargate. And the idea was that you could basically fork those, or publish them inside of your own environment of, “Oh, you want a REST API; go ahead and do this.” It feels like their vision is a lot more prescriptive than what yours is.

Matt: Yeah. I talked to them quite a lot about Proton, actually because, as always, there’s different methodologies and different ways of doing things. And as I showed externally, we have our software accelerator, which is kind of our take on Proton, and it’s very open. Anybody can contribute; anybody can consume. And then that way, it means that you don’t necessarily have one central team, you can have—think of it more like an SRE function for all of the patterns, rather than… the Proton way is you’ve separate teams that are your DevOps teams that set up your patterns and then separate team that’s consumer, and they have different permissions, different rights to do different things. If you use a Proton pattern, anytime an update is made to that pattern, it auto-deploys your infrastructure.

Corey: I can see that breaking an awful lot.

Matt: [laugh]. Yeah. So, the idea is sort of if you’re a consumer, I assume you [unintelligible 00:24:35] be going to change that infrastructure. You can, they’ve built in an escape hatch, but the whole concept of it is there’s a central team that looks to what the best configuration for that is. So, I think Proton has so much potential, I just think they need to loosen some of the boundaries for it to work for us, and that’s the feedback I’ve given them directly as well.

Corey: One thing that I want to take a step beyond this is, you care about this? More than most do. I mean, people will work with computers, yes. We get paid for that. Then they’ll go and give talks about things. You’re doing that as well. They’ll launch a website occasionally, like, cdkpatterns.com, which you have. And then you just sort of decide to go for the absolute hardest thing in the world, and you’re one of four authors of a book on this. Tell me more.

Matt: Yeah. So, this is something that there’s a few of us have been talking since one of the first CDK Days, where we’re friends, so there’s AWS Heroes. There’s Thorsten Höger, Matt Bonig, Sathyajith Bhat, and myself, came together—it was sometime in the summer last year—and said, “Okay. We want to write a book, but how do we do this?” Because, you know, we weren’t authors before this point; we’d never done it before. We weren’t even sure if we should go to a publisher, or if we should self-publish.

Corey: I argue that no one wants to write a book. They want to have written a book, and every first-time author I’ve ever spoken to at the end has said, “Why on earth would anyone want to do this a second time?” But people do it.

Matt: Yeah. And that’s we talked to Alex DeBrie, actually, about his book, the amazing Dynamodb Book. And it was his advice, told us to self-publish. And he gave us his starter template that he used for his book, which took so much of the pain out because all we had to do was then work out how we were going to work together. And I will say, I write quite a lot of stuff in general for people, but writing a book is completely different because once it’s out there, it’s out there. And if it’s wrong, it’s wrong. You got to release a new version and be like, “Listen, I got that wrong.” So, it did take quite a lot of effort from the group to pull it together. But now that we have it, I want to—I don’t have a printed copy because it’s only PDF at the minute, but I want a copy just put here [laugh] in, like, the frame. Because it’s… it’s what we all want.

Corey: Yeah, I want you to do that through almost a traditional publisher, selfishly, because O’Reilly just released the AWS Cookbook, and I had a great review quote on the back talking about the value added. I would love to argue that they use one of mine for The CDK Book—and then of course they would reject it immediately—of, “I don’t know why you do all this. Using the console and lying about it is way easier.” But yeah, obviously not the direction you’re trying to take the book in. But again, the industry is not quite ready for the lying version of ClickOps.

It’s really neat to just see how willing you are to—how to frame this?—to give of yourself and your time and what you’ve done so freely. I sometimes make a joke—that arguably isn’t that funny—that, “Oh, AWS Hero. That means that you basically volunteer for a $1.6 trillion company.”

But that’s not actually what you’re doing. What you’re doing is having figured out all the sharp edges and hacked your way through the jungle to get to something that is functional, you’re a trailblazer. You’re trying to save other people who are working with that same thing from difficult experiences on their own, having to all thrash and find our own way. And not everyone is diligent and as willing to continue to persist on these things. Is that a somewhat fair assessment how you see the Hero role?

Matt: Yeah. I mean, no two Heroes are the same, from what I’ve judged, I haven’t met every Hero yet because pandemic, so Vegas was the first time [I met most 00:28:12], but from my perspective, I mean, in the past, whatever number of years I’ve been coding, I’ve always been doing the same thing. Somebody always has to go out and be the first person to try the thing and work out what the value is, and where it’ll work for us more work for us. The only difference with the external and public piece is that last 5%, which it’s a very different thing to do, but I personally, I like even having conversations like this where I get to meet people that I’ve never met before.

Corey: You sort of discovered the entire secret of why I have an interview podcast.

Matt: [laugh]. Yeah because this is what I get out of it, just getting to meet other people and have new experiences. But I will say there’s Heroes out there doing very different things. You’ve got, like, Hiro—as in Hiro, H-I-R-O—actually started AWS Newbies and she’s taught—ah, it’s hundreds of thousands of people how to actually just start with AWS, through a course designed for people who weren’t coders before. That kind of thing is next-level compared to anything I’ve ever done because you know, they have actually built a product and just given it away. I think that’s amazing.

Corey: At some level, building a product and giving it away sounds like, “You know, I want to never be lonely again.” Well, that’ll work because you’re always going to get support tickets. There’s an interesting narrative around how to wind up effectively managing the community, and users, and demands, based on open-source maintainers, that we’re all wrestling with as an industry, particularly in the wake of that whole log4j nonsense that we’ve been tilting at that windmill, and that’s going to be with us for a while. One last thing I want to talk about before we wind up calling this an episode is, you are one of the organizers of CDK Day. What is that?

Matt: Yeah, so CDK Day, it’s a complete community-organized conference. The past two have been worldwide, fully virtual just because of the situation we’re in. And I mean, they’ve been pretty popular. I think we had about 5000 people attended the last one, and the idea is, it’s a full day of the community just telling their stories of how they liked or disliked using the CDK. So, it’s not a marketing event; it’s not a sales event; we actually run the whole event on a budget of exactly $0. But yeah, it’s just a day of fun to bring the community together and learn a few things. And, you know, if you leave it thinking CDK is not for you, I’m okay with that as much as if you just make a few friends while you’re there.

Corey: This is the first time I’d realized that it wasn’t a formal AWS event. I almost feel like that’s the tagline that you should have under it. It’s—because it sounds like the CDK Day, again, like, it’s this evangelism pure, “This is why it’s great and why you should use it.” But I love conferences that embrace critical views. I built one of the first talks I ever built out that did anything beyond small user groups was “Heresy in the Church of Docker.”

Then they asked me to give that at ContainerCon, which was incredibly flattering. And I don’t think they made that mistake a second time, but it was great to just be willing to see some group of folks that are deeply invested in the technology, but also very open to hearing criticism. I think that’s the difference between someone who is writing a nuanced critique versus someone who’s just [pure-on 00:31:18] zealotry. “But the CDK is the answer to every technical problem you’ve got.” Well, I start to question the wisdom of how applicable it really is, and how objective you are. I’ve never gotten that vibe from you.

Matt: No, and that’s the thing. So, I mean, as we’ve worked out in this conversation, I don’t work for AWS, so it’s not my product. I mean, if it succeeds or if it fails, it doesn’t impact my livelihood. I mean, there are people on the team who would be sad for, but the point is, my end goal is always the same. I want people to be enabled to rapidly deliver their software to help their customers.

If that’s CDK, perfect, but CDK is not for everyone. I mean, there are other options available in the market. And if, even, ClickOps is the way to go for you, I am happy for you. But if it’s a case of we can have a conversation, and I can help you get closer to where you need to be with some other tool, that’s where I want to be. I just want to help people.

Corey: And if I can do anything to help along that axis, please don’t hesitate to let me know. I really want to thank you for taking the time to speak with me and being so generous, not just with your time for this podcast, but all the time you spend helping the rest of us figure out which end is up, as we continue to find that the way we manage environments evolves.

Matt: Yeah. And, listen, just thank you for having me on today because I’ve been reading your tweets for two years, so I’m just starstruck at this moment to even be talking to you. So, thank you.

Corey: No, no. I understand that, but don’t worry, I put my pants on two legs at a time, just like everyone else. That’s right, the thought leader on Twitter, you have to jump into your pants. That’s the rule. Thanks again so much. I look forward to having a further conversation with you about this stuff as I continue to explore, well honestly, what feels like a brand new paradigm for how we manage code.

Matt: Yeah. Reach out if you need any help.

Corey: I certainly will. You’ll regret asking. Matt [Coulter 00:33:06], Technical Architect at Liberty Mutual. I’m Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you’ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you’ve hated this podcast, please leave a five-star review on your podcast platform of choice, write an angry comment, then click the submit button, but lie and say you hit the submit button via an API call.

Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Announcer: This has been a HumblePod production. Stay humble.

Newsletter Footer

Get the Newsletter

Reach over 30,000 discerning engineers, managers, enthusiasts who actually care about the state of Amazon’s cloud ecosystems.

"*" indicates required fields

This field is for validation purposes and should be left unchanged.
Sponsor Icon Footer

Sponsor an Episode

Get your message in front of people who care enough to keep current about the cloud phenomenon and its business impacts.