Join me as I continue the Whiteboard Confessional series with a look at the necessity of internal tooling. I touch upon the fact that every business has a wealth of non-technical folks who might not be comfortable using SSH on the command line, why early-stage companies can’t always build internal tools but why they eventually need to, how the no-code and low-code movement makes it easier to build internal tooling via drag-and-drop UIs, why it’s important to handle access to said internal tools, and more.
Episode Show Notes & Transcript
About Corey Quinn
Over the course of my career, I’ve worn many different hats in the tech world: systems administrator, systems engineer, director of technical operations, and director of DevOps, to name a few. Today, I’m a cloud economist at The Duckbill Group, the author of the weekly Last Week in AWS newsletter, and the host of two podcasts: Screaming in the Cloud and, you guessed it, AWS Morning Brief, which you’re about to listen to.
Corey: Welcome to AWS Morning Brief: Whiteboard Confessional. I’m Cloud Economist Corey Quinn. This weekly show exposes the semi-polite lie that is whiteboard architecture diagrams. You see, a child can draw a whiteboard architecture, but the real world is a mess. We discuss the hilariously bad decisions that make it into shipping products, the unfortunate hacks the real-world forces us to build, and that the best to call your staging environment is “theory”. Because invariably whatever you’ve built works in the theory, but not in production. Let’s get to it.
Corey: This episode is brought to you by Trend Micro Cloud One™. A security services platform for organizations building in the Cloud. I know you're thinking that that's a mouthful because it is, but what's easier to say? “I'm glad we have Trend Micro Cloud One™, a security services platform for organizations building in the Cloud,” or, “Hey, bad news. It's going to be a few more weeks. I kind of forgot about that security thing.” I thought so. Trend Micro Cloud One™ is an automated, flexible all-in-one solution that protects your workflows and containers with cloud-native security. Identify and resolve security issues earlier in the pipeline, and access your cloud environments sooner, with full visibility, so you can get back to what you do best, which is generally building great applications. Discover Trend Micro Cloud One™ a security services platform for organizations building in the Cloud. Whew. At trendmicro.com/screaming.
In almost any production environment, there's going to be a few tasks as your company grows that someone winds up having to perform in your production app. And in many cases, the people who have to perform those tasks are themselves not excessively technical, which means if you fail to properly invest in internal tooling, well, that means you're going to have someone who winds up getting this, effectively, printed out page that hangs in their cubicle—or equivalent during these uncertain times—where they wind up following a checklist of, step one: SSH into a production server. Step two: copy and paste the following command, which in turn, I don't know, spins up a Ruby on Rails console, or does some task on the database and returns a query. Now, this is universally recognized as awful because, for better or worse, most business users are not overwhelmingly comfortable when it comes to using SSH on the command line.
Now, in an ideal world with unlimited resources, you would be able to have an internal tools developer who could focus on things like that specifically for your teams. And in fact, most very large hyper-scale companies have entire herds of people doing nothing but that. But when you're building something from scratch, and you're a relatively small, scrappy team, it's much more challenging because you take a step back and have to make some unfortunate and challenging determinations of, “Okay, am I going to A) sit here and have very expensive people build tooling, or B) have them work on features, which, you know, bring money into the company?” I'm not going to sit here and say that people are wrong for not investing in internal tooling early on.
But at some point, the longer you go without making those investments, the greater your risk is because someone is going to get something wrong. They're going to fat-finger a command somewhere; they're going to run it on the wrong system; a key pair is going to not do what it needs to do; some error-checking was not built into whatever script you're having them run, and a command is going to fail, but it's going to continue on as if it succeeded and potentially run the wrong thing in the wrong place. It effectively is setting up a recipe for disaster, and when this happens, as it inevitably will, the natural response is going to be to blame the poor schmuck who had to go ahead and run your crappy shell script command because you couldn't bother to invest in internal tooling. This is an area that's near and dear to my heart because it's something that I spend a fair bit of time worrying about myself. Again, I've built a ridiculous architecture that powers my newsletters, and I have a separate aspect of that, that lets my ad sales folks wind up injecting sponsor stuff into the newsletter for me.
Fun fact that isn't super well known, I don't see any of the sponsor stuff that goes out in my newsletter until after I've already written that week's issue because I don't want to wind up finding myself having to change what I say to avoid irritating a sponsor, you know, like someone with a sense of self-preservation or an appreciation for maintaining their income might do. So, it's sort of an editorial firewall for me. In order for that to make sense, though, there was no way in the world I was going to get away with having people who are managing the ad sales portion, SSH-ing into a box, and running this arcane script that talks to DynamoDB. And, “Oh, yeah, just run this script; it invokes a lambda function, and—hey, where are you going? Come back,” is how that story is going to play out.
So, my initial approach was to look into what it would take to pay someone who's good at building web forms and front-end tooling. It turns out those people cost a lot of money. My approach was to ultimately use Retool, which I've talked about repeatedly on this show, but there are a lot of tools in this space. AWS Honeycode, for example, is one of the worst examples of something like this. The value there is that it ties together a bunch of APIs with a drag-and-drop Visual Basic style interface that lets you build internal web apps.
And their pricing model is such that you would never in a million years use this for anything public. But for internal tooling, it's a great approach. Sure, you need some developer time to set up the APIs, or the scripts that it calls on the back end, but it's really an accelerated function here because you don't need anyone to spend time on UI, past drag and drop. When it comes time to update something, you can wind up changing an API parameter or building a quick API on the other side and the interface remains remarkably consistent for users. There are a number of tools like this out there, and I'm a big fan of the no-code/low-code movement, specifically because it solves incredible business issues here.
This episode is sponsored in part by our good friends over a ChaosSearch, which is a fully managed log analytics platform that leverages your S3 buckets as a data store with no further data movement required. If you're looking to either process multiple terabytes in a petabyte-scale of data a day or a few hundred gigabytes, this is still economical and worth looking into. You don't have to manage Elasticsearch yourself. If your ELK stack is falling over, take a look at using ChaosSearch for log analytics. Now, if you do a direct cost comparison, you're going to say, “Yeah, 70 to 80 percent on the infrastructure costs,” which does not include the actual expense of paying infrastructure people to mess around with running Elasticsearch themselves. You can take it from me or you can take it from many of their happy customers, but visit chaossearch.io today to learn more.
Now, please don't misunderstand me. I’m not having this conversation to shill for any particular product or service unless they're sponsoring this episode. I'm talking instead about the higher-level pattern of making sure that you take the time to invest in tooling before it winds up blowing up in your face, and you wind up inevitably blaming someone for something that isn't really their fault. Again, I run a business; I am extremely sympathetic to the fact that there's an infinite amount of work, and for most of us who are not funded by SoftBank, there's not an infinite amount of resources to throw at that infinite amount of work. So we make tradeoffs, we make decisions, and yes, there's always going to be something that is overly complicated and technical because either it doesn't happen frequently enough to wind up investing in tooling around, or it's complicated, or it's simple enough that, just run this one command and it'll be fine.
That can get you by for a surprising period of time, but eventually, someone is going to copy and paste something wrong, and it's going to lead to disaster. So, fundamentally, what I'm suggesting and advocating for here is invest at least a little bit of effort in getting to a point of internal tooling that doesn't require four to eight hours of training someone how the Linux command line works, which is nutty in this year of our Lord 2020, and give them something that looks a lot more like an internal web page. Now, as we have learned from a very public recent Twitter hack, you're going to want to be careful with how you handle access to said internal tools because at some point, what you're building is fundamentally going to look increasingly like an admin panel. From my perspective, for sending out my newsletter, there is no button inside any of these systems that will cause a newsletter to send.
This sounds like it’s an intelligent safety approach, but it's not. In fact, it's a limitation of ConvertKit’s lack of a broadcast API. In fact, after all of my highly technical stuff finishes, I have to copy and paste the HTML, like some kind of farm animal, into a web page. Now, that has a whole series of problems but the silver lining behind it is that if someone were to break into my newsletter production system, which is possible, all they would be able to do would be to muck up some of the content and delete some stuff, and there are backups of all of these things. It isn't going to ever get to a point where someone has gotten access to this stuff and now my career is ruined, or I have spammed a bunch of nonsense to my newsletter subscribers. Be very clear here: the stuff that I spam my newsletter subscribers with is highly intentional, and it is my, basically, rambling equivalent in text form of this newsletter.
So, we've covered a fair bit of ground here. In summation, invest in internal tooling insofar as you can, understand there are going to be times that you're not going to be able to make those investments, and gain the wisdom to know the difference between those two scenarios before it blows up in your face, and you blame the wrong person for your own shortcomings.
This has been the AWS Morning Brief: Whiteboard Confessional, I'm Cloud Economist Corey Quinn. And if you've liked this podcast, please leave a five-star review on Apple Podcasts, whereas if you've hated this podcast, please leave a five-star review on Apple Podcasts as soon as you build a tool to do it for you automatically.
Thank you for joining us on Whiteboard Confessional. If you have terrifying ideas, please reach out to me on twitter at @quinnypig and let me know what I should talk about next time.
Announcer: This has been a HumblePod production. Stay humble.