“Cloudash”ing onto Mac with Maciej Winnicki

Episode Summary

Figuring out what the hell is going on in the AWS environment can be a head scratcher, on a good day. But, thanks to the earnest work of folks like Maciej Winnicki, Founder of Cloudash, these itches are, indeed, being scratched. Maciej and Cloudash got put on blast by Corey on Twitter, but for all the right reasons! Maciej goes into the details on Cloudash and how it works as a native desktop app for Mac. Maciej talks about all the areas that Cloudash is covering down from Lambda to CloudFront and others. Corey is a happy customer completely on his own! Listen in as he and Maciej take a deep dive into the technical aspects of Cloudash, and what Maciej has created is something really unique!

Episode Show Notes & Transcript

About Maciej
Maciej Winnicki is a serverless enthusiast with over 6 years of experience in writing software with no servers whatsoever. Serverless Engineer at Stedi, Cloudash Founder, ex-Engineering Manager, and one of the early employees at Serverless Inc.


Links:

Transcript
Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.


Corey: This episode is sponsored in part byLaunchDarkly. Take a look at what it takes to get your code into production. I’m going to just guess that it’s awful because it’s always awful. No one loves their deployment process. What if launching new features didn’t require you to do a full-on code and possibly infrastructure deploy? What if you could test on a small subset of users and then roll it back immediately if results aren’t what you expect? LaunchDarkly does exactly this. To learn more, visitlaunchdarkly.com and tell them Corey sent you, and watch for the wince.


Corey: This episode is sponsored in part by our friends at Rising Cloud, which I hadn’t heard of before, but they’re doing something vaguely interesting here. They are using AI, which is usually where my eyes glaze over and I lose attention, but they’re using it to help developers be more efficient by reducing repetitive tasks. So, the idea being that you can run stateless things without having to worry about scaling, placement, et cetera, and the rest. They claim significant cost savings, and they’re able to wind up taking what you’re running as it is in AWS with no changes, and run it inside of their data centers that span multiple regions. I’m somewhat skeptical, but their customers seem to really like them, so that’s one of those areas where I really have a hard time being too snarky about it because when you solve a customer’s problem and they get out there in public and say, “We’re solving a problem,” it’s very hard to snark about that. Multus Medical, Construx.ai and Stax have seen significant results by using them. And it’s worth exploring. So, if you’re looking for a smarter, faster, cheaper alternative to EC2, Lambda, or batch, consider checking them out. Visit risingcloud.com/benefits. That’s risingcloud.com/benefits, and be sure to tell them that I said you because watching people wince when you mention my name is one of the guilty pleasures of listening to this podcast.
Corey: Welcome to Screaming in the Cloud. I’m Cloud Economist Corey Quinn. And my guest today is Maciej Winnicki, who is the founder of Cloudash. Now, before I dive into the intricacies of what that is, I’m going to just stake out a position that one of the biggest painful parts of working with AWS in any meaningful sense, particularly in a serverless microservices way, is figuring out what the hell’s going on in the environment. There’s a bunch of tools offered to do this and they’re all—yeee, they aspire to mediocrity. Maciej, thank you for joining me today.


Corey: Welcome to Screaming in the Cloud. I’m Cloud Economist Corey Quinn. And my guest today is Maciej Winnicki, who is the founder of Cloudash. Now, before I dive into the intricacies of what that is, I’m going to just stake out a position that one of the biggest painful parts of working with AWS in any meaningful sense, particularly in a serverless microservices way, is figuring out what the hell’s going on in the environment. There’s a bunch of tools offered to do this and they’re all—yeee, they aspire to mediocrity. Maciej, thank you for joining me today.


Maciej: Thank you for having me.


Corey: So, I turned out to have accidentally blown up Cloudash, sort of before you were really ready for the attention. You, I think, tweeted about it or put it on Hacker News or something; I stumbled over it because it turns out that anything that vaguely touches cloud winds up in my filters because of awesome technology, and personality defects on my part. And I tweeted about it as I set it up and got the thing running, and apparently this led to a surge of attention on this thing that you’ve built. So, let me start off with an apology. Oops, I didn’t realize it was supposed to be a quiet launch.


Maciej: I actually thank you for that. Like, that was great. And we get a lot of attention from your tweet thread, actually because at the end, that was the most critical part. At the end of the twitter, you wrote that you’re staying as a customer, so we have it on our website and this is perfect. But actually, as you said, that’s correct.


Our marketing strategy for releasing Cloudash was to post it on LinkedIn. I know this is not, kind of, the best strategy, but that was our plan. Like, it was like, hey, like, me and my friend, Tomasz, who’s also working on Cloudash, we thought like, let’s just post it on LinkedIn and we’ll see how it goes. And accidentally, I’m receiving a notification from Twitter, “Hey, Corey started tweeting about it.” And I was like, “Oh, my God, I’m having a heart attack.” But then I read the, you know—


Corey: Oops.


Maciej: [laugh]. Yeah. I read the, kind of, conclusion, and I was super happy. And again, thank you for that because this is actually when Cloudash kind of started rolling as a product and as a, kind of, business. So yeah, that was great.


Corey: To give a little backstory and context here is, I write a whole bunch of serverless nonsense. I build API’s Gateway, I hook them up to Lambda’s Function, and then it sort of kind of works. Ish. From there, okay, I would try and track down what was going on because in a microservices land, everything becomes a murder mystery; you’re trying to figure out what’s broken, and things have exploded. And I became a paying customer of IOpipe. And then New Relic bought them. Well, crap.


Then I became a paying customer of Epsagon. And they got acquired by Cisco, at which point I immediately congratulated the founders, who I know on a social basis, and then closed my account because I wanted to get out before Cisco ruins it because, Cisco. Then it was, what am I 
going to use next? And right around that time is when I stumbled across Cloudash. And it takes a different approach than any other entity in the space that I’ve seen because you are a native Mac desktop app. I believe your Mac only, but you seem to be Electron, so I could be way 
off base on that.


Maciej: So, we’re Linux as well right now and soon we’ll be Windows as well. But yeah, so, right now is Mac OS and Linux. Yeah, that’s correct. 
So, our approach is a little bit different.


So, let me start by saying what’s Cloudash? Like, Cloudash is a desktop app for, kind of, monitoring and troubleshooting serverless architectures services, like, serverless stuff in general. And the approach that we took is a little bit different because we are not web-based, we’re desktop-based. And there’s a couple of advantages of that approach. The first one is that, like, you don’t need to share your data with us because we’re not, kind of, downloading your metrics and logs to our back end and to process them, et cetera, et cetera. We are just using the credentials, the AWS profiles that you have defined on your computer, so nothing goes out of your AWS account.


And I think this is, like, considering, like, from the security perspective, this is very crucial. You don’t need to create a role that you give us access to or anything like that. You just use the stuff that you have on your desktop, and everything stays on your AWS account. So, nothing—we don’t download it, we don’t process it, we don’t do anything from that. And that’s one approach—well, that’s the one advantage. The other advantage is, like, kind of, onboarding, as I kind of mentioned because we’re using the AWS profiles that you have defined in your computer.


Corey: Well, you’re doing significantly more than that because I have a lot of different accounts configured different ways, and when I go to one of them that uses SSO, it automatically fires me off to the SSO login page if I haven’t logged in that day for a 12 hour session—


Maciej: Yes.


Corey: —for things that have credentials stored locally, it uses those; and for things that are using role-chaining to use assuming roles from the things I have credentials for, and the things that I just do role assumption in, and it works flawlessly. It just works the way that most of my command-line tools do. I’ve never seen a desktop app that does this.


Maciej: Yeah. So, we put a lot of effort into making sure that this works great because we know that, like, no one will use Cloudash if there’s—like, not no one, but like, we’re targeting, like, serverless teams, maybe, in enterprise companies, or serverless teams working on some startups. And in most cases, those teams or those engineers, they use SSO, or at least MFA, right? So, we have it covered. And as you said, like, it should be the onboarding part is really easy because you just pick your AWS profile, you just pick region, and just pick, right now, a CloudFormation stack because we get the information about your service based on CloudFormation stack. So yeah, we put a lot of effort in making sure that this works without any issues.


Corey: There are some challenges to it that I saw while setting it up, and that’s also sort of the nature of the fact you are, in fact, integrating with CloudWatch. For example, it’s region specific. Well, what if I want to have an app that’s multi-region? Well, you’re going to have a bad time because doing [laugh] anything multi-region in AWS means you’re going to have a bad time that gets particularly obnoxious and EC2 get to when you’re doing something like Lambda@Edge, where, oh, where are the logs live; that’s going to be in a CloudFront distribution in whatever region it winds up being accessed from. So, it comes down to what distribution endpoint or point of presence did that particular request go through, and it becomes this giant game of whack-a-mole. It’s frustrating, and it’s obnoxious, and it’s also in no way your fault.


Maciej: Yeah, I mean, we are at the beginning. Right now, it’s the most straightforward, kind of pe—how people think about stacks of serverless. They’re think in terms of regions because I think for us, regions, or replicated stacks, or things like that are not really popular yet. Maybe they will become—like, this is how AWS works as a whole, so it’s not surprising that we’re kind of following this path. I think my point is that our main goal, the ultimate goal, is to make monitoring, as I said, the troubleshooting serverless app as simple as possible.


So, once we will hear from our customers, from our users that, “Hey, we would like to get a little bit better experience around regions,” we will definitely implement that because why not, right? And I think the whole point of Cloudash—and maybe we can go more deep into that later—is that we want to bring context into your metrics and logs. If you’re seeing a, for example, X-Ray trace ID in your logs, you should be able with one click just see that the trace. It’s not yet implemented in Cloudash, but we are having it in the backlog. But my point is that, like, there should be some journey when you’re debugging stuff, and you shouldn’t be just, like, left alone having, like, 20 tabs, Cloudash tabs open and trying to figure out where I was—like, where’s the Lambda? Where’s the API Gateway logs? Where are the CloudFront logs? And how I can kind of connect all of that? Because that’s—it’s an issue right now.


Corey: Even what you’ve done so far is incredibly helpful compared to the baseline experience that folks will often have, where I can define a service that is comprised of a number of different functions—I have one set up right now that has seven functions in it—I grab any one of those things, and I can set how far the lookback is, when I look at that function, ranging from 5 minutes to 30 days. And it shows me at the top the metrics of invocations, the duration that the function runs for, and the number of errors. And then, in the same pane down below it, it shows the CloudWatch logs. So, “Oh, okay, great. I can drag and zoom into a specific timeframe, and I see just the things inside of that.”


And I know this sounds like well, what’s the hard part here? Yeah, except nothing else does it in an easy-to-use, discoverable way that just sort of hangs out here. Honestly, the biggest win for me is that I don’t have to log in to the browser, navigate through some ridiculous other thing to track down what I’m talking about. It hangs out on my desktop all the time, and whether it’s open or not, whenever I fire it up, it just works, basically, and I don’t have to think about it. It reduces the friction from, “This thing is broken,” to, “Let me see what the logs say.”


Very often I can go from not having it open at all to staring at the logs and having to wait a minute because there’s some latency before the event happens and it hits CloudWatch logs itself. I’m pretty impressed with it, and I’ve been keeping an eye on what this thing is costing me. It is effectively nothing in terms of CloudWatch retrieval charges. Because it’s not sitting there sucking all this data up all the time, for everything that’s running. Like, we’ve all seen the monitoring system that winds up costing you more than it costs more than they charge you ancillary fees. This doesn’t do that.


I also—while we’re talking about money, I want to make very clear—because disclaiming the direction the money flows in is always important—you haven’t paid me a dime, ever, to my understanding. I am a paying customer at full price for this service, and I have been since I discovered it. And that is very much an intentional choice. You did not sponsor this podcast, you are not paying me to say nice things. We’re talking because I legitimately adore this thing that you’ve built, and I want it to exist.


Maciej: That’s correct. And again, thank you for that. [laugh].


Corey: It’s true. You can buy my attention, but not my opinion. Now, to be clear, when I did that tweet thread, I did get the sense that this was something that you had built as sort of a side project, as a labor of love. It does not have VC behind it, of which I’m aware, and that’s always going to, on some level, shade how I approach a service and how critical I’m going to be on it. Just because it’s, yeah, if you’ve raised a couple 100 million dollars and your user experience is trash, I’m going to call that out.


But if this is something where you just soft launched, yeah, I’m not going to be a jerk about weird usability bugs here. I might call it out as “Ooh, this is an area for improvement,” but not, “What jackwagon thought of this?” I am trying to be a kinder, gentler Corey in the new year. But at the same time, I also want to be very clear that there’s room for improvement on everything. What surprised me the most about this is how well you nailed the user experience despite not having a full team of people doing UX research.


Maciej: That was definitely a priority. So, maybe a little bit of history. So, I started working on Cloudash, I think it was April… 2019. I think? Yeah. It’s 2021 right now. Or we’re 2022. [unintelligible 00:11:33].


Corey: Yeah. 2022, now. I—


Maciej: I’m sorry. [laugh].


Corey: —I’ve been screwing that up every time I write the dates myself, I’m with you.


Maciej: [laugh]. Okay, so I started working on Cloudash, in 2020, April 2020.


Corey: There we go.


Maciej: So, after eight months, I released some beta, like, free; you could download it from GitHub. Like, you can still download on GitHub, but at that time, there was no license, you didn’t have to buy a license to run it. So, it was, like, very early, like, 0.3 version that was working, but sort of, like, [unintelligible 00:12:00] working. There were some bugs.


And that was the first time that I tweeted about it on Twitter. It gets some attention, but, like, some people started using it. I get some feedback, very initial feedback. And I was like, every time I open Cloudash, I get the sense that, like, this is useful. I’m talking about my own tool, but like, [laugh] that’s the thing.


So, further in the history. So, I’m kind of service engineer by my own. I am a software engineer, I started focusing on serverless, in, like, 2015, 2016. I was working for Serverless Inc. as an early employee.


I was then working as an engineering manager for a couple of companies. I work as an engineering manager right now at Stedi; we’re also, like, fully serverless. So I, kind of, trying to fix my own issues with serverless, or trying to improve the whole experience around serverless in AWS. So, that’s the main purpose why we’re building Cloudash: Because we want to improve the experience. And one use case I’m often mentioning is that, let’s say that you’re kind of on duty. Like, so in the middle of night PagerDuty is calling you, so you need to figure out what’s going on with your Lambda or API Gateway.


Corey: Yes. PagerDuty, the original [Call of Duty: Nagios 00:13:04]. “It’s two in the morning; who is it?” “It’s PagerDuty. Wake up, jackass.” Yeah. We all had those moments.


Maciej: Exactly. So, the PagerDuty is calling you and you’re, kind of, in the middle of night, you’re not sure what’s going on. So, the kind of thing that we want to optimize is from waking up into understanding what’s going on with your serverless stuff should be minimized. And that’s the purpose of Cloudash as well. So, you should just run one tool, and you should immediately see what’s going on. And that’s the purpose.


And probably with one or two clicks, you should see the logs responsible, for example, in your Lambda. Again, like that’s exactly what we want to cover, that was the initial thing that we want to cover, to kind of minimize the time you spent on troubleshooting serverless apps. Because as we all know, kind of, the longer it’s down, the less money you make, et cetera, et cetera, et cetera.


Corey: This episode is sponsored by our friends at Oracle Cloud. Counting the pennies, but still dreaming of deploying apps instead of "Hello, World" demos? Allow me to introduce you to Oracle's Always Free tier. It provides over 20 free services and infrastructure, networking, databases, observability, management, and security. And—let me be clear here—it's actually free. There's no surprise billing until you intentionally and proactively upgrade your account. This means you can provision a virtual machine instance or spin up an autonomous database that manages itself all while gaining the networking load, balancing and storage resources that somehow never quite make it into most free tiers needed to support the application that you want to build. With Always Free, you can do things like run small scale applications or do proof-of-concept testing without spending a dime. You know that I always like to put asterisks next to the word free. This is actually free, no asterisk. Start now. Visit snark.cloud/oci-free that's snark.cloud/oci-free.


Corey: One of the things that I appreciate about this is that I have something like five different microservices now that power my newsletter production pipeline every week. And periodically, I’ll make a change and something breaks because testing is something that I should really get around to one of these days, but when I’m the only customer, cool. Doesn’t really matter until suddenly I’m trying to write something and it doesn’t work. Great. Time to go diving in, and always I’m never in my best frame of mind for that because I’m thinking about writing for humans not writing for computers. And that becomes a challenge.


And okay, how do I get to the figuring out exactly what is broken this time? Regression testing: It really should be a thing more than it has been for me.


Maciej: You should write those tests. [laugh].


Corey: Yeah. And then I fire this up, and okay, great. Which sub-service is it? Great. Okay, what happened in the last five minutes on that service? Oh, okay, it says it failed successfully in the logs. Okay, that’s on me. I can’t really blame you for that. But all right.


And then it’s a matter of adding more [print or 00:14:54] debug statements, and understanding what the hell is going on, mostly that I’m bad at programming. And then it just sort of works from there. It’s a lot easier to, I guess, to reason about this from my perspective than it is to go through the CloudWatch dashboards, where it’s okay, here’s a whole bunch of metrics on different graphs, most of which you don’t actually care about—as opposed to unified view that you offer—and then “Oh, you want to look at logs, that’s a whole separate sub-service. That’s a different service team, obviously, so go open that up in another browser.” And I’m sitting here going, “I don’t know who designed this, but are there any windows in their house? My God.”


It’s just the saddest thing I can possibly experience when I’m in the middle of trying to troubleshoot. Let’s be clear, when I’m troubleshooting, I am in no mood to be charitable to anyone or anything, so that’s probably unfair to those teams. But by the same token, it’s intensely frustrating when I keep smacking into limitations that get in my way while I’m just trying to get the thing up and running again.


Maciej: As you mentioned about UX that, like, we’ve spent a lot of time thinking about the UX, trying different approaches, trying to understand which metrics are the most important. And as we all know, kind of, serverless simplifies a lot of stuff, and there’s, like, way less metrics that you need to look into when something is happening, but we want to make sure that the stuff that we show—which is duration errors, and p95—are probably the most important in most cases, so like, covering most of this stuff. So sorry, I didn’t mention that before; it was very important from the very beginning. And also, like, literally, I spent a lot of time, like, working on the colors, which sounds funny, [laugh] but I wanted to get them right. We’re not yet working on dark mode, but maybe soon.


Anyways, the visual part, it’s always close to my heart, so we spent a lot of time going back to what just said. So, definitely the experience around using CloudWatch right now, and CloudWatch logs, CloudWatch metrics, is not really tailored for any specific use case because they have to be generic, right? Because AWS has, like, I don’t know, like, 300, or whatever number of services, probably half of them producing logs—maybe not half, maybe—


Corey: We shouldn’t name a number because they’ll release five more between now and when this publishes in 20 minutes.


Maciej: [laugh]. So, CloudWatch has to be generic. What we want to do with Cloudash is to take those generic tools—because we use, of course, CloudWatch logs, CloudWatch metrics, we fetch data from them—but make the visual part more tailored for specific use case—in our case, it’s the serverless use case—and make sure that it’s really, kind of—it shows only the stuff that you need to see, not everything else. So again, like that’s the main purpose. And then one more thing, we—like this is also some kind of measurement of success, we want to reduce number of tabs that you need to have open in your browser when you’re dealing with CloudWatch. So, we tried to put most important stuff in one view so you don’t need to flip between tabs, as you usually do when try to under some kind of broader scope, or broader context of your, you know, error in Lambda.


Corey: What inspired you to do this as a desktop application? Because a lot of companies are doing similar things, as SaaS, as webapps. And I have to—as someone who yourself—you’re a self-described serverless engineer—it seems to me that building a webapp is sort of like the common description use case of a lot of serverless stuff. And you’re sitting here saying, “Nope, it’s desktop app time.” Which again, I’m super glad you did. It’s exactly what I was looking for. How do you get here?


Maciej: I’d been thinking about both kinds of types of apps. So like, definitely webapp was the initial idea how to build something, it was the webapp. Because as you said, like, that’s the default mode. Like, we are thinking webapp; like, let’s build a webapp because I’m an engineer, right? There is some inspiration coming from Dynobase, which was made by a friend [unintelligible 00:18:55] who also lives in Poland—I didn’t mention that; we’re based in [Poznań 00:18:58], Poland.


And when I started thinking about it, there’s a lot of benefits of using this approach. The biggest benefit, as I mentioned, is security; and the second benefit is just most, like, cost-effective because we don’t need to run in the backend, right? We don’t need to download all your metrics, all your logs. We I think, like, let’s think about it, like, from the perspective. Listen, so everyone in the company to start working, they 
have to download all of your stuff from your AWS account. Like, that sounds insane because you don’t need all of that stuff elsewhere.


Corey: Store multiple copies of it. Yeah I, generally when I’m looking at this, I care about the last five to ten minutes.


Maciej: Exactly.


Corey: I don’t—


Maciej: Exactly.


Corey: —really care what happened three-and-a-half years ago on this function. Almost always. But occasionally I want to look back at, “Oh, this has been breaking. How long has it been that way?” But I already have that in the AWS environment unless I’ve done the right thing and turned on, you know, log expiry.


Maciej: Exactly. So, this is a lot of, like, I don’t want to be, like, you know, mean to anyone but like, that’s a lot of waste. Like, that’s a lot of waste of compute power because you need to download it; of cost because you need to get this data out of AWS, which you need to pay for, you know, get metric data and stuff like this. So, you need to—


Corey: And almost all of its—what is it? Write once, read never. Because it’s, you don’t generally look at these things.


Maciej: Yeah, yeah. Exactly.


Corey: And so much of this, too, for every invocation I have, even though it’s low traffic stuff, it’s the start with a request ID and what version is running, it tells me ‘latest.’ Helpful. A single line of comment in this case says ‘200.’ Why it says that, I couldn’t tell you. And then it says ‘End request ID.’ The end.


Now, there’s no way to turn that off unless you disabled the ability to write to CloudWatch logs in the function, but ingest on that cost 50 cents a gigabyte, so okay, I guess that’s AWS’s money-making scam of the year. Good for them. But there’s so much of that, it’s like looking at—like, when things are working, it’s like looking at a low traffic site that’s behind a load balancer, where there’s a whole—you have gigabytes, in some cases, of load balancer—of web server logs on the thing that’s sitting in your auto-scaling group. And those logs are just load balancer health checks. 98% of it is just that.


Same type of problem here, I don’t care about that, I don’t want to pay to store it, I certainly don’t want to pay to store it twice. I get it, that makes an awful lot of sense. It also makes your security job a hell of a lot easier because you’re not sitting on a whole bunch of confidential data from other people. Because, “Well, it’s just logs. What could possibly be confidential in there?” “Oh, my sweet summer child, have you seen some of the crap people put in logs?”


Maciej: I’ve seen many things in logs. I don’t want to mention them. But anyways—and also, you know, like, usually when you gave access to your AWS account, it can ruin you. You know, like, there might be a lot of—like, you need to really trust the company to give access to your AWS account. Of course, in most cases, the roles are scoped to, you know, only CloudWatch stuff, actions, et cetera, et cetera, but you know, like, there are some situations in which something may not be properly provisioned. And then you give access to everything.


Corey: And you can get an awful lot of data you wouldn’t necessarily want out of that stuff. Give me just the PDF printout of last month’s bill for a lot of environments, and I can tell you disturbing levels of detail about what your architecture is, just because when you—you can infer an awful lot.


Maciej: Yeah.


Corey: Yeah, I hear you. It makes your security story super straightforward.


Maciej: Yeah, exactly. So, I think just repeat my, like, the some inspiration. And then when I started thinking about Cloudash, like, definitely one of the inspiration was Dynobase, from the, kind of, GUI for, like, more powerful UI for DynamoDB. So, if you’re interested in that stuff, you can also check this out.


Corey: Oh, yeah, I’ve been a big fan of that, too. That’ll be a separate discussion on a different episode, for sure.


Maciej: [laugh]. Yeah.


Corey: But looking at all of this, looking at the approach of, the only real concern—well, not even a concern. The only real challenge I have with it for my use case is that when I’m on the road, the only thing that I bring with me for a computer is my iPad Pro. I’m not suggesting by any means that you should build this as a new an iPad app; that strikes me as, like, 15 levels of obnoxious. But it does mean that sometimes I still have to go diving into the CloudWatch console when I’m not home. Which, you know, without this, without Cloudash, that’s what I was doing originally anyway.


Maciej: You’re the only person that requested that. And we will put that into backlog, and we will get to that at some point. [laugh].


Corey: No, no, no. Smart question is to offer me a specific enterprise tier pricing—.


Maciej: Oh, okay. [laugh].


Corey: —that is eye-poppingly high. It’s like, “Hey, if you want a subsidize feature development, we’re thrilled to empower that.” But—


Maciej: [laugh]. Yeah, yeah. To be honest, I like that would be hard to write [unintelligible 00:23:33] implement as iPad app, or iPhone app, or whatever because then, like, what’s the story behind? Like, how can I get the credentials, right? It’s not possible.


Corey: Yeah, you’d have to have some fun with that. There are a couple of ways I can think of offhand, but then that turns into a sandboxing issue, and it becomes something where you have to store credentials locally, regardless, even if they’re ephemeral. And that’s not great. Maybe turn it into a webapp someday or something. Who knows.


What I also appreciate is that we had a conversation when you first launched, and I wound up basically going on a Zoom call with you and more or less tearing apart everything you’ve built—and ideally constructive way—but looking at a lot of the things you’ve changed in your website, you listened to an awful lot of feedback. You doubled your pricing, for example. Used to be ten bucks a month; now you’re twenty. 
Great. I’m a big believer in charging more.


You absolutely add that kind of value because it’s, “Well, twenty bucks a month for a desktop app. That sounds crappy.” It's, “Yeah, jackwagon, 
what’s your time worth?” I was spending seven bucks a month in serverless charges, and 120 or 130 a month for Epsagon, and I was thrilled to pieces to be doing it because the value I got from being able to quickly diagnose what the hell was going on far outstripped what the actual cost of doing these things. Don’t fall into the trap of assuming that well, I shouldn’t pay for software. I can just do it myself. Your time is never free. People think it is, but it’s not.


Maciej: That’s true. The original price of $9.99, I think that was the price was the launch promo. After some time, we’ve decided—and after adding more features: API Gateway support—we’ve decided that this is, like, solving way more problems, so like, you should probably pay a little bit more for that. But you’re kind of lucky because you subscribed to it when it was 9.99, and this will be your kind of prize for the end of, you know—


Corey: Well, I’m going to argue with you after the show to raise the price on mine, just because it’s true. It’s the—you want to support the things that you want to exist in the world. I also like the fact that you offered an annual plan because I will go weeks without ever opening the app. And that doesn’t mean it isn’t adding value. It’s that oh, yeah, I will need that now that I’m hitting these issues again.


And if I’m paying on a monthly basis, and it shows up with a, “Oh, you got charged again.” “Well, I didn’t use it this month; I should cancel.” And [unintelligible 00:25:44] to an awful lot of subscriber churn. But in the course of a year, if I don’t have at least one instance in which case, wow, that ten minute span justified the entire $200 annual price tag, then, yeah, you built the wrong thing or it’s not for me, but I can think of three incidents so far since I started using it in the past four months that have led to that being worth everything you will charge me a year, and then some, just because it made it so clear what was breaking.


Maciej: So, in that regard, we are also thinking about the team licenses, that’s definitely on the roadmap. There will be some changes to that. And we definitely working on more and more features. And if we’re—like, the roadmap is mostly about supporting more and more AWS services, so right now it’s Lambda, API Gateway, we’re definitely thinking about SQS, SNS, to get some sense how your messages are going through, probably something, like, DynamoDB metrics. And this is all kind of serverless, but why not going wider? Like, why not going to Fargate? Like, Fargate is theoretically serverless, but you know, like, it’s serverless on—


Corey: It’s serverless with a giant asterisk next to it.


Maciej: Yeah, [laugh] exactly. So, but why not? Like, it’s exactly the same thing in terms of, there is some user flow, there is some user journey, when you want to debug something. You want to go from API Gateway, maybe to the container to see, I don’t know, like, DynamoDB metric or something like that, so it should be all easy. And this is definitely something.


Later, why not EC2 metrics? Like, it would be a little bit harder. But I’m just saying, like, first thing here is that you are not, like, at this point, we are serverless, but once we cover serverless, why not going wider? Why not supporting more and more services and just making sure that all those use cases are correctly modeled with the UI and UX, et cetera?


Corey: That’s going to be an interesting challenge, just because that feels like what a lot of the SaaS monitoring and observability tooling is done. And then you fire this thing up, and it looks an awful lot like the AWS console. And it’s, “Yeah, I just want to look at this one application that doesn’t use any of the rest of those things.” Again, I have full faith and confidence in your ability to pull this off. You clearly have done that well based upon what we’ve seen so far. I just wonder how you’re going to wind up tackling that challenge when you get there.


Maciej: And maybe not EC2. Maybe I went too far. [laugh].


Corey: Yeah, honestly, even EC2-land, it feels like that is more or less a solved problem. If you want to treat it as a bunch of EC2, you can use Nagios. It’s fine.


Maciej: Yeah, totally.


Corey: There are tools that have solved that problem. But not much that I’ve seen has solved the serverless piece the way that I want it solved. You have.


Maciej: So, it’s definitely a long road to make sure that the serverless—and by serverless, I mean serverless how AWS understands serverless, so including Fargate, for example. So, there’s a lot of stuff that we can improve. It’s a lot of stuff that can make easier with Cloudash than it is with CloudWatch, just staying inside serverless, it will take us a lot of time to make sure that is all correct. And correctly modeled, correctly designed, et cetera. So yeah, I went too far with EC2 sorry.


Corey: Exactly. That’s okay. We all go too far with EC2, I assure you.


Maciej: Sorry everyone using EC2 instances. [laugh].


Corey: If people want to kick the tires on it, where can they find it?


Maciej: They can find it on cloudash.dev.


Corey: One D in the middle. That one throws me sometimes.


Maciej: One D. Actually, after talking to you, we have a double-D domain as well, so we can also try ‘Clouddash’ with double-D. [laugh].


Corey: Excellent, excellent. Okay, that is fantastic. Because I keep trying to put the double-D in when I’m typing it in my search tool on my desktop, and it doesn’t show up. And it’s like, “What the—oh, right.” But yeah, we’ll get there one of these days.


Maciej: Only the domain. It’s only the domain. You will be redirected to single-D.


Corey: Exactly.


Maciej: [laugh].


Corey: We’ll have to expand later; I’ll finance the feature request there. It’ll go well. If people want to learn more about what you have to think about these things, where else can they find you?


Maciej: On Twitter, and my Twitter handle is @mthenw. M-then-W, which is M-T-H—mthenw. And my co-founder @tlakomy. You can probably add that to [show notes 00:29:35]. [laugh].


Corey: Oh, I certainly will. It’s fine, yeah. Here’s a whole bunch of letters. I hear you. My Twitter handle used to be my amateur radio callsign. It turns out most people don’t think like that. And yeah, it’s become an iterative learning process. Thank you so much for taking the time to speak with me today and for building this thing. I really appreciate both of them.


Maciej: Thank you for having me here. I encourage everyone to visit cloudash.dev, if you have any feature requests, any questions just send us an email at [email protected], or just go to GitHub repository in the issues; just create an issue, describe what you want and we can talk about it.


We are always happy to help. The main purpose, the ultimate goal of Cloudash is to make the serverless engineer’s life easier, on very high level. And on a little bit lower level, just to make, you know, troubleshooting and debugging serverless apps easier.


Corey: Well, from my perspective, you’ve succeeded.


Maciej: Thank you.


Corey: Thank you. Maciej Winnicki, founder of Cloudash. I’m Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you’ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you’ve hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment telling me exactly why I’m wrong for using an iPad do these things, but not being able to send it because you didn’t find a good way to store the credentials.


Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.


Announcer: This has been a HumblePod production. Stay humble.

Transcript

Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.

Corey: This episode is sponsored in part byLaunchDarkly. Take a look at what it takes to get your code into production. I’m going to just guess that it’s awful because it’s always awful. No one loves their deployment process. What if launching new features didn’t require you to do a full-on code and possibly infrastructure deploy? What if you could test on a small subset of users and then roll it back immediately if results aren’t what you expect? LaunchDarkly does exactly this. To learn more, visitlaunchdarkly.com and tell them Corey sent you, and watch for the wince.

Corey: This episode is sponsored in part by our friends at Rising Cloud, which I hadn’t heard of before, but they’re doing something vaguely interesting here. They are using AI, which is usually where my eyes glaze over and I lose attention, but they’re using it to help developers be more efficient by reducing repetitive tasks. So, the idea being that you can run stateless things without having to worry about scaling, placement, et cetera, and the rest. They claim significant cost savings, and they’re able to wind up taking what you’re running as it is in AWS with no changes, and run it inside of their data centers that span multiple regions. I’m somewhat skeptical, but their customers seem to really like them, so that’s one of those areas where I really have a hard time being too snarky about it because when you solve a customer’s problem and they get out there in public and say, “We’re solving a problem,” it’s very hard to snark about that. Multus Medical, Construx.ai and Stax have seen significant results by using them. And it’s worth exploring. So, if you’re looking for a smarter, faster, cheaper alternative to EC2, Lambda, or batch, consider checking them out. Visit risingcloud.com/benefits. That’s risingcloud.com/benefits, and be sure to tell them that I said you because watching people wince when you mention my name is one of the guilty pleasures of listening to this podcast. Corey: Welcome to Screaming in the Cloud. I’m Cloud Economist Corey Quinn. And my guest today is Maciej Winnicki, who is the founder of Cloudash. Now, before I dive into the intricacies of what that is, I’m going to just stake out a position that one of the biggest painful parts of working with AWS in any meaningful sense, particularly in a serverless microservices way, is figuring out what the hell’s going on in the environment. There’s a bunch of tools offered to do this and they’re all—yeee, they aspire to mediocrity. Maciej, thank you for joining me today.

Corey: Welcome to Screaming in the Cloud. I’m Cloud Economist Corey Quinn. And my guest today is Maciej Winnicki, who is the founder of Cloudash. Now, before I dive into the intricacies of what that is, I’m going to just stake out a position that one of the biggest painful parts of working with AWS in any meaningful sense, particularly in a serverless microservices way, is figuring out what the hell’s going on in the environment. There’s a bunch of tools offered to do this and they’re all—yeee, they aspire to mediocrity. Maciej, thank you for joining me today.

Maciej: Thank you for having me.

Corey: So, I turned out to have accidentally blown up Cloudash, sort of before you were really ready for the attention. You, I think, tweeted about it or put it on Hacker News or something; I stumbled over it because it turns out that anything that vaguely touches cloud winds up in my filters because of awesome technology, and personality defects on my part. And I tweeted about it as I set it up and got the thing running, and apparently this led to a surge of attention on this thing that you’ve built. So, let me start off with an apology. Oops, I didn’t realize it was supposed to be a quiet launch.

Maciej: I actually thank you for that. Like, that was great. And we get a lot of attention from your tweet thread, actually because at the end, that was the most critical part. At the end of the twitter, you wrote that you’re staying as a customer, so we have it on our website and this is perfect. But actually, as you said, that’s correct.

Our marketing strategy for releasing Cloudash was to post it on LinkedIn. I know this is not, kind of, the best strategy, but that was our plan. Like, it was like, hey, like, me and my friend, Tomasz, who’s also working on Cloudash, we thought like, let’s just post it on LinkedIn and we’ll see how it goes. And accidentally, I’m receiving a notification from Twitter, “Hey, Corey started tweeting about it.” And I was like, “Oh, my God, I’m having a heart attack.” But then I read the, you know—

Corey: Oops.

Maciej: [laugh]. Yeah. I read the, kind of, conclusion, and I was super happy. And again, thank you for that because this is actually when Cloudash kind of started rolling as a product and as a, kind of, business. So yeah, that was great.

Corey: To give a little backstory and context here is, I write a whole bunch of serverless nonsense. I build API’s Gateway, I hook them up to Lambda’s Function, and then it sort of kind of works. Ish. From there, okay, I would try and track down what was going on because in a microservices land, everything becomes a murder mystery; you’re trying to figure out what’s broken, and things have exploded. And I became a paying customer of IOpipe. And then New Relic bought them. Well, crap.

Then I became a paying customer of Epsagon. And they got acquired by Cisco, at which point I immediately congratulated the founders, who I know on a social basis, and then closed my account because I wanted to get out before Cisco ruins it because, Cisco. Then it was, what am I going to use next? And right around that time is when I stumbled across Cloudash. And it takes a different approach than any other entity in the space that I’ve seen because you are a native Mac desktop app. I believe your Mac only, but you seem to be Electron, so I could be way off base on that.

Maciej: So, we’re Linux as well right now and soon we’ll be Windows as well. But yeah, so, right now is Mac OS and Linux. Yeah, that’s correct. So, our approach is a little bit different.

So, let me start by saying what’s Cloudash? Like, Cloudash is a desktop app for, kind of, monitoring and troubleshooting serverless architectures services, like, serverless stuff in general. And the approach that we took is a little bit different because we are not web-based, we’re desktop-based. And there’s a couple of advantages of that approach. The first one is that, like, you don’t need to share your data with us because we’re not, kind of, downloading your metrics and logs to our back end and to process them, et cetera, et cetera. We are just using the credentials, the AWS profiles that you have defined on your computer, so nothing goes out of your AWS account.

And I think this is, like, considering, like, from the security perspective, this is very crucial. You don’t need to create a role that you give us access to or anything like that. You just use the stuff that you have on your desktop, and everything stays on your AWS account. So, nothing—we don’t download it, we don’t process it, we don’t do anything from that. And that’s one approach—well, that’s the one advantage. The other advantage is, like, kind of, onboarding, as I kind of mentioned because we’re using the AWS profiles that you have defined in your computer.

Corey: Well, you’re doing significantly more than that because I have a lot of different accounts configured different ways, and when I go to one of them that uses SSO, it automatically fires me off to the SSO login page if I haven’t logged in that day for a 12 hour session—

Maciej: Yes.

Corey: —for things that have credentials stored locally, it uses those; and for things that are using role-chaining to use assuming roles from the things I have credentials for, and the things that I just do role assumption in, and it works flawlessly. It just works the way that most of my command-line tools do. I’ve never seen a desktop app that does this.

Maciej: Yeah. So, we put a lot of effort into making sure that this works great because we know that, like, no one will use Cloudash if there’s—like, not no one, but like, we’re targeting, like, serverless teams, maybe, in enterprise companies, or serverless teams working on some startups. And in most cases, those teams or those engineers, they use SSO, or at least MFA, right? So, we have it covered. And as you said, like, it should be the onboarding part is really easy because you just pick your AWS profile, you just pick region, and just pick, right now, a CloudFormation stack because we get the information about your service based on CloudFormation stack. So yeah, we put a lot of effort in making sure that this works without any issues.

Corey: There are some challenges to it that I saw while setting it up, and that’s also sort of the nature of the fact you are, in fact, integrating with CloudWatch. For example, it’s region specific. Well, what if I want to have an app that’s multi-region? Well, you’re going to have a bad time because doing [laugh] anything multi-region in AWS means you’re going to have a bad time that gets particularly obnoxious and EC2 get to when you’re doing something like Lambda@Edge, where, oh, where are the logs live; that’s going to be in a CloudFront distribution in whatever region it winds up being accessed from. So, it comes down to what distribution endpoint or point of presence did that particular request go through, and it becomes this giant game of whack-a-mole. It’s frustrating, and it’s obnoxious, and it’s also in no way your fault.

Maciej: Yeah, I mean, we are at the beginning. Right now, it’s the most straightforward, kind of pe—how people think about stacks of serverless. They’re think in terms of regions because I think for us, regions, or replicated stacks, or things like that are not really popular yet. Maybe they will become—like, this is how AWS works as a whole, so it’s not surprising that we’re kind of following this path. I think my point is that our main goal, the ultimate goal, is to make monitoring, as I said, the troubleshooting serverless app as simple as possible.

So, once we will hear from our customers, from our users that, “Hey, we would like to get a little bit better experience around regions,” we will definitely implement that because why not, right? And I think the whole point of Cloudash—and maybe we can go more deep into that later—is that we want to bring context into your metrics and logs. If you’re seeing a, for example, X-Ray trace ID in your logs, you should be able with one click just see that the trace. It’s not yet implemented in Cloudash, but we are having it in the backlog. But my point is that, like, there should be some journey when you’re debugging stuff, and you shouldn’t be just, like, left alone having, like, 20 tabs, Cloudash tabs open and trying to figure out where I was—like, where’s the Lambda? Where’s the API Gateway logs? Where are the CloudFront logs? And how I can kind of connect all of that? Because that’s—it’s an issue right now.

Corey: Even what you’ve done so far is incredibly helpful compared to the baseline experience that folks will often have, where I can define a service that is comprised of a number of different functions—I have one set up right now that has seven functions in it—I grab any one of those things, and I can set how far the lookback is, when I look at that function, ranging from 5 minutes to 30 days. And it shows me at the top the metrics of invocations, the duration that the function runs for, and the number of errors. And then, in the same pane down below it, it shows the CloudWatch logs. So, “Oh, okay, great. I can drag and zoom into a specific timeframe, and I see just the things inside of that.”

And I know this sounds like well, what’s the hard part here? Yeah, except nothing else does it in an easy-to-use, discoverable way that just sort of hangs out here. Honestly, the biggest win for me is that I don’t have to log in to the browser, navigate through some ridiculous other thing to track down what I’m talking about. It hangs out on my desktop all the time, and whether it’s open or not, whenever I fire it up, it just works, basically, and I don’t have to think about it. It reduces the friction from, “This thing is broken,” to, “Let me see what the logs say.”

Very often I can go from not having it open at all to staring at the logs and having to wait a minute because there’s some latency before the event happens and it hits CloudWatch logs itself. I’m pretty impressed with it, and I’ve been keeping an eye on what this thing is costing me. It is effectively nothing in terms of CloudWatch retrieval charges. Because it’s not sitting there sucking all this data up all the time, for everything that’s running. Like, we’ve all seen the monitoring system that winds up costing you more than it costs more than they charge you ancillary fees. This doesn’t do that.

I also—while we’re talking about money, I want to make very clear—because disclaiming the direction the money flows in is always important—you haven’t paid me a dime, ever, to my understanding. I am a paying customer at full price for this service, and I have been since I discovered it. And that is very much an intentional choice. You did not sponsor this podcast, you are not paying me to say nice things. We’re talking because I legitimately adore this thing that you’ve built, and I want it to exist.

Maciej: That’s correct. And again, thank you for that. [laugh].

Corey: It’s true. You can buy my attention, but not my opinion. Now, to be clear, when I did that tweet thread, I did get the sense that this was something that you had built as sort of a side project, as a labor of love. It does not have VC behind it, of which I’m aware, and that’s always going to, on some level, shade how I approach a service and how critical I’m going to be on it. Just because it’s, yeah, if you’ve raised a couple 100 million dollars and your user experience is trash, I’m going to call that out.

But if this is something where you just soft launched, yeah, I’m not going to be a jerk about weird usability bugs here. I might call it out as “Ooh, this is an area for improvement,” but not, “What jackwagon thought of this?” I am trying to be a kinder, gentler Corey in the new year. But at the same time, I also want to be very clear that there’s room for improvement on everything. What surprised me the most about this is how well you nailed the user experience despite not having a full team of people doing UX research.

Maciej: That was definitely a priority. So, maybe a little bit of history. So, I started working on Cloudash, I think it was April… 2019. I think? Yeah. It’s 2021 right now. Or we’re 2022. [unintelligible 00:11:33].

Corey: Yeah. 2022, now. I—

Maciej: I’m sorry. [laugh].

Corey: —I’ve been screwing that up every time I write the dates myself, I’m with you.

Maciej: [laugh]. Okay, so I started working on Cloudash, in 2020, April 2020.

Corey: There we go.

Maciej: So, after eight months, I released some beta, like, free; you could download it from GitHub. Like, you can still download on GitHub, but at that time, there was no license, you didn’t have to buy a license to run it. So, it was, like, very early, like, 0.3 version that was working, but sort of, like, [unintelligible 00:12:00] working. There were some bugs.

And that was the first time that I tweeted about it on Twitter. It gets some attention, but, like, some people started using it. I get some feedback, very initial feedback. And I was like, every time I open Cloudash, I get the sense that, like, this is useful. I’m talking about my own tool, but like, [laugh] that’s the thing.

So, further in the history. So, I’m kind of service engineer by my own. I am a software engineer, I started focusing on serverless, in, like, 2015, 2016. I was working for Serverless Inc. as an early employee.

I was then working as an engineering manager for a couple of companies. I work as an engineering manager right now at Stedi; we’re also, like, fully serverless. So I, kind of, trying to fix my own issues with serverless, or trying to improve the whole experience around serverless in AWS. So, that’s the main purpose why we’re building Cloudash: Because we want to improve the experience. And one use case I’m often mentioning is that, let’s say that you’re kind of on duty. Like, so in the middle of night PagerDuty is calling you, so you need to figure out what’s going on with your Lambda or API Gateway.

Corey: Yes. PagerDuty, the original [Call of Duty: Nagios 00:13:04]. “It’s two in the morning; who is it?” “It’s PagerDuty. Wake up, jackass.” Yeah. We all had those moments.

Maciej: Exactly. So, the PagerDuty is calling you and you’re, kind of, in the middle of night, you’re not sure what’s going on. So, the kind of thing that we want to optimize is from waking up into understanding what’s going on with your serverless stuff should be minimized. And that’s the purpose of Cloudash as well. So, you should just run one tool, and you should immediately see what’s going on. And that’s the purpose.

And probably with one or two clicks, you should see the logs responsible, for example, in your Lambda. Again, like that’s exactly what we want to cover, that was the initial thing that we want to cover, to kind of minimize the time you spent on troubleshooting serverless apps. Because as we all know, kind of, the longer it’s down, the less money you make, et cetera, et cetera, et cetera.

Corey: This episode is sponsored by our friends at Oracle Cloud. Counting the pennies, but still dreaming of deploying apps instead of "Hello, World" demos? Allow me to introduce you to Oracle's Always Free tier. It provides over 20 free services and infrastructure, networking, databases, observability, management, and security. And—let me be clear here—it's actually free. There's no surprise billing until you intentionally and proactively upgrade your account. This means you can provision a virtual machine instance or spin up an autonomous database that manages itself all while gaining the networking load, balancing and storage resources that somehow never quite make it into most free tiers needed to support the application that you want to build. With Always Free, you can do things like run small scale applications or do proof-of-concept testing without spending a dime. You know that I always like to put asterisks next to the word free. This is actually free, no asterisk. Start now. Visit snark.cloud/oci-free that's snark.cloud/oci-free.

Corey: One of the things that I appreciate about this is that I have something like five different microservices now that power my newsletter production pipeline every week. And periodically, I’ll make a change and something breaks because testing is something that I should really get around to one of these days, but when I’m the only customer, cool. Doesn’t really matter until suddenly I’m trying to write something and it doesn’t work. Great. Time to go diving in, and always I’m never in my best frame of mind for that because I’m thinking about writing for humans not writing for computers. And that becomes a challenge.

And okay, how do I get to the figuring out exactly what is broken this time? Regression testing: It really should be a thing more than it has been for me.

Maciej: You should write those tests. [laugh].

Corey: Yeah. And then I fire this up, and okay, great. Which sub-service is it? Great. Okay, what happened in the last five minutes on that service? Oh, okay, it says it failed successfully in the logs. Okay, that’s on me. I can’t really blame you for that. But all right.

And then it’s a matter of adding more [print or 00:14:54] debug statements, and understanding what the hell is going on, mostly that I’m bad at programming. And then it just sort of works from there. It’s a lot easier to, I guess, to reason about this from my perspective than it is to go through the CloudWatch dashboards, where it’s okay, here’s a whole bunch of metrics on different graphs, most of which you don’t actually care about—as opposed to unified view that you offer—and then “Oh, you want to look at logs, that’s a whole separate sub-service. That’s a different service team, obviously, so go open that up in another browser.” And I’m sitting here going, “I don’t know who designed this, but are there any windows in their house? My God.”

It’s just the saddest thing I can possibly experience when I’m in the middle of trying to troubleshoot. Let’s be clear, when I’m troubleshooting, I am in no mood to be charitable to anyone or anything, so that’s probably unfair to those teams. But by the same token, it’s intensely frustrating when I keep smacking into limitations that get in my way while I’m just trying to get the thing up and running again.

Maciej: As you mentioned about UX that, like, we’ve spent a lot of time thinking about the UX, trying different approaches, trying to understand which metrics are the most important. And as we all know, kind of, serverless simplifies a lot of stuff, and there’s, like, way less metrics that you need to look into when something is happening, but we want to make sure that the stuff that we show—which is duration errors, and p95—are probably the most important in most cases, so like, covering most of this stuff. So sorry, I didn’t mention that before; it was very important from the very beginning. And also, like, literally, I spent a lot of time, like, working on the colors, which sounds funny, [laugh] but I wanted to get them right. We’re not yet working on dark mode, but maybe soon.

Anyways, the visual part, it’s always close to my heart, so we spent a lot of time going back to what just said. So, definitely the experience around using CloudWatch right now, and CloudWatch logs, CloudWatch metrics, is not really tailored for any specific use case because they have to be generic, right? Because AWS has, like, I don’t know, like, 300, or whatever number of services, probably half of them producing logs—maybe not half, maybe—

Corey: We shouldn’t name a number because they’ll release five more between now and when this publishes in 20 minutes.

Maciej: [laugh]. So, CloudWatch has to be generic. What we want to do with Cloudash is to take those generic tools—because we use, of course, CloudWatch logs, CloudWatch metrics, we fetch data from them—but make the visual part more tailored for specific use case—in our case, it’s the serverless use case—and make sure that it’s really, kind of—it shows only the stuff that you need to see, not everything else. So again, like that’s the main purpose. And then one more thing, we—like this is also some kind of measurement of success, we want to reduce number of tabs that you need to have open in your browser when you’re dealing with CloudWatch. So, we tried to put most important stuff in one view so you don’t need to flip between tabs, as you usually do when try to under some kind of broader scope, or broader context of your, you know, error in Lambda.

Corey: What inspired you to do this as a desktop application? Because a lot of companies are doing similar things, as SaaS, as webapps. And I have to—as someone who yourself—you’re a self-described serverless engineer—it seems to me that building a webapp is sort of like the common description use case of a lot of serverless stuff. And you’re sitting here saying, “Nope, it’s desktop app time.” Which again, I’m super glad you did. It’s exactly what I was looking for. How do you get here?

Maciej: I’d been thinking about both kinds of types of apps. So like, definitely webapp was the initial idea how to build something, it was the webapp. Because as you said, like, that’s the default mode. Like, we are thinking webapp; like, let’s build a webapp because I’m an engineer, right? There is some inspiration coming from Dynobase, which was made by a friend [unintelligible 00:18:55] who also lives in Poland—I didn’t mention that; we’re based in [Poznań 00:18:58], Poland.

And when I started thinking about it, there’s a lot of benefits of using this approach. The biggest benefit, as I mentioned, is security; and the second benefit is just most, like, cost-effective because we don’t need to run in the backend, right? We don’t need to download all your metrics, all your logs. We I think, like, let’s think about it, like, from the perspective. Listen, so everyone in the company to start working, they have to download all of your stuff from your AWS account. Like, that sounds insane because you don’t need all of that stuff elsewhere.

Corey: Store multiple copies of it. Yeah I, generally when I’m looking at this, I care about the last five to ten minutes.

Maciej: Exactly.

Corey: I don’t—

Maciej: Exactly.

Corey: —really care what happened three-and-a-half years ago on this function. Almost always. But occasionally I want to look back at, “Oh, this has been breaking. How long has it been that way?” But I already have that in the AWS environment unless I’ve done the right thing and turned on, you know, log expiry.

Maciej: Exactly. So, this is a lot of, like, I don’t want to be, like, you know, mean to anyone but like, that’s a lot of waste. Like, that’s a lot of waste of compute power because you need to download it; of cost because you need to get this data out of AWS, which you need to pay for, you know, get metric data and stuff like this. So, you need to—

Corey: And almost all of its—what is it? Write once, read never. Because it’s, you don’t generally look at these things.

Maciej: Yeah, yeah. Exactly.

Corey: And so much of this, too, for every invocation I have, even though it’s low traffic stuff, it’s the start with a request ID and what version is running, it tells me ‘latest.’ Helpful. A single line of comment in this case says ‘200.’ Why it says that, I couldn’t tell you. And then it says ‘End request ID.’ The end.

Now, there’s no way to turn that off unless you disabled the ability to write to CloudWatch logs in the function, but ingest on that cost 50 cents a gigabyte, so okay, I guess that’s AWS’s money-making scam of the year. Good for them. But there’s so much of that, it’s like looking at—like, when things are working, it’s like looking at a low traffic site that’s behind a load balancer, where there’s a whole—you have gigabytes, in some cases, of load balancer—of web server logs on the thing that’s sitting in your auto-scaling group. And those logs are just load balancer health checks. 98% of it is just that.

Same type of problem here, I don’t care about that, I don’t want to pay to store it, I certainly don’t want to pay to store it twice. I get it, that makes an awful lot of sense. It also makes your security job a hell of a lot easier because you’re not sitting on a whole bunch of confidential data from other people. Because, “Well, it’s just logs. What could possibly be confidential in there?” “Oh, my sweet summer child, have you seen some of the crap people put in logs?”

Maciej: I’ve seen many things in logs. I don’t want to mention them. But anyways—and also, you know, like, usually when you gave access to your AWS account, it can ruin you. You know, like, there might be a lot of—like, you need to really trust the company to give access to your AWS account. Of course, in most cases, the roles are scoped to, you know, only CloudWatch stuff, actions, et cetera, et cetera, but you know, like, there are some situations in which something may not be properly provisioned. And then you give access to everything.

Corey: And you can get an awful lot of data you wouldn’t necessarily want out of that stuff. Give me just the PDF printout of last month’s bill for a lot of environments, and I can tell you disturbing levels of detail about what your architecture is, just because when you—you can infer an awful lot.

Maciej: Yeah.

Corey: Yeah, I hear you. It makes your security story super straightforward.

Maciej: Yeah, exactly. So, I think just repeat my, like, the some inspiration. And then when I started thinking about Cloudash, like, definitely one of the inspiration was Dynobase, from the, kind of, GUI for, like, more powerful UI for DynamoDB. So, if you’re interested in that stuff, you can also check this out.

Corey: Oh, yeah, I’ve been a big fan of that, too. That’ll be a separate discussion on a different episode, for sure.

Maciej: [laugh]. Yeah.

Corey: But looking at all of this, looking at the approach of, the only real concern—well, not even a concern. The only real challenge I have with it for my use case is that when I’m on the road, the only thing that I bring with me for a computer is my iPad Pro. I’m not suggesting by any means that you should build this as a new an iPad app; that strikes me as, like, 15 levels of obnoxious. But it does mean that sometimes I still have to go diving into the CloudWatch console when I’m not home. Which, you know, without this, without Cloudash, that’s what I was doing originally anyway.

Maciej: You’re the only person that requested that. And we will put that into backlog, and we will get to that at some point. [laugh].

Corey: No, no, no. Smart question is to offer me a specific enterprise tier pricing—.

Maciej: Oh, okay. [laugh].

Corey: —that is eye-poppingly high. It’s like, “Hey, if you want a subsidize feature development, we’re thrilled to empower that.” But—

Maciej: [laugh]. Yeah, yeah. To be honest, I like that would be hard to write [unintelligible 00:23:33] implement as iPad app, or iPhone app, or whatever because then, like, what’s the story behind? Like, how can I get the credentials, right? It’s not possible.

Corey: Yeah, you’d have to have some fun with that. There are a couple of ways I can think of offhand, but then that turns into a sandboxing issue, and it becomes something where you have to store credentials locally, regardless, even if they’re ephemeral. And that’s not great. Maybe turn it into a webapp someday or something. Who knows.

What I also appreciate is that we had a conversation when you first launched, and I wound up basically going on a Zoom call with you and more or less tearing apart everything you’ve built—and ideally constructive way—but looking at a lot of the things you’ve changed in your website, you listened to an awful lot of feedback. You doubled your pricing, for example. Used to be ten bucks a month; now you’re twenty. Great. I’m a big believer in charging more.

You absolutely add that kind of value because it’s, “Well, twenty bucks a month for a desktop app. That sounds crappy.” It's, “Yeah, jackwagon, what’s your time worth?” I was spending seven bucks a month in serverless charges, and 120 or 130 a month for Epsagon, and I was thrilled to pieces to be doing it because the value I got from being able to quickly diagnose what the hell was going on far outstripped what the actual cost of doing these things. Don’t fall into the trap of assuming that well, I shouldn’t pay for software. I can just do it myself. Your time is never free. People think it is, but it’s not.

Maciej: That’s true. The original price of $9.99, I think that was the price was the launch promo. After some time, we’ve decided—and after adding more features: API Gateway support—we’ve decided that this is, like, solving way more problems, so like, you should probably pay a little bit more for that. But you’re kind of lucky because you subscribed to it when it was 9.99, and this will be your kind of prize for the end of, you know—

Corey: Well, I’m going to argue with you after the show to raise the price on mine, just because it’s true. It’s the—you want to support the things that you want to exist in the world. I also like the fact that you offered an annual plan because I will go weeks without ever opening the app. And that doesn’t mean it isn’t adding value. It’s that oh, yeah, I will need that now that I’m hitting these issues again.

And if I’m paying on a monthly basis, and it shows up with a, “Oh, you got charged again.” “Well, I didn’t use it this month; I should cancel.” And [unintelligible 00:25:44] to an awful lot of subscriber churn. But in the course of a year, if I don’t have at least one instance in which case, wow, that ten minute span justified the entire $200 annual price tag, then, yeah, you built the wrong thing or it’s not for me, but I can think of three incidents so far since I started using it in the past four months that have led to that being worth everything you will charge me a year, and then some, just because it made it so clear what was breaking.

Maciej: So, in that regard, we are also thinking about the team licenses, that’s definitely on the roadmap. There will be some changes to that. And we definitely working on more and more features. And if we’re—like, the roadmap is mostly about supporting more and more AWS services, so right now it’s Lambda, API Gateway, we’re definitely thinking about SQS, SNS, to get some sense how your messages are going through, probably something, like, DynamoDB metrics. And this is all kind of serverless, but why not going wider? Like, why not going to Fargate? Like, Fargate is theoretically serverless, but you know, like, it’s serverless on—

Corey: It’s serverless with a giant asterisk next to it.

Maciej: Yeah, [laugh] exactly. So, but why not? Like, it’s exactly the same thing in terms of, there is some user flow, there is some user journey, when you want to debug something. You want to go from API Gateway, maybe to the container to see, I don’t know, like, DynamoDB metric or something like that, so it should be all easy. And this is definitely something.

Later, why not EC2 metrics? Like, it would be a little bit harder. But I’m just saying, like, first thing here is that you are not, like, at this point, we are serverless, but once we cover serverless, why not going wider? Why not supporting more and more services and just making sure that all those use cases are correctly modeled with the UI and UX, et cetera?

Corey: That’s going to be an interesting challenge, just because that feels like what a lot of the SaaS monitoring and observability tooling is done. And then you fire this thing up, and it looks an awful lot like the AWS console. And it’s, “Yeah, I just want to look at this one application that doesn’t use any of the rest of those things.” Again, I have full faith and confidence in your ability to pull this off. You clearly have done that well based upon what we’ve seen so far. I just wonder how you’re going to wind up tackling that challenge when you get there.

Maciej: And maybe not EC2. Maybe I went too far. [laugh].

Corey: Yeah, honestly, even EC2-land, it feels like that is more or less a solved problem. If you want to treat it as a bunch of EC2, you can use Nagios. It’s fine.

Maciej: Yeah, totally.

Corey: There are tools that have solved that problem. But not much that I’ve seen has solved the serverless piece the way that I want it solved. You have.

Maciej: So, it’s definitely a long road to make sure that the serverless—and by serverless, I mean serverless how AWS understands serverless, so including Fargate, for example. So, there’s a lot of stuff that we can improve. It’s a lot of stuff that can make easier with Cloudash than it is with CloudWatch, just staying inside serverless, it will take us a lot of time to make sure that is all correct. And correctly modeled, correctly designed, et cetera. So yeah, I went too far with EC2 sorry.

Corey: Exactly. That’s okay. We all go too far with EC2, I assure you.

Maciej: Sorry everyone using EC2 instances. [laugh].

Corey: If people want to kick the tires on it, where can they find it?

Maciej: They can find it on cloudash.dev.

Corey: One D in the middle. That one throws me sometimes.

Maciej: One D. Actually, after talking to you, we have a double-D domain as well, so we can also try ‘Clouddash’ with double-D. [laugh].

Corey: Excellent, excellent. Okay, that is fantastic. Because I keep trying to put the double-D in when I’m typing it in my search tool on my desktop, and it doesn’t show up. And it’s like, “What the—oh, right.” But yeah, we’ll get there one of these days.

Maciej: Only the domain. It’s only the domain. You will be redirected to single-D.

Corey: Exactly.

Maciej: [laugh].

Corey: We’ll have to expand later; I’ll finance the feature request there. It’ll go well. If people want to learn more about what you have to think about these things, where else can they find you?

Maciej: On Twitter, and my Twitter handle is @mthenw. M-then-W, which is M-T-H—mthenw. And my co-founder @tlakomy. You can probably add that to [show notes 00:29:35]. [laugh].

Corey: Oh, I certainly will. It’s fine, yeah. Here’s a whole bunch of letters. I hear you. My Twitter handle used to be my amateur radio callsign. It turns out most people don’t think like that. And yeah, it’s become an iterative learning process. Thank you so much for taking the time to speak with me today and for building this thing. I really appreciate both of them.

Maciej: Thank you for having me here. I encourage everyone to visit cloudash.dev, if you have any feature requests, any questions just send us an email at [email protected], or just go to GitHub repository in the issues; just create an issue, describe what you want and we can talk about it.

We are always happy to help. The main purpose, the ultimate goal of Cloudash is to make the serverless engineer’s life easier, on very high level. And on a little bit lower level, just to make, you know, troubleshooting and debugging serverless apps easier.

Corey: Well, from my perspective, you’ve succeeded.

Maciej: Thank you.

Corey: Thank you. Maciej Winnicki, founder of Cloudash. I’m Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you’ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you’ve hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment telling me exactly why I’m wrong for using an iPad do these things, but not being able to send it because you didn’t find a good way to store the credentials.

Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Announcer: This has been a HumblePod production. Stay humble.

Newsletter Footer

Get the Newsletter

Reach over 30,000 discerning engineers, managers, enthusiasts who actually care about the state of Amazon’s cloud ecosystems.

"*" indicates required fields

This field is for validation purposes and should be left unchanged.
Sponsor Icon Footer

Sponsor an Episode

Get your message in front of people who care enough to keep current about the cloud phenomenon and its business impacts.