Join Pete and Jesse as they talk about the important role tagging plays in influencing DevOps, why tagging strategies need to change over time, why improving your organization's tagging strategy isn't an overnight fix, how tagging is all about cost attribution, some alternative use cases for tagging that you might not have thought about, why tools like Cloud Custodian can help you understand more about your cloud environment, how positive reinforcement can be used to get your team on board with tagging, and more.
Episode Show Notes & Transcript
- Unconventional Guide to AWS Cost Management:https://www.duckbillgroup.com/resources/unconventional-guide-to-aws-cost-management/
Corey: This episode is sponsored in part by LaunchDarkly. Take a look at what it takes to get your code into production. I’m going to just guess that it’s awful because it’s always awful. No one loves their deployment process. What if launching new features didn’t require you to do a full-on code and possibly infrastructure deploy? What if you could test on a small subset of users and then roll it back immediately if results aren’t what you expect? LaunchDarkly does exactly this. To learn more, visit launchdarkly.com and tell them Corey sent you, and watch for the wince.
Pete: Hello, and welcome to the AWS Morning Brief: Fridays From the Field. I am Pete Cheslock.
Jesse: I’m Jesse DeRose.
Pete: We’re back again. And we’re here. We made it, Jesse.
Jesse: I was worried. This was a journey. Thank you, everybody, for coming on this journey with us.
Pete: It was quite an experience going through the Unconventional Guide to AWS Cost Savings. We’ve made it. I just can’t believe we’re here.
Pete: So, what are we talking about today for the culmination of our magnum opus of cost savings optimizations?
Jesse: This is a fun one. And I know I keep saying that this is my favorite about everyone, but I have to admit that this one, this topic today probably is my absolute favorite. This one I get really nerdy over. Today, we’re talking about how to predict your future and make your CFO happy. No—spoiler alert—there are not any crystal balls involved in this one. There’s no stock market conversations.
This is talking about how you can use all of the different things that we’ve talked about throughout the course of this Unconventional Guide to really bring it all together into a couple ideas that will help you better understand your cloud costs, and really better understand your business, I think.
Pete: Yeah. All of the things we talked about really lead up to this one, which is the clients of ours that are the most mature, who are incredibly optimized in their Amazon usage, are the ones who have adopted a majority of these specific items. They all lead to this last one, that ability to predict your future usage based on something that’s happening internally, or if a salesperson comes to you and says, “Hey, we’re about to close this deal, but I need to discount our service.” People are going to start wanting to know well, what is the cheapest that you could sell your service for and still have a positive gross margin?
Jesse: Yeah. So, if you’ve done a lot of the things that we’ve talked about in the last couple episodes—I apologize, I know homework’s not the best for a podcast—but if you’ve had the opportunity to work on some of those things, you should have a ton of valuable insights into your spend. We’re talking about tagging, and showback models in particular, maybe even a chargeback model. But you can ultimately use all of this data to better understand what is your forecasted spend is going to look like with a new potential customer coming onto the platform? Or if you get into the topic that we’re going to talk about today, which is mostly unit economics, you can really understand how much can I discount my service and still make a profit, like Pete mentioned?
Pete: Yeah, I mean, imagine there’s a global pandemic that happens, and it causes your usage to spike by 500% within the course of a month. How did your spend change? Do you know where it changed? And did it change in ways that you were expecting it to? Like, my databases grew by a lot, and this other thing didn’t grow by very much.
Like, that would be expected. But also another thing that—a question that we actually like to ask a lot of our clients, if your sales just doubled overnight, okay would your spend change? Where are the places that are most expensive to operate your service? And again, this is kind of generic. I’ve worked in a lot of SaaS services, so I always think of sales, but just think of whether you’re using the cloud for a SaaS service that you provide and sell, like, B2C, things like that, or B2B, you still have users.
They might be internal users. Well, what if your users doubled overnight? What if half the company was using your internal service and now the whole company is? How does that change your usage?
Jesse: And it’s also important to think about not just your AWS usage, but all of the other services that you use that support your overall business model: things like monitoring and observability tools, logging vendors, maybe third-party sim tools. All of these are affecting your overall total infrastructure cost and are all part of this conversation. So, it’s really important to start thinking about those architecture diagrams. Remember, when we said, way, way back at the beginning of this conversation, to overlay costs on top of your architecture diagram, understanding that, understanding what parts of your product or what parts of your architecture are the most expensive will really help you understand what’s going to change?
Pete: Yeah, let’s say you’ve got a six-figure bill to Datadog or one of the big log management vendors out there, but inside of that bill, is that all just evenly spread across the whole business? What if your log vendor was—the entire spend was all by one service that some developer left the debug logging enabled for? You know, you’d want a way of understanding that maybe that spend was concentrated in maybe a non-production aspect of your account. Because then again, that wouldn’t grow, right? That wouldn’t affect your growth in your sales the same way as if maybe all of your services were equally sending logs of a certain volume over.
So, all of those extra services, they all add up, and we see it more and more, as more of our clients start adopting more than just Amazon services: they might be adopting a Snowflake, they might be adopting third-party services running databases running in other services, or EMR type workloads that are not on EMR, and they’re running on Qubole or things like that. There’s just a lot of these services that more and more people are consuming from that fall outside of just the AWS invoice.
Jesse: And this also gets back to not just architecture diagrams, but also tagging and showback models, cost visibility, really understanding where your spend is going. And this is fantastic to understand where your spend is going, but finance is probably going to want something a little bit more than this. It’s not just about how much are we spending, or where are we spending it, and maybe it’s not even a finance question. Maybe this is a sales conversation, assuming that you’re a SaaS company. Maybe this is, as Pete mentioned before, “Hey, we want to understand where can we provide discounts? What services can we ultimately discount to negotiate getting new customers on the platform?”
Pete: So, Jesse, we hear a lot of these terms a lot, and I’d love a ‘explain like I am a five-year-old’ version of it, but we hear a chargeback. And we hear showback, and honestly, I’ve never worked at those massive companies where you might implement these things, but can you give us just a real quick—for all the listeners out there, when we say showback, what does that mean? And when we say chargeback, what does that mean?
Jesse: So, a showback model essentially takes all of your cloud costs, all of your total infrastructure spend, your AWS spend, all the third-party spend, and it shows every team, every product, every microservice, maybe, depending—or maybe even business unit, depending on how your organization is split up—it shows each one of those units, how much they are actually spending, how much they’re actually using these different cloud vendors. So, this is where tagging comes in super handy because if you’ve tagged all of your taggable resources, and properly attributed all of your cloud costs with tagging and linked accounts, you have a very clear idea of who’s spending what. You know very clearly, maybe 70 to 80% of your total infrastructure spend is related to one particular product because all the cost is attributed to one particular product. And maybe that’s something you didn’t know before. Maybe now you know okay, maybe that product needs to be a little bit more expensive so that we can make sure that we are making money off of it, or profiting off of it, whereas other services can be discounted because they’re not as expensive.
Whereas in a chargeback model, you are ultimately not just showing each of these teams, hey, here’s how much you spent on AWS and Datadog usage and all these other vendors every month, you’re actually charging them for that usage. You’re actually pulling their cloud costs from their budget.
Pete: Yeah. They might actually have a budget of money. It’s all—if you want to really explain like I’m five, it’d be like, I give my child their—they get $1 for all of the tasks that they do throughout the week, I don’t actually give them the money because I usually have to subtract out their, like, Roblox spend of the week [laugh] and things like that. It’s all virtual, but at the end of the day, you know, we’re kind of virtually giving this business unit some money, and then, kind of, virtually charging them for their services within.
Jesse: Yeah. And this is mature. We don’t see a lot of companies doing this. This is hard because you have to take other steps first to get here. And so this is why we harp so much on cost attribution through tagging and through linked accounts.
This is why we harp so much on cost visibility and overlaying those cloud costs on your architecture diagrams to understand all of this data to lead to this point, which is understanding, where, how much is my primary product actually costing us? How much is my secondary product actually costing us? Or maybe how much is this business unit costing us in terms of total infrastructure spend?
Pete: Yeah, I mean, I can kind of share my history with this at previous companies is that, again, eventually someone in the financial department is going to say, “What was our cost for Amazon?” They specifically will want to know the production cost because that figures into a term called gross margin, which you often hear at SaaS businesses. Gross margin is basically you take all the revenue that came in and you subtract away what it took to support that revenue. And mostly, that is just the Amazon bill and these other vendors, perhaps, and you end up with a percentage. And hopefully, it’s a positive percentage.
It means you’re theoretically making money at a gross, I mean, obviously, before you pay salaries and all those other items, but that being kind of beside the point for now, that number, you’re probably going to get asked for. So, you wouldn’t want to give like your straight Amazon bill, like, “Oh, well, we spent $100,000 last month,” because some of that spend was probably in research and development; it was probably in a development account or a QA account. You really just want your product spent. So, at a previous company, the first step we took was break out our spend via production and development, just two criteria. Now, for us because we started with just a handful of accounts—this was before a lot of accounts were more prevalent, before organizations—before it was easy to handle a lot of accounts—we had a Prod and a Dev. Super easy. Look at Prod, look at Dev. There’s the two bills.
But then as time went on, we needed to get more granular. We were running some development workloads, testing out new databases at scale in kind of a hidden dark deployed mode, in production. Well, we want to subtract that spend from there. And that requires tagging. I mean, that’s why we really harped on tagging for a couple of episodes because tagging is the only way you’re going to be able to do that.
Now, we see more often a lot of our clients do maybe an account per product, or account per business unit. Those are, again, really effective ways to corral your spend to make it really easy to break it out and add it up. It’s really just trying to break it down to the most reasonable spend unit possible that you can then play around with and adjust. Mostly to go back to your CFO when they asked you, “Hey, I need to know this specific answer.” You’ve got it hopefully somewhat available.
Jesse: Okay, so this is where we’re going to start talking about unit economics. And hopefully, your eyes will not glaze over. I want to make sure that—this is important, this is really actually beneficial. It’s not just a specific economic thing that you learned back in Econ 101. This is actually going to be useful and beneficial for your organization.
So, unit economics describes your product in terms of revenues and costs in relation to a unit KPI—that’s where the ‘unit’ term comes from in ‘unit economics’—that tracks closely with customer demand. So, that’s a really gross definition, I know, and I apologize.
Pete: You know, and we can even extend that a little bit further and give some good examples. Like, maybe if you are a website that provides eLearning services, your unit might be the number of daily active users or thousands of daily active users, right, could be a thing. That could be a unit that you’re selling. I actually worked at a SaaS company where we sold a piece of software that would run per server, and we broke our unit down to the servers—the things that we sold—down to that level.
Jesse: Yeah, if you are in the airline industry, for example, your unit would probably be every passenger. How many passengers are you able to sell tickets to on every plane? What do those costs look like?
Corey: If your mean time to WTF for a security alert is more than a minute, it's time to look at Lacework. Lacework will help you get your security act together for everything from compliance service configurations to container app relationships, all without the need for PhDs in AWS to write the rules. If you're building a secure business on AWS with compliance requirements, you don't really have time to choose between antivirus or firewall companies to help you secure your stack. That's why Lacework is built from the ground up for the Cloud: low effort, high visibility and detection. To learn more, visit lacework.com.
Pete: And you don’t need just, like, one unit. Maybe you have one unit for your whole platform like the whole gross production spend breaks down into one specific unit, you could do that. But you could also have units at a per-service level because maybe it’s like you’re processing a lot of documents. I worked for an email archiving company, forever ago, and our unit was the amount of emails that were indexed and archived so we could figure out, we might have one customer who just didn’t generate a lot of emails, but they had tons of users. Well, one of our units was the volume of emails that we were indexing and archiving for that customer, whereas on the flip side, if maybe our spend was driven more by user count, and not document count, maybe that’s what we want our unit to be, is per user.
Jesse: Yeah. It’s really important to call out that you might have a single easy-to-define unit; you might have a more complex relationship that’s weighted with a couple different factors of different components of the architecture. But ultimately, your unit KPI and how you break out your costs to support your customers will be unique to your overall business.
Pete: Exactly. And this is where you’re only going to find this answer out with a lot of conversations, internally. It could come to you pretty easily, you know, just based on how your business is. But I think for a lot of folks using Amazon, especially if you’re just in a specific business unit inside of a broader business, it could be a little bit more challenging to figure out. But what you’re really trying to do is just figure out, when X changes, our spend changes, and we spend more or we spend less. Try to solve for X. That’s really what you’re trying to do.
Jesse: Okay. So, now we’ve covered the unit KPI part of this conversation. Awesome. So, we’re done, Pete, right? We just take our AWS bill and then—
Jesse: —divide it by the unit and we’re done.
Pete: So, easy. I’ve got my unit. I’ve got my bill. I got an iPhone that can do a calculator. Good to go.
Jesse: [laugh]. Good. We’re done. Well, wait. What about if I have multiple AWS accounts? Wait, what if I have multiple different products?
Pete: Yeah, that’s… I mean, I kind of calculator. I mean, I might be here all day, but…
Jesse: [laugh]. I’ve got a whiteboard. We’ve got some time.
Pete: Yeah, we got time. That’s a great point, though. Again, what if you do have things that are just spread all over the place? What if you’ve got two different products, two different services inside the same account? Because of course, you would. That’s a super normal thing. I’m not even saying that sarcastically. That’s a super normal thing.
Pete: Well, how do you handle this? How do you handle shared services?
Jesse: Yeah. I mean I—
Pete: We could go on for too long on that one, but these are questions you really want to start asking.
Jesse: Yeah. And remember that you’re potentially going to have different unit KPIs for different products, for different business units. That’s fine. That’s expected. But make sure that you are measuring appropriately for each of those.
The incoming costs, the incoming revenues, and costs for each of these isn’t going to change. That’s coming from your tagged usage and your linked accounts, but maybe the unit that you’re dividing that spend by is going to change, and that’s fine. This is where a spreadsheet comes in super, super handy. I love my Excel spreadsheets for this. Very, very easy to just bring in all of the bill data across different accounts, and really clearly attribute this spend is for the service, or the spend is for this product, or the spend is for this business unit, and then divide that by the unit that we have in question to get your actual unit KPI, to get your unit economic metric.
Pete: Yeah, and this is where the superpower comes in. Once you have this number, now you can better understand and make product-level decisions. Again, whether you’re a SaaS product with a product you’re selling to external customers or building an internal tool, your product is the thing that the internal users consume. Your product decisions can now be driven by this. I mean, I have recollections of conversations with product teams, where they would talk about certain services internally, how they wanted to expand and do all the stuff with it.
And I said, “Well, right now that one service represents one-third of our total spend, right? Our gross margin, that is one-third. But we looked at the users, and it’s only being used by one percent.” When you have these big numbers and saying, “Wow, the company spends a third of their money on something one percent of people use,” then maybe that’s not the place we want to be investing product decisions into. Maybe it is, but you don’t know enough to have that conversation unless you have this data.
Jesse: Absolutely. I think there’s one other small caveat that we haven’t touched on that I do want to call out, and this comes back to your conversation about tagging. We have noticed that a lot of teams want to tag to a certain extent, and then start building their showback models immediately, which is great that you’ve got investment, you’ve got energy, you really want to get to that showback model, get to that chargeback model, that unit economic model space. But if your usage or cloud usage is not thoroughly tagged and accurately tagged, your resulting data is not going to be accurate either. So, we think about this in terms of a cost margin or a cost error.
So, for example, if your production spend or your production usage is only 60% tagged, that means you’ve got 40% error in that data that’s coming in; your cloud spend for production has 40% error margin, which is huge.
Pete: Yeah, exactly. Track your untagged spend, as well as your tagged spend. I mean, make sure you have a story for the things that are not tagged. That includes things like data transfer and things that maybe are not as taggable within AWS. That’s an important aspect of this that you’ll want to make sure you’re at least not forgetting about.
Even if you can’t tag it, you don’t have a solution for it, make sure it’s in the back of your head that this is maybe not as accurate of a forecast because we’re just taking data transfer and dividing it by product versus actually looking at which product uses the most to transfer.
Jesse: Yeah, and this is a tough concept, so don’t feel bad if you listen to this episode, again, don’t feel bad if you go download the Unconventional Guide from the Duckbill Group website—we’ll have the link in the [show notes 00:20:12]—this one is a tough concept because it brings in a lot of other moving parts to ultimately get at this one unifying really, really important idea. This is one that we see a lot of clients and potential clients struggle with. So, if you’re taking some time to understand this concept, you’re not alone.
Pete: Exactly. This is the goal of all of the previous work, and this is something that you would measure in just a multi-year commitment in most businesses. And the larger the business is, the longer that work is going to take because it’s hard, there are a lot of moving pieces, and so many things need to be done in advance of all of this. And again, realize you’re not doing this work in a vacuum. There’s things that are moving and shifting as it’s all happening. So, don’t beat yourself up if you’re looking at this and thinking to yourself, “This is just a huge task. I’m never going to get this done.” It’s just not something that’s going to happen overnight.
All right, well, hey, if you’ve enjoyed this podcast, if you’ve enjoyed this series, please go to lastweekinaws.com/reviewand give it a five-star rating on your podcast platform of choice, whereas if you hated this podcast, please go to lastweekinaws.com/review give it a five-star rating, but then tell us what’s the next series you want to have if you didn’t like this one.
Also, don’t forget, you can give us your feedback and any questions that we’ll be continuing to answer in future episodes lastweekinaws.com/QA. You don’t need to put your name, can be totally anonymous. Give us your question. We’d love to dive into some of those topics.
And finally, you can download our Unconventional Guide, the whole PDF of everything we’ve talked about at the Duckbill Group website. We’ll include that link in our [show notes 00:21:51] and you can head over there. Thanks again.
Announcer: This has been a HumblePod production. Stay humble.