Episode Summary

Join Pete and Jesse for another installment of the Unconventional Guide to AWS Cost Management. This episode focuses on tagging, why it’s important, how many organizations treat it as an afterthought, how tagging is a lot like your doctor telling you to eat more vegetables, how tagging is an easy problem to solve except for the fact no one is incentivized to solve it, different strategies for tagging, why tagging for the sake of tagging won’t deliver the results you’re hoping for, the stakeholders you should rope into a discussion about your tagging strategy, how a two-person startup might not understand why they are tagging things today but will be super happy they did in three years, why tagging policies should change over time, and more.

Episode Show Notes & Transcript



Corey: This episode is sponsored in part by LaunchDarkly. Take a look at what it takes to get your code into production. I’m going to just guess that it’s awful because it’s always awful. No one loves their deployment process. What if launching new features didn’t require you to do a full-on code and possibly infrastructure deploy? What if you could test on a small subset of users and then roll it back immediately if results aren’t what you expect? LaunchDarkly does exactly this. To learn more, visit launchdarkly.com and tell them Corey sent you, and watch for the wince.

Pete: Hello, and welcome to the AWS Morning Brief: Fridays From the Field. I'm Pete Cheslock.

Jesse: I'm Jesse DeRose.

Pete: And we're back again, Jesse. We are back. But really have we gone anywhere to begin with?

Jesse: We've been making our way slowly but surely through this Unconventional Guide. Lots of really interesting recommendations, lots of really interesting feedback from all of you, which we really, really appreciate. We can't wait to dive into some of those ideas deeper in future episodes.

Pete: Yeah. And don't forget, you can give us additional feedback and questions at lastweekinaws.com/QA, feel free to add your name. Or not. Doesn't matter. It can be totally anonymous. That's fine with us. So today, we're talking about a topic that is very near and dear to our hearts. 

Jesse: Yes.

Pete: It is tagging.

Jesse: Yes.

Pete: Tagging your resources in Amazon, or I mean really any cloud provider; any place you can tag something you probably should. And we're going to talk a little bit about strategies for that, how people use their tags, just all the fun things related to it. Tagging, it's easy to do, right, Jesse? You just tag your resources and all your problems go away.

Jesse: Yep. Thanks, everybody, have a good night.

Pete: So yeah, if you've enjoyed this podcast, please go to—no, I’m just kidding. 

Jesse: [laugh].

Pete: Tagging is probably the thing that most companies are doing poorly, simply because it's hard, and it's an afterthought, and if you didn't have a really solid forced strategy to ensure tags and force compliance, you're probably not going back to fix it.

Jesse: Yeah. It's not thought about as something that's a first-class citizen in the cloud world. When you think about the things that are important to your business model, you might think about getting your application out the door and running, maybe talking about business requirements for availability, failover, data retention, but tagging is nowhere on that list. That's not something that I think any organization thinks about as part of an MVP, let alone future iterations of their products.

Pete: Tagging feels much like the same feeling I get when my doctor says that I should eat more veggies. 

Jesse: Oof.

Pete: I know they're good for me; I know we need to do this. They have vitamins, and fiber, and all these wonderful things. But in order to make those veggies something I want to eat, we have to learn to make it more delicious. Personally, I find duck fat works to make them more delicious. I wish we could apply a duck fat strategy to the tagging problem.

Jesse: Yeah, it's not an easy problem to solve. Or rather, I should say it is an easy problem to solve, but it's not something that anybody is quickly incentivized to solve. Tagging, just for the sake of tagging, it doesn't work.

Pete: Yeah, it's that there really are no incentives for it. No good incentives. It's usually because someone came over to your desk and said, “Hey, what's this charge for? And who's using it? And what's the deal with this?” 

And you're going into Cost Explorer, and you're like, “Uh, I don't know. It's in this one account.” And that's as far as you can go to figure out who did what and why that thing is the way it is.

Jesse: Yeah. There are so many different tagging strategies that we've seen. We've seen some clients talk about tagging as a way to potentially penalize engineers who aren't tagging or who are spending too much money. We've seen organizations who are tagging to reward teams that are tagging all their spend or keeping their spend optimized. Across the board, there are just so many different ways to go about this.

Pete: So let's assume you are like most of the companies that we've seen. Definitely not all: there are some rare gems out there that are making tagging a long term and continual process, which we're actually going to talk about in a future episode, how to do that. But let's say you're just looking at your bill, you're looking at your usage, and you're saying to yourself, “Okay. I need to be better at this.” What do they say, “The journey of a thousand miles starts with a single step?” What is that first step?

Jesse: Yeah there's a lot of different ways to go about this. I think there's a couple great places to start. Now, I will say AWS has a thrilling 24-page best practices white paper that we’ll throw a link in the [show notes 00:05:18]. 

Pete: Have you read that, Jesse?

Jesse: I will say that I have read parts of it. I have not read all of it, and so I want to make it very, very clear to all of our listeners, this is not a document that needs to become the holy grail for your organization. I think in the same way that you could read the SRE book from Google and have some good takeaways, you can skim through this white paper, maybe read through a couple of the sections that seem most applicable to your organization, and then start with those ideas, start with those best practices, and then build them over time organically; develop them over time organically.

Pete: I like to read it some nights when I'm just having trouble sleeping, and maybe by page two or three I’m just out.

Jesse: Yeah. There's a lot of content in there talking about what to tag, why to tag. I think the best place for any organization to start is to think about what are the important things that we need to tag. And that's a conversation that's going to involve not just engineers, but also finance, potentially IT, maybe also security teams, depending on how your organization is built. Because ultimately, what you want to do is understand what are the things that my organization cares about when it comes to our cloud usage? 

Maybe engineers care about which teams are using which services or they care about who owns which services. So, for example, when there's that i3 instance that somebody spun up and forgot to spin down, and everyone goes, “Well, it's not me, clearly you did it.” And then you realize that it's tagged with an owner tag that says that I did it, then you can't really argue with that. But then also think about maybe finance wants to know, what is the accounting unit for each of these resources? Is it usage for cost of goods sold, or COGS for example; something maybe in a production environment? 

Is it in a development environment and associated with research and development, for example? Is it something that the entire organization needs to use? Like sort of a general or accounting section. And then ultimately they can use that information to break down spend from a financial perspective, for forecasting purposes, for business finance purposes, to really help better understand how the organization overall is using the Cloud.

Corey: This episode is sponsored by ExtraHop. ExtraHop provides threat detection and response for the Enterprise (not the starship). On-prem security doesn’t translate well to cloud or multi-cloud environments, and that’s not even counting IoT. ExtraHop automatically discovers everything inside the perimeter, including your cloud workloads and IoT devices, detects these threats up to 35 percent faster, and helps you act immediately. Ask for a free trial of detection and response for AWS today at extrahop.com/trial.

Pete: Yeah, tagging can really go beyond and capture a lot of information. You can store—what is it—255 key-value pairs for a lot of Amazon resources. Now, there are limitations, of course. The most effective way to get just 100 percent coverage and allocation on your spend is to do a count per maybe business unit or product. That is obviously pretty complex and can be challenging to do, but tagging can get you into the 90 percent range of coverage. 

And those tags, I think to Jesse's point, what you mentioned was, by just tagging for the sake of tagging, don't waste your time. If you're looking at all these resources, and you're clicking around the Amazon tooling that can help you tag and categorize unless you have a plan on what you want to answer with those tags just don't even waste your time. But having those conversations with finance, having those conversations with your security teams, I've seen interesting use cases, not for an access control reason, but a way of just tracking these resources from a security perspective. One of my favorite uses of tagging was at a previous company we ran multiple accounts per environment. So, this was kind of earlier in Amazon where they didn't have a lot of tooling for running multiple accounts. 

You wouldn't want to run 100 accounts on Amazon at this time; that was just way too much. The tooling just wasn't advanced yet. But we ran maybe four accounts. We ran our production, maybe our development, our QA, you know, a security account. Really, really basic. 

But every once in a while, we had a need to run systems in our production environment that were kind of like test systems. They were not load testing, but sometimes we’d want to analyze large amounts of data, test out new versions of software, but that software running on all those servers isn't directly impactful with our cost of goods sold. So, when looking at the Amazon bill, your Amazon bill for production, it's going to look inflated. And your CFO is going to come over and say, “Why is the bill so high? Why did it grow so much?” 

And so what we did is using tagging to basically identify what resources were running in production that we could basically subtract from our cost of goods sold, and that actually allowed us to, from a financial standpoint improve our gross margin numbers and make them as accurate as possible.

Jesse: So let's say you've talked to engineers, you've talked to finance, you’ve talked to product, security, IT, whoever else, you've built this great tagging strategy, this great tagging policy that you want to now enforce across the engineering organization. There are lots of different ways to enforce it. And again, a lot of engineers aren't incentivized to add tags to their resources, so in order to make sure that all of your taggable resources are tagged, there's a couple different things to think about. I think the number one thing is, think about, how can you automatically add tags, to your resources through your automation through your systems like your CI/CD deployment pipeline? Can you automatically add all of the tags related to your tagging policies based on a CI/CD pipeline that creates these? Or maybe infrastructure as code that automatically has these tags set?

Pete: Yeah, the key point is that no matter what you start with, it's going to change. It's going to change next year or the year after, it's not going to live—that one great strategy you created, it's going to live as long as it's going to live, but it's definitely not going to live forever. So, just getting started, I think Jesse, your point, integrate into your CI and CD systems, in your Terraform, your CloudFormation, however you provision these assets are a great way of doing that. And just make it default for net new services. You can go back and fix other things later, but just by making that default. But I think one other great thing is when you do come up with this great strategy, don't keep it a secret.

Jesse: Yes, absolutely. We'll talk about this one a little bit later in another episode, but please, please share it with everybody.

Pete: Yeah, exactly. You spent all this time to pull together all these different groups and come up with this strategy. Now, is the time to, again, pull together all these groups and let them know about it. So, these different ways of getting started, and Amazon tooling has gotten so much better on this to really help people identify their untagged, and tag some different things. And these tools are just so much better than they were before, that I feel like there's less of an excuse now for just tagging to be a very low percentage. 

It's just so well integrated into the CloudFormations, you know, all these things. It's all there. It's all at your disposal. And if you think it's going to be a waste of time, Jesse and I can definitely, hopefully, call—I would say we can hopefully calm your fears that any investment you make into tagging—as long as it's well planned out with various teams—will pay just a massive amount of dividends in the future. You don't realize it yet. 

If you're a startup, and you're two people, and you're tagging your items from day one, and it's like, “Ugh, it’s such a pain.” In two or three years from now, you're going to look back and be like, “I'm so glad we did that.” And hilariously, Jesse, we run a lot of applications, right—

Jesse: Yeah.

Pete: —in our own accounts. How do we do with tagging? 

Jesse: Yeah, I was just about to say, we ran into this exact issue within our organization because we started spinning up resources without a clear tagging policy, and then all of a sudden, when our bill reached a point where Corey said we needed to know where these costs were going, we didn’t. We didn't have a tagging policy in place. So, we are a clear example of what not to do. And we realized, this is important. And we've seen this in other organizations as well. It is never ever too early to start tagging your resources in AWS.

Pete: The cobbler's children have no shoes, basically.

Jesse: Yeah.

Pete: It's so true. And what's funny is that, largely, we only started to care when the credits ran out. Which is incredibly common. No one cares what you're spending when you're a startup and you've got your two years of credits, or whichever program that you followed to get those free credits. You got your credits from setting up your company on Stripe, and you got your Amazon startup credits, all this other stuff. Yeah, go nuts. Just provision things. There's no bill. And then the month that those credits run out, and you're like, “Huh. Shit.” [laugh].

Jesse: Yeah, I didn't realize that we were, uh… spending that much money over there. 

Pete: Exactly. So, we could definitely say we've experienced it from our previous companies, places that we've had to adopt a tagging strategy later, or even if just tagging wasn't well supported until later on and then you have to add it, it is so much harder. So, the earlier you can start, the better. But don't fret, if you're later in your company's lifecycle and you don't have anywhere, just get started today. Start having those conversations. That's the first step is to start having those conversations.

Jesse: Absolutely. And to Pete's point earlier, this tagging policy is going to change over time, so make it a point to reassess this information, maybe on a quarterly basis, maybe on an annual basis, to course-correct over time because these tags are going to change over time. The needs of the organization are going to change over time. And that's fine. That's absolutely valid. So, make sure that you just put that on the calendar now so you can have those conversations.

Pete: Yeah, and we will be back next week to talk more about this and, once you've started tagging, how to improve and continually improve upon that and what strategies you can follow. 

So, if you have enjoyed this podcast, please go to lastweekinaws.com/review, give it a five-star review on your podcast platform of choice, whereas if you hated this podcast, please go to lastweekinaws.com/review and give it a five-star rating on your podcast platform of choice and tell us how good you're doing it tagging. Again, I always like to remind folks you can go to lastweekinaws.com/QA. Send us your questions. We would love to hear from you, and we'll be answering those in future episodes. Thanks again.

Announcer: This has been a HumblePod production. Stay humble.
Newsletter Footer

Get the Newsletter

Reach over 30,000 discerning engineers, managers, enthusiasts who actually care about the state of Amazon’s cloud ecosystems.

"*" indicates required fields

This field is for validation purposes and should be left unchanged.
Sponsor Icon Footer

Sponsor an Episode

Get your message in front of people who care enough to keep current about the cloud phenomenon and its business impacts.