The Independent AWS Security Researcher with Scott Piper

Episode Summary

Scott Piper, now a Principle Engineer at Block, and firmly established as the independent AWS security researcher, is becoming a regular fixture on the show. Now having taken a corporate gig, at the news of which Corey was a little let down, Scott is able to quell his concerns. Scott is still as public and prolific as ever. Scott holds true to his security research, and as he tells it his public presence is something companies can learn to value instead of sweat. Corey frequently mines Scott’s Twitter for his weekly security newsletter, an excellent source for the latest in security. Scott offers kind works on the newsletter, IMSDv2, and some request for areas for AWS to improve their security game!

Episode Show Notes & Transcript

About Scott
Cloud security historian.
Developed flaws.cloud, CloudMapper, and Parliament.
Founding team for fwd:cloudsec

Links:

Transcript
Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.


Corey: This episode is sponsored in part by our friends at Vultr. Optimized cloud compute plans have landed at Vultr to deliver lightning fast processing power, courtesy of third gen AMD EPYC processors without the IO, or hardware limitations, of a traditional multi-tenant cloud server. Starting at just 28 bucks a month, users can deploy general purpose, CPU, memory, or storage optimized cloud instances in more than 20 locations across five continents. Without looking, I know that once again, Antarctica has gotten the short end of the stick. Launch your Vultr optimized compute instance in 60 seconds or less on your choice of included operating systems, or bring your own. It's time to ditch convoluted and unpredictable giant tech company billing practices, and say goodbye to noisy neighbors and egregious egress forever. Vultr delivers the power of the cloud with none of the bloat. "Screaming in the Cloud" listeners can try Vultr for free today with a $150 in credit when they visit getvultr.com/screaming. That's G E T V U L T R.com/screaming. My thanks to them for sponsoring this ridiculous podcast.


Corey: Couchbase Capella Database-as-a-Service is flexible, full-featured and fully managed with built in access via key-value, SQL, and full-text search. Flexible JSON documents aligned to your applications and workloads. Build faster with blazing fast in-memory performance and automated replication and scaling while reducing cost. Capella has the best price performance of any fully managed document database. Visit couchbase.com/screaminginthecloud to try Capella today for free and be up and running in three minutes with no credit card required. Couchbase Capella: make your data sing.


Corey: Welcome to Screaming in the Cloud, I’m Corey Quinn. I am joined by a returning guest with a bit of a different job. Scott Piper was formerly an independent security researcher—basically the independent security researcher in the AWS space—but now he’s a Principal Engineer over at Block. Scott, welcome back.


Scott: Thanks for having me, again, Corey.


Corey: So, you’ve taken a corporate job, and when that happened, I have to confess, I was slightly discouraged because oh, now it’s going to be like one of those stories of when someone you know goes to work at Apple because no one knows anyone at Apple; we just used to know people who went there and then we kind of lost touch because it’s a very insular thing. Not the Block slash Square slash whatever they’re calling themselves this week has that reputation. But InfoSec is always a very nuanced space and companies that have large footprints and, you know, handle financial transaction processing generally don’t encourage loud voices that attract attention around anything that isn’t directly aligned with the core mission of the company. But you’re still as public and prolific as ever. Was that a difficult balance for you to strike?


Scott: So, when I was considering employment options, that was something that I made 
clear to any companies that I was talking to, that this is something that probably will and should continue because a lot of my value to these companies is because I’m able to have discussions, able to impact change because of that public persona. So yeah, so I think that it was something that they were aware of, and a risk that they took. [laugh]. But yeah, it’s been useful.


Corey: This is the sort of conversation I would have expected to have with, “Yeah, things seem to be continuing the same, and I haven’t rocked any boats, yet and they haven’t fired me, knock on wood.” Except that recently you’ve launched yet something else that I am personally a fan of. Now, before we get into the specifics of what it is you’re up to these days, I should call out that since your last appearance on this show, I have really leaned into the Thursday newsletter podcast duo of Last Week in AWS: Security Edition. Rounding up what happened the previous week—yes, it was the previous week, and it comes out on Thursdays—because, you know, timing and publication, things are hard, computers, you know how it is—aimed at a target audience that is very much not you: People who have to care about security, but are not immersed in the space. It’s a, “All right, what now? What do I have to pay attention to?”


Because there’s a lot of noise in this space, there’s a lot of vendor-captured stuff out there. There’s very little that is for people who work in security but don’t have the word security anywhere near their job title. And I have to confess that one of my easy shortcuts is, “Oh, it’s a pretty thin issue this week,” which is not inherently a bad thing, let’s be clear, it’s not yay, the three things you need to care about in security then eight more of filler; that’s not what we’re about. But I always want to make sure I didn’t miss something meaningful, and one of my default publication steps is, “What’s Scott been tweeting about this week?” Just to make sure that I didn’t miss something that I really should be talking about.


And every single time I pull up your Twitter feed, I find myself learning something, whether it’s a new concept, or whether it is a nuance on an existing thing I was already aware of. So first, thank you for all the work that you do as a member of the community, despite having a, “Regular corporate job,” quote-unquote, you’re still very present. It’s appreciated.


Scott: Thank you. Yeah. And I mean, that newsletter is great for people that don’t want to be spending multiple hours per day trolling through Twitter and reading that. So, it provides, also, something great for the community to not have to spend all that time on Twitter like I do [laugh], unfortunately.


Corey: It also strives—sort of—to be something approaching an upbeat position of not quite as cynical and sarcastic as the Monday issue. I try to be not just this is the thing that happened, but go a little bit into and this is why it matters. This is how to think about it. This thing that Amazon put out is nonsense, however, here’s the kernel hidden within it that might lead to something, such as thinking about how you do sign-on, or how to think about protecting MFA devices, or stuff like that you normally care about a lot right after you really should have cared about it but didn’t at all. So, it’s just the idea of aiming in a slightly different audience.


Scott: Yeah definitely. And it provides value that it does, it takes some delay so that you can read what everybody has written, how they’ve responded to the different news outtakes, you’re not just including the hot takes. For example, as of this morning, there’s a certain incident with an authentication provider, and it’s not really clear if there was actually a breach or not. And so it’s valuable to take a moment to understand what happened, get all the voices to have expressed their points, so you can summarize those issues.


Corey: An internal term that we’ve used to describe the position here is that I am prolific but I also have things to do as a part of my job that do not involve sitting there hitting refresh on Twitter like mad all the time. The idea is to have the best take not the first take—


Scott: Exactly.


Corey: And if that means that I lose a bunch of eyeballs and early ad impressions in the middle of the night and whatnot, well, great. I don’t sell ad impressions anyway, so what does it matter? It winds up lending itself to a more thoughtful analysis of figuring out, in the sober light of day, is this a nothing-burger or is this enormous? With that SSO issue that you’re alluding to—[cough] Okta—sorry, something caught in my throat there—very clearly, something is going on, but if I had written next week’s newsletter last night while it was still very unclear, it would have been a very different tone than the one that I would have written this morning after their public statement, and even still a certainly different tone that it would take a couple of days once more information is almost certain to come to light. And that is something that is, I think, underappreciated in certainly on Twitter, where an old tweet—there’s nothing worse than an old tweet unless you’re using it to drag someone for something—that, “Well, we have different perspectives on that nowadays. It’s not 2018 anymore.” Right. Okay, cool.


Scott: Yep. [laugh].


Corey: But something that you’ve done has been a bit of a pivot lately. Historically, you have been right there in my sweet spot of needling cloud providers for their transgressions in various ways. Cool, right there with you. We could co-author a book on the subject. But lately, you’ve started a community list of [IMSDv2 00:07:04] abuses.


Now, first, we should talk about what IMSDv2 is. It’s the name that it clearly came from Amazon because that’s a name only a cloud provider bad at naming things could possibly love. What is it?


Scott: So, it’s the Instance Metadata Service, Version Two. If there’s a version two, you can imagine there was a version one at some point. And the version two—


Corey: And there’s a version two because Amazon prod—the first one was terrible, but they don’t turn anything off, ever, so this is the way and the light and the future; we’re going to leave that old thing around until your great-grandchild dies of old age.


Scott: Exactly, yeah. So, when EC2s first came out, and IAM roles first came out, you wanted to give your EC2s the ability to use AWS privileges, so this is how those EC2s are getting access to their credentials that they can use. And the way in which this was originally done was there’s this magic IP address, this 169.254.169.254 IP address, which is very important for security on AWS because if anything can access that magic IP address from an EC2 instance, you can steal their credentials of that EC2, and therefore basically become that EC2 instance, in terms of what it can do in the AWS environment.


And so in 2019, there was a large breach of Capital One that was related to this. And so as a result of that—I think that AWS probably had this new version, probably, in the works for a while, but I think that motivated their faster release of this new version, and so IMDSv2 changed how you would obtain these credentials. So, you basically—instead of making a single GET request to this IP address, now you had to make multiple requests, they were now PUT request instead of a GET request, there was a challenge and response, there’s the hop limit. So, there’s all these various things that are going to make it harder and basically mitigate a lot of the different types of vulnerabilities that previously would be used in order to obtain these credentials. The problem, though, is that IMDSv1 still exists on EC2s, unless you as a customer are enforcing IMDSv2.


And so, in order to do this in a large environment, it’s difficult—theoretically, it’s a simple thing; all you should have to do is update your SDK and now you’re able to make use of the latest version. And if you’re using any version of the SDK that was released in the past over two years, you already should be using IMDSv2 there, but you have to enforce it. And so that’s where the problem is. And what was most problematic to me is now that I work for a company, we have run into the problem that there are some vendor solutions that we use that weren’t allowing us to enforce IMDSv2 across all of our different accounts. And this is something I’ve heard from a number of other customers as well.


And so I decided to create this list with vendors that I’ve had to deal with, vendors that other customers have had to deal with, in order to basically try and solve this problem once and for all. It’s been multiple years now and a lot of these vendors, unfortunately, were also security vendors. And so that makes the conversation a little bit easier, to basically put them on this wall-of-shame and say, “You’re a security vendor and you’re not allowing your customers to enforce best practices of security.”


Corey: I want to call on a couple of things around that. Originally the metadata service was used for a number of other things—still is—beyond credentials. It is not the credential service as envisioned by a lot of folks. The way that—also we’ll find those credentials empty until there’s an EC2 instance role, and those credentials will both be scoped what that instance does and automatically rotated in the fullness of time so they’re not long-lived credentials that once you have them, they will last forever. This is, of course, a best practice and something you should be leveraging, but scope those credentials down, or you wind up with one of the ways that was chained together in the Capital One breach a few years ago.


It’s also worth noting that service would have been more useful earlier in time with a few functions. For example, you can use the metadata service to retrieve the instance tags about the EC2 instance. When I requested it in 2015, it was not possible. But they had released it in January of this year, 2022, long after we have all come up with workarounds for this, where we could have used that to set the hostname internally on the system, if you’re looking for something basic and easy. It would have been something then you could have used to automatically self-register with DNS without having to jump through a whole bunch of hoops to do it manually.


And you look at this, and it’s wow, that’s a whole lot of crappy tooling I can just throw into the trash heap of history you don’t need anymore. But the IMSDv2, you’re right, makes it a lot harder, there has to be a conversation, not just something you can sort of bankshot something off of to get access to it. And it’s a terrific mitigation. What I’ve liked about your list of more or less shaming companies for doing this is, on the one hand, you have companies who take themselves off of the list as soon as it’s up there. It’s, “Oh, we love when people talk about us. Wait, what’s that? They’re saying something unkind? On the internet?” And they’ll fix it, which honestly is better than I expected.


And then every once in a while you’ll see something that’s horrifying of, “Oh, yeah, we’re 
not vulnerable to that at all because we tell you to create permanent long-lived credentials, store them on disk and we’ll use those instead.” And it’s… that is, like, guaranteeing that no one is going to break down your door by making your walls out of tissue paper. Don’t do that. Like, that has gone so far around the band that has come back around again. So, hopefully that got fixed.


Scott: And I think you pointed out a couple of things I want to talk about with this is that, one, it has actually been very successful in terms of getting large vendors to make changes. Currently, of the seven vendors that have ever been listed there, are three of them have already made fixes and have been removed from the list. And the list has only been up for about a month. And so, in terms of getting enterprise solution vendors to make changes within, like, just a few weeks is very surprising to me. And these are things that people have been asking for for years now, and so it had motivated them a lot there.


And the other thing that I want to point out is people have looked at the success that it’s had and considered maybe we should make wall-of-shame lists, for all the things that we want. And I want to point out that there are some things about this problem, the IMDSv2 specifically, that make it work for having this wall-of-shame list like this. One of them is that not supporting or not allowing customers to enforce IMDSv2 is basically always bad. There is not a use case where you can make a claim—


Corey: There is no nuance where that, in this case, is the thing to do, like having an open S3 bucket: There are use cases where that is very much something you want to do, but it’s the uncommon case.


Scott: Exactly. That I think is an important thing. Another thing is it’s not just putting up a list, you know, like that is what people are seeing publicly, but behind the scenes, there’s a lot of other things that are happening. One, I am communicating with various customers, customers that are reporting this issue to me, in order to try to better understand what’s happening there, so that I can then relay that information to the company. So, I’m not just putting up the list; I’m also, behind the scenes, having conversations with these different companies to try to get timelines from them, to try to make sure that they are aware of the problem, they are aware that they’re on this list, how to get off the list. So, there’s that conversation happening.


There’s also the conversation that I’m happening with AWS in order to make various requests that AWS improve this for customers, to make this easier. And this is something that is public on that repo. I have my list of requests to AWS so that people can relay that to their own TAMs at AWS to basically say these are things we want as well. And so this includes things like, “I want an AWS account to have the ability to default to always be enforcing IMDSv2.” You know, so as an example, when you create an EC2 through the web console—which people can say, oh, you should always be using Infrastructure as Code; the reality is many folks are using the web console to create EC2s to do other changes.


And when you create an EC2 in the web console, by default, it’s going to allow IMDSv1 still. And so my request to AWS is, you should allow me to just default enforce IMDSv2. Also, the web console does not give you visibility into which EC2s are enforcing it and which ones are not. And also, you do not have the ability in the web console to enforce it. You cannot click on an EC2 and say, “Please enforce it now.”


So, it’s all these various, like, minor changes that I’m requesting AWS to do.


Corey: It has to be done at instance creation time.


Scott: Exactly. And so there is an API that you can make in order to change it afterwards, but that’s only an API so you have to use the CLI or some other mechanism; you can’t do it in the web console. But the other thing that I’m requesting AWS do is if security is a priority for AWS and they have all these other partners that are security companies, that they should be requiring their partners to also be enforcing this in their various products. So, if a partner is basically not allowing your AWS customers to enforce security best practices, then perhaps that partnership should be revoked in some way. And so that’s a more aggressive thing that I’m asking AWS to do, but I think is reasonable.


Corey: I’d also like them to get all of their own first-party services to support this, too.


Scott: That’s true as well. So, AWS is currently on the list. And so, they have one service, Data Pipelines, which if you are an AWS customer and you are using that service, you are not going to be able to enforce IMDSv2 in your environment. So, AWS themselves, unfortunately, is not allowing customers to enforce this. And then AWS themselves in their own production servers, we have seen indications that they do not enforce IMDSv2 on their own production servers.


So, the best practice that they are telling customers to follow, they unfortunately are not following it themselves. And so the way in which we saw this was Orca is a security company that ended up finding this issue with AWS—and there’s a lot of questions in terms of what all exactly they found—but they had this post that they called “Breaking Formation” in which they were somehow able to find—basically exploit to some degree—and again, it’s unclear exactly what they were able to exploit here—but they were able to exploit AWS production servers that are responsible for the CloudFormation service. And in their blog post, they had a screenshot which showed that those production servers are not enforcing IMDSv2. And so AWS themselves is struggling with this as well, as are many customers. So, it’s something that, you know, I put together this list of requests in hopes that AWS can make it easier for not only customers but also themselves to be able to enforce it.


Corey: There are a lot of different things that we wish companies did differently, particularly if that company is AWS. Why is this the particular windmill that you’ve decided to tilt at given—let’s say—it’s not exactly slim pickins out there as far as changes that we wish companies would make? Obviously, you mentioned at one point, there is no drawback to enabling this, but a lot could be said for other aspects as well. Why is this one so important?


Scott: So, in part, I personally have some, I guess, history with this [laugh], basically, IMDSv2, and so we can discuss this. This is back when Capital One had their breach in 2019, there was this Senator, Senator Ron Wyden, who sent this email over to AWS, to Steve Schmidt, who was the CISO at the time there and still is the CISO, and he basically—


Corey: Now, he’s head of security for all of Amazon.


Scott: Yeah, yeah.


Corey: CJ is now the AWS CISO. And he has the good sense to hide.


Scott: Yeah. [laugh]. So, at the time, this Senator Ron Wyden had send over this email—and obviously it’s not Senator Ron Wyden himself, you know, it’s one of his, like, technical people on staff that is able to give him this information—and he sends this email to AWS saying, “Hey, this metadata service played a role in this very significant breach. Why hasn’t this been fixed?” And Steve Schmidt responded, and because it’s communications between a senator, I guess it has to become public.


So, Steve Schmidt responds, saying that, “Hey, we never knew that this was an issue before,” is essentially what he responds with. And that irked me because I had reported this to AWS previously, as had many other people. So, there was a conference presentation by this guy Andrés Riancho at BlackHat, I believe in 2014, and he had presented previously in 2013, so it was a known issue; it had been around for a while. But I took the time to actually report it to AWS Security. So, I went through the correct channel of making sure that AWS was aware of a security concern, as a security researcher—so reporting it through that correct channel there—and provided Senator Ron Wyden with all this information.


And so, then he then requested that the FTC begin a federal investigation into AWS, related to basically not following the best practices that security researchers have recommended. So, that was, kind of like, my early, I guess, involvement with this issue. So, it’s something that I’ve been interested in for a while to make sure that this is resolved completely at some point.


Corey: This episode is sponsored by our friends at Oracle Cloud. Counting the pennies, but still dreaming of deploying apps instead of “Hello, World” demos? Allow me to introduce you to Oracle’s Always Free tier. It provides over 20 free services and infrastructure, networking, databases, observability, management, and security. And—let me be clear here—it’s actually free. There’s no surprise billing until you intentionally and proactively upgrade your account. This means you can provision a virtual machine instance or spin up an autonomous database that manages itself, all while gaining the networking, load balancing, and storage resources that somehow never quite make it into most free tiers needed to support the application that you want to build. With Always Free, you can do things like run small-scale applications or do proof-of-concept testing without spending a dime. You know that I always like to put asterisks next to the word free? This is actually free, no asterisk. Start now. Visit snark.cloud/oci-free that’s snark.cloud/oci-free.


Corey: It’s always fun watching where people come from, as far as the security problems that they call out. There was, I believe in the cloud security forum Slack, a thread of recently about what security issues are top-of-mind and that should be fixed as a baseline expectation. In fact, let me dig it out because that is one of those things that I think is well worth having the conversation properly on this.


Good examples of risky, insecure defaults in AWS. And people are talking about IMDSv1, and they’re talking about all kinds of other in-depth things, and my contribution to it was, “If I go and I spin up an AWS account, until I go out of my way, I’m operating as root in that account. That seems bad.” And a few responses to that were oh, the basically facepalming, “Oh, of course.” I wish that there were an easy way to get AWS SSO as the default because it is the right answer for so many different things. It solves so many painful problems that otherwise you’re going to wind up stuck with.


And this stuff is hard and confusing; when people are starting out with this for the first time, they’re not approaching this from, “All right, how do I be extremely secure?” They want to get some work done. For fun a year ago, I spun up a test account—unattached to any organization—and because account aliases are globally unique, I somehow came up with the account ‘shitposting’ because that’s pretty much what I use it for. The actual reason I wanted that was I wanted something completely unattached from any other account that I could easily take screenshots from at any point, and the worst case scenario is okay, I’ve exposed some credential of my own in an account that has no privileged access to anything; I just have to apologize for all the Bitcoin mining now. And honestly, I think AWS would love that marketing campaign; they’d see my face on a billboard looking horrified. It’ll be great.


But I turned on every security service as I went because, of course, security is the most important thing. And there were so many to turn on, and the bill was approaching 50 bucks a month for an empty account. And it’s. It starts to feel a little weird and more 
than a little wrong.


Scott: [laugh]. Yeah, my personal concern in terms of default security features is really that problem of the cost controls, I think that that still is a big issue that AWS does not have cost controls such that when a student wants to try and use AWS for the very first time and somehow they spin up large EC2 instance, or they just you know, end up creating an access key and that access key gets leaked and somehow their account gets compromised and used for Bitcoin mining, now they’re stuck with that large AWS bill. For a student who has no budget, is in debt, and now is suddenly being, you know, hit with multiple thousands of dollars on their bill, that I think is very problematic, and that is something that I wish AWS would change as a default is basically, if you are creating AWS account for the very first time, have some type of—I don’t know how this would look, but maybe just be able to say, like, I don’t ever want this AWS account to spend more than $100 per month, and I’m okay if you end up destroying all my data in the account because I have no money and money is more important to me than whatever data I may store in here.


Corey: Make an answer to that question mandatory, just as putting a credit card in is mandatory. Because there are two extremes here. It’s more or less the same problem of AWS not knowing who its customers are beyond an AWS account, but there’s a spectrum somewhere between I’m a student who wants to learn how the cloud works, and my approach to security is very much the same. Don’t let randos spin up resources in my account, and I don’t ever want to be charged. If that means you turn off my “Hello World” blog post, okay, great.


On the other end, it’s this is Netflix. And this is our, you know, eight-millionth account that we’re spending up to do a thing and what do you mean you’re applying service quotas to it? I thought we had an understanding?—everything is a service quota, let’s be clear—


Scott: Yep.


Corey: —or a company that’s about to run a Superbowl ad. Yeah, there’s going to be a lot of traffic there. Don’t touch it. Just make it work. We don’t care what it costs.


Understanding where you fall on the cost perspective—as well as a security point of view of, “We’re a bank, which means forget security best practices, we have compliance obligations that cannot be altered in this account and here’s what they are.” There has to be a way that is easy and approachable for people to wind up moving that slider to whatever position best represents them. Because there are accounts where I never want to be charged a thing. And that’s an important thing because—and I’ve been talking about this for a while because I’m convinced it’s a matter of time—that poor kid who wound up trading on margin at Robinhood, woke up saw that he was seven-hundred-and-some-odd grand in debt and killed himself. When it all settled out, I think he turned something like a $30,000 profit when all was said and done, which just serves to make it worse.


I can see a scenario in which that happens, and part of the contributors to it are that we used to see that the surprise bill for compromised accounts was 10, 15, 20 grand. Now, they’re 70 to 90 because there are more regions, more services to run containers—because of course there are—and the payoff is such that the people exploiting this have gotten very practiced and very operationalized at spinning up those resources quickly, and they cost a lot very quickly. I mean, the third use case that they’re not aiming at yet is people like me, where it’s, oh, you have a free account that sandboxed; I want to get the high score on the free tier because all their fraud is attuned to you making money. With me, it’s nope, just going to run up the store to embarrass Amazon. That’s not a common exploit vector, but I’m very much here.


Scott: [laugh]. Yep. And that also is the thing though: The Denial of Wallet attack is also a concern on AWS, as well, where you’ve written a blog post about this, how if you are able to make use of data transfer in different ways, you can run up very high multi-million dollar bills in people’s AWS accounts and even AWS’s own protections and defenses against trying to look for cost spikes and things like that is delayed by multiple hours. And so you can still end up spending a lot of money in people’s accounts, or one thing that’s wild is an S3 object locking; that feature, the whole purpose behind it is to ensure data can never be deleted. It exists for various compliance reasons, so even AWS themselves cannot delete certain data.


So, if an attacker is able to abuse that functionality in somebody’s account, they can end up locking data such that for the next 100 years, it can never be deleted and you’re going to have to pay for that for the next 100 years inside your account. The only way of not paying for that anymore is to move everything that you have in an AWS account to a new account, and then ask AWS to delete that account, which is not going to be reasonable under most circumstances.


Corey: Yeah, alternatively, it’s one of those scenarios where well, the only other option is to start physically ripping hard drives out of racks in a bunch of different data centers. It’s wild to me. It’s such an attack surface that honestly I believe for the longest time that AWS Security is otherworldly good. And as we start seeing from these breaches, no, what really is otherworldly good is their ability to apply pressure to people not to go public with things they discover that they then wind up keeping quiet because once this whole Orca stuff came out, we started digging, and Aidan Steele found some stuff where you could just get unfiltered, raw outputs of CloudTrail events by setting up a couple of rules in weird ways.


And that was a giant problem, and it was never disclosed publicly. I don’t know if any of my events were impacted; I can’t trust that they would have told me if they were. And for the first time, I’m looking at things like confidential computing, which are designed around well, what if you don’t trust your cloud provider? Historically, I guess I was naive because my approach was, “Well, then you shouldn’t be using the cloud.” Now it’s, “Well, that’s actually kind of a good point.”


Because it’s not that I don’t trust my cloud provider to necessarily do what they’re telling me. I just don’t trust them to tell me what they’re doing. And that’s part of it. The, “Well, we found an issue, but you can’t prove we had an issue, so we’re going to say nothing.” And when it comes to light—because it always does—it erodes trust in a big way. And trust is everything in cloud.


Scott: Yeah. And so with some of the breaches that have come out, I created another GitHub repo to start tracking all the different security incidents that I could find for the three cloud providers, Azure, GCP, and AWS. And so on there, I started listing not only some of the blog posts from security companies that had been able to exploit vulnerabilities in the cloud providers, but also just anything else that I felt was a security mistake in some way. And so there’s a number of things I tried to avoid on there. Like, I tried to avoid listing something that’s kind of like a business decision, for example, services that get released that don’t have CloudTrail support. That’s a security concern to me, but that’s kind of a business decision that they decided to release a service before it supported all that functionality.


So, I tried to start listing off all those different things in order to also keep track of you know, is there a security provider that’s worse than the others? Are there any type of common patterns that I can see? And so I tried to look through some of those different things. And that’s been interesting because also I really only focus on AWS, and so I haven’t really known what all has been happening with GCP and Azure. And that was interesting because there’s been two issues that have happened on AWS where the exact same issue happened on the other cloud providers. And so that tells me, that’s concerning to me because that tells me tht—


Corey: Because those are not discovered at the same time let’s be clear.


Scott: Yeah. These were, like, over a year apart. And so basically, somebody had found something on GCP, and then a year-plus later, somebody else found the exact same issue on AWS. And then similarly, there was an issue with Azure and then a year-plus later, same issue on AWS. And that’s concerning because that tells me that AWS may not be monitoring what are the security issues that are impacting other cloud providers, and therefore checking whether or not they happen to themselves?


That’s something that you would expect a mature security team to be doing is to be monitoring what are public incidents that are happening to my competitors, and am I impacted similarly? Or what can I do to try and identify those issues, fix them, make sure they never happen? All those types of steps in terms of security maturity. And that’s something that then I’m a little concerned of that we’ve seen those issues happen before. There’s also, on AWS specifically, they have had a number of issues related to their IAM-managed policies that keep cropping up.


And so they have had a number of incidents where they were releasing policies that shouldn’t have been released in some way. And that’s concerning that showed that they don’t really have a change management process that you would expect. Usually, you would expect a company to be having GitHub PRs and approval processes and things like that, in order to make sure that there’s a second set of eyes on something before it gets released.


Corey: Particularly things of this level of sensitivity. This is not—like, I was making fun of them a day or two ago for having broken the copyright footer and not updating them since 2020 because instead of the ‘copyright’ symbol, they used an ‘at’ symbol. Minor stuff, but like that’s fun to needle people about, but it doesn’t actually matter for anything.


Scott: Yeah.


Corey: Security matters and mistakes show.


Scott: Yeah. And so there had been some examples where they released a policy that was called, like, ‘cheese puffs something’ and it’s like, okay, that’s clearly, like, an internal service of some sort. But I’d called them out and, like, I’d sent an email to AWS Security being like, “Hey, you need to make sure that you have change management processes on your IAM policies because one day you’re going to do something that is bad.” And one day they did. They made a change to the read-only access policy, and that basically—they removed every single privilege, somebody had ended up, you know, internally, removed every single privileges to the read-only access policy and replaced it with a whole bunch of write privileges for, I think, the Cassandra service.


And so, that was like, clearly they’ve made a mistake that they should have made sure they were correcting because you know, they had these previous incidents. Another kind of similar one was in December, there was a support policy where they had added S3 GetObject to that policy, and that was concerning in terms of have they just given all of their support employees access to everybody’s content in their S3 buckets? And so AWS made some statements saying that there were other controls in place there so it wouldn’t have been possible. But it’s those types of things that [crosstalk 00:33:17]—


Corey: Originally, those statements were made on Twitter, let’s be clear here.


Scott: Yes. Yeah. [laugh].


Corey: And I feel like there’s a—while I deeply appreciate how accessible a lot of their senior people are, I cannot point the executive leadership team at a client to some tweets that someone made. That is not a public statement of record that works on this.


Scott: Exactly.


Corey: They’re learning. We’ll get there sooner or later, I presume. I want to thank you for taking the time to speak with me, as always, I’ll throw links to these repos into the [show notes 00:33:46], but if they want to know more what you have to say, where’s the best place to find you?


Scott: So, my Twitter, which, unfortunately, is a handle written in hex, but it’s—‘dabbadoo’ is how you would pronounce it, but it’s probably easiest to see a link for it. So, that’s probably the main place to look for me.


Corey: That’s why my old Twitter handle was my amateur radio callsign. I don’t use that one anymore. It’s just easier. And I think that’s the right answer. Besides, given what you do, it’s easy enough if people want your attention. They screw up badly enough, you’ll come to them.


Scott: Yep. [laugh].


Corey: Scott, I really appreciate your time. Thanks again.


Scott: Thank you.


Corey: Scott Piper, Principal Engineer at Block and, more or less, roving security troubadour for lack of a better term. I’m Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you’ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you’ve hated this podcast, please leave a five-star review on your podcast platform of choice or a comment on the YouTubes saying that this episode is completely invalid because you wind up using the old version of the metadata service and you’ve never had a problem. That you know of.


Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.


Announcer: This has been a HumblePod production. Stay humble.

Transcript

Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.

Corey: This episode is sponsored in part by our friends at Vultr. Optimized cloud compute plans have landed at Vultr to deliver lightning fast processing power, courtesy of third gen AMD EPYC processors without the IO, or hardware limitations, of a traditional multi-tenant cloud server. Starting at just 28 bucks a month, users can deploy general purpose, CPU, memory, or storage optimized cloud instances in more than 20 locations across five continents. Without looking, I know that once again, Antarctica has gotten the short end of the stick. Launch your Vultr optimized compute instance in 60 seconds or less on your choice of included operating systems, or bring your own. It's time to ditch convoluted and unpredictable giant tech company billing practices, and say goodbye to noisy neighbors and egregious egress forever. Vultr delivers the power of the cloud with none of the bloat. "Screaming in the Cloud" listeners can try Vultr for free today with a $150 in credit when they visit getvultr.com/screaming. That's G E T V U L T R.com/screaming. My thanks to them for sponsoring this ridiculous podcast.

Corey: Couchbase Capella Database-as-a-Service is flexible, full-featured and fully managed with built in access via key-value, SQL, and full-text search. Flexible JSON documents aligned to your applications and workloads. Build faster with blazing fast in-memory performance and automated replication and scaling while reducing cost. Capella has the best price performance of any fully managed document database. Visit couchbase.com/screaminginthecloud to try Capella today for free and be up and running in three minutes with no credit card required. Couchbase Capella: make your data sing.

Corey: Welcome to Screaming in the Cloud, I’m Corey Quinn. I am joined by a returning guest with a bit of a different job. Scott Piper was formerly an independent security researcher—basically the independent security researcher in the AWS space—but now he’s a Principal Engineer over at Block. Scott, welcome back.

Scott: Thanks for having me, again, Corey.

Corey: So, you’ve taken a corporate job, and when that happened, I have to confess, I was slightly discouraged because oh, now it’s going to be like one of those stories of when someone you know goes to work at Apple because no one knows anyone at Apple; we just used to know people who went there and then we kind of lost touch because it’s a very insular thing. Not the Block slash Square slash whatever they’re calling themselves this week has that reputation. But InfoSec is always a very nuanced space and companies that have large footprints and, you know, handle financial transaction processing generally don’t encourage loud voices that attract attention around anything that isn’t directly aligned with the core mission of the company. But you’re still as public and prolific as ever. Was that a difficult balance for you to strike?

Scott: So, when I was considering employment options, that was something that I made clear to any companies that I was talking to, that this is something that probably will and should continue because a lot of my value to these companies is because I’m able to have discussions, able to impact change because of that public persona. So yeah, so I think that it was something that they were aware of, and a risk that they took. [laugh]. But yeah, it’s been useful.

Corey: This is the sort of conversation I would have expected to have with, “Yeah, things seem to be continuing the same, and I haven’t rocked any boats, yet and they haven’t fired me, knock on wood.” Except that recently you’ve launched yet something else that I am personally a fan of. Now, before we get into the specifics of what it is you’re up to these days, I should call out that since your last appearance on this show, I have really leaned into the Thursday newsletter podcast duo of Last Week in AWS: Security Edition. Rounding up what happened the previous week—yes, it was the previous week, and it comes out on Thursdays—because, you know, timing and publication, things are hard, computers, you know how it is—aimed at a target audience that is very much not you: People who have to care about security, but are not immersed in the space. It’s a, “All right, what now? What do I have to pay attention to?”

Because there’s a lot of noise in this space, there’s a lot of vendor-captured stuff out there. There’s very little that is for people who work in security but don’t have the word security anywhere near their job title. And I have to confess that one of my easy shortcuts is, “Oh, it’s a pretty thin issue this week,” which is not inherently a bad thing, let’s be clear, it’s not yay, the three things you need to care about in security then eight more of filler; that’s not what we’re about. But I always want to make sure I didn’t miss something meaningful, and one of my default publication steps is, “What’s Scott been tweeting about this week?” Just to make sure that I didn’t miss something that I really should be talking about.

And every single time I pull up your Twitter feed, I find myself learning something, whether it’s a new concept, or whether it is a nuance on an existing thing I was already aware of. So first, thank you for all the work that you do as a member of the community, despite having a, “Regular corporate job,” quote-unquote, you’re still very present. It’s appreciated.

Scott: Thank you. Yeah. And I mean, that newsletter is great for people that don’t want to be spending multiple hours per day trolling through Twitter and reading that. So, it provides, also, something great for the community to not have to spend all that time on Twitter like I do [laugh], unfortunately.

Corey: It also strives—sort of—to be something approaching an upbeat position of not quite as cynical and sarcastic as the Monday issue. I try to be not just this is the thing that happened, but go a little bit into and this is why it matters. This is how to think about it. This thing that Amazon put out is nonsense, however, here’s the kernel hidden within it that might lead to something, such as thinking about how you do sign-on, or how to think about protecting MFA devices, or stuff like that you normally care about a lot right after you really should have cared about it but didn’t at all. So, it’s just the idea of aiming in a slightly different audience.

Scott: Yeah definitely. And it provides value that it does, it takes some delay so that you can read what everybody has written, how they’ve responded to the different news outtakes, you’re not just including the hot takes. For example, as of this morning, there’s a certain incident with an authentication provider, and it’s not really clear if there was actually a breach or not. And so it’s valuable to take a moment to understand what happened, get all the voices to have expressed their points, so you can summarize those issues.

Corey: An internal term that we’ve used to describe the position here is that I am prolific but I also have things to do as a part of my job that do not involve sitting there hitting refresh on Twitter like mad all the time. The idea is to have the best take not the first take—

Scott: Exactly.

Corey: And if that means that I lose a bunch of eyeballs and early ad impressions in the middle of the night and whatnot, well, great. I don’t sell ad impressions anyway, so what does it matter? It winds up lending itself to a more thoughtful analysis of figuring out, in the sober light of day, is this a nothing-burger or is this enormous? With that SSO issue that you’re alluding to—[cough] Okta—sorry, something caught in my throat there—very clearly, something is going on, but if I had written next week’s newsletter last night while it was still very unclear, it would have been a very different tone than the one that I would have written this morning after their public statement, and even still a certainly different tone that it would take a couple of days once more information is almost certain to come to light. And that is something that is, I think, underappreciated in certainly on Twitter, where an old tweet—there’s nothing worse than an old tweet unless you’re using it to drag someone for something—that, “Well, we have different perspectives on that nowadays. It’s not 2018 anymore.” Right. Okay, cool.

Scott: Yep. [laugh].

Corey: But something that you’ve done has been a bit of a pivot lately. Historically, you have been right there in my sweet spot of needling cloud providers for their transgressions in various ways. Cool, right there with you. We could co-author a book on the subject. But lately, you’ve started a community list of [IMSDv2 00:07:04] abuses.

Now, first, we should talk about what IMSDv2 is. It’s the name that it clearly came from Amazon because that’s a name only a cloud provider bad at naming things could possibly love. What is it?

Scott: So, it’s the Instance Metadata Service, Version Two. If there’s a version two, you can imagine there was a version one at some point. And the version two—

Corey: And there’s a version two because Amazon prod—the first one was terrible, but they don’t turn anything off, ever, so this is the way and the light and the future; we’re going to leave that old thing around until your great-grandchild dies of old age.

Scott: Exactly, yeah. So, when EC2s first came out, and IAM roles first came out, you wanted to give your EC2s the ability to use AWS privileges, so this is how those EC2s are getting access to their credentials that they can use. And the way in which this was originally done was there’s this magic IP address, this 169.254.169.254 IP address, which is very important for security on AWS because if anything can access that magic IP address from an EC2 instance, you can steal their credentials of that EC2, and therefore basically become that EC2 instance, in terms of what it can do in the AWS environment.

And so in 2019, there was a large breach of Capital One that was related to this. And so as a result of that—I think that AWS probably had this new version, probably, in the works for a while, but I think that motivated their faster release of this new version, and so IMDSv2 changed how you would obtain these credentials. So, you basically—instead of making a single GET request to this IP address, now you had to make multiple requests, they were now PUT request instead of a GET request, there was a challenge and response, there’s the hop limit. So, there’s all these various things that are going to make it harder and basically mitigate a lot of the different types of vulnerabilities that previously would be used in order to obtain these credentials. The problem, though, is that IMDSv1 still exists on EC2s, unless you as a customer are enforcing IMDSv2.

And so, in order to do this in a large environment, it’s difficult—theoretically, it’s a simple thing; all you should have to do is update your SDK and now you’re able to make use of the latest version. And if you’re using any version of the SDK that was released in the past over two years, you already should be using IMDSv2 there, but you have to enforce it. And so that’s where the problem is. And what was most problematic to me is now that I work for a company, we have run into the problem that there are some vendor solutions that we use that weren’t allowing us to enforce IMDSv2 across all of our different accounts. And this is something I’ve heard from a number of other customers as well.

And so I decided to create this list with vendors that I’ve had to deal with, vendors that other customers have had to deal with, in order to basically try and solve this problem once and for all. It’s been multiple years now and a lot of these vendors, unfortunately, were also security vendors. And so that makes the conversation a little bit easier, to basically put them on this wall-of-shame and say, “You’re a security vendor and you’re not allowing your customers to enforce best practices of security.”

Corey: I want to call on a couple of things around that. Originally the metadata service was used for a number of other things—still is—beyond credentials. It is not the credential service as envisioned by a lot of folks. The way that—also we’ll find those credentials empty until there’s an EC2 instance role, and those credentials will both be scoped what that instance does and automatically rotated in the fullness of time so they’re not long-lived credentials that once you have them, they will last forever. This is, of course, a best practice and something you should be leveraging, but scope those credentials down, or you wind up with one of the ways that was chained together in the Capital One breach a few years ago.

It’s also worth noting that service would have been more useful earlier in time with a few functions. For example, you can use the metadata service to retrieve the instance tags about the EC2 instance. When I requested it in 2015, it was not possible. But they had released it in January of this year, 2022, long after we have all come up with workarounds for this, where we could have used that to set the hostname internally on the system, if you’re looking for something basic and easy. It would have been something then you could have used to automatically self-register with DNS without having to jump through a whole bunch of hoops to do it manually.

And you look at this, and it’s wow, that’s a whole lot of crappy tooling I can just throw into the trash heap of history you don’t need anymore. But the IMSDv2, you’re right, makes it a lot harder, there has to be a conversation, not just something you can sort of bankshot something off of to get access to it. And it’s a terrific mitigation. What I’ve liked about your list of more or less shaming companies for doing this is, on the one hand, you have companies who take themselves off of the list as soon as it’s up there. It’s, “Oh, we love when people talk about us. Wait, what’s that? They’re saying something unkind? On the internet?” And they’ll fix it, which honestly is better than I expected.

And then every once in a while you’ll see something that’s horrifying of, “Oh, yeah, we’re not vulnerable to that at all because we tell you to create permanent long-lived credentials, store them on disk and we’ll use those instead.” And it’s… that is, like, guaranteeing that no one is going to break down your door by making your walls out of tissue paper. Don’t do that. Like, that has gone so far around the band that has come back around again. So, hopefully that got fixed.

Scott: And I think you pointed out a couple of things I want to talk about with this is that, one, it has actually been very successful in terms of getting large vendors to make changes. Currently, of the seven vendors that have ever been listed there, are three of them have already made fixes and have been removed from the list. And the list has only been up for about a month. And so, in terms of getting enterprise solution vendors to make changes within, like, just a few weeks is very surprising to me. And these are things that people have been asking for for years now, and so it had motivated them a lot there.

And the other thing that I want to point out is people have looked at the success that it’s had and considered maybe we should make wall-of-shame lists, for all the things that we want. And I want to point out that there are some things about this problem, the IMDSv2 specifically, that make it work for having this wall-of-shame list like this. One of them is that not supporting or not allowing customers to enforce IMDSv2 is basically always bad. There is not a use case where you can make a claim—

Corey: There is no nuance where that, in this case, is the thing to do, like having an open S3 bucket: There are use cases where that is very much something you want to do, but it’s the uncommon case.

Scott: Exactly. That I think is an important thing. Another thing is it’s not just putting up a list, you know, like that is what people are seeing publicly, but behind the scenes, there’s a lot of other things that are happening. One, I am communicating with various customers, customers that are reporting this issue to me, in order to try to better understand what’s happening there, so that I can then relay that information to the company. So, I’m not just putting up the list; I’m also, behind the scenes, having conversations with these different companies to try to get timelines from them, to try to make sure that they are aware of the problem, they are aware that they’re on this list, how to get off the list. So, there’s that conversation happening.

There’s also the conversation that I’m happening with AWS in order to make various requests that AWS improve this for customers, to make this easier. And this is something that is public on that repo. I have my list of requests to AWS so that people can relay that to their own TAMs at AWS to basically say these are things we want as well. And so this includes things like, “I want an AWS account to have the ability to default to always be enforcing IMDSv2.” You know, so as an example, when you create an EC2 through the web console—which people can say, oh, you should always be using Infrastructure as Code; the reality is many folks are using the web console to create EC2s to do other changes.

And when you create an EC2 in the web console, by default, it’s going to allow IMDSv1 still. And so my request to AWS is, you should allow me to just default enforce IMDSv2. Also, the web console does not give you visibility into which EC2s are enforcing it and which ones are not. And also, you do not have the ability in the web console to enforce it. You cannot click on an EC2 and say, “Please enforce it now.”

So, it’s all these various, like, minor changes that I’m requesting AWS to do.

Corey: It has to be done at instance creation time.

Scott: Exactly. And so there is an API that you can make in order to change it afterwards, but that’s only an API so you have to use the CLI or some other mechanism; you can’t do it in the web console. But the other thing that I’m requesting AWS do is if security is a priority for AWS and they have all these other partners that are security companies, that they should be requiring their partners to also be enforcing this in their various products. So, if a partner is basically not allowing your AWS customers to enforce security best practices, then perhaps that partnership should be revoked in some way. And so that’s a more aggressive thing that I’m asking AWS to do, but I think is reasonable.

Corey: I’d also like them to get all of their own first-party services to support this, too.

Scott: That’s true as well. So, AWS is currently on the list. And so, they have one service, Data Pipelines, which if you are an AWS customer and you are using that service, you are not going to be able to enforce IMDSv2 in your environment. So, AWS themselves, unfortunately, is not allowing customers to enforce this. And then AWS themselves in their own production servers, we have seen indications that they do not enforce IMDSv2 on their own production servers.

So, the best practice that they are telling customers to follow, they unfortunately are not following it themselves. And so the way in which we saw this was Orca is a security company that ended up finding this issue with AWS—and there’s a lot of questions in terms of what all exactly they found—but they had this post that they called “Breaking Formation” in which they were somehow able to find—basically exploit to some degree—and again, it’s unclear exactly what they were able to exploit here—but they were able to exploit AWS production servers that are responsible for the CloudFormation service. And in their blog post, they had a screenshot which showed that those production servers are not enforcing IMDSv2. And so AWS themselves is struggling with this as well, as are many customers. So, it’s something that, you know, I put together this list of requests in hopes that AWS can make it easier for not only customers but also themselves to be able to enforce it.

Corey: There are a lot of different things that we wish companies did differently, particularly if that company is AWS. Why is this the particular windmill that you’ve decided to tilt at given—let’s say—it’s not exactly slim pickins out there as far as changes that we wish companies would make? Obviously, you mentioned at one point, there is no drawback to enabling this, but a lot could be said for other aspects as well. Why is this one so important?

Scott: So, in part, I personally have some, I guess, history with this [laugh], basically, IMDSv2, and so we can discuss this. This is back when Capital One had their breach in 2019, there was this Senator, Senator Ron Wyden, who sent this email over to AWS, to Steve Schmidt, who was the CISO at the time there and still is the CISO, and he basically—

Corey: Now, he’s head of security for all of Amazon.

Scott: Yeah, yeah.

Corey: CJ is now the AWS CISO. And he has the good sense to hide.

Scott: Yeah. [laugh]. So, at the time, this Senator Ron Wyden had send over this email—and obviously it’s not Senator Ron Wyden himself, you know, it’s one of his, like, technical people on staff that is able to give him this information—and he sends this email to AWS saying, “Hey, this metadata service played a role in this very significant breach. Why hasn’t this been fixed?” And Steve Schmidt responded, and because it’s communications between a senator, I guess it has to become public.

So, Steve Schmidt responds, saying that, “Hey, we never knew that this was an issue before,” is essentially what he responds with. And that irked me because I had reported this to AWS previously, as had many other people. So, there was a conference presentation by this guy Andrés Riancho at BlackHat, I believe in 2014, and he had presented previously in 2013, so it was a known issue; it had been around for a while. But I took the time to actually report it to AWS Security. So, I went through the correct channel of making sure that AWS was aware of a security concern, as a security researcher—so reporting it through that correct channel there—and provided Senator Ron Wyden with all this information.

And so, then he then requested that the FTC begin a federal investigation into AWS, related to basically not following the best practices that security researchers have recommended. So, that was, kind of like, my early, I guess, involvement with this issue. So, it’s something that I’ve been interested in for a while to make sure that this is resolved completely at some point.

Corey: This episode is sponsored by our friends at Oracle Cloud. Counting the pennies, but still dreaming of deploying apps instead of “Hello, World” demos? Allow me to introduce you to Oracle’s Always Free tier. It provides over 20 free services and infrastructure, networking, databases, observability, management, and security. And—let me be clear here—it’s actually free. There’s no surprise billing until you intentionally and proactively upgrade your account. This means you can provision a virtual machine instance or spin up an autonomous database that manages itself, all while gaining the networking, load balancing, and storage resources that somehow never quite make it into most free tiers needed to support the application that you want to build. With Always Free, you can do things like run small-scale applications or do proof-of-concept testing without spending a dime. You know that I always like to put asterisks next to the word free? This is actually free, no asterisk. Start now. Visit snark.cloud/oci-free that’s snark.cloud/oci-free.

Corey: It’s always fun watching where people come from, as far as the security problems that they call out. There was, I believe in the cloud security forum Slack, a thread of recently about what security issues are top-of-mind and that should be fixed as a baseline expectation. In fact, let me dig it out because that is one of those things that I think is well worth having the conversation properly on this.

Good examples of risky, insecure defaults in AWS. And people are talking about IMDSv1, and they’re talking about all kinds of other in-depth things, and my contribution to it was, “If I go and I spin up an AWS account, until I go out of my way, I’m operating as root in that account. That seems bad.” And a few responses to that were oh, the basically facepalming, “Oh, of course.” I wish that there were an easy way to get AWS SSO as the default because it is the right answer for so many different things. It solves so many painful problems that otherwise you’re going to wind up stuck with.

And this stuff is hard and confusing; when people are starting out with this for the first time, they’re not approaching this from, “All right, how do I be extremely secure?” They want to get some work done. For fun a year ago, I spun up a test account—unattached to any organization—and because account aliases are globally unique, I somehow came up with the account ‘shitposting’ because that’s pretty much what I use it for. The actual reason I wanted that was I wanted something completely unattached from any other account that I could easily take screenshots from at any point, and the worst case scenario is okay, I’ve exposed some credential of my own in an account that has no privileged access to anything; I just have to apologize for all the Bitcoin mining now. And honestly, I think AWS would love that marketing campaign; they’d see my face on a billboard looking horrified. It’ll be great.

But I turned on every security service as I went because, of course, security is the most important thing. And there were so many to turn on, and the bill was approaching 50 bucks a month for an empty account. And it’s. It starts to feel a little weird and more than a little wrong.

Scott: [laugh]. Yeah, my personal concern in terms of default security features is really that problem of the cost controls, I think that that still is a big issue that AWS does not have cost controls such that when a student wants to try and use AWS for the very first time and somehow they spin up large EC2 instance, or they just you know, end up creating an access key and that access key gets leaked and somehow their account gets compromised and used for Bitcoin mining, now they’re stuck with that large AWS bill. For a student who has no budget, is in debt, and now is suddenly being, you know, hit with multiple thousands of dollars on their bill, that I think is very problematic, and that is something that I wish AWS would change as a default is basically, if you are creating AWS account for the very first time, have some type of—I don’t know how this would look, but maybe just be able to say, like, I don’t ever want this AWS account to spend more than $100 per month, and I’m okay if you end up destroying all my data in the account because I have no money and money is more important to me than whatever data I may store in here.

Corey: Make an answer to that question mandatory, just as putting a credit card in is mandatory. Because there are two extremes here. It’s more or less the same problem of AWS not knowing who its customers are beyond an AWS account, but there’s a spectrum somewhere between I’m a student who wants to learn how the cloud works, and my approach to security is very much the same. Don’t let randos spin up resources in my account, and I don’t ever want to be charged. If that means you turn off my “Hello World” blog post, okay, great.

On the other end, it’s this is Netflix. And this is our, you know, eight-millionth account that we’re spending up to do a thing and what do you mean you’re applying service quotas to it? I thought we had an understanding?—everything is a service quota, let’s be clear—

Scott: Yep.

Corey: —or a company that’s about to run a Superbowl ad. Yeah, there’s going to be a lot of traffic there. Don’t touch it. Just make it work. We don’t care what it costs.

Understanding where you fall on the cost perspective—as well as a security point of view of, “We’re a bank, which means forget security best practices, we have compliance obligations that cannot be altered in this account and here’s what they are.” There has to be a way that is easy and approachable for people to wind up moving that slider to whatever position best represents them. Because there are accounts where I never want to be charged a thing. And that’s an important thing because—and I’ve been talking about this for a while because I’m convinced it’s a matter of time—that poor kid who wound up trading on margin at Robinhood, woke up saw that he was seven-hundred-and-some-odd grand in debt and killed himself. When it all settled out, I think he turned something like a $30,000 profit when all was said and done, which just serves to make it worse.

I can see a scenario in which that happens, and part of the contributors to it are that we used to see that the surprise bill for compromised accounts was 10, 15, 20 grand. Now, they’re 70 to 90 because there are more regions, more services to run containers—because of course there are—and the payoff is such that the people exploiting this have gotten very practiced and very operationalized at spinning up those resources quickly, and they cost a lot very quickly. I mean, the third use case that they’re not aiming at yet is people like me, where it’s, oh, you have a free account that sandboxed; I want to get the high score on the free tier because all their fraud is attuned to you making money. With me, it’s nope, just going to run up the store to embarrass Amazon. That’s not a common exploit vector, but I’m very much here.

Scott: [laugh]. Yep. And that also is the thing though: The Denial of Wallet attack is also a concern on AWS, as well, where you’ve written a blog post about this, how if you are able to make use of data transfer in different ways, you can run up very high multi-million dollar bills in people’s AWS accounts and even AWS’s own protections and defenses against trying to look for cost spikes and things like that is delayed by multiple hours. And so you can still end up spending a lot of money in people’s accounts, or one thing that’s wild is an S3 object locking; that feature, the whole purpose behind it is to ensure data can never be deleted. It exists for various compliance reasons, so even AWS themselves cannot delete certain data.

So, if an attacker is able to abuse that functionality in somebody’s account, they can end up locking data such that for the next 100 years, it can never be deleted and you’re going to have to pay for that for the next 100 years inside your account. The only way of not paying for that anymore is to move everything that you have in an AWS account to a new account, and then ask AWS to delete that account, which is not going to be reasonable under most circumstances.

Corey: Yeah, alternatively, it’s one of those scenarios where well, the only other option is to start physically ripping hard drives out of racks in a bunch of different data centers. It’s wild to me. It’s such an attack surface that honestly I believe for the longest time that AWS Security is otherworldly good. And as we start seeing from these breaches, no, what really is otherworldly good is their ability to apply pressure to people not to go public with things they discover that they then wind up keeping quiet because once this whole Orca stuff came out, we started digging, and Aidan Steele found some stuff where you could just get unfiltered, raw outputs of CloudTrail events by setting up a couple of rules in weird ways.

And that was a giant problem, and it was never disclosed publicly. I don’t know if any of my events were impacted; I can’t trust that they would have told me if they were. And for the first time, I’m looking at things like confidential computing, which are designed around well, what if you don’t trust your cloud provider? Historically, I guess I was naive because my approach was, “Well, then you shouldn’t be using the cloud.” Now it’s, “Well, that’s actually kind of a good point.”

Because it’s not that I don’t trust my cloud provider to necessarily do what they’re telling me. I just don’t trust them to tell me what they’re doing. And that’s part of it. The, “Well, we found an issue, but you can’t prove we had an issue, so we’re going to say nothing.” And when it comes to light—because it always does—it erodes trust in a big way. And trust is everything in cloud.

Scott: Yeah. And so with some of the breaches that have come out, I created another GitHub repo to start tracking all the different security incidents that I could find for the three cloud providers, Azure, GCP, and AWS. And so on there, I started listing not only some of the blog posts from security companies that had been able to exploit vulnerabilities in the cloud providers, but also just anything else that I felt was a security mistake in some way. And so there’s a number of things I tried to avoid on there. Like, I tried to avoid listing something that’s kind of like a business decision, for example, services that get released that don’t have CloudTrail support. That’s a security concern to me, but that’s kind of a business decision that they decided to release a service before it supported all that functionality.

So, I tried to start listing off all those different things in order to also keep track of you know, is there a security provider that’s worse than the others? Are there any type of common patterns that I can see? And so I tried to look through some of those different things. And that’s been interesting because also I really only focus on AWS, and so I haven’t really known what all has been happening with GCP and Azure. And that was interesting because there’s been two issues that have happened on AWS where the exact same issue happened on the other cloud providers. And so that tells me, that’s concerning to me because that tells me tht—

Corey: Because those are not discovered at the same time let’s be clear.

Scott: Yeah. These were, like, over a year apart. And so basically, somebody had found something on GCP, and then a year-plus later, somebody else found the exact same issue on AWS. And then similarly, there was an issue with Azure and then a year-plus later, same issue on AWS. And that’s concerning because that tells me that AWS may not be monitoring what are the security issues that are impacting other cloud providers, and therefore checking whether or not they happen to themselves?

That’s something that you would expect a mature security team to be doing is to be monitoring what are public incidents that are happening to my competitors, and am I impacted similarly? Or what can I do to try and identify those issues, fix them, make sure they never happen? All those types of steps in terms of security maturity. And that’s something that then I’m a little concerned of that we’ve seen those issues happen before. There’s also, on AWS specifically, they have had a number of issues related to their IAM-managed policies that keep cropping up.

And so they have had a number of incidents where they were releasing policies that shouldn’t have been released in some way. And that’s concerning that showed that they don’t really have a change management process that you would expect. Usually, you would expect a company to be having GitHub PRs and approval processes and things like that, in order to make sure that there’s a second set of eyes on something before it gets released.

Corey: Particularly things of this level of sensitivity. This is not—like, I was making fun of them a day or two ago for having broken the copyright footer and not updating them since 2020 because instead of the ‘copyright’ symbol, they used an ‘at’ symbol. Minor stuff, but like that’s fun to needle people about, but it doesn’t actually matter for anything.

Scott: Yeah.

Corey: Security matters and mistakes show.

Scott: Yeah. And so there had been some examples where they released a policy that was called, like, ‘cheese puffs something’ and it’s like, okay, that’s clearly, like, an internal service of some sort. But I’d called them out and, like, I’d sent an email to AWS Security being like, “Hey, you need to make sure that you have change management processes on your IAM policies because one day you’re going to do something that is bad.” And one day they did. They made a change to the read-only access policy, and that basically—they removed every single privilege, somebody had ended up, you know, internally, removed every single privileges to the read-only access policy and replaced it with a whole bunch of write privileges for, I think, the Cassandra service.

And so, that was like, clearly they’ve made a mistake that they should have made sure they were correcting because you know, they had these previous incidents. Another kind of similar one was in December, there was a support policy where they had added S3 GetObject to that policy, and that was concerning in terms of have they just given all of their support employees access to everybody’s content in their S3 buckets? And so AWS made some statements saying that there were other controls in place there so it wouldn’t have been possible. But it’s those types of things that [crosstalk 00:33:17]—

Corey: Originally, those statements were made on Twitter, let’s be clear here.

Scott: Yes. Yeah. [laugh].

Corey: And I feel like there’s a—while I deeply appreciate how accessible a lot of their senior people are, I cannot point the executive leadership team at a client to some tweets that someone made. That is not a public statement of record that works on this.

Scott: Exactly.

Corey: They’re learning. We’ll get there sooner or later, I presume. I want to thank you for taking the time to speak with me, as always, I’ll throw links to these repos into the [show notes 00:33:46], but if they want to know more what you have to say, where’s the best place to find you?

Scott: So, my Twitter, which, unfortunately, is a handle written in hex, but it’s—‘dabbadoo’ is how you would pronounce it, but it’s probably easiest to see a link for it. So, that’s probably the main place to look for me.

Corey: That’s why my old Twitter handle was my amateur radio callsign. I don’t use that one anymore. It’s just easier. And I think that’s the right answer. Besides, given what you do, it’s easy enough if people want your attention. They screw up badly enough, you’ll come to them.

Scott: Yep. [laugh].

Corey: Scott, I really appreciate your time. Thanks again.

Scott: Thank you.

Corey: Scott Piper, Principal Engineer at Block and, more or less, roving security troubadour for lack of a better term. I’m Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you’ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you’ve hated this podcast, please leave a five-star review on your podcast platform of choice or a comment on the YouTubes saying that this episode is completely invalid because you wind up using the old version of the metadata service and you’ve never had a problem. That you know of.

Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Announcer: This has been a HumblePod production. Stay humble.

Newsletter Footer

Get the Newsletter

Reach over 30,000 discerning engineers, managers, enthusiasts who actually care about the state of Amazon’s cloud ecosystems.

"*" indicates required fields

This field is for validation purposes and should be left unchanged.
Sponsor Icon Footer

Sponsor an Episode

Get your message in front of people who care enough to keep current about the cloud phenomenon and its business impacts.