S3: 15 Years and 100 Trillion Objects Later with Kevin Miller

Episode Summary

Kevin Miller is currently the global General Manager for Amazon Simple Storage Service (S3), an object storage service that offers industry-leading scalability, data availability, security, and performance. Prior to this role, Kevin has had multiple leadership roles within AWS, including as the General Manager for Amazon S3 Glacier, Director of Engineering for AWS Virtual Private Cloud, and engineering leader for AWS Virtual Private Network and AWS Direct Connect. Kevin was also Technical Advisor to Charlie Bell, Senior Vice President for AWS Utility Computing. Kevin is a graduate of Carnegie Mellon University with a Bachelor of Science in Computer Science.

Episode Show Notes & Transcript

About Kevin

Links:

AWS S3: https://aws.amazon.com/S3
AWS Twitch: https://www.twitch.tv/aws
AWS YouTube: https://www.youtube.com/user/AmazonWebServices
AWS Pi Week: https://pages.awscloud.com/pi-week-2021.html

Transcript

Announcer: Hello, and welcome to Screaming in the Cloud with your host, Cloud Economist Corey Quinn. This weekly show features conversations with people doing interesting work in the world of Cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.

Corey: Join me on April 22nd at 1 PM ET for a webcast on Cloud & Kubernetes Failures & Successes in a Multi-everything World. I'll be joined by Fairwinds President Kendall Miller and their Solution Architect, Ivan Fetch. We’ll discuss the importance of gaining visibility into this multi-everything cloud native world. For more info and to register visit www.fairwinds.com/corey.

Corey: If your mean time to WTF for a security alert is more than a minute, it's time to look at Lacework. Lacework will help you get your security act together for everything from compliance service configurations to container app relationships, all without the need for PhDs in AWS to write the rules. If you're building a secure business on AWS with compliance requirements, you don't really have time to choose between antivirus or firewall companies to help you secure your stack. That's why Lacework is built from the ground up for the Cloud: low effort, high visibility and detection. To learn more, visit www.lacework.com.

Corey: Welcome to Screaming in the Cloud. I’m Corey Quinn. I’m joined this week by Kevin Miller, who’s currently the general manager for Amazon S3 which presumably needs no introduction itself, but there’s always someone. Kevin, welcome to the show. Thanks for joining us, and what is S3?

Kevin: Well, Corey, thanks for having me. Yes, Amazon S3 was actually the first generally available AWS service. We actually just celebrated our 15-year anniversary here on Pi Day, 3/14. And S3 is an object storage service that makes it easy for customers to put and store any amount of data that they want. We operate in all AWS regions worldwide, and we have a number of features to help customers manage their storage at scale because scalability is really one of the core building blocks, tenets for S3, where we provide the ability for customers to scale up and scale down the amount of storage they use, they don’t have to pre-provision storage, and when they delete objects that they don’t need, they stopped paying for them immediately.

So, we just make it easy for customers to store whenever they need, access it from applications, whether those are applications running in AWS or somewhere else on the internet, and really just want to make it super easy for customers to build storage, use storage with their applications.

Corey: So, a previous guest in, I say the first quarter of the show’s life—as of this time—was Mai-Lan Tomsen Bukovec, who at the time was also the general manager of S3, and she has since ascended to, perhaps, S4 or complex storage service. And you have transitioned from a role where you were the general manager of Glacier—

Kevin: Correct.

Corey: —or Amazon S3 Glacier, and that’s the point of the question. Is Glacier part of S3? Is it something distinct? I know they’re tightly related, but it always seems that it’s almost like the particle-wave experiment in physics where, “is it part of S3 or is it a distinct service?” depends entirely on the angle you’re looking at it through?

Kevin: Right. Well, that’s—Amazon S3 Glacier is a business that we run as a separate business, with a general manager. Joe Fitzgerald looks after that business today. Certainly, most of our customers use Glacier through S3, so they can put data into S3 and they actually can put it directly into the Glacier storage class—or the Glacier Deep Archive storage class—or customers can configure lifecycle policies to move data into Glacier at a certain point. So, the primary interface customers use is through S3, but it is run as a standalone business because there’s just a set of technology and human decisions that need to be made, specific to that type of storage, that archive storage. So, I work very closely with Joe, he and I are peers, but they are run as separate businesses.

Corey: So you, of course, transitioned. You’ve I guess, we’ll say that you’ve thawed. You are no longer the GM of Glacier, you’re now the GM of S3. And you just had a somewhat big announcement to celebrate that 15-year anniversary of S3 Object Lambda.

Kevin: Yes. We’re very excited about S3 Object Lambda. And we’ve spoken to a number of customers who were looking for features with S3, and the way that they described it was that they liked the S3 API, they want to access their data through that standard API, there’s lots of software that knows how to use that including, obviously, the AWS SDK. And so they liked that GET interface to get data out and to put data in, but they wanted a way to change the data a little bit as it was being retrieved. And there’s a lot of use cases for why they wanted to do it.

Everything from redacting certain data to maybe changing the size of an image for particular workloads, or maybe they have a large amount of XML data and for certain applications, they want a JSON formatted input. And so rather than have a lot of complicated business logic to do that, they said, well, why can’t I just put something in the path so that as the data is being retrieved through the GET API, I can make that change, the data can be reformatted.

Corey: It’s similar to the Lambda@Edge approach where instead of having to change or modify the source constantly and have every possible permutation, just operate on the request.

Kevin: Yeah, that’s right. So, I want one copy of my data; I don’t want to have to create lots of derivative copies of it. But I want to be able to make changes to it as it’s going through the APIs. So, that’s what we built is Lambda, it’s integrated with Lambda, it’s full Lambda. So really, it’s pretty powerful.

Customers can do anything you can do in a Lambda function you can do in these functions that are then run. So, an application makes a GET request, that invokes the Lambda function, the function can process the data, and then whatever is returned out is then sent and streamed back to the application. So, customers can build some transformation logic that runs in line with that request, but then transforms that data that goes to applications.

Corey: So, at the time that we’re recording this, the announcement is hours old. This is not something that has had time yet to permeate the ecosystem; people are still working through the various implications of it, so it may very well be that this winds up aging before we can even turn the episode around. But what is the most horrifying use case of this that you’ve seen so far? Because I’m looking at this and I’m thinking, “Oh, you know what I can use this for?” People are thinking, “Oh, a database?” “No, that’s what Route 53 is. Now, I can use S3 as a messaging queue.”

Kevin: Well, possibly. I keep saying that I’m going to use it as a random number generator. But that was—yeah—

Corey: I thought that was the bill.

Kevin: [laugh]. Not quite. We have a lot of use cases that we’re hearing and seeing already in just the first few hours for it. I don’t know that I would call any super-horrifying. But we have everything from what I was saying in terms of redaction and image transformation to one of the things that I think a lot of—will be great will be using it to prepare files for ML training.

I’ve actually done some work with training machine learning models, and oftentimes, there’s just little things you have to tweak in the data. Sometimes you get a row that has an extra piece of data in it that you didn’t expect or it’s missing a field, and that causes the training job to fail. So, just being able to kind of cleanse data and get it ready to feed into an ML training model, that seems like a really interesting use case as well.

Corey: Increasingly, it’s starting to seem like S3’s the biggest challenge over the past 15 years of evolution has been that it was poorly named because it’s EC2 look at this now and come away with the idea that it’s not simple. And if you take a look at what it does, it’s very clearly not. I mean, the idea of having storage that increases linearly, as far as cost goes—you’re billed for what you use, without having to pre-provision a storage appliance at a petabyte at a time and buy a number of shells. “Ooh, if I add one more the vendor discount kicks in, so I may as well over-provision there.” “Oh, we’re running low. Now, we have to panic order and get some more in.” I’ve always said that S3 has infinite storage because it does. It turns out, you folks can provision storage added to S3 faster than I can fill it, I suspect because you just get the drives on Amazon.

Kevin: Well, it’s a little bit more complicated than that. I mean, I think, Corey, that’s a place that you rightly call out. When we say ‘simple storage service,’ although there’s so much functionality in S3 today, I think we go back to some of the core tenets of S3 around the simplicity, and scalability, and resiliency, and those are not easy. There’s a lot of time spent within the team just making sure that we have the capacity, managing the supply chain to a deep level; it’s a little bit harder than just clicking ‘buy now.’ But we have teams that focus on that and do a great job, and also just around looking around corners and identifying how we continue to raise the bar for resiliency, and security, and durability of the service.

So, there’s just, yeah, there’s a lot of work that goes into that. But I do think it goes back to that simplicity of being able to scale up and scale down makes it just really nice to build applications. And now with the ability to build serverless applications where you have you have the ability to put a little code there in the request path so that you don’t have to have complicated business logic in an application. We think that that is still, it’s a simple capability. It goes back to how do we make it EC2 build applications that are integrated with storage?

Corey: Does S3 Object Lambda integrate with all different storage tiers? Is it something that only works on standard? Does it work with infrequent access? Does it work with, for example, the one that still exists, but no one ever talks about: Reduced Redundancy Storage? Does it work with Glacier? Just, it sits there and that thing spins for an awfully long time.

Kevin: It will work with all storage classes, yes. With Glacier you would have to restore an object first and then it would. So, you’d issue the restore initially, although the Lambda function itself could also issue the restore. Then you would most likely then come back for a second request later to retrieve the data from Glacier once it’s been restored. But it does work with S3 Standard, S3 Intelligent Tiering, SIA, and any other storage classes.

Corey: I think my favorite part of all of this is that the interaction model for any code that’s accessing stuff in S3 doesn’t change. It is strictly a talk to the endpoint, make a typical S3 GET and everything that happens on the backend of that is transparent to your application.

Kevin: Exactly. And that was, again, if you go back to the simplicity, how do we make this simple, we said, “Customers love just that simple API.” It’s a GET API, and how do we make it so that that API continues to work, and applications that know how to use a GET, they can continue to use a GET and retrieve the data. But the data will be transformed for them before it comes back.

Corey: Are there any boundaries around what else that Object Lambda is going to be able to talk to? Is it only able to do internal massaging of the data that it sees? Is it going to be able to call out to other services? How extensible is this?

Kevin: The Lambda can do, essentially, whatever a Lambda function can do, including all the different languages. And then also, yeah, it can call out to DynamoDB, for example, if you want to, for example, let’s say you have a CSV file and you want to augment that CSV with an extra piece of data, where you’re looking it up in a DynamoDB table, you can do that. So, you can merge multiple data streams together, you can dip out to an external database to add to that data. It’s pretty flexible there.

Corey: So, at some level, what you’re realistically saying here is that until now, S3 has been able to be configured as a static website hosting facility; now it can also host dynamic websites.

Kevin: Well, S3 Object Lambda today will work with applications that are running within the customer’s account or where they’ve granted access through another account. We don’t support S3 Object Lambda directly as a public website endpoint at this point, so that’s something that we’re definitely listening to feedback from customers on.

Corey: Can I put CloudFront in front of it, and then that can invoke the GET endpoint?

Kevin: Today, you can’t, but that is also something that we’re—we’ve heard from a few use cases. But primarily, the use cases that we’re focused on right now are ones where it’s applications running within the account or within a peer account.

Corey: I was hoping to effectively re-implement WordPress on top of S3. Now, again, not all use cases are valid, or good, or something anyone should do, but that’s most of the ways I tend to approach architecture. I tend to live my life as a warning to others, whenever I get the opportunity.

Kevin: Yeah. [laugh]. I don’t respond to that, Corey. [laugh].

Corey: That’s fine, you don’t need to. So, one thing that was also discussed is that this is the 15-year anniversary, and the service has changed an awful lot during that time. In fact, I will call our, for really no other reason than to be a small petty man, that the very first AWS service in beta was SQS. Someone’s going to win a bar trivia night on that, someday.

Kevin: That’s right.

Corey: But S3 was the first to general availability because obviously, a message queue was needed before storage. And let’s face it, as well, that most people even if they’re not in the space can instinctively wrap their heads around what storage is; a message queue requires a little bit more explanation. But that’s okay, we will do the revisionist history thing, and that’s fine. But it’s evolved beyond that. It had some features that again, are still supported but not advertised.

The Reduced Redundancy Storage is still available, but not talked about. And there’s no economic incentive for doing it, so people should not be using it, I will make that declaration on my part, so you don’t have to. But you can still talk to it using SOAP calls, in the regions where that existed, via XML, which is the One True Data Interchange Format, because I want everyone mad at me. You can still use the, we’ll call it legacy because I don’t believe it’s supporting new regions, the BitTorrent interface for S3 data. A lot of these were really neat when it came out and far future, and they didn’t pan out for one reason or another, but they’re still there. There’s been no change since launch that I’m aware of that suddenly breaks if you’re using S3 and have just gone on walkabout for the last 15 years. Is that correct?

Kevin: You’re right. There’s functionality that we had from early on in S3 that’s still supported. And I think that speaks to the way we think about the service, which is that when a customer starts adopting it, even for features like BitTorrent, which certainly that’s not a feature that is as widely adopted as most of them. But there are customers that use it and so our philosophy is that we continue fully supporting it and helping those customers with that protocol. And if they are looking to do something different, then will help them find a different alternative to it.

But, yeah, the only other thing that I would highlight is just that there have been some changes to the TLS protocols we’ve supported over time, and that’s been something we’ve closely worked with customers to manage those transitions to make sure that we’re hitting the right security benchmarks in terms of the TLS protocol support.

Corey: It’s hard on some level also to talk about S3 without someone going, “Oh, what about that time in 2017 when S3 went down?” Now, I’m going to caveat that before we begin in that, one, it went down in a single region, not globally. To my understanding, the ability to provision new buckets was impacted during the outage, but things hosted elsewhere would have been fine. Everything depends, inherently, on S3 on some level, and that sort of leads to a cascade effect where other things were super wonky for a while. But since then, AWS has been remarkably public about what changed and how things have changed.

I think you mentioned during the keynote at re:Invent, or re:Invent two years ago, that there’s now something like 235 microservices at the time, that power S3 under the hood, which of course, every startup in the world looked at that and said. “Oh, a challenge. We can beat that.” Like they’re somehow Pokemon, and you’ve got to implement at least that many to be a real service. I digress. A lot changed under the hood, to my understanding, almost a complete rewrite, but the customer experience didn’t.

Kevin: Yeah, I think that’s right, Corey. And we are constantly evolving the services that underlie S3. And over the 15 years, that’s been, maybe, the only constant has been the change in the services. And those services change and improve based on the lessons we’ve learned and new bars that we want to hit. And I think one really good example of that it is the launch of S3 Strong Consistency in December of last year. And Strong Consistency, for folks who have used S3 for a long time, that was a very significant change.

Corey: Oh, it was a bi-modal distribution, as far as the response to that. The response was either, “What does that even mean, and why would I care?”

Kevin: Right.

Corey: And the other type of response was people dropping their coffee cup in shock when they heard it.

Kevin: It’s a very significant change. And obviously, we delivered that to all requests, to all buckets was no change to performance and no additional costs. So, it was just something that everyone who uses S3 and—today or in the future—got for free, essentially no additional charge.

Corey: What does Strong Consistency mean, and why is that important, other than as an impressive feat of technical engineering?

Kevin: Right. So, in the original implementation of S3, you could overwrite one object but still receive the initial version of an object in response to a GET request. So, that’s what we call eventual consistency where there can be, generally a short period of time, but some period of time where a subsequent write would not be reflected in a GET request. And so with Strong Consistency, now, the guarantee we provide is that as soon as you receive a 200 response on a PUT request, then all subsequent GET requests and all subsequent LIST requests will include that most recent object version, the most recent version of the data that you’ve provided for that object.

And that’s just an important change because there’s plenty of applications that rely on that idea of I’ve PUT the data and now I’m guaranteed to get the exact data that I’ve PUT in response, versus getting an older version of that data.

Corey: There’s a lot that goes into that, and it’s deceptively complicated because someone thinks about that in the context of a single computer writing to disk—“Well, why is that hard? I edit a file. Then I talk to that file, and my edits are in that file.” Yeah. Distributed systems don’t quite work that way.

And now imagine this at the scale of S3. It was announced in a blog post at the start of this week that 100 trillion objects are stored in S3. That’s something like 16,000 per person alive today. And that is massive. And part of me does wonder how many of those are people doing absolutely horrifying things, but it’s a—customer use cases are weird. There’s no way around that.

Kevin: That’s right. North of 100 trillion objects. I think, actually, 99 trillion are cat pictures that you’ve uploaded, Corey, but—

Corey: Oh, almost certainly. Then I use them as a database. The mood of the cat is how we wind up doing this. It’s not just for sentiment analysis; it’s sentiment-driven.

Kevin: Yeah, that’s right. That’s right. But yes, S3 is a very large distributed system, and so maintaining consistent state across a large distributed system requires very careful protocols. There’s actually, one of the things we talked about this week, that I think it’s pretty interesting about the way that internal engineering in S3 has changed over the last few years, is that we’ve actually been using formal logic and mathematical proofs to actually prove the correctness of our consistency algorithms. So, the team spent a lot of time engineering the consistency services, and all the services that had to change to make consistency work.

Now, there’s a lot of testing that went into it, kind of traditional engineering testing, but then on top of that, we brought in mathematicians, basically, to do formal proofs of the protocols. And they found edge cases. I mean, some of the most esoteric edge cases you can imagine, but—

Corey: But it’s not just startups that are using this stuff, it’s hospitals. Those edge cases need to not exist if you’re going to make guarantees around things like this.

Kevin: That’s right. And you just have to make sure. And it’s hard; they did painstaking work to test, but with our formal logic, we’re able to just to simulate billions of combinations of messages and updates that we’re able to then validate that the correct things are happening relative to consistency. So, there’s a very significant engineering work, it was a multi-year effort, really, to get Strong Consistency to the point it was. But just to go back to your earlier point, that’s just an example of how S3 really has changed under the hood, but the external API, it’s still the external API. So, that’s our north star on all of this work.

Corey: Incidents happen fast, but they don’t come out of nowhere. If they’re watching, your team can catch the sudden shifts in performance, but who has time to constantly check thousands of hosts, services, and containers?

That’s where New Relic Lookout comes in. Part of Full-Stack Observability, it compares current performance to past performance, then displays it in an estate-wide view of your whole system.

Corey: So, you’ve effectively rebuilt the entire car while hurtling down the freeway at 60—or if you’re like me, 85—but it still works the same way. There are some things as a result that you’re not able to change. So, if you woke up, alternate timeline, you knew then what you know now, how would you change the interface? Or what one-way doors did you go through when building S3 early on in its history that in hindsight you would have treated differently?

Kevin: Well, I think that for the customers who used S3 in the very early days, there was an originally this idea that S3 buckets would be global, actually, global in scope. And we realized pretty early on that what we really wanted was regional isolation. And so today, when you create a bucket, you create a bucket in a specific region and that’s the only place that that data is stored. It’s stored in that region. Of course, it’s stored across three physically diverse data centers within that region to provide durability and availability, but it’s stored entirely within that region.

And I think in hindsight, I think if we had known, initially, that we would have moved into that regional model, we may have thought a little bit differently about how buckets are named, for example. But where we are now, we definitely like the regional resiliency, I think that’s a model that has proven itself time and time again, that having that regional resiliency is critical. And customers really appreciate that.

Corey: Something I want to talk about speaks directly to the heart of that resiliency, and the, frankly, ridiculous level of durability and availability the service offers, we’ve had you get on stage talking about these things, we’ve had Mai-Lan several times on stage talking about these things, and Jeff Barr writes blog posts on all of these things. I’m going to go out in the limb and guess that there’s more than just the three of you building this.

Kevin: Oh, yeah.

Corey: What’s involved keeping this site up and running? Who are the people that we don’t get to see? What are they doing?

Kevin: Well, there’s large engineering teams responsible for S3, of course, and they, I would say, in many ways are the unsung heroes of delivering the services that we do. Of course, you know, we get to be on stage and talking about these cool new features, but it’s only with a ton of hard work about the engineering teams day in and day out. And a lot of it is having the right instrumentation, and monitoring the health of the service to an incredibly deep level. It’s down very deep into hardware, of course, very deep into software, and getting all those signals and then making sure that every day, we’re doing the right set of things, both in terms of work that has to be done today, and project work that will help us deliver step-functions improvements, whether it’s adding another degree of availability, or looking at just certain types of data and certain edge cases that we want to strengthen our posture around, there’s constant work to look around corners, and then really just to continuously raise the bar for availability, and resiliency, and durability within the service.

Corey: It almost feels, on some level, like the most interesting changes and the enhancements that come out, almost always without comment, come from the strangest moments. I mean, I remember having a meeting with a couple of folks a year or two ago, when I was—I kept smacking into a particular challenge; I didn’t understand that there was an owner ACL at the time, and it turned out that there were two challenges there. One was that I didn’t fully understand what I was looking at, so people took my bug report more seriously than it probably deserved. And to be clear, no one was ever anything professional on this. And we had a conversation, my understanding dramatically improved, but the second part was a while later, “Oh, yeah. Now, with S3, you can also set an ACL that determines that any object placed into the bucket now has an ownership ID of the bucket owner.”

And I care about that primarily because that directly impacts the cost and usage reports that are what my company spends most of our life staring into. But it made for such an easier time as far as what we have to deploy to customer accounts and how we went up thinking about these things. And it was just a quiet release that was like many others with the same lack of fanfare that, “Oh, the service you don’t use is now available in a region you’ve never heard of. Have fun.” And there are, I think, almost 3000 of various releases last year; this was one of them that move the needle.

It’s little things like that, but it’s not so little because doing anything like this at the scale of something like S3 is massive. People who have worked in very small environments don’t really appreciate it. People who have worked in much larger environments—like, the larger the environment you get to work within the more magical something like this seems.

Kevin: Yeah, I think it’s a good example, you point to the S3 object ownership example, I think that’s a great example of the kind of feature that took us actually quite a bit of work to figure out how we would deliver that in as simple a fashion as possible. That was actually a feature that, at one point, I think there was a 2 or 3D matrix being developed of different ways that we might have to have flags on objects. And we just kept pushing and pushing to say, “It has to be simpler. We have to make this easier to use.” And I think we ended up in a really good spot. And it certainly, for customers that have lots of accounts, which I would say almost all of our large customers end up with many, many accounts—

Corey: Well, we’d like to hope so anyway. There was a time where, “Oh, just one per customer is fine.” And then you got to redefine what ‘large account’ looked like a few times that it was, “Okay, let’s see how this evolves.” Again, the things you learn from customers as you go.

Kevin: Yeah, exactly. And then there’s lots of reasons for different teams, different projects, and so forth, where you have lots of accounts. But for any of those, kind of, large accounts scenarios, or large organization scenarios, there’s almost always cases where you’re writing data across accounts in different buckets. So certainly, that’s a feature that, for folks who use S3, they knew exactly how they were going to use it, turned it on right away.

Corey: It’s the constant quiet source of improvement that is just phenomenal. The argument I always made that I think is one of the most magical parts of cloud that isn’t really talked about is that if I go ahead and I build an environment and I put it in AWS, it’s going to be more durable, arguably more secure, and better run and maintained five years later, if I never touch it again, whereas if I try that in a data center, the raccoons will carry the equipment off into the wilderness right around year three. And that’s something that is generally not widely understood until people have worked extensively with it.

S3 is also one of those things that I find is a very early and very defining moment, when companies look at going through either a cloud migration or a digital transformation, if people will pardon me using the term that I love making fun of, it’s a good metric for how cloud-y for lack of a better term, is your application and your environment. If everything lives on disks attached to instances, well, not very; you’ve just more or less replicated your data center environment into a cloud, which is fine as a step one. It’s not the most efficient, it makes the cloud look a lot more like your data center, and you’re not leveraging a lot of the capability there. Object storage is one of the first things that seems to shift, and one of the big accelerators or drags on adoption always seems like it comes down to how the staff think about those things. What do you see around that?

Kevin: Yeah. I think that’s right, Corey. I think that it’s super exciting to me working with customers that are looking to transform their business because oftentimes it goes right down to the data in terms of, what data am I collecting? What can I do with that data to make better decisions and make more real-time decisions that actually have meaningful impact on my business? And you talk about modern applications, some of it is about developing new modern applications and maybe even applications that open up new lines of business for a customer.

But then we have other customers who also use data and analytics to reduce costs and to better manage their manufacturing or other facilities. We have one customer who runs paper mills, and they were able to use data in S3 and analytics on top of it, to optimize how fast the paper mills run to eliminate the machines or reduce the amount of time that machines are down because they get jammed. And so it’s examples like that, where customers are able to first off, using S3 and using AWS able to just store a lot more than they’ve ever thought they could in a traditional on-premises installation, and then on top of that really make better use of that data to drive their business. And I mean, that’s super exciting to me, but I think you’re right as well about the people side of it. I mean, that is a, I think an area that is really underappreciated in terms of the amount of change and the amount of growth that is possible and yet really untapped at this point.

Corey: On some level, it almost shifts into—and again, this is understandable. I’m not criticizing anyone, I want to be clear here. Lord knows I’ve been there myself—where people start to identify the technology that they work with, as a part of their identity of who they are, professionally or in some cases personally.

Kevin: Yep.

Corey: And it’s an easy misstep to make. If there were suddenly a giant pile of reasons that everyone should migrate back to data centers, my first instinct would be to resist that, regardless of the merits of that argument because well, I’ve spent the last four years getting super deep into the world of AWS. Well, isn’t that my identity now on some level, so I should absolutely advocate for everything to be in AWS at all times. And that’s just not true; it’s never true, but every time it’s a hard step to make, psychologically.

Kevin: Oh, I agree. I think it is, psychologically, a hard step to make. And I think people get used to working with the technology that they do. And change can always be scary. I mean, certainly for myself as well, just in circumstances, where you say, “Well, I don’t know. It’s uncertain; I don’t know if I’m going to be successful at it.”

But I firmly believe that everyone at their core is interested in growth and developing, and doing more tomorrow than they did yesterday. And sometimes it’s not obvious. Sometimes it can be frightening, as I said, but I do think that fundamentally people like to grow. And so I think with the transformation that’s ongoing in terms of moving towards more cloud environments, and then, again, transforming the business on top of that, to really think about IT differently, think about technology differently. I just think there’s tremendous opportunity for folks to grow; people who are maintaining current systems to grow and develop new skills to maintain cloud systems or to build cloud applications even. So, I just think that’s an incredibly untapped portion of the market in terms of providing the training, and the skills and support to transform the culture and the people to have the skills for tomorrow’s environments.

Corey: Thank you so much for taking the time to speak with me about this dizzying array of things that S3 has been doing. What you’ve been up to for the last 15 years, which is always a weird question. “What have you been up to for the last 15 years, anyway?” But usually in a much more accusatory tone. If people want to learn more about what you’re up to, how you’re thinking about these things, okay can they find you?

Kevin: Well, I mean, obviously, they can find the S3 website at aws.amazon.com/S3. But there’s a number of videos on Twitch and YouTube both of myself and many of the folks within the team. Really, we’re excited to share a lot of new material. This week, with our Pi Week we decided Pi Day was not enough; we would extend it to be a four-day event. So, all week we’ve been sharing a ton of information, including some deep dives with some of the principal engineers that really help build S3 and deliver on that higher bar for availability, and durability, and security. And so, they’ve been sharing a little bit of behind-the-scenes, as well as just a number of videos on S3 and the innards there. So, really invite folks to check that out. And otherwise, my [inbox 00:32:54] is always open as well.

Corey: And of course, I would be remiss if I didn’t point out that I just did a quick check, and you have what can only be described as a sarcastic number of job openings within the S3 organization of all kinds of different roles.

Kevin: That’s right. I mean, we’re always hiring software engineers, and then systems development engineers in particular, as well as product management—

Corey: And TPMS, and, you know, of course, I’m assuming naming analysts. Like, “How do we keep it ‘S3,’ but not call it ‘simple’ anymore?” Let me spoil that one for someone: serverless. You call it serverless storage service, and you’re there. Everyone wins. You ride the hype train, everyone’s happy.

Kevin: I’m going to write that up right now, Corey. It’s a good idea.

Corey: Exactly. Well, we find a way to turn that story into six pages, but that’s a separate problem.

Kevin: That’s right.

Corey: Thank you so much for taking the time to speak with me. I really appreciate it.

Kevin: Likewise. It’s been great to chat. Thanks, Corey.

Corey: Kevin Miller, General Manager of Amazon Simple Storage Service, better known as S3. I’m Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you’ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you’ve hated this podcast, please leave a five-star review on your podcast platform of choice along with an insulting comment that whenever someone tries to retrieve it, we’ll have an Object Lambda rewrite it as something uplifting and positive.

Announcer: This has been this week’s episode of Screaming in the Cloud. You can also find more Corey at screaminginthecloud.com, or wherever fine snark is sold.

This has been a HumblePod production. Stay humble.

S3: 15 Years and 100 Trillion Objects Later with Kevin Miller

Episode Summary

Episode Show Notes & Transcript

You might also like

Reliable Software by Default with Jeremy Edberg

See Why GenAI Workloads Are Breaking Observability with Wayne Segar

Presenting at re:Invent with Matt Berk and Bowen Wang

Get the Newsletter

Sponsor an Episode