Episode Summary

Join Pete and Jesse as they talk about AWS Storage Day 2020, how AWS events differ from Apple events, how storage plays a major role in virtually every AWS service, how basically no one uses FSx, the S3 Intelligent-Tiering class and why it was Pete’s favorite product announcement from the event, why Pete and Jesse don’t recommend turning on Intelligent-Tiering automatically, all the caveats you need to know about the new product, why you should reach out to your AWS account manager if you’re not sure how to optimize storage, and more.

Episode Show Notes & Transcript


Corey: This episode is sponsored in part by Catchpoint. Look, 80 percent of performance and availability issues don’t occur within your application code in your data center itself. It occurs well outside those boundaries, so it’s difficult to understand what’s actually happening. What Catchpoint does is makes it easier for enterprises to detect, identify, and of course, validate how reachable their application is, and of course, how happy their users are. It helps you get visibility into reachability, availability, performance, reliability, and of course, absorbency, because we’ll throw that one in, too. And it’s used by a bunch of interesting companies you may have heard of, like, you know, Google, Verizon, Oracle—but don’t hold that against them—and many more. To learn more, visit www.catchpoint.com, and tell them Corey sent you; wait for the wince.

Pete: Hello, and welcome to AWS Morning Brief. I am Pete Cheslock. Corey, while being back from his paternity leave, is still not here. We are having too much fun. And by we, I mean I'm joined again with Jesse DeRose. Hey, Jesse. 

Jesse: Thanks as always for having me, Pete. 

Pete: It's so much fun to again chat with people outside of my little family unit, that we've just decided not to give this back to Corey. And luckily, Corey has many other podcasts that he does, he was pretty happy to give it away.

Jesse: I feel like you should never talk about your children that way, but he's got a plethora at this point. So, he's willing to kind of share the wealth.

Pete: Exactly. And if you notice, we have a new theme song that came out, I think it was last week was the first week that we brought in the new theme song, which is I think much in line with a previous episode where we talked about ’80s breakdancing movies that the new theme song kind of has that vibe to it.

Jesse: I hope you're wearing the Members Only jean jacket that I sent you, along with the shades to match the uniform.

Pete: Yeah. I mean, I was born in ’80, so the ’80s for me, I was very young. I'm kind of waiting for the ’90s movies to come around again because I want to rock out my JNCO jeans and my wallet chain. 

Jesse: [laugh], yes.

Pete: And all that good stuff.

Jesse: I am ready.

Pete: Exactly. Well, what are we talking about today? Well, earlier this week, AWS Storage Day 2020 happened on Tuesday. If you were a part of that, it was a free online event. As Amazon called it, a full day online event. Except it was only about four hours long, so kind of mailing it in on that one, huh?

Jesse: Can we start discussing that with our boss and say that a full day of work is technically just four hours? Can we just start working with that going forward?

Pete: Yeah, we'll just say it right now. So, hey, Corey, we're done for the day. Put in the old college four.

Jesse: [laugh]. That's what you say, “I put in the old college try. I just did my full day of four hours, according to AWS. So, this has been great. I'll talk to you tomorrow.”

Pete: Exactly. Well, Storage Day this year—it's the second year in a row if I'm remembering it correctly. 2019 was the last year they did that—and I feel like this kind of ties into the fact that there's just so many announcements that happened around re:Invent, that leading up into re:Invent, you have a lot of announcements to maybe soften the blow for a lot of folks. And Storage Day, really is just this whole day—well, four hours worth of a whole day—talking about everything related to storage. And we're talking about things like S3, EBS, EFS, FSx, for the five huge enterprises that probably use FSx. 

Although if you actually do use FSx, I'd be curious to hear about how you like it and what you think of it because we don't really hear a lot of people using it. But these are all the services, plus many more, that Amazon talked about as part of its Storage Day.

Jesse: Yeah, it was a really interesting discussion. I greatly appreciate that AWS broke out this discussion prior to AWS re:Invent, but they dropped a lot of knowledge on us all at once, and in, like, rapid-fire succession, I was really, kind of… not necessarily surprised, but there's a lot of information that they shared all at once. And I have to admit that after sitting through this presentation, I now have a greater appreciation for Apple's slow presentation style. As much as I hate it; as much as I hate sitting for an hour and a half for one announcement while they toot their own horn, I have to say that the buildup and getting me involved in the story and bringing me along with them. It works, it absolutely works. And it was kind of hard for me to pick up on all the things that went on during AWS Storage Day this year because there was a lot of things going on.

Pete: And honestly, the fact they give so much information is really amazing in, I guess, both their ability to tout, in many cases, minor feature changes that most SaaS businesses would just turn on and maybe blog about. But this is—obviously the engine of AWS is so good at discussing their wins. But you're right, it's just a huge amount. On Monday, Jeff Barr of course, wrote the blog post with a lot of these details, linking to countless other blog posts. And I think it really speaks to just how, probably every, or nearly every Amazon service ties into storage in some way. It's a huge, huge part of this ecosystem. 

Jesse: Absolutely. 

Pete: So, as you can imagine, there were so many new features that we're not even going to be able to cover them all throughout the course, but we did want to call out some of the big ones, or at least what we thought were the biggest ones, the most interesting new features, new product announcements that came out, and also just touch on some of the other things that we thought were pretty interesting as well. And yeah, there was a lot of fun stuff. I think the biggest one that was announced was the S3 Intelligent-Tiering, which is a class storage tier within S3, adds additional levels of archive access. So, if you imagine Intelligent-Tiering, you know, you have the automatic tiering of data from frequently accessed to infrequently accessed as things age out, they essentially automate that for you. So, as things are not accessed, you just start automatically paying less for them. And anything automatic in a cost savings world is going to help you save money. 

If you don't have to think about it and it just does it for you, it's fantastic. Well, Intelligent-Tiering added in these additional tiers—which they are Glacier—level tiers. They are additional places that your data can eventually move to as they start aging out based on a whole series of criteria. But there's caveats. There's more caveats now. 

Before, one of the interesting things that we actually learned as part of this—because it was buried in a pricing page footnote—is that when you store something into Intelligent-Tiering, there is a minimum storage time period that you will get charged for. It's one month; it’s 30 days. So, you don't even want to use Intelligent-Tiering if the life of your data is less than 30 days. Kind of makes sense if you think about it because that's the point. You're not sure how long something might live for, put it there; Intelligent-Tiering will kind of take care of it. 

But these new archive tiers add another piece of complexity, and that has to do with the speed at which you can recover that data because these additional tiers—Archive and archive tiers—within Intelligent-Tiering are Glacier-class storage, which means you will eventually wait the same amount of time as you might wait for a Glacier response. So, can your application support waiting—what is it, Jesse—six hours? 12 hours for a response?

Jesse: minimum, yeah.

Pete: it's something you have to keep in mind that your app has to support that weight when you request that object. And you can expedite it; there's a charge for it—there's a charge for everything on Amazon, of course—but you have to really plan a little bit more. It's not as plug-and-play. It's not as, like, flip a switch and magic happens as maybe it felt like when we originally looked at Intelligent-Tiering.

Jesse: I will say that this feels like a logical next step in terms of adding additional storage tiers to Intelligent-Tiering. When you look at the number of storage tiers for S3 in general, you have a number of options, including the Glacier Archive options. So, adding similar functionality to Intelligent-Tiering: it feels right. It feels like a logical next step. But there really are these caveats that, from a business perspective, we don't recommend just turning it on. 

You really need to think about what is the access pattern of my data. Or alternatively, if you don't know the access pattern of your data, at least understand going in, using Intelligent-Tiering, that there are caveats, there are additional charges if your data is stored for less than 30 days, for example, or if that your data is stored for more than 30 days and ends up in one of the Archive tiers, or Infrequent Access tiers, and then needs to be restored, there are retrieval times associated with that. So, there's a lot of really, really great features here for companies who are using S3 Intelligent-Tiering and know their S3 data's access patterns. So, if you know how frequently your data is going to be accessed, you've got great, great functionality here to pay less as you store your data long term. But keep in mind that it's not something that we recommend plugging in automatically. We've seen so many companies who are not leveraging the S3 Intelligent-Tiering or leveraging the S3 Infrequent Access functionality correctly and end up charging more than had they just kept everything on S3 standard.

Corey: This episode is sponsored in part by ChaosSearch. Now their name isn’t in all caps, so they’re definitely worth talking to. What is ChaosSearch? A scalable log analysis service that lets you add new workloads in minutes, not days or weeks. Click. Boom. Done. ChaosSearch is for you if you’re trying to get a handle on processing multiple terabytes, or more, of log and event data per day, at a disruptive price. One more thing, for those of you that have been down this path of disappointment before, ChaosSearch is a fully managed solution that isn’t playing marketing games when they say “fully managed.” The data lives within your S3 buckets, and that’s really all you have to care about. No managing of servers, but also no data movement. Check them out at chaossearch.io and tell them Corey sent you. Watch for the wince when you say my name. That’s chaossearch.io.

Pete: Right. There's monitoring charges, right? You have to pay for monitoring your files on there. So, when you think about your files on S3, if you have a large number of sizable files, then Intelligent-Tiering can make a lot of sense. You can't even turn on—you can’t—I say ‘turn on,’ but it's a class so you would move objects into it. 

You would move objects into different classes of S3 storage, but you can't even move objects in there that are below a certain size. I think it's somewhere, like, 128 kB or 256 kB, I can't remember exactly, but there is a minimum size limit to storing objects in there. So, that's a caveat that you have to think about. From the docs, Amazon tells you. 

They're upfront about this; They say, “Listen. Objects in your archive access tier are retrieved in three to five hours.” By ‘retrieved’ they mean moved back to the frequently accessed tier, where they'll have to age out again. That's another caveat, you have to think about. If you have things that are constantly moving into those archive tiers, and then back into the frequent, you have to then wait again for them to not be accessed. 

Again, those little things that are just caveats. The way a lot of people use S3, though, it's a dumping ground for data. This will probably save you money. And you can turn on some of the really great S3 analytics services within Amazon to start analyzing your S3 usage. Look at those reports and figure out the oldest age data, how frequently are things being accessed, all this stuff exists within Amazon. You can go and turn that on and—groan—QuickSight. Ugh, QuickSight even has some default—like it's a data endpoint you can point QuickSight at to run some additional reports. So, again, you can do a lot of this analysis just on your own. It's great. Of course, you can call me or Jesse, and we'll help you do it as well if you don't have the time. But—

Jesse: absolutely.

Pete: we're happy to look at that stuff. There are some cool things though, and I want to talk about that because you can automatically pay less than $1 per terabyte per month when your objects go into that Deep Archive when they haven't been accessed for 180 days or more. That is in-credible. S3 Standard? $23 a terabyte, I think? Something like that, depending on volume, usage, everything else, and any sort of discounts that you might have. It's a lot. 20-something dollars a terabyte. 

We're talking less than $1 per terabyte. Think about all of those documents that you have that, like, just no one's accessed. That's pretty impressive that you can spend so little for that. The next thing that I thought was really great: setup is very easy to do. It has filtering support, it has tagging support, object tags, or object prefix. 

I mean, that is, admittedly, also really helpful. You don't have to have stuff in certain locations to tier it out. You could have your application essentially writing things into these certain areas to have them be part of these settings. So, that's a really helpful feature that they add for you. 

And finally, I love the fact that you can actually define the number of days to set the aging out of this data. It's not, like, 90 days for archive and 180 days for Deep Archive; they are classes that you can enable—so you don't even need to use Deep Archive if you don't want to—and you can specify the number of days. So, maybe you say to yourself, “Well, I don't want to use Deep Archive because it doesn't appear that it has an expedited request if I want this data back faster—” which again, there's a charge for—” Doesn't support that. So, I only want to use Archive tier.” You can do that. 

And maybe you say, “I only want to use Archive tier when things are over, you know, 120 days, 180 days, 300 days.” Who knows. You can make all those settings for you. So, it does give you some flexibility there and you can make it work for your organization and your use case.

Jesse: Absolutely. I think that's a really important thing to highlight, which is, you get to pick if you want to use these additional cheaper storage tiers. And not only that, you get to pick when objects are transitioned into these storage tiers, which we have not seen as readily available for other S3 storage tiers. So, this is a fantastic opportunity to use these storage tiers, if it fits your business use case. So, I think that's the big asterisk here; whenever you see a commercial for prescription medicine, it always says, “Talk to your doctor if this is right for you.” Talk to your teams, talk to the business, see if this Intelligent-Tiering Archive Access and Deep Archive Access are right for you. They might be, they might not. It's really just additional features that you have the option to use, if what your lifecycle policies are for your S3 data.

Pete: Yeah, exactly. And if you have a great relationship with your Amazon account manager, you should send them a message and say, “Hey—”

Jesse: absolutely.

Pete: “—we're considering this. We've got a lot of buckets—” because let's say you have hundreds of accounts like a lot of people do; it's very easy to create accounts now with Amazon organizations. So, you've got hundreds of accounts. And within those hundreds of accounts, you have hundreds of buckets or more because now there's no longer the hundred bucket limit. How many people actually remember that one? Do you remember that one, Jesse? Remember the hundred bucket limit in S3?

Jesse: Oh my God, yes. It had plagued my dreams.

Pete: Yeah. I remember being a very early user of Amazon and asking for that to be increased, and it was the first time I ever got a, “No,” from Amazon. [laugh].

Jesse: Yeah. 

Pete: But you have this potential; you could have more than 100 buckets in an account now. You could have countless accounts. How to even begin to understand what you might save from that is a big challenge, and if you are not an expert in analyzing your cost and usage report, or are not working with us at Duckbill, what are your options? Like, how are you going to troll through all of these enumerations of your S3 usage just to figure out if this would be worthwhile for you? And my answer is usually, go reach out to your account manager. Especially if you have a support contract that you pay real money for, this is something that they should absolutely be able to help you out with. 

And they can run you a report. They can do some analysis for you to give you a feel for what you might save especially, again if you're in the petabyte range of S3 usage, or greater—multi-petabyte—this is something you definitely want to be looking at. And these additional archive tiers, too, really could be a huge impact to your bottom line costs. This is just one of the many announcements from AWS Storage Day. Like I said, it's really hard to fit this into one episode, so we're not going to do that. 

And we're actually going to split this into the next episode. So, stay tuned for part two of TBD—probably just two. Two of two—where we'll talk about some of the other really fantastic announcements from AWS Storage Day, more cool stuff with S3, some EBS changes, EFS—because that's still around; people are still using EFS—and a whole bunch of new features for FSx. And really, we'll dive into some of those changes, we'll dive into some of the other announcements from that day, and give you our impression on where all this stuff is going. Which is really, really amazing to see just, again, the level of innovation that is coming out, or maybe just the speed at which all this stuff comes out. 

All right, well, if you enjoyed this podcast, please go to lastweekinaws.com/review and give it a five-star review on your podcast platform of choice, whereas if you hated this podcast, please go to lastweekinaws.com/review and still give it a five-star rating on your podcast platform of choice, and tell us how much you love using FSx because you might be one of the, I don't know, two or three people that are actually using it. Thanks again.

Announcer: This has been a HumblePod production. Stay humble.
Newsletter Footer

Get the Newsletter

Reach over 30,000 discerning engineers, managers, enthusiasts who actually care about the state of Amazon’s cloud ecosystems.

"*" indicates required fields

This field is for validation purposes and should be left unchanged.
Sponsor Icon Footer

Sponsor an Episode

Get your message in front of people who care enough to keep current about the cloud phenomenon and its business impacts.