S3 Is Not a Backup

I’ve gotten some flak recently for daring to suggest that S3’s native features weren’t a substitute for a thoughtful backup strategy.

I’m not alone in this opinion. Andreas Wittig calls out that S3 Versioning is insufficient on the grounds of three reasons:

  1. Accidental deletion, since you can delete all versions at the same time,
  2. Malicious deletion, which hits the same problem, and
  3. At scale, recovery is going to suck for you.

I’d also extend that to MFA delete; it makes deleting things you want to get rid of SUPER obnoxious, while still not solving for everything. I suppose Object Lock might work except then you will never, ever be able to delete your data ever again. That’s expensive and more than a little bit constraining.

What are you protecting against?

A common failure mode of backups is not having a clear understanding around what scale of disaster you’re attempting to defend your data against. Accidentally overwriting a single file? Losing your entire AWS account? Losing an entire AWS region? All of AWS going down? All three cloud providers deleting your data simultaneously? A determined attacker, possibly internal to your organization?

There’s a reason that AWS Backup for S3 launched despite its clownshoes pricing that costs more than the S3 original copy of the data it’s protecting. Yes, AWS offers a sarcastic number of 9s in its durability design targets for S3 (despite its SLAs not coming anywhere near that guarantee), but that’s a design target only! I talk about it in some depth in a post I wrote last year.

People still have it in their heads that hard drives and computers are unlikely to fail at S3’s scale and design–and they’re right! But the same people who claim on Twitter that this obviates the need for backups also keep asking for an “Edit Tweet” button because they make mistakes whilst tweeting, seemingly without spotting the contradiction there.

I can’t speak for anyone else, but I plan to screw things up a lot. I fat finger buckets, overwrite files and objects with empty versions, and make a medley of other only-funny-to-other-people mistakes. For the stuff that actually matters, I have multiple tiers of backups; Time Machine for on-site backups, Backblaze for off-site, using git to manage code (and STILL copying the directory to a .bak version before making sweeping changes just in case!), and more. There are two kinds of people: those who have lost data, and those who will. And nobody is nearly as fanatical about this stuff than a member of the first group. Yes, I’m one of them. Obviously! I know I’m fallible, and I plan for that.

When we start exploring failure modes, I also make it a point to consider what happens when a bad actor (internal or external) gets access to things with ill-intent in mind. That’s why my own credentials only get me into some of our AWS accounts, but not all of them. That’s why for stuff that REALLY matters, I offsite to Google Cloud because let’s face it: AWS as a business entity is very much a single point of failure. I’m not concerned about AWS going out of business, but the odds of my credit card being declined and losing my data is very much non-zero. Plus, it’s easier for me to just do it than to explain to folks down the road why I didn’t.

It’s also important to figure out what data you’re going to miss if you lose it. Most of my data isn’t frankly that important. We generate a lot of that stuff, and if my S3 access logs go poof someday, I won’t really care. I only care about them now negatively, because they keep showing up on my AWS bill.

The Security Trade-Off

The inverse thing to consider here is that the more places you store backups, the greater your security exposure gets. Those are more systems and more places to guard against access. Your backups should absolutely be encrypted–but where do the keys live? Better hope you’re backing those up, too!

One of the reasons that S3 bucket negligence awards have been highlighting so much data that’s so incredibly damaging is that S3 itself is often used as a backup target for other systems that hold incredibly sensitive things. It’s easier to get access to a single S3 bucket than it is for all three of a payroll database, a payments database, and your production API keys.

Let Me Be Clear

I’m not suggesting that S3 loses data. I’m not suggesting that it’s insecure. I’m not even suggesting that it’s badly priced, and you’re probably aware of just how rare that is!

I’m suggesting that multiple copies of important data in different places was a mainstay of good operational practice back in the data center days when we had to worry about individual hard drive failures, and just because AWS has come up with a clever way to survive those doesn’t do anything to defend against the other threats to our data that we’ve always been living with.

But it does definitely give us a leg up on most ransomware.