Episode 59: Rebuilding AWS S3 in a Weekend with Valentino Volonghi

Episode Summary

Amazon’s AWS offers a tantalizing range of services at incredible prices. While not a panacea to all your cloud computing needs, it’s definitely risen fast to become a critical piece of the pie for many companies looking to scale up quickly. Valentino Volonghi is CTO of AdRoll, who uses AWS extensively. In this episode, Volonghi relates his years of experience with AWS, and all its growing pains. Today? There’s a lot of magic in S3 as well, and Volonghi explains how AdRoll leverages this magic.

Episode Show Notes & Transcript

About Valentino Volonghi

Valentino currently designs and implements AdRoll's globally distributed architecture. He is the President and Founder of the Italian Python Association that runs PyCon Italy. Since 2000, Valentino has specialized in distributed systems and actively worked with several Open Source projects. In his free time, he shows off his biking skills on his Cervelo S2 on 50+ mile rides around the Bay.

Links Referenced: 


Host: Hello and welcome to Screaming in the Cloud with your host cloud economist Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.

Corey: This episode of Screaming in the Cloud is sponsored by O'Reilly's Velocity 2019 conference. To get ahead today, your organization needs to be cloud-native. The 2019 Velocity program in San Jose from June 10th to 13th is going to cover a lot of topics we've already covered on previous episodes of this show, ranging from Kubernetes and site reliability engineering over to observability and performance.

The idea here is to help you stay on top of the rapidly changing landscape of this zany world called cloud. It's a great place to learn new skills, approaches and of course technologies. But what's also great about almost any conference is going to be the hallway track. Catch up with people who are solving interesting problems, trade stories, learn from them and ideally learn a little bit more than you knew going into it.

There are going to be some great guests, including at least a few people who've been previously on this podcast, including Liz Fong Jones and several more. Listeners to this podcast can get 20% off of most passes with the code cloud20. That's C-L-O-U-D-2-0 during registration. To sign up go to velocityconf.com/cloud. That's velocityconf.com/cloud.

Thank you to velocity for sponsoring this podcast. 

Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I'm joined this week by Valentino Volonghi, CTO of AdRoll. Welcome to the show.

Valentino: Hey Corey, thanks for having me on the show.

Corey: No, thanks for being had. Let's start at the very beginning. Who are you and what do you do?

Valentino: Well, I'm CTO AdRoll group. And what AdRoll group does is effectively build marketing tools for businesses that want to grow, and they're looking to try to make sense of everything that is happening in marketing, especially when it comes to digital marketing that effectively is going to help their businesses drive more customers to their websites and turned them into profitable customers effectively.

Corey: Awesome. You've also been a community hero for AWS for the last five years or so?

Valentino: Yeah. I was a lucky enough to be included in the first group of community heroes, which I think was started in 2014. It isn't still completely clear to me what exactly community heroes do besides obviously helping the company and what did we do to deserve to be called community heroes. I think lots of people such as yourself are doing a great amount of work to help the community understand the cloud and spreading the reasoning behind everything that is happening in the market these days. So maybe you should be a hero as well.

Corey: Unfortunately, my harsh line on no capes winds up being a bit of a nonstarter for that. And I'm told the wardrobe is very explicit.

Valentino: Oh, okay. I didn't know that.

Corey: Exactly. It all comes down to sartorial choices and whatnot. So you've been involved with using AWS from a customer perspective for, I'm betting longer than five years.

Valentino: Yeah, about probably longer than a decade, actually longer than a decade.

Corey: And it's amazing watching how that service has just ... I guess how all AWS services have evolved over that time span. Where it's gone from, yeah, it runs some VMs and some storage, and if you want to charitably call it a network you can because latency was all over the map. And it's just amazing watching how that's evolved over a period of time where not only was it iterating rapidly and improving itself, but it seemed like the entire rest of the industry was more or less ignoring it completely as some sort of flash in the pan. I've never understood why they got the headstart that they did.

Valentino: Oh man. Such a long, long time ago. I remember I was still in Europe before I came to work on to start up AdRoll, but in 2006 I think was when S3 was first released. And I remember starting to take a look at it and thinking, "Wow, now you can put files on a system out there that you don't know really where it lives, but I don't need to have my own machines anymore." And it was the time that you used to buy colocations online and it was a provisioning process for all of those, you needed to choose your memory size and you typically get a co-located, co-hosted, shared host type situation. And it was expensive. And then yeah, in 2008 EC2 came out, 2007, EC2 came out and it felt like magic. And at that point in time, either was running in a data center out here on Spear Street in San Francisco. And I remember we had a two databases machines, both RAID 5 and one machine was humming along fine, but the other one was going on two drives. Two drives that were failing in the RAID 5. And we started driving the ordering the drives on Amazon or whatever, NewEgg. And I think that we're in back order at that time and we needed to wait for a week or two before those could arrive. At that moment in time I made the call. That's it, we're not doing this anymore. We are going on AWS. Just give me two weeks and I migrate everything, I told the CEO. And then we'll be free from the data center. And I tell you it costs will be exactly the same. And actually that's exactly what happened.

It took two weeks, I moved all the machines over, the costs were exactly the same, but we had no more needs to run to the store and provision extra capacity, or buy extra capacity, or any of that stuff. It also allowed us massive amounts of flexibility. And then very early on it was funny because I think I believe that through all of the stages of disbelief when it comes to AWS or cloud in general, where the first the complaints were, "Well it's not performant enough." If you want to run MapReduce you cannot run it inside AWS. There's simply not enough IO performance on the boxes. I even lived in a period of time, I was following closely when when Github was on AWS at first and then they moved to Rackspace afterwards because AWS wasn't fast enough, even for them. And they were working through some issues here and there. Some of those things were obviously real, like truly maturity situations. EBS has gone through a lot of ups and downs, but it's mostly been stable since then.

Living in now a day and age where the EBS drives that you get from AWS are super stable but it never used to be like that. And you needed to kind of get adapted and get used to the fact that an EBS drive could fail or the entire region could go down and because of EBS drives, which has happened in US East a few times in the past. But yeah, from those very few simple services with very rudimental and simple APIs, it does feel like they have ... They're starting to add more and more, not only breadth, because obviously that's evident to anybody at this point in time. I don't think anybody can keep up with the number of services that are being released.

But what's really surprising is that for the services where they see value and where customers are seeing a lot of adoption and interest, they can go to extreme depth with the functionality that they implement, the care with which they implement it and ultimately with how much of it is available for many of them. Now you get over 160, I think, different types of instances. It used to be that you only had six or seven, and now 160. Some of them are FPGA instances, which I think there's only maybe a handful of people in the world that can code those machines properly. And then certainly don't work at my company right now.

Corey: Well that's always the fun question too is, do you think that going through those early days where you were building out an entire ecosystem on ... Or sorry, an entire infrastructure on relatively unreliable instances and disks and whatnot, was I guess, a lesson that to some extent gets lost today. I mean it taught you early on, at least for me, that any given thing can fail. So architecting accordingly was important. Now you wind up with ultra reliable things that never seem to fail until one day they do and everything explodes. Do you think it's leading to less robust infrastructure in the modern era?

Valentino: It's possible. I think if people get on AWS thinking that we're going to run in the cloud so it's never going to fail because Amazon manages it, I think they're definitely making a real mistake, a very shortsighted statement that right there.

Not just because of that in case of failures, but a couple of years ago, I think maybe three years ago, there were all of those Zen vulnerabilities coming out and Amazon needed to patch and entire regions needed to be rebooted. And what do you do at that point when your infrastructure is not fully automated and capable to be restored without downtime in user facing software? You're going to need to pause development for weeks just in order to patch a higher urgency vulnerability in your core infrastructure. That's just an event that is not even a fault of anybody.

It's not even in the necessarily under full control of Amazon, and you need to be ready for some of that stuff. So there are. I would say that that are, there are systems that are simply ... Lots of companies that especially in their first journey to moving stuff inside AWS, they tend to just replicate exactly what they have in their own data center and just move it inside AWS. I know this because for example, AdRoll has done the first time that we migrated into AWS. We first migrated just our boxes, and then we quickly learned that that it wasn't always that reliable and so we needed to figure some of that stuff out for ourselves. And effectively you start to realize in our case back then that you needed to work around many of those things. But as you said today, it isn't quite that way, and to an extent, Amazon almost makes a promise about many of these services not failing or taking care of your infrastructure for you.

For example, if you look at Aurora, is a stupendous, fantastic piece of database software. It's extremely fast. It's always replicated in multiple availability zones and multiple data centers. The fail over time is less than a second, I think, at the moment. And when you're tasked with solving a problem, building a service you're going to choose to build it on top of Aurora neglecting to think about what happens if Aurora doesn't answer to me because the network goes off? Or what happens if my machines go down because AWS configured them?

Some of the biggest dire profile issues in terms of infrastructure of the last year alone, for example with the S3 of being erroneous configuration changes being pushed to production. What do you do at that point? Your system needs to be built in such a way that is going to be resistant, at least partially to some of these things. And Amazon is trying to build a lot of the tools around that stuff. But I think it still takes ... It still takes a lot of mind presence from the developers and architect to actually do this in a thoughtful way, use the services that you need to use in a thoughtful way, understand the perimeter of your infrastructure, and particular the assumptions you're making as you're building the infrastructure.

And if you can design a graceful degradation service where failure of an entire subsystem is not gonna lead to complete failure to serve a website, whether you slowly get to just the less useful website progressively but still maintaining the core service that you might offer, then it improves your infrastructure quite, quite a lot. I think this is where Chaos Monkey, Gorilla, whatever, King Kong, Kong or whatever it's called the for the region, failure come into play to try to exercise those muscles. It's obviously important to have them go in in production, but I think even a good start would be to have those running as you're prototyping your software and just see what if the failures bring you?

And another trend we've seen recently is the use of TLA Plus as a formal verification language where you can effectively spec your system using these formal languages and then test it using verification software so that it highlights places where your assumptions were not checking out with reality effectively.

Corey: The challenge that I've always had when looking at, I guess shall we say older environments and older architectures is that any of the days, but what you just described is very common, where you wind up taking an existing on-prem data center app and more or less migrating that wholesale directly as a one-to-one migration into the cloud. That was great when you could view the cloud as just a giant pile of, I guess, similar style resources. But now with 150 something in AWS alone, the higher level services start to unlock and empower different things that weren't possible back then, at least not without a tremendous amount of work. You talk for example, not having enough people around who can program FPGAs. Do you think that if you were building AdRoll today for example, you would focus on higher level services architecturally? Would you go server-less? Would containers be interesting? Or would you effectively stick to the tried and true architecture that got you to where you are?

Valentino: Probably, I would probably do a mix. I think what's important to evaluate as building infrastructure is the skill set of the people that you have working on your team. And you certainly need to play to their strengths. Ultimately they are the ones who are building and maintaining your infrastructure, not Amazon, not an external vendor. And most certainly not the open source maintainer of whichever project you use in alternative. And the other aspect is try to understand sometimes it's in subtle indications from Amazon, which services Amazon is investing most of their energy or a lot of their energy in, so that you know that they continue to grow and they continue to receive support and they continue to fix bugs and issues because you know that they'll be with you for the rest of your company's life, for example.

But on the other hand, a lot of times you write software just automatically without really thinking about the better way to write something just because you're used to it. And so typically it's not an easy thing to just jump out of the habit of getting an instance going to do something. And it might be a good idea at first, but if you develop a good process to test new architectures and new ideas. That you might quickly end up realizing, well, actually I don't need to run a T2 micro or whatever for running this particular thing with S3 where every time a file is uploaded on S3 I run some checks on the file that is uploaded to S3. You might realize, well, maybe the best thing to do is to try to play around with a Lambda function instead. And that effectively fixes your entire problem.

One area for example, we've tested around and it's on our ... On AdRoll's technical blog is that we built a globally distributed, eventually consistent counter that uses DynamoDB and Lambda and S3 together and effectively is able to aggregate all of the counts that are happening in each of the remote regions into a single counter in a central region that can then be synced back to each remote region. This way we can keep track of, for example, in our case, how much money has been spent in each particular region, and be sure that this money is spent efficiently. And the only other alternative way to do it is to set up a fairly complex database of your own, make sure that latency of updates is fast enough and that all the machines are up and running all the time. And if anything goes down, it's a high urgency situation because your controls on the budgets go away.

So it's sometimes it's really useful to, especially when dealing with problems in which communication and the flow of information isn't particularly easy to grasp for an engineer. It's easy to be able to remove an entire layer of a problem and be reliant on someone else to be providing the SLA that they are promising you. And so effectively that's the case for what for Lambda is if there's obviously a particular range of uses in which Lambda makes complete sense. But from the point of view of price and from the point of view of the resources needed or the type of computation that runs on it. And if you can manage to keep this in your mind when you're making decisions and ... Or you can make some tests, you can actually discover that maybe you can use Lambda and you get away with not having to solve quite a challenging problem at the end of the day.

So sometimes it helps to rewrite in some infrastructure just as an exercise. What I do at AdRoll is that I do as a CTO, I tend to not have a lot of direct reports. I consider each service at AdRoll to be my direct report as a team effectively, each of them being a team. And they every six weeks, they provide a short presentation in which they explain the budgets that have gone through, whether they have overspent or underspent and why.

And among the many things, they also talk about their infrastructure. We have diagrams of infrastructure. We talk about new releases from Amazon and what would be a new way to build the same thing. And they evaluate whether it would save money or not. And so you kinda need to have someone in the organization, especially if you're planning to adopt some of the new technologies that their role is effectively dedicated to being up to date with what's going on in the world and knowing the infrastructure of your systems and be able to make suggestions, and then let the team make the decision at that point.

Corey: What's also sometimes hard to reconcile for some people is that these services don't hold still. And I think one of the better services to draw this parallel to is one I know you're passionate about. Let's talk a little bit about S3. Before we started recording this show you mentioned that you thought that it was pretty misunderstood.

Valentino: Yeah, yeah.

Corey: What do you mean by that?

Valentino: Well, S3 has been, in my view, it's been one of the closest things to magic that that exists inside AWS. Until not long ago the maximum amount of data that you could pull from S3 was one gigabit per second on streams. You were limited in the number of requests per second that you could run on the same shot of S3. There was no way of tagging objects. There was the latency on the first byte when S3 started was in the two, 300 milliseconds. It was expensive. S3 probably has undergone some of the most cost cutting that you could see out there. And part of the decreasing costs is that the standard storage classes become cheaper, but also they have added significant other storage classes that you can move your data in and out of relatively simply without having to change a service effectively.

It's the same, very same API but different cost profile and storage mechanism. And when it all started, there was just one, it was just US standard and it was pretty expensive to use both in terms of per request costs and storage costs. But yeah, today there is the limit on the single bandwidth. The bandwidth on a single stream is not one gigabit per second anymore, it's at least five gigabits per second. If you can get 100 gigs on one of the instances that have 100 gig networking inside Amazon, you can get all of those 100 gigs out of S3 just fetching multiple streams. The latency that you got on the first byte is well below 100 milliseconds. Their range queries are very well supported so you could fetch logs inside S3. S3 is turning into almost a database now. With S3 select it allows you to run filters directly on your files by decompressing them on the fly and recompressing them afterwards. Or simply by reading richer formats like it could be Parquet, for example.

It honestly is something that it's hard to imagine how you could build everything that we have going on right now at AdRoll without S3. It has gotten to the point where running an HDFS cluster for us is not really that useful. If you look at EMI themselves, they have a version of HBase that runs backed by S3. And I know of extremely big companies that have moved from running HBase backed by file system HDFS to instead HBase backed by S3 that have had incredible improvements in performance and the consistency of the performance of HBase. And HBase is very sensitive to the performance of the discs because it's a consistent first database effectively. And if the region that is currently master ... So if the server that is currently master for the region is slow, it ends up bringing down that entire region effectively. It's a service that has grown dramatically. And we have experimented even when using it as a file system by using user file descriptors in the kernel. More recent versions of the Linux kernel allow user file descriptors. And if you have limited use for writing, like we do, and you want to treat the file system like a write once read many file system, then S3 becomes actually surprisingly useful as well.

Netflix published a blog article in their tech blog talking about, for example, how they use a way to mount S3 as a local file system in order to using FFmpeg to run movie decoding and transcoding. Because effectively FFmpeg was not created with the idea that S3 was around and so it needs to have the entire file available on the local system or at least an entire block available. It doesn't work well with streams. And so if you can abstract that part away from the FFmpeg API and move into the file system, you can suddenly use S3 as some kind of almost a file system. And we've done a similar thing when it comes to processing columnar files or indexed files from East side S3, where if you know exactly the range of data that you want to access, you can just do it inside S3.

We use it as a communication layer between the map and the reduced stage of our homegrown map reduce frameworks. And it's again, it allows us to cut away thousands of hours per day on waiting times for downloading a file to the local disk for processing it on local disk. We can just process it right away and cache it on the box after it's been downloaded. It's quite remarkable. The speed increase, the cost decrease, the S3 select. I think we're going to see in the near future databases that start to use S3 as the actual backend for their storage more and more without worrying about the limitations of the current disk.

And effectively we'll be able to scale in a stateless way adding as many machines as you want and respond to as much traffic as you can without needing to worry about failures either. It's an incredible amount of opportunity and possibility that is coming down in the future that I'm really excited to see become real.


I think that requires people to update a lot of their understandings about it. I mean, one of the things that I've always noticed that's been incredibly frustrating is that people believe it when it says it's simple storage service. Oh, simple. And you look on Hacker News and that's generally the consensus. Well S3 doesn't sound hard. I can build one of those in a weekend. And you see a bunch of companies trying to spin up alternatives to this, companies no one's ever heard of before. Oh, we're going to do S3 but on the blockchain, is another popular one that makes me just roll my eyes back in my head so hard I pass out.

You're right. This is the closest thing to magic that I think you'll see in all of AWS. And people haven't seemed to update their opinion. I think you're right. It's getting closer to a database than almost anything else. But I guess the discussions around it tend to be, well, a little facile, for lack of a better term. Well there was this outage a couple of years ago and it went down for four hours in a single region. And that's a complete nonstarter so we can't ever trust it. Who's going to be able to run their own internal data store with better uptime than that? Remarkably few people.


Yeah. I mean AdRoll has used 17 exabytes of bandwidth from S3, just for our business intelligence workload from EC2 to S3 this past month. I don't even know how. How do we even start, if a router communicated between S3 and whatever instance we have going around goes out and we're out, we're out for good.

S3 has multiple different paths to reach EC2 and they are all redundant. Each machine's internal there is obviously redundant. They're replicated in multiple zones. And well now this bandwidth is available across multiple zones because I'm storing data inside the inside the region, so it's already available in multiple data centers. The number of boxes that are needed to aggregate to 17 exabytes as well is quite impressive. We have no people thinking about this. I think we run over 20 billion requests per month on S3. I'm pretty sure that bucket, if it were made public, would be one of the biggest properties in terms of volume on the web. And I just can't see it. Processing 20 billion events per month with files that are sometimes significantly big, it's going to take a lot of people.

Corey: Exactly. People like to undervalue what their own expertise, what their time costs. The opportunity cost of focusing on that more than other stuff. And you still see it was strange implementations of trying to mount S3 in a FUSE file system. Trying to treat it like that has never worked out for anything I've ever seen but people keep trying.

Valentino: Yeah. The FUSE file system a is an interesting one I think. I think things might change in the future, but it needs to be done with some concept of what you're doing. It really isn't the file system, but it works for a certain subset of the use cases. And we're not even talking about necessarily yet all of the compliance side of things.

So encryption, ability to rotate your keys to set permissions on who can or cannot access, tabbing each object, building rules for accessing the objects or the prefix based on the tags available on that object using BIM policies and-

Corey: Life cycle transitions, object locks so no one can delete it. Litigation hold options, and then take a look at deep archive. $12,000 a year to store a petabyte? That's who cares money.

Valentino: Yeah, that's exactly, absolutely right. Plus it doesn't matter. At a certain point in time if you're not compliant and you're storing that much data, you just can't. So you're gonna have to delete it all. There's a lot of different security regulations, GDPR is incoming. Is your database going to remain compliant? Well GDPR isn't incoming, actually. It came out a year ago. But with GDPR here and the California privacy law incoming next at the end of the year and the end of this year, is your data, is your storage system going to help you to become compliant? Who's going to build all of the compliance tools on top of your storage system and make sure that you remain compliant for kingdom come? So that basically it's a ... I mean, it's awesome and I think it's a healthy exercise for every engineer to always question what is the value that you're getting out of a service? And try to scheme or understand the infrastructures, try to whiteboard it out and maybe do quick cost estimation.

But it's never enough to have just the engineer in their. Security is a stakeholder of these kind of decisions. The operations team is a stakeholder in this kind of decision. The business is a stakeholder in this kind of decision. The business might not be happy, as you said, to spend $500,000 for two engineers to work on S3 a year when they can spend $12,000 to get a petabyte stored here inside S3. It's just a lot easier. 12 grand is really who cares money.

Corey: Exactly. Especially when you're dealing with what it takes to build and run something that leverages that much data. It becomes almost a side note. And the durability guarantees remain there as well. It feels like one of those things we could go on with for hours and hours.


Yeah. And the other aspect that is very important is how close S3 is to the computing power. Because as I said, 17 exabytes of data just for BI purposes, I cannot do that cross data center. There is no way. That would cost everything from the business in terms of bandwidth costs. Many times other vendors approach AdRoll obviously asking for using their storage solution, but you're simply either to deploy you I need my own data center and then you're not close to where the capacity is, or you are in another system where I don't need a data center but you're not located near my compute capacity. And so I lose that piece of the piece of the equation that makes all of the stuff that I want to do worthwhile.

To an extent, S3 is the biggest locking reason behind EC2. It really is hard to replicate all of the different bits and pieces of technology that are built on top of S3. And in particular being so close to so many services that are easy to integrate with each other, where using things such as Lambda or EC2 makes it very compelling. Other cloud vendors are obviously always on the catch up and getting there, but I don't think they're quite to the level of customization, security, compliance and ease of use that really Amazon S3 has. It also has really hard aspect to it as well, but I think by and large it's a huge success story.

Corey: If people are interested in hearing more, I guess of your wise thoughts on the proper application of these various services, where on the Internet can they find you?

Valentino: Oh, I ... The easiest way to find me is to shoot me questions, comments or follow me on my Twitter account, @dialtone_. AdRoll also has a tech blog at tech.adroll.com. We usually publish a lot of interesting articles about the ongoings with our infrastructure. Things such as the globally, eventually consistent counter that I mentioned earlier, but also our extreme use of the spot market effectively, or our strange use of S3 as a quasi file system for processing our MapReduce jobs, which also are described in our blog.

And generally speaking, I'm more than happy to answer questions and what not to local events. I usually go to as many local events as I can here with the ... Either AWS user events or other meetups, or I go to a random set of other conferences as well.

Corey: Thank you so much for taking the time to speak with me today. I appreciate it.

Valentino: Thank you.

Corey: Valentino Volonghi, CTO of AdRoll. I'm Corey Quinn, and this is Screaming in the Cloud.

Host: This has been this week's episode of Screaming in the Cloud. You can also find more Corey at screaminginthecloud.com, or wherever fine snark is sold.

This has been a HumblePod production. Stay humble.

Newsletter Footer

Get the Newsletter

Reach over 30,000 discerning engineers, managers, enthusiasts who actually care about the state of Amazon’s cloud ecosystems.

"*" indicates required fields

This field is for validation purposes and should be left unchanged.
Sponsor Icon Footer

Sponsor an Episode

Get your message in front of people who care enough to keep current about the cloud phenomenon and its business impacts.