Hyperscaler Infrastructure for the Masses with Jessie Frazelle, Steve Tuck, and Bryan Cantrill of Oxide Computing

Episode Summary

Jessie Frazelle, Steve Tuck, and Bryan Cantrill founded the Oxide Computer Company. They’re building a rack-scale server design to deliver cloud hyperscale innovations around density, efficiency, cost, reliability, manageability, and security to everyone running on-premises compute infrastructure. Join Corey, Jessie, Steve, and Bryan as they discuss how and why a lot of companies still run on-premises infrastructure in the cloud era; the last time Corey bought a box; what happens when you rely on myriad vendors for your hardware and software needs; why two-person startups should start in the cloud and when they should move on-prem; how if the cloud is renting a hotel, on-prem is owning a house; musings on tech and sustainability; how the cloud is not a panacea; the current status of Moore’s law; and more.

Episode Show Notes & Transcript

Links Referenced


Announcer: Hello and welcome to Screaming in the Cloud with your host, Cloud Economist Corey Quinn. This weekly show features conversations with people doing interesting work in the world of Cloud, thoughtful commentary on this state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.

Corey Quinn: And this episode is sponsored by InfluxData. Influx is most well known for InfluxDB, which is a time-series database that you use if you need a time-series database. Think Amazon TimeStream except actually available for sale and has paying customers. To check out what they're doing both with their SAS offering as well as their on-premise offerings that you can use yourself because they're open source, visit influxdata.com. My thanks to them for sponsoring this ridiculous podcast.

Welcome to Screaming in the Cloud, I'm Corey Quinn. I'm joined this week by not one guest, but three, because you go ahead and get Jessie Frazelle, Bryan Cantrill, and Steve Tuck together in a room and then telling them no, they can't all speak. Together, they're Oxide Computer. Welcome to the show folks.

Bryan Cantrill: Thanks for having us.

Steve Tuck: Glad to be here.

Jessie Frazelle: Yeah. Super exciting.

Corey Quinn: Let's start at the beginning. You're called Oxide Computer, so obviously you focus on Rust. What does the company do?

Bryan Cantrill: It is a bit of a tip of the hat to Rust actually. I am falling. I mean I did fall in love with the Rust but it's not weird. It's not only about Rust. So just you want to-

Corey Quinn: No, at some point, you've got to stop talking about work and actually do work.

Bryan Cantrill: That is true.

Jessie Frazelle: Yeah. I guess our tagline is a Hyperscaler infrastructure for everyone else. If you think about how Amazon, Google, Microsoft, the hyperscalers, built their internal infrastructure, we are trying to do that for everyone else with hyper, with like a rack scale server based off open compute project and going all the way up with the software layer to deploying VMs. You can basically plug in a rack and then you are able to deploy VMs within like a few minutes of it booting. It's the goal.

Corey Quinn: I'm sure you've been asked this question once or twice, but I say the quiet part out loud. Why in the world would you build a hardware computer startup in the time of Cloud? Did no one hug any of you enough as children, perhaps your parents were cousins? What inspired you to do this in the year of our Lord 2019.

Jessie Frazelle: Well, a lot of people still honestly run on-premise. If you could talk to companies really like a lot of where the hype is with containers and Kubernetes and a lot of other things like that, that's super, super forward facing. A lot of people run on-premises for good reasons. Either strategic or the unit cost economics of them actually running in the Cloud, it's too expensive. But if you're a finance company and you are hyper currency trading, you need good latency or you really care about the security of your infrastructure because you saw it with the Capitol One breach that a bank running in the Cloud. Like it's actually super horrifying if they get popped.

I think that there are really, really good reasons for running in the Cloud and a lot of that market has been neglected for so long. I spent a lot of my fun employment talking to a lot of these companies and honestly like they're in a great deal of pain and we even saw it during the raise that people don't seem to believe that they exist but they do. We're here to disrupt that and build them something that is actually nice.

Bryan Cantrill: Corey, this is the life that Steve and I lived. We worked at a Cloud computing company actually. We're very pro-Cloud just to be clear. We love the Cloud, but we also know that there are economic reasons in particular that you don't want to, you may not want to actually rent your compute. You actually may want to buy a computer or two as it turns out if your Cloud bill is high enough.

Corey Quinn: Well, looking at the Cloud bills of large companies, there's a distinct break from what they talk about on stage at re:Invent where, "Oh, we're going to talk about these machine learning algorithms and all of these container things and Serverless is the future." And you look at the bill and it's all primarily just a giant pile of EC2 instances. The stuff that they talk about on stage is not the stuff that they're actually spending the money on. I'm curious as much as we hear about Cloud, is there still a sizable market running data centers?

Steve Tuck: Yeah, there is. I think your question is one that we've heard quite a bit in the market of wasn't everything going to the Cloud and having run lots of infrastructure ourselves on-premises and then talking to a lot of the enterprise market. There is still an enormous amount of infrastructure that is being ran and that they are expecting to run for years and years and years. Frustratingly for ourselves, for them in the past, that market has been neglected and there has been this false dichotomy that if I'm going to run on-premises, I can't have the kind of utility or elasticity or just ease for developers that I would have in the public Cloud. You're right, 80%, 90% of the bulk of usage for folks today is more like easy too. Let me be able to easily spin up and provision a VM via an API of various sizes with various software running on it and store some data from that and connect to that over the network.

Bryan Cantrill: Corey, when was the last time you bought a box? Like a to you box or whatever and stood it up?

Corey Quinn: Oh, probably 2012.

Bryan Cantrill: Yeah, right. We're here to tell you that the state of the art hasn't changed that much since 2012 or since-

Steve Tuck: ... Or 2008.

Bryan Cantrill: Yeah, right. Exactly. It's basically still a personal computer. If you are in the unfortunate position where you are actually Cloud aware or you know what the Cloud is and then you have to, for one of these reasons that just mentions: economic reasons or strategic reasons or so on, actually stand up your own Cloud because you actually want to own your own computers. It's very dismaying to discover that the state of the art has advanced so little and then really heightening the level of pain, Google and Facebook and others make it very clear the infrastructure that they are on and it's not the infrastructure that you just bought. In fact, it's much, much, much better and there's no way you can buy it. This is the kind of the fundamental issue. If you want to buy the Facebook's open compute server, which is, you've got like Tioga Pass or Bryce Canyon, these are, it's a nice box. Or if you want to buy Google's warehouse size, the data center size computer, they're not for sale. If you're a customer who's buying your own machines, this is really frustrating.

Steve Tuck: Yeah. For six or seven years, there's been innovation going into these hyperscalers infrastructure to deliver out the services we all consume, Amazon, Google, Microsoft, Facebook, et cetera. And none of that innovation has made its way into the larger enterprise market, which is frustrating.

Bryan Cantrill: And then, you're told they don't exist and they go into outrage.

Steve Tuck: ... Because it's all going into the public Cloud.

Jessie Frazelle: Yeah.

Steve Tuck: So what does it matter?

Corey Quinn: Yeah. If I go and buy a whole bunch of servers to put in a cage somewhere at a data center, you're right, my answer hasn't really changed in years. I would call up Dell or HP or Supermicro, I would be angry and upset by regardless of which direction I went in an no one's going to be super happy. Nothing's going to show up on time and what I get is going to be mostly the same except for those one or two edge case boxes that don't work the same way that slowly caused me to rip the remains in my hair out and eventually, I presumably have a data center up and running.

Cloud wasn't, for me, at least this thing that was amazing and solved all of the problems about instant on and on-demand compute. But, it got me out of the data center where I didn't have to deal with things that did nothing other than pissed me off. I guess my first question for you then becomes what makes you different than the rest of those, shall we say, legacy vendors? And then, I want to talk a little bit about how much of what the hyperscalers have done is applicable if you're not building out rows and rows and rows and rows of racks.

Jessie Frazelle: I think honestly, what makes us different is the fact that we have experience with Cloud compute and we also have experience in making tools that developers love. We Also have the experience with Bryan and Steve of running on-prem and all that pain. What you get, like you were saying when you buy like a few servers, is you get a kit car and so what we are giving people is not a kit car. It's like the Death Star all built together.

Bryan Cantrill: ... Is that a Margaret literature where the sun?

Steve Tuck: We have to update that slide.

Bryan Cantrill: I know Death Star. It feels a bit like what's Alderaan in that?

Jessie Frazelle: I know. I should-

Bryan Cantrill: ... We should go to that one.

Jessie Frazelle: ... I should have used that in the raise honestly.

Corey Quinn: You'll get attention.

Bryan Cantrill: Actually, that was not the point I was trying to make. Actually. It should be said Corey that I'm not sure that Jess believed how bad the existing vendors were. I think she heard Steve and me complain but then she acted too surprised when she talked to customers.

Jessie Frazelle: Yeah. No, this is very true because they did complain and then I was like, "Okay, I'm going to go talk to some people." And so, I spent a lot of time actually tracking down a bunch of fuel and getting on the phone with them. Honestly, then I was like, "Whoa, Dell is really bad. I didn't realize that everyone has the same exact problem." Not only that, you get it by these vendors into thinking this problem is only yours.

Corey Quinn: You bought the wrong rails. No one has ever done that before. You must be simple.

Bryan Cantrill: We've never seen this problem before. You're the only ones to report this problem. I cannot tell you how deeply to the marrow frustrating it is. Honestly, this is... Corey, you made this point too about AWS and the Cloud taking this pain away. That is one thing that AWS really does not do. They do not try to tell a customer that they are the only ones seeing a problem. Logic is they can't get away with that. Right? They can't get away with saying no. Frankfurt is only down for you.

Corey Quinn: Yeah, Twitter would disagree.

Bryan Cantrill: Exactly. But, you don't have a way of going to Twitter and understanding that "Hey, is anyone else seeing a level of dim failure that that seems to be too high to take an example?" We were seeing a problem guys. Steven was at that in 2011 when we were seeing a pandemic of reliability issues. We were being told we're the only ones that are seeing this problem and it's a problem that we couldn't possibly create so it made no sense.

I asked a room full of, I don't know how many were in there, like 300 people, "Could you raise your hand if you've ever seen a parody error on your rate controller?" All of a sudden, 20 or 30 angry hands shot up. It wasn't very many people, but it was we were not the only ones-

Steve Tuck: ... All these people discovering they weren't the only one. They all have the same message.

Bryan Cantrill: They were not the only ones. I was looking around and I'm seeing like, "Wait a minute." My hand isn't the only one that's up and all of a sudden you have like an instant therapy group. That shows you the level of frustration and that level of frustration has only grown in the last decade.

Steve Tuck: A big part of the frustration stems from the disconnected hardware and software. To your example, Corey, you go buy a bunch of hardware and if you can get it all uniform and get it stood up and racked and tester drives and now you have a collection of hardware, but yet now you have to go add software.

Bryan Cantrill: But Steve, we can just run Kubernetes on bare metal, right, Jess?

Steve Tuck: Stop it.

Jessie Frazelle: Don't even. Do not. That was not the third rail that you're supposed to touch by the way.

Bryan Cantrill: I love that third rail. That's one of my favorite third rails to touch.

Steve Tuck: But, tightly integrating hardware and software together is going to be one key differentiation point for us, Corey. And owning both the hardware and the software side so the customer doesn't have to so they can effectively roll a rack in, plug it in and expose the APIs to the developers that they love and operational tools that they love. This is the kind of stuff you would have expected to come out of hardware 20 years ago, 10 years ago and so we're finally going to do it.

Corey Quinn: I will say that what you're saying resonates as far as people having anger about what's going on in this, I guess, world of data centers. I gave a talk at scale beginning of 2019 called The Cloud is a Scam. Apparently, someone ripped it off and put it up on YouTube a few weeks ago and I just glanced that as 43,000 views and a bunch of terrible comments and someone's monetizing it who is not me. Great, awesome. At least the message is getting out there. But the goal, the entire talk is in five minutes, is just a litany of failures I had while setting up a data center series in one year at an old job. Every single one of those stories happened and people resonated. Each one resonates with people. The picture of the four different sizes of rack nuts that look exactly the same, that are never actually compatible with one another, a Schrodinger's GBIC where you send the wrong one every time and there's no way out of it. Just going down this litany of terrible experiences that everyone can relate to.

Steve Tuck: Was that GBIC a Sun GBIC by the way?

Corey Quinn: No, I think it was a Cisco GBIC at that point, just because we all have our own personal scars.

Steve Tuck: Serious GBIC problem. That was a century back.

Corey Quinn: Yeah, and the fact that it resonates as strongly as it does. It's fascinating.

Bryan Cantrill: Yeah. In terms of what a kit car, the Exton data centers, It's not good. The problem is that you're left buying from all these different vendors. As Steve says, you've got this hardware/software divide and what we think people are begging for is an iPhone in their DC. They're begging for that fully integrated, vertically integrated experience where if there is a failure, I know the vendor who is going to actually go proactively deal with it. Just as you've got that at Google and Facebook and so on. A group that I think Jess and I should be coined or not, but I love the term infrastructure privilege, which those hyperscalers definitely have and we want to give that infrastructure privilege to everybody.

Corey Quinn: Well, that does lead to the next part of the question, which is okay, if I want to build a data center like Google for example. Well, step one, I'm going to have an interview series that is as condescending and insulting to people as humanly possible. Once I wind up getting the very brightest boys, and yes they're generally boys, who can help build this thing out, super, that's just awesome. But now, most of what it feels like they do requires step one: build or buy an enormous building, step two: cooling power, et cetera, et cetera. And suddenly like step 35 is and then worry about getting individual optimizations for racks. Then going down the tier even further talking about specific individual servers, how much, what scale do you need to be operating at before some of those benefits become available to, shall we say, other companies who have old school things like business models?

Jessie Frazelle: Yeah. Honestly, in the people that I was talking to, and I won't name names, there were numerous who contemplated doing that. But, the problem is of course obviously that's not their business. Their business is not being a hyperscaler. You very rarely see companies doing it unless there is this huge gain for them economically. A lot of the value add that we're providing is giving them access to having all the resources almost of a hyperscaler without having to hire this massive team and have an entire build out of a data center for it. You can still host in a colo.

Bryan Cantrill: Yeah, I think that is that you start with a rack, right? That you can do with a single rack, especially if that rack represents both hardware and software integrated together, you can get a lot of those advantages. Yes, there are advantages absolutely when you're architecting it at an entire DC level or an entire region level. But, we think that those advantages actually can start at a much lower level.

Now, we don't think it starts much lower than a rack. If you just want a 1U or 2U server, no, that's going to be a two smallest scale. The smallest scale that we will go is to a rack. It is a rack scale design.

Steve Tuck: To your question on what is the applicability, how can people use this? Do they have to reimagine their data centers? Most modern data centers today have a sufficient power capacity to support these rack scale designs. Especially if you're a company that is using some of these rates like Equinox or Iron Mountain or others, those service providers are beginning to and have really come a long way in modernizing their facilities because they need to reduce, they need to help customers who want to reduce PUE. They need to help customers who are out of space and need to get much better density and utilization out of their space. You can take advantage of benefits on one rack and these racks can fit in many modern data centers today.

Corey Quinn: Is there a minimum size or is this one of those things where, "Oh cool, I will take one computer please and it's going to be just like my old desktop except now it sounds like a jet engine taking off every time someone annoys me"?

Bryan Cantrill: I think a rack is going to be the minimum buy.

Corey Quinn: Okay. So it needs to be a much larger desk is what I'm hearing?

Steve Tuck: Reinforce the desk.

Bryan Cantrill: Yeah, exactly. Reinforce the desk and it's going to be it'll fit on a 24 inch floor tile and it will fit in extant DC within limits, but a rack is going to be the minimum design point.

Corey Quinn: I frequently said that multicloud is a stupid best practice and I stand by that. However, if your customers are in multiple Clouds and you're a platform, you probably want to be where your customers are unless you enjoy turning down money. An example of that is InfluxData.

InfluxData are the manufacturers of InfluxDB, a time series database that you'll use if you need a time series database. Check them out at influxdb.com.

As you've been talking to, I guess, prospective customers as you went through the joy of starting up the company, did they all tend to fit into a particular profile where they across the board, for example, it feels like a two person company that is trying to launch the next coming of Twitter for pets is probably not going to necessarily be your target market. Who is?

Jessie Frazelle: I really tried to get a diversity of people who I was talking to so it's a lot of different industries and it's a lot of different sizes. There's super large enterprises there with numerous teams internally that we'll probably only interact with maybe one of those teams. There's also kind of smaller scale, not to the scale of only two people in a company, but maybe like a hundred person company that is interested as well. It's really a variety of different people. I did that like almost on purpose because you can easily get trapped into making something very specific for a very specific industry.

Bryan Cantrill: In terms of Twitter for pets, they should start in the Cloud. I would encourage anyone, any two person software startup go start in the cloud, do not start by buying your own machines. But, when you get to a certain point where VCs are now looking not just for growth but are looking for margin say, and you're kind of approaching that S1 point or you're approaching the point where you've got product market fit and now you need to optimize for cogs, now is the time to consider owning and operating your own machines. That's what we want to make easy.

Jessie Frazelle: Totally. We're not convincing anyone to move away from the Cloud for bad reasons for sure.

Steve Tuck: Well, I think everyone's going to use both. They're very few and far between companies that we've spoken to so far are not going to have a large footprint in the public Cloud for a lot of the use cases in the business that have less predictability or new apps they're deploying have seasonality, you have different usage patterns. But for those more persistent, more predictable workloads that they can see multi-year usage for, a lot of them are contemplating building their own infrastructure on-premises because there's huge economic wins and you get more control over the infrastructure.

Corey Quinn: There's a lot of opportunity to, I guess, build baseline workloads out in data centers, but the whole beautiful part of the Cloud is I can just pay for whatever I use and it can vary all the time. Now, excuse me, I have to spend the next three years planning out my compute usage so I can buy reserved instances or savings plans now that are perfectly aligned so I don't wind up overpaying. There starts to be an idea towards maybe there is an on-demand story someday for stuff like this, but I'm not sure the current configuration of the Cloud is really the economical way to get there.

Steve Tuck: Yeah, it is still much more the hotel model. If you're going to stay in the city for a week, a hotel makes sense. But as you get to a month and then three months and then a year, it's pretty difficult to get to the right unit economics that are necessary for some of these persistent workloads. I think bandwidth is part of it. There's just between compute storage and bandwidth, when used in a persistent fashion, are pretty expensive.

Now, if those work with one's business, I think when there's not an economic challenge or a need for control or being able to modify the underlying infrastructure, the Cloud is great. It's just we're hearing from more and more companies where it is a pretty big spread for that infrastructure that they're running. Even with reserved instances, it's tough to put a three to five year value on it.

Bryan Cantrill: We've been convinced that this post Cloud SAS workload is coming. In part because Steve and I lived it in our previous lives, so we've been convinced it's coming economically. I think it's a little bit surprising the degree to which it's arriving. We've had increase increasingly been surprised by conversations that we've had with folks who are like, all right, well, this company is going to be, they're going to die in the Cloud. They would never contemplate going to the Cloud.

And then, we learned that... I know actually we've got a high level mandate to get off the Cloud by 2021 or what have you.

Steve Tuck: Or move X percent off.

Bryan Cantrill: Or move X percent off or what have you. It's surprising. I think it shows that they've got high level mandates to either or to move to that hybrid model that Steve described. Especially when you look at things like bandwidth. Bandwidth by the way, as when that Steve talked a bingo card for this little panel.

Steve Tuck: Oh.

Bryan Cantrill: Yeah, I know there's no way you're going to let Corey get through an entire episode without mentioning bandwidth costs because bandwidth costs are outrageous on AWS.

Corey Quinn: The problem isn't even that they're expensive because you can at least make an argument in favor of being expensive. The problem I have is that they are incom-freaking-hensible as far as being able to understand what it's going to cost in advance. The way you find out is the somewhat titillatingly called suck-it-and-see method. Namely, how do you figure out if a power cable in a data center is live? Well, it's dangerous to touch it. So you hand it to an intern to suck on the end of the cable and if they don't get blown backwards through the rack, it's probably fine. That's what you're doing except with the bill. If your bill doesn't explode and take your company out of business, okay, maybe that data transfer patterns okay this month.

Bryan Cantrill: That's it and I think that I've always said that, you know who loves utility billing models? Utilities. People don't love it. It's like in a... Corey, I know you've got teenagers, but-

Corey Quinn: She's two at the moment so it's almost a three nature though. We'll see.

Bryan Cantrill: ... but okay, even like you know-

Steve Tuck: Oh, four, five.

Bryan Cantrill: Oh, bandwidth overages do not start at too young at age at this point.

Steve Tuck: Yeah. Five years old [crosstalk 00:24:14].

Bryan Cantrill: Five years old and like, you're scared as hell that they're going to be on the cellular network when they-

Steve Tuck: Watching YouTube channels.

Bryan Cantrill: ... watching, exactly. Being on the Cloud is like having thousands of teenagers that are on YouTube videos all the time and you're praying that they're all on the wifi. They all claim they are but of course half of them are accidentally on the cellular.

Corey Quinn: As you were talking to a variety of VC types, presumably, as you were getting the funds together to launch and come out of stealth mode, what questions did you wind up seeing coming up again and again? They can be good questions, they can be bad questions, they can be hilarious questions. I leave it to you.

Steve Tuck: What a spectrum of questions.

Jessie Frazelle: I actually think what was most fascinating is that nobody really asked the same questions. A lot of people kind of had differing opinions almost so much so that if you were to combine all the VCs we talked to, they would all not align.

Bryan Cantrill: Right. You get one kind of super Voltron VC that understands everything and then one just total idiot VC that understands nothing.

Corey Quinn: The thing that was-

Steve Tuck: And everything in the middle.

Corey Quinn: Everything in the middle. The thing that was also interesting and I don't know if you two felt the same thing, but I felt it was the thing that people would often have the most angst about or the most questions for us about are the things they understood the least.

Jessie Frazelle: Yes. There were other VCs who they understood those things the most and then all the other things they were like, wait, but what about? So if you combine them together, you would almost get our ideal scenario where they understand everything or you would get the worst scenario where they understand absolutely nothing.

Steve Tuck: And ask the most question.

Corey Quinn: Or the best questions that is not entirely clear which side of that very large spectrum they're on. Well, what about magnets? You're sitting there trying to figure out are they actually having a legitimate concern about some weird magnetic issue that you hadn't considered or did they have no actual idea how computers work or is it something smack dab in the freaking middle?

Steve Tuck: Or is it employee to try and throw you off?

Bryan Cantrill: We've got to talk with some of the dumbest questions we got.

Jessie Frazelle: Okay. Can we do?

Bryan Cantrill: Yes.

Jessie Frazelle: Okay. There was one question where we were trying to raise enough money to hire enough people to go build this basically. A lot of the questions that would come up would be like, "What can you do with like-

Steve Tuck: A lot less money?"

Jessie Frazelle: A seed thing. Just a seed thing, something like $1 million.

Corey Quinn: Fail mostly.

Jessie Frazelle: One rack, can you build them?

Steve Tuck: Get some proof points and get off.

Jessie Frazelle: One rack?

Bryan Cantrill: Right. There's one VC in particular and I had said, "Can you shrink the scope of the problem a bit so that you can take less investment, prove it out and then scale the business? What if you were to say shrink this down to like three racks?"

Jessie Frazelle: Three racks.

Bryan Cantrill: If you just did the three racks, would that then prove out things instead of going big and building a lot of racks, what if you just built three?

Steve Tuck: Corey, to clarify, most of what we're building is software so that first rack is actually the expensive one.

Bryan Cantrill: ... 90% of the time.

Steve Tuck: The second rack is a lot cheaper.

Jessie Frazelle: Eventually you caused the chasm and you have a rack and then you can go sell that rack to everyone.

Corey Quinn: And then, you go pitch to SoftBank on the other hand like, "Okay, we like your perspective but what can you do with $4 billion?" And the only acceptable answer that gets the money is something monstrous.

Bryan Cantrill: You know what's funny is when we were first starting to raise, we're like, "What would we do if SoftBank came to us?" And then, we realized very shortly after starting our raise like, "SoftBank is not going to be coming to us or anybody else."

Steve Tuck: A long time.

Corey Quinn: They're hiding from an investor who has a bone saw.

Bryan Cantrill: Exactly. [crosstalk 00:27:53] God, that got dark well.

Jessie Frazelle: That was really dark.

Steve Tuck: That got really dark. Yeah, it was.

Corey Quinn: This is why they don't invite me back more than once for events like this. But now, it's easy to look from the outside and make fun of spectacular stumble failures and whatnot. It's also interesting to see how there are serious concerns like the one you're addressing that aren't, shall we say, the exciting things and have the potential to revolutionize society. It doesn't necessarily, and correct me if I'm wrong, sound like that's what you're aiming at. It sounds more like you're trying to drag, I need to build some server racks and I want to be able to do it in a way that doesn't make me actively hate my own life. So, maybe we can drag them kicking and screaming at least into the early 2000s but anything out of the '80s works.

Bryan Cantrill: Yes. Actually, I think we actually are changing and do view ourselves as changing things a bit more broadly and that we are bringing innovations that are really important, clear innovations that the hyperscalers have developed for themselves and have actually very charitably made available for others.

The open compute project that was originally initiated by Facebook was really an attempt to bring these innovations to a much broader demographic. It didn't exactly, or hasn't yet, I should say, hit that broader demographic. It's really been confined to the hyperscalers. These advantages are really important. It was really interesting, we were talking to Amir Michael who we had on our podcast and had a just a fascinating conversation with him about. Amir was the engineer at Facebook who led the open compute project. A big part of the motivation for Amir was the ecological efficiency that you get from designing these larger scale machines, rack scale designs and driving the PUE down. That was a real motivator for him. A really deep earnest motivator and an earnest motivator for Facebook and the OCP was allowing other people to appreciate that.

I don't think it's too ridiculous to say that as we have a changing planet or much more mindful about the way we consume energy. Actually delivering these more efficient designs to people, it's not just about giving them a better technology but a more efficient one as well.

Jessie Frazelle: Yeah, the power savings are actually huge. It's really good.

Steve Tuck: Yeah. Corey, I know you had someone on your podcast talking about comparing Cloud providers and who are focusing on sustainability both in terms of utilization and also offsetting. You've got large enterprise data centers that are even further behind some of the hyperscalers that aren't, aren't maybe scoring as highly against a Google or another. This gives them the ability not only to increase density and get much a smaller footprint in either their colo or in their data center, but also get a lot of these power efficiency savings. We don't want it easier for people to go build racks. We actually want to make it easy for folks to just snap new racks in and have usable capacity that is easy to manage.

Bryan Cantrill: Also, by crossing that hardware/software divide, give people insight into their utilization and allow people to make the right level of purchase. Even if that means buying less stuff from us next year because we know that's going to be a lifelong relationship for us.

Corey Quinn: What is the story, if you can tell me this now, please feel free to tell me you can't, around hardware refreshes? One of the challenges of course is not only at this technology get faster, better, cheaper, et cetera, but it also gets more energy efficient, which from a climate and sustainability perspective is incredibly important. What is the narrative here? Frankly, one of the reasons I like even renting things rather than buying them in forms of a cell phone purchase plan is because I don't have to worry about getting rid of the old explodey thing. What is the story with Oxide around upgrading and sustainability?

Steve Tuck: Well, first, I think is you got to take it from a couple angles. From when should one replace infrastructure, this is something that hardware manufacturers do a very poor job of providing information for one to make that decision. What is the health of my current infrastructure? What is the capacity of my current infrastructure? Basics that should just come out of the box that help you make those decisions. Should I buy new equipment for three years, four years, five years? What am I warranty schedules look like? Being a former operator of thousands and thousands of machines, one of the things that was most frustrating was that the vendors I bought from seem to treat those 8,000 machines as 8,000 separate instances, 8,000 separate pieces of hardware that I was doing something with and no cohesive view into that.

Steve Tuck: So number one, make it easier for customers to look at the data and make those decisions. The other element of it is that your utilization is more environmentally impacting in many cases than the length of time or the efficiency of the box itself. How can I be smarter about how I am putting that infrastructure to work? If I'm only using 30% of my capacity but it's all plugged in drawing power all the time, it's extraordinarily wasteful and so I think there's a whole lot that can be gained in terms of efficiency of infrastructure one has. And then yes, there's also the question of when is the right time to buy hardware based on improvements in the underlying components?

Bryan Cantrill: Yeah. I think the other thing that's happening of course is that Moore's law is slowing down. What is the actual lifetime of a CPU, for example? How long can a CPU run? We know this the bad way from having machines that were in production long after they should have been taken out of production for various reasons. We know that there's no uptake in mortality even after CPUs had been in production for a decade. At the end of Moore's law, should we be ripping and replacing a CPU every three years? Probably not. We can expect our RDRAM down state a level. We can expect our... Obviously our CPU clock frequencies have already leveled, but our transistor that season, the CPUs are going to level.

And then so, how do you then have a surround that's designed to make that thing run longer? When it does need to be replaced, you want to be sure you're replacing the right thing and you want to make sure you've got a modular architecture and OCP has got a terrific rack design that allows for individual sleds to be replaced. Certainly, we're going to optimize that as well. But, we're also taking the opportunity to rethink the problem.

Corey Quinn: When people look at what you're building and come at it from a relatively naive, or shall we say, Cloud-first perspective, which let's face it, I'm old enough to admit it, those are the same thing. What do you think-

Bryan Cantrill: Can I first do the nationalist movement though. That makes me feel kind of uncomfortable honestly.

Corey Quinn: It really does feel somewhat nationalists and then we talk about Cloud native and oh, that goes nowhere good.

Bryan Cantrill: ... Oh.

Steve Tuck: Oh, God.

Corey Quinn: I'm not apologizing for that. But, what are people-

Steve Tuck: There are more Cloud nativist is what I understand.

Corey Quinn: Exactly. What are people missing when they're coming from that perspective looking at what you're building?

Bryan Cantrill: I don't think we're going to try to talk people out of that perspective. That's fine. We're actually going to go to the folks that are already in pain, which Jess knows many.

Jessie Frazelle: Yes. Definitely a lot of people in pain and also we do understand the Cloud and the usability that comes the interfaces there. I also think that we can innovate on them, but yeah, I think that we're not opposed.

Bryan Cantrill: Well, just like someone who's using Lambda doesn't necessarily need to be educated about, well actually it's not sort of a lesson, there's something you're running in. Or someone who's running containers, it doesn't necessarily have to be educated about actually a hardware virtual machine and what that means. Someone's running a hardware virtual machine doesn't need to necessarily be educated about what an actual bare metal machine looks like. We don't feel an obligation to force this upon people for whom this is not a good fit.

Corey Quinn: If they don't want to learn more about the exciting things that Oxide Computer is up to, now that your post stealth, where can they learn more about you?

Jessie Frazelle: Head on over to our website oxide.computer and also, we have our own podcast called On The Metal. It's tales from the hardware/software interface, so you can subscribe to that as well.

Steve Tuck: Kind to say in Corey, we obviously, we love your podcast. Our podcast is awesome. It is so good.

Jessie Frazelle: Don't start a competition right now.

Corey Quinn: ... No, it's not a competition.

Steve Tuck: Come on.

Corey Quinn: It's like we can both be-

Steve Tuck: We aspire to-

Corey Quinn: No, we can be... We're different. We're not in the... One is Screaming at the Cloud. The other is tales from the hardware/software interface. These are very different-

Steve Tuck: They do talking two different domains.

Jessie Frazelle: And to be fair, we copied Corey on his entire podcast set up.

Corey Quinn: Oh, it's absolutely fine. I've made terrible decisions in the hopes that other people will sabotage themselves by repeating them. Deal it.

Bryan Cantrill: Well, one step ahead.

Steve Tuck: Too late.

Bryan Cantrill: I actually think, and I think we all think, that we made the podcast that we all kind of dreamed of listening to, which was folks who've done interesting work at the hardware/software interface describing some of their adventures. It's amazing. It's a good reminder that no matter how cloudy things are, the hardware/software interface still very much exists.

Steve Tuck: Yeah. Software still has to run on hardware.

Bryan Cantrill: Software still has to run on hardware.

Jessie Frazelle: Computers will always be there. Servers will always exist.

Bryan Cantrill: Corey, you surely must have done in emergency episode when HPE announced their Cloudless initiative. I assume that you had a spot episode on that. I think that they would track that.

Corey Quinn: Oh, they shut that down so quickly that it wasn't even there for more than a day, which proves that cyberbullying works. It's abhorrent when you do it to an individual, but when you do it to a multibillion-dollar company like HP, it's justified and frankly everyone can feel good about the outcome. Hashtag Cloudless is now gone.

Bryan Cantrill: Yeah, and it did not last long. It feels like Microsoft Tay may have lasted even bit longer. God, it was within... It's because it's stupid. It's stupid to talk about things that are Cloudless and things that, and even Serverless, like we have to be very careful about what that means because we are still running on physical hardware. That's what that podcast is all about.

Corey Quinn: Well, even you're defining things by what they're not an Oxide is definitionally built on something that is no longer the absence of atmosphere. You're now about presence rather than absence. Good work.

Jessie Frazelle: ... Wow. Okay. That is the worst.

Steve Tuck: That was meta.

Bryan Cantrill: That was meta. We've got a lot of good reasons for naming the company Oxide. Oxides make very strong bonds. They're very... Silicone is normally found in its oxide. But, that's very meta, not thought of that one.

Corey Quinn: Oh, yeah. Wait until people start mishearing it as oxhide, something you skin off a Buffalo.

Jessie Frazelle: We'll have to cross that bridge when we come to it. But, thanks for planting the seeds.

Bryan Cantrill: Yeah. Why are we settling so poorly in the Great Plains states?

Corey Quinn: Sustainability. Thank you all so much for taking the time to speak with me today. I appreciate it.

Bryan Cantrill: ... Corey, thank for having us.

Steve Tuck: Thanks, Corey. It's great.

Corey Quinn: Likewise. The team of Oxide Computer. I'm Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this episode, please leave it a five star review on iTunes. If you've hated it, please leave it a five star review on iTunes.

Announcer: This has been this week's episode of Screaming in the Cloud. You can also find more Corey at screaminginthecloud.com or wherever fine snark is sold.

This has been a HumblePod Production. Stay humble.

Newsletter Footer

Get the Newsletter

Reach over 30,000 discerning engineers, managers, enthusiasts who actually care about the state of Amazon’s cloud ecosystems.

"*" indicates required fields

This field is for validation purposes and should be left unchanged.
Sponsor Icon Footer

Sponsor an Episode

Get your message in front of people who care enough to keep current about the cloud phenomenon and its business impacts.