Multi-Cloud in Sanity with Simen Svale Skogsrud

Episode Summary

Simen Svale Skogsrud, CTO & Co-Founder of Sanity.io, joins Corey on Screaming in the Cloud to discuss how Sanity.io is simplifying multi-cloud strategy. Simen reveals how they came up with the concept of a Content Lake, as well as the unique approach they’ve taken to architecting workflows for their clients. Simen and Corey discuss the phenomenon of old technology being made new (and hopefully improved) again, and how you can hate a technology while also appreciating the benefits it provides. Simen also describes the best practices Sanity has developed around architecting a multi-cloud approach.

Episode Show Notes & Transcript

About Simen

Ever since he started programming simple games on his 8-bit computer back in the day, Simen has been passionate about how software can deliver powerful experiences. Throughout his career he has been a sought-after creator and collaborator for companies seeking to push the envelope with their digital end-user experiences.

He co-founded Sanity because the state of the art content tools were consistently holding him, his team and his customers back in delivering on their vision. He is now serving as the CTO of Sanity.

Simen loves mountain biking and rock climbing with child-like passion and unwarranted enthusiasm. Over the years he has gotten remarkably good at going over the bars without taking serious damage.

Links Referenced:

Transcript

Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.

Corey: This episode is brought to us by our friends at Pinecone. They believe that all anyone really wants is to be understood, and that includes your users. AI models combined with the Pinecone vector database let your applications understand and act on what your users want… without making them spell it out. Make your search application find results by meaning instead of just keywords, your personalization system make picks based on relevance instead of just tags, and your security applications match threats by resemblance instead of just regular expressions. Pinecone provides the cloud infrastructure that makes this easy, fast, and scalable. Thanks to my friends at Pinecone for sponsoring this episode. Visit Pinecone.io to understand more.

Corey: This episode is brought to you in part by our friends at Veeam. Do you care about backups? Of course you don’t. Nobody cares about backups. Stop lying to yourselves! You care about restores, usually right after you didn’t care enough about backups. If you’re tired of the vulnerabilities, costs, and slow recoveries when using snapshots to restore your data, assuming you even have them at all living in AWS-land, there is an alternative for you. Check out Veeam, that's V-E-E-A-M for secure, zero-fuss AWS backup that won’t leave you high and dry when it’s time to restore. Stop taking chances with your data. Talk to Veeam. My thanks to them for sponsoring this ridiculous podcast.

Corey: Welcome to Screaming in the Cloud. I’m Corey Quinn. Today’s guest is here to tell a story that I have been actively searching for, for years, and I have picked countless fights in pursuit of it. And until I met today’s guest, I was unconvinced that it actually exists. Simen Svale is the co-founder and CTO of a company called Sanity. Simen, thank you for joining me, what is Sanity? What do you folks do over there?

Simen: Thank you, Corey. Thank you. So, we used to be this creative agency that came in as, kind of—we would, kind of, Black Hawk Down into a company and help them innovate, and that would be our thing. And these were usually content, a project like media companies, corporate communication, these kinds of companies, we would be coming in and we would develop some ideas with them. And they would love those ideas and then invariably, we wouldn’t ever be able to do those ideas because we couldn’t change the workflows in their CMS, we couldn’t extend their content models, we couldn’t really do anything meaningful.

So, then we would end up setting up separate tools next to those content tools and they would invariably get lost and never be used after a while. So, we were like, we need to solve this problem, we need to solve it at the source. So, we decided we wanted a new kind of content platform. It would be a content platform consisting of two parts. There will be the, kind of, workspace where you create the content and do the workflows and all that, that will be like an open-source project that you can really customize and build the exact workspace that you need for your company.

And then on the other side, you would have this, kind of, content cloud, we call it the content lake. And the point with this is to very often you bring in several different sources, you have your content that you create specifically for a project, but very often you have content from an ERP system, availability of products, time schedules. Let’s say you’re real estate agent; you have data about your properties that come from other systems. So, this is a system to bring all that together. And then there is another thing that kind of really frustrated me was content systems had content APIs, and content APIs are really particularly, and specifically, about a certain way of using content, whereas we thought content is just data.

It should be data, and the API should be a database query language. So, these are, kind of, the components of Sanity, it’s a very customizable workspace for working with content and running your content workflows. And it’s this content lake, which is this, kind of, cloud for your content.

Corey: The idea of a content lake is fascinating, on some level, where it goes beyond what the data lake story, which I’ve always found to be a little of the weird side when cloud companies get up and talk about this. I remember this distinctly a few years ago at a re:Invent keynote, that Andy Jassy, then the CEO of AWS, got up and talked about customer’s data lakes, and here’s tools for using that. And I mentioned it to one of my clients it’s like, and they looked at me like I was a very small, very simple child and said, “Yeah, that would be great, genius, if we had a data lake, but we don’t.” It’s like, “You… you have many petabytes of data hanging out in S3. What do you think that is?” “Oh, that just the logs and the assets and stuff.” It’s… yeah.

Simen: [laugh].

Corey: So, it turns out that people don’t think about what they have in the same terms, and meeting customers with their terms is challenging. Do you find that people have an idea of what a content cloud or a content lake is before you talk to them about it?

Simen: I mean, that’s why it took us some time to come up with the word content lake. But we realized, like, our thinking was, the content lake is where you bring all your content to make it curiable and to make it deliverable. So that’s, like—you should think, like, as long as I need to present this to end-users, I need to bring it into the content lake. And it’s kind of analogous to a data lake. Of course, if you can’t curate your data in the data lake, it isn’t a data lake, even if you have all the data there. You have to be able to analyze it and deliver it in the format you need it.

So, it’s kind of an analogy for the same kind of thinking. And a crux of a content lake is it gives you one, kind of, single API that works for all of your content sources. It kind of brings them all in together in one umbrella, which is, kind of, the key here, that teams can then leverage that without learning new APIs and without ordering up new APIs from the other teams.

Corey: The story that really got me pointed in your direction is when a mutual friend of ours looked at me and said, “Oh, you haven’t talked to them yet?” Because it was in response to a story I’ve told repeatedly, at length, at anyone who will listen, and by that I include happens to be unfortunate enough to share an elevator ride with me. I’ll talk to strangers about this, it doesn’t matter. And my argument has been for a long time that multi-cloud, in the sense of, “Oh yeah, we have this one workload and we can just seamlessly deploy it anywhere,” is something that is like cow tipping as Ben Kehoe once put it, in that it doesn’t exist and you know it doesn’t exist because there are no videos of it happening on YouTube. There are no keynote stories where someone walks out on stage and says, “Oh, yeah, thanks for this company’s great product, I had my thing that I built entirely on AWS, and I can seamlessly flip a switch, and now it’s running on Google Cloud, and flip the switch again, and now it’s running on Azure.”

And the idea is compelling, and they’re very rarely individual workloads that are built from the beginning to be able to run like that, but it takes significant engineering work. And in practice, no one ever takes advantage of that optionality in most cases. It is vanishingly rare. And our mutual friend said, “Oh, yeah. You should talk to Simen. He’s done it.”

Simen: [laugh]. Yeah.

Corey: Okay, shenanigans on that, but why not? I’m game. So, let me be very direct. What the hell have you done?

Simen: [laugh]. So, we didn’t know it was hard until I saw his face when I told him. That helps, right? Like, ignorance is bliss. What we wanted was, we were blessed with getting very, very big enterprise customers very early in our startup journey, which is fantastic, but also very demanding.

And one thing we saw was, either for compliance reasons or for, kind of, strategic partnership reasons, there were reasons that big, big companies wanted to be on specific service providers. And in a sense, we don’t care. Like, we don’t want to care. We want to support whatever makes sense. And we are very, let’s call it, principled architects, so actually, like, the lower levels of Sanity doesn’t know they are part of Sanity, they don’t even know about customers.

Like, we had already the, kind of, separation of concerns that makes the lower—the, kind of, workload-specific systems of Sanity not know a lot of what they are doing. They are basically just, kind of, processing content, CDN requests, and just doing that, no idea about billing or anything like that. So, when we saw the need for that, we thought, okay, that means we have the, what we call the color charts, which is, kind of, the light bulbs, the ones we can have—we have hundreds and hundreds of them and we can just switch them off and the service still works. And then there’s the control plane that is, kind of, the admin interface that the user is use to administrate the resources. We wanted customers to just be able to then say, “I want this workloads, this kind of content store to run on Azure, and I want this one on Google Cloud.” I wanted that to feel the same way regions do. Like, you just choose that and we’ll migrate it to wherever you want it. And of course, charge you for that privilege.

Corey: Even that is hard to do because when companies say, “Oh, yeah, we didn’t have a multi-cloud strategy here,” it’s okay, if you’re multi-cloud strategy evolves, we have to have this thing on multiple clouds, okay, first as a step one, if you’re on AWS—which is where this conversation usually takes place when I’m having this conversation with people, given the nature of what I do for a living—it’s, great, first, deploy it to a second AWS region and go active-active between those two. You should—theoretically—have full-service and API compatibility between them, which removes a whole bunch of problems. Just go ahead and do that and show us how easy it is. And then for step two, then talk about other cloud providers. And spoiler, there’s never a step two because that stuff is way more difficult than people who have not done it give it credit for being.

How did you build your application in such a way that you aren’t taking individual dependencies on things that only exist in one particular cloud, either in terms of the technology itself or the behaviors? For example, load balancers come up with different inrush times, RDS instances provision databases at different speeds with different guarantees around certain areas across different cloud providers. At some point, it feels like you have to go back to the building blocks of just rolling everything yourself in containers and taking only internal dependencies. How do you square that circle?

Simen: Yeah, I think it’s a good point. Like, I guess we had a fear of—my biggest fear in terms of single cloud was just that leverage you provide your cloud provider if you use too many of those kinds of super-specific services, the ones that only they run. Like, so it was, our initial architecture was based on the fact that we would be able to migrate, like, not necessarily multi-cloud, just, if someone really ups the price or behaves terribly, we can say, “Oh, yeah. Then we’ll leave for another cloud provider.” So, we only use super generic services, like queue services, blob services, these are pretty generic across the providers.

And then we use generic databases like Postgres or Elastic, and we run them pretty generically. So, anyone who can provide, like, a Postgres-style API, we can run on that. We don’t use any exotic features. Let’s say, picking boring Technologies was the most, kind of, important choice. And then this also goes into our business model because we are a highly integrated database provider.

Like in one sense, Sanity is as a content database with this weird go-to-market. Like, people think of us as a CMS, but it is actually the database we charge for. So also, we can’t use these very highly integrated services because that’s our margin. Like, we want that money, right [laugh]? So, we create that value and then we build that on very simple, very basic building blocks if that makes sense.

So, when we wanted to move to a different cloud, everything we needed access to, we could basically build a platform inside Azure that looks exactly like the one we built inside Google, to the applications.

Corey: There is something to be said for the approach of using boring technologies. Of course, there’s also the story of, “Yeah, I use boring technologies.” “Like what?” “Oh, like, Kubernetes,” is one of the things that people love to say. It’s like, “Oh, yes.”

My opinion on Kubernetes historically has not been great. Basically, I look at it as if you want to cosplay working at Google but can’t pass their technical screen, then Kubernetes is the answer for you. And that’s more than a little unfair. And starting early next year, I’m going to be running a production workload myself in Kubernetes, just so I can make fun of it with greater accuracy, honestly, but I’m going to learn things as I go. It is sort of the exact opposite of boring.

Even my early experiments with it so far have been, I guess we’ll call it unsettling as far as some of the non-deterministic behaviors that have emerged and the rest. How did you go about deciding to build on top of Kubernetes in your situation? Or was it one of those things that just sort of happened to you?

Simen: Well, we had been building microservice-based products for a long time internal to our agency, so we kind of knew about all the pains of coordinating, orchestrating, scaling those—

Corey: “We want to go with microservices because we’re tired of being able to find the problem. We want this to be much more of an exciting murder mystery when something goes down.”

Simen: Oh, I’ve heard that. But I think if you carve up the services the right way, every service becomes simple. It’s just so much easier to develop, to reason about. And I’ve been involved in so many monoliths before that, and then every refactor is like guts on the table is, like, month, kind of, ordeal, super high risk. With the microservices, everything becomes a simple, manageable affair.

And you can basically rebuild your whole stack service by service. And you can do—like, it’s a realistic thing. Like, you—because all of them are pretty simple. But it’s kind of complicated when they are all running inside instances, there’s crosstalk with configuration, like, you change the library, and everything kind of breaks. So, Docker was obvious.

Like, Docker, that kind of isolation, being able to have different images but sharing the machine resources was amazing. And then, of course, Kubernetes being about orchestrating that made a lot of sense. But that was also compatible with a few things that we have already discovered. Because workloads in Kubernetes needs to be incredibly boring. We talk about boring stuff, like, if you, for example—in the beginning, we had services that start up, they do some, kind of, sanity check, they validate their environment and then they go into action.

That in itself breaks the whole experience because what you want Kubernetes-based service to do is basically just do one thing all the time in the same way, use the same amount of memory, the same amount of resources, and just do that one thing at that rate, always. So, we broke apart those things, even the same service runs in different containers, depending on their state. Like, this is the state for doing the Sanity check, this is the state for [unintelligible 00:13:05], this is the state for doing mutations. Same service. So, there’s ways about that.

I absolutely adore the whole thing. It saved—like, I haven’t heard about those pains we used to have in the past ever again. But also, it wasn’t an easy choice for me because my single SRE at the time said, like, he was either Kubernetes or he’d quit. So, it was very simple decision.

Corey: Exactly. The resume-driven development is very much a thing. I’ve not one to turn up my nose at that; that’s functionally what I’ve done my entire career. How long had your product been running in an environment like that before, “Well, we’re going multi-cloud,” was on the table?

Simen: So, that would be three-and-a-half years, I think, yeah. And then we started building it out in Azure.

Corey: That’s a sizable period of time in the context of trying to understand how something works. If I built something two months ago, and now I have to pick it up and move it somewhere else, that is generally a much easier task as far as migrations go than if the thing has been sitting there for ten years. Because whenever you leave something in an environment like that, it tends to grow roots and takes a number of dependencies, both explicit and implicit, on the environment in which runs. Like, in the early days of AWS, you sort of knew that local disks on the instances were ephemeral because in the early days, that was the only option you had. So, every application had to be written in such a way that it did not presume that there was going to be local disk persistence forever.

Docker containers take that a significant step further. Where when that container is gone, it’s gone. There is no persistent disk there without some extra steps. And in the early days of Docker, that wasn’t really a thing either. Did you discover that you’d take in a bunch of implicit dependencies like that on the original cloud that you were building on?

Simen: I’m old school developer. I would all the way back to C. And in C, you need to be incredibly, incredibly careful with your dependencies because you basically—your whole dependency mapping is happening inside of your mind. The language doesn’t help you at all. So, I’m always thinking about my kind of project as, kind of, layers of abstraction.

If someone talks to Postgres during a request, requests are supposed to be handled in the index, then I’m [laugh] pretty angry. Like, that breaks the whole point. Like, the whole point is that this service doesn’t need to know about Postgres. So, we have been pretty hardcore on, like, not having any crosstalk, making sure every service just knows about—like, we had a clear idea which services were allowed to talk to which services. And we were using GVT tokens internally to make sure that authentication and the rights management was just handled on the ingress point and just passed along with records.

So, no one was able to talk to user stores or authentication services. That always all happens on the ingress. So, in essence, it was a very pure, kind of, layered platform already. And then, like I said, also then built on super boring technologies. So, it wasn’t really a dramatic thing.

The drama was more than we didn’t maybe, like [laugh] like these sort of cloud services that much. But as you grow older in this industry, you kind of realize that you just hate the technologies differently. And some of the time, you hate a little bit less than others. And that’s just how it goes. That’s fine. So, that was the pain. We didn’t have a lot of pain with our own platform because of these things.

Corey: It’s so nice watching people who have been around in the ecosystem for long enough to have made all the classic mistakes and realized, oh, that’s why common wisdom is what common wisdom is because generally speaking, that shit works, and you learn it yourself from first principles when you decide—poorly, in most cases—to go and reimplement things. Like oh, DNS goes down a lot, so we’re just going to rsync around an ETSI hosts file on all of our Linux servers. Yeah, we tried that collectively back in the ’70s. It didn’t work so well then, either. But every once in a while, some startup founder feels the need to speed-run learning those exact same lessons.

What I’m picking up from you is a distinct lack of the traditional startup founder vibe of, “Oh well, the reason that most people don’t do things this way is because most people are idiots. I’m smarter than they are. I know best.” I’m getting the exact opposite of that from you where you seemed to wind up wanting to stick to things that are tried and true and, as you said earlier, not exciting.

Simen: Yeah, at least for these kinds of [unintelligible 00:17:15]. Like, so we had a similar platform for our customers that we, kind of, used internally before we created Sanity, and when we decided to basically redo the whole thing, but for kind of a self-serve thing and make a product, I went around the developer team and I just asked them, like, “In your experience, what systems that we use are you not thinking about, like, or not having any problems with?” And, like, just make a list of those. And there was a short list that are pretty well known. And some of them has turned out, at the scale we’re running now, pretty problematic still.

So, it’s not like it’s all roses. We picked Elasticsearch for some things and that it can be pretty painful. I’m on the market for a better indexing service, for example. And then sometimes you get—let’s talk about some mistakes. Like, sometimes you—I still am totally on the microservices train, and if you make sure you design your workloads clearly and have a clear idea about the abstractions and who gets to talk to who, it works.

But then if you make a wrong split—so we had a split between a billing service and a, kind of, user and resource management service that now keeps talking back and forth all the time. Like, they have to know about what each other is. And it says, if two services need to know about each other’s reciprocally, like, then you’re in trouble, then those should be the same service, in my opinion. Or you can split it some other way. So, this is stuff that we’ve been struggling with.

But you’re right. My last, kind of, rah-rah thing was Rails and Ruby, and then when I weened off of that, I was like, these technologies work for me. For example, I use Golang a lot. It’s a very ugly language. It’s very, very useful. You can’t argue against the productivity you have in Go, but also the syntax is kind of ugly. And then I realized, like, yeah, I kind of hate everything now, but also, I love the productivity of this.

Corey: This episode is sponsored in part by our friends at Uptycs, because they believe that many of you are looking to bolster your security posture with CNAPP and XDR solutions. They offer both cloud and endpoint security in a single UI and data model. Listeners can get Uptycs for up to 1,000 assets through the end of 2023 (that is next year) for $1. But this offer is only available for a limited time on UptycsSecretMenu.com. That’s U-P-T-Y-C-S Secret Menu dot com.

Corey: There’s something to be said for having been in the industry long enough to watch today’s exciting new thing becomes tomorrow’s legacy garbage that you’ve got to maintain and support. And I think after a few cycles of that, you wind up becoming almost cynical and burned out on a lot of things that arise that everyone leaves everyone breathless. I am generally one of the last adopters of something. I was very slow to get on virtualization. I was a doomsayer on cloud itself for many years.

I turned my nose up at Docker. I mostly skipped the whole Kubernetes thing and decided to be early to serverless, which does not seem to be taking off the way that I wanted it to, so great. It’s one of those areas where just having been in the operation side particularly, having to run things and fix them at two in the morning when they inevitably break when some cron job in the middle of the night fires off because no one will be around then to bother. Yeah, great plan. It really, at least in my case, makes me cynical and tired to the point where I got out of running things in anger.

You seem to have gone a different direction where oh, you’re still going to build and run things. You’re just going to do it in a ways that are a lot more well-understood. I think there’s a lot of value to that and I don’t think that we give enough credit as an industry to people making those decisions.

Simen: You know, I was big into Drum and Bass back in the ’90s I just love that thing. And then you went away, and then something came was called dubstep. It’s the same thing. And it’s just better. It’s a better Drum and Bass.

Corey: Oh yeah, the part where it goes doof, doof, doof, doof, doof, doof, doof—

Simen: [laugh]. Exactly.

Corey: Has always been—it’s yeah, we call it different things, but the doof, doof, doof, doof, doof music is always there. Yeah.

Simen: Yeah, yeah, yeah. And I think the thing to recognize, you could either be cynical and say, like, you kids, you’re just making the same music we did like 20 years ago, or you can recognize that actually it—

Corey: Kids love that, being told that. It’s their favorite thing, telling them, “Oh yeah, back when I was your age…” that’s how you—that’s a signifier of a story that they’re going to be riveted to and be really interested in hearing.

Simen: [laugh]. Exactly. And I don’t think like that because I think you need to recognize that this thing came back and it came back better and stronger. And I think Mark Twain probably didn’t say that history doesn’t repeat itself, it rhymes. And this is similar thing.

Right now I have to contend with the fact that server-based rendering is coming back as a completely new thing, which was like, the thing, always, but also it comes back with new abstractions and new ways of thinking about that and comes back better with better tooling. And kind of—I think the one thing if you can take away from that kind of journey, that you can be stronger by not being excited by shiny new things and not being, kind of, a champion for one specific thing over every other thing. You can just, kind of, see the utility of that. And then when they things come back and they pretend to be new, you can see both the, kind of, tradition of it and maybe see it clearer than most of the people, but also, it’s like you said, don’t bore the kids because also you should see how it is new, how it is solving new things, and how these kids coming back with the same old thing as a new thing, they saw it differently, they framed it slightly differently, and we are better for it.

Corey: There’s so much in this industry that we take from others. We all stand on the shoulders of giants, and I think that is something that is part of what makes this industry so fantastic in different ways. Some of the original computer scientists who built some of the things that everyone takes for granted these days are still alive. It’s not like the world of physics, for example, where some of the greats wound up discovering these things hundreds of years ago. No, it’s all evolved within living memory.

That means that we can talk to people, we can humanize them, on some level. It’s not some lofty great sitting around and who knows what they would have wanted or how they would have intended this. Now, you have people who helped build the TCP stack stand up and say, “Oh yeah, that was a dumb. We did a dumb. We should not have done it that way.” Oh, great.

It’s a constant humbling experience watching people evolve things. You mentioned that Go was a really neat language. Back when I wound up failing out of school, before I did that, I took a few classes in C and it was challenging and obnoxious. About like you would expect. And at the beginning of this year, I did a deep-dive into learning go over the course of a couple days enough to build a binary that winds up controlling my internet camera in my home office.

And I’ve learned an awful lot and how to do things and got a lot of things wrong, and it was a really fun language. It was harder to do a lot of the ill-considered things that get people into trouble with C.

Simen: Hmm.

Corey: The idea that people are getting nice things in a way that we didn’t have them back when we were building things the first time around is great. If you’re listening to this, it is imperative—listen to me—it is imperative. Do not email me about Rust. I don’t want to hear it.

Simen: [laugh].

Corey: But I love the fact that our tools are now stuff that we can use in sensible ways. These days, as you look at using sensible tools—which in this iteration, I will absolutely say that using a hyperscale public cloud provider is the right move; that’s the way to go—do you find that, given that you started over hanging out on Google Cloud, and now you’re running workloads everywhere, do you have an affinity for one as your primary cloud, or does everything you’ve built wind up seamlessly flowing back and forth?

Simen: So, of course, we have a management interface that our end-users, kind of, use to monitor, and it has to be—at least has to have a home somewhere, even though the data can be replicated everywhere. So, that’s in Google Cloud because that’s where we started. And also, I think GCP is what our team likes the most. They think it’s the most solid platform.

Corey: Its developer experience is far and away the best of all the major cloud providers. Bar none. I’ve been saying that for a while. When I first started using it, I thought I was going to just be making fun of it, but this is actually really good was my initial impression, and that impression has never faded.

Simen: Yeah. No, it’s like it’s terrible, as well, but it’s the least terrible platform of them all. But I think we would not make any decisions based on that. As long as it’s solid, as long as it’s stable, and as long as, kind of, price is reasonable and business practices is, kind of, sound, we would work with any provider. And hopefully, we would also work with less… let’s call it less famous, more niche providers in the future to provide, let’s say, specific organizations that need very, very specific policies or practices, we will be happy to support. I want to go there in the future. And that might require some exotic integrations and ways of building things.

Corey: A multi-cloud story that I used to tell—in the broader sense—used PagerDuty as an example because that is the service that does one thing really well, and that is wake you up when something sends the right kind of alert. And they have multiple cloud providers historically that they use. And the story that came out of it was, yeah, as I did some more digging into what they’ve done and how they talked about this, it’s clear that the thing that wakes you up in the middle of the night absolutely has to work across a whole bunch of different providers because if it’s on one, what happens when that’s the one that goes down? We learned that when AWS took an outage in 2011 or 2012, and PagerDuty went down as a result of that. So, the thing that wakes you up absolutely lives in a bunch of different places on a bunch of different providers.

But their marketing site doesn’t have to. Their user control panel doesn’t have to. If there’s an outage in their primary cloud that is sufficiently gruesome enough, okay, they can have a degraded mode where you’re not able to update and set up new alerts and add new users into your account because everything’s on fire in those moments anyway, that’s an acceptable trade-off. But the thing that wakes you up absolutely must work all the time. So, it’s the idea of this workload has got to live in a bunch of places, but not every workload looks like that.

As you look across the various services and things you have built that comprise a company, do you find that you’re biasing for running most things in a single provider or do you take that default everywhere approach?

Simen: No, I think that to us, it is—and we’re not—that’s something we haven’t—work we haven’t done yet, but architecturally, it will work fine. Because as long as we serve queries, like, we have to—like components, like, people write stuff, they create new content, and that needs to be up as much as possible. But of course, when that goes down, if we still serve queries, their properties are still up, right? Their websites or whatever is still serving content.

So, if we were to make things kind of cross-cloud redundant, it would be the CDN, like, indexes and the varnish caches and have those [unintelligible 00:27:23]. But it is a challenge in terms of how you do routing. And let’s say the routing provider is down. How do you deal with that? Like, there’s been a number of DNS outages and I would love to figure out how to get around that. We just, right now, people would have to manually, kind of, change their—we have backup ingress points with the—yeah, that’s a challenge.

Corey: One of the areas where people get into trouble with multi-cloud as well, that I’ve found, has been that people do it with that idea of getting rid of single points of failure, which makes a lot of sense. But in practice, what so many of them have done is inadvertently added multiple points of failure, all of which are single-tracked. So okay, now we’re across to cloud providers, so we get exposure to everyone’s outages, is how that winds up looking. I’ve seen companies that have been intentionally avoiding AWS because great, when they go down and the internet breaks, we still want our store to be up. Great, but they take a dependency on Stripe who is primarily in AWS, so depending on the outage, people may very well not be able to check out of their store, so what did they gain by going to another provider? Because now when that provider goes down, their site is down then too.

Simen: Mmm. Yeah. It’s interesting that anything works at all, actually, like, seeing how intertwined everything is. But I think that is, to me, the amazing part, like you said, someone’s marketing site doesn’t have to be moved to the cloud, or maybe some of it does. And I find it interesting that, like, in the serverless space, even if we provide a very—like, we have super advanced engineers and we do complex orchestration over cloud services, we don’t run anything else, right?

Like, all of our, kind of, web properties is run with highly integrated, basically on Vercel, mostly, right? Like we don’t want to know about—like, we don’t even know which cloud that’s running on, right? And I think that’s how it should be because most things, like you said, most things are best outsourced to another company and have them worry, like, have them worry when things are going down. And that’s how I feel about these things that, yes, you cannot be totally protected, but at least you can outsource some of that worry to someone who really knows what—like, if Stripe goes down, most people don’t have the resources to worry at the level that Stripe would worry, right? So, at least you have that.

Corey: Exactly. Yeah, if you ignore the underlying cloud provider stuff, they do a lot of things I don’t want to have to become an expert in. Effectively, you wind up getting your payment boundary through them; you don’t have to worry about PCI yourself at all; you can hand it off to them. That’s value.

Simen: Exactly. Yeah.

Corey: Like, the infrastructure stuff is just table stakes compared to a lot of the higher up the stack value that companies in that position enjoy. Yeah, I’m not sitting here saying don’t use Stripe. I want to be very clear on that.

Simen: No, no, no. No, I got you. I got you. I just remember, like, so we talked about maybe you hailing all the way back to Seattle, so hail all the way back to having your own servers in a, kind of, place somewhere that you had to drive to, to replace a security card because when the hard drive was down. Or like, oh, you had to scale up and now you have to buy five servers, you have to set them up and drive them to the—and put them into the slots.

Like, yes, you can fix any problem yourself. Perfect. But also, you had to fix every problem yourself. I’m so happy to be able to pay Google or AWS or Azure to have that worry for me, to have that kind of redundancy on hand. And clearly, we are down less time now that we have less control [laugh] if that makes sense.

Corey: I really want to thank you for being so generous with your time. If people want to learn more, where’s the best place for them to find you?

Simen: So, I’m at @svale—at Svale—on Twitter, and my DMs are open. And also we have a Slack community for Sanity, so if you want to kind of engage with Sanity, you can join our Slack community, and that will be on there as well. And you find it in the footer on all of the sanity.io webpages.

Corey: And we will put links to that in the show notes.

Simen: Perfect.

Corey: Thank you so much for being so generous with your time. I really appreciate it.

Simen: Thank you. This was fun.

Corey: Simen Svale, CTO and co-founder at Sanity. I’m Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you’ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you’ve hated this podcast, please leave a five-star review on your podcast platform of choice along with an insulting comment, and make sure you put that insulting comment on all of the different podcast platforms that are out there because you have to run everything on every cloud provider.

Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Announcer: This has been a HumblePod production. Stay humble.
Newsletter Footer

Get the Newsletter

Reach over 30,000 discerning engineers, managers, enthusiasts who actually care about the state of Amazon’s cloud ecosystems.

"*" indicates required fields

This field is for validation purposes and should be left unchanged.
Sponsor Icon Footer

Sponsor an Episode

Get your message in front of people who care enough to keep current about the cloud phenomenon and its business impacts.