Networks and Sustainability in Computing with George Porter

Episode Summary

George Porter, a computer science professor at the University of California, San Diego, talks to us about advanced networking and the effects of computing on the environment In this episode of Screaming in the cloud. George explores the shift towards optical networking in data centers to meet growing bandwidth needs and discusses the significant carbon footprint associated with computing, from data centers to device production. In addition to providing a look into the future of scalable, sustainable computing systems, George mentions the difficulties and benefits of incorporating cloud computing into academic research.

Episode Video

Episode Show Notes & Transcript

Show Highlights:

(00:00) - Introduction

(03:15) - The Shift to Optical Networking

(07:50) - The Efficiency of Cloud Networks

(12:06) - Adaptable Networks for Different Uses

(16:19) - Reducing Computing's Carbon Footprint

(20:25) - Highlighting Computing's Environmental Impact Through Art

(26:51) - Cloud Computing Challenges in Academia

(31:18) - The benefits of cloud computing for academic research

(34:14) - Closing thoughts

About George:
A Computer Science Professor at UC San Diego focusing on high-performance and sustainable computer systems

Links:

Center for Network Systems at UCSD: https://cns.ucsd.edu/
Low Carbon Computing and Collaboration with the University of San Diego: https://c3lab.net/

Transcript

George Porter: In order to build enough switching capacity, you started have to building these really complicated topologies in the data center network, different switches interconnected in different ways. And that drives up costs and it drives up power, and it becomes a barrier to kind of deploying stuff.

Corey Quinn: Welcome to Screaming in the Cloud. I'm Corey Quinn. My guest today is a little bit off the beaten track for people I normally wind up speaking to who are usually Doing interesting, or at least things, in the world of industry. George Porter, instead, took his life in a little bit of a different direction.

He's a professor at the University of California, San Diego, in the Computer Science Department. George, thank you for joining me.

George Porter: Hi, Corey. Thank you for having me on. It's a pleasure. We've, uh, talked on Twitter, so it's nice to talk in person.

Corey Quinn: This is honestly one of the first conversations I'm aware of having with a computer science professor where it wasn't a, a very different dynamic when I was in the process of failing out of college 20 some odd years ago.

I, I'm surprised, like, wow, I, I don't, I can't see the chip on your shoulder from here, nor the very thinly disguised drinking problem as you start trying to shove algorithms down the throat of someone who isn't really equipped to handle it.

George Porter: Oh, well, I'm just a networking professor, so I don't have to worry about algorithms.

We can just write some code and try some stuff out and see if packets get through.

Corey Quinn: That seems like as good a place to start as any, because back, back in my day, which is a terrifying turn of phrase, but by God, it's become true. Networking was always perceived as being a very vocational area, where, oh, academia working with networking?

Nonsense. My first job was as a network engineer at Chapman University without having a degree myself. I, I was viewed as in many ways akin to various other people in the facility staff of just, just make the packets go back and forth and that was the end of it. But now it's, you're telling me it's become a full on academic discipline.

George Porter: It has been, I'm afraid to say. All the fun's been drained out, and now we're being very rigorous and creating theories and models and things like that. No, I kid, but I actually started very similarly. My first job in high school was at an internet provider in Houston called NeoSoft. And so, it was sort of, you know, this was the mid 90s.

And like you said, there was none of this at scale cloud, public cloud, private cloud. There was basically just, you know, Hey, we finally got a connection to a new website from Coca Cola. They're on the web now, you know, it was brand new. Um, but, uh, but the reality today, though, is for our students who are graduating, Pretty much regardless of what they're interested in doing, they need some ability to connect the software they're writing to either other nodes, to the cloud, to download things, to update things, to push software updates.

It's just, you know, networking is so important.

Corey Quinn: I've been cited somewhat recently now as pointing out that, you know, without the network, all these cloud computing is just basically a series of very expensive space heaters. If they can't talk to anything, there's not a lot of value behind it. You've been talking a lot lately, academically, about the idea of optical networking, which on the one hand struck me as, uh, so what?

What's the big deal on that? We've had fiber runs that people have been tripping over and breaking in data centers since longer than I've been involved in that. What's changed in the space?

George Porter: Oh, it's actually very interesting. Um, so, like you said, in a, in a, in a, in a traditional data center, you're going to find fiber all over the place, running to all the racks, running to rows, running to telecom rooms, and etc.

What's really changed over the last 15 years or so has been the introduction of, of optics into the actual switching process. So. You might think of using Fiber to interconnect, let's say, a Broadcom switch to a Cisco switch, connect it to a Mellanox NIC, or I guess now an NVIDIA NIC. You know, the Fiber is used to carry data from one place to another, but once it actually gets to that switch, to that router, to that end device, it's converted back into electronics where traditional 1990s style networking can happen.

We can look at the MAC address, we can look at the IP address and do something with it. That has become a real bottleneck and a real barrier to deployment. So to build a network at cloud scale that can support thousands of machines, GPUs, TPUs, solid state storage devices, et cetera, et cetera, et cetera.

The bandwidth requirements is just growing so quickly that actually getting data into and out of those packet switch chips has become a big problem. And so my group at UCSD. Um, and other academic groups dating back about 10 or 15 years have started looking into how can we actually replace those switch chips, those switch devices with fully optical devices.

And that was very sort of, um, sci fi, very far into the future kind of research. And what's interesting over the last, really just since the pandemic, even the last year or two, has been to see hyperscalers talking about how they've successfully deployed this technology, Google most particularly.

Corey Quinn: I was always stuck in a mindset where fiber is expensive, first off.

Secondly, oh, anyone who, uh, has two neurons to bang together can basically crimp down an Ethernet patch cable, as, as evidenced by the fact they trusted me to do that once upon a time. Uh, cable testers are worth their weight in gold, but grinding fiber is a very specific expensive skill set. So I was always viewing it as, it's what you do for longer runs, where you start to exceed the spec for whatever generation of, uh, CAT cable you're using.

When I started, it was CAT5, now I think it's 7 or 8, who knows. But the speeds continue to increase on the Ethernet side of it. What is the driver behind this? Is it purely bandwidth?

George Porter: It is, in a lot of ways, bandwidth. So, like you mentioned, copper is really convenient, and it's really durable. I've run over some CAT5 with a vacuum cleaner, and it works fine.

So, it's not a big problem. The big issue is that as you go to faster and faster bandwidths, There is a property of copper cables called the skin effect that means you need to put more and more energy into that cable in order to drive those high bandwidths. So when you went from like 100 megabit to 1 gigabit, we could stay on Cat5.

We kind of, 5e, we go up to maybe 10 gigabits a second. You start running into these technical limitations to driving that copper. And so when you want to start looking at 100, 200, 400 gigabits a second terabit ethernet, all the way to the desktop, all the way to, I mean, to the server, to the device, you really have to go optics. You have to go fiber.

Corey Quinn: I remember back in the early days of my own career, you look at things like OC 192s, which were basically the backbone of the internet, where they could handle up. Speed's not quite what you're talking about, but darn close, uh, over huge, huge spans. That was always giant bundles of fiber that were doing those things.

I have no idea what the actual termination of those things look like, because I assure you, you did not let the relatively wet behind the ears kid who's accident prone near the stuff that wind up costing more in some cases than the building it was housed

George Porter: in. Oh, absolutely. And for wide area applications where you're running from San Francisco to Los Angeles or L.

A. to San Diego or something like that, that fiber is extremely expensive because you've got all the right of way and the trenches and all this kind of stuff. When you go into the data center, though, you can pull bundles of fiber. It's super cheap. It's actually quite inexpensive. And then really the endpoints become kind of expensive.

And so the thing that we were addressing, one of the problems we were looking at was just the fact that as we start driving bandwidth up to the devices, it's Again, in order to build enough switching capacity, you started have to building these really complicated apologies in the data center network.

Different switches interconnected in different ways, and that drives up cost, and it drives up power, and it becomes a barrier to kind of deploying stuff.

Corey Quinn: One of the things that I think is underappreciated about the cloud, and I've been whining about this for ages, which is, you can turn a dial, effectively, on anything in AWS.

If you want more RAM, great, turn the dial. You'll spend more money. You'll get more RAM. Oh, that's too expensive. Let me turn it down. Okay, and what happens is what you would expect. Same with disk space, same with CPUs, same with Almost everything except network. Everything in the network side of any of the cloud providers is you get one side that is uniformly excellent.

There's no more cost effective egress fee where you say, okay, I'd like it to be over there by, I don't know, September. September sounds good. And instead, you're consistently getting that high cost and high performance network. And I want to be clear, because this gets overlooked. What they do from a network perspective is just this side of magic.

Because back when I worked in data centers, you had top of rack switch congestion. I had a client at one point who built out a private cloud and didn't think about that. And suddenly, you have this massive number of servers in every rack that's supposed to talk everywhere seamlessly. And they were bandwidth constrained all over the place as the top of rack switches started melting and dripping down the rest of the rack.

I don't see any evidence of that in the modern cloud providers. You can get effectively line rate between any two points indefinitely, and that's magic.

George Porter: Yeah, I think that magic is delivered by the fact that a lot of these networking folks, that these hyperscalers are pulling their hair out to make that abstraction look like it's real.

It's not, but they're able to sort of make it look like it's real. Just like you said, you can wheel Essentially, your ability to scale out in the data center is limited by how big your loading dock is, because you can just start unloading servers and devices and RAM and storage as fast as you want. But like you mentioned, it's the network where everything has to come together, and that has traditionally been something that has been difficult to upgrade, because either you need to kind of upgrade your network first, and now it's going to be very expensive up front.

You're not going to be able to saturate it, or alternatively, you're going to upgrade your devices and now your network's a problem. And so trying to figure out how to keep those in parity is a huge challenge that a lot of these operators have.

Corey Quinn: I have to say that there's, it's always, folks have an extreme sense of Being, talking to wizards whenever I talk to some of these cloud provider network folks.

Uh, AWS's Cole McCarthy is the top of mind for a lot of that stuff. And he's always been incredibly kind, incredibly clear in communicating. And you're right, discussing the fact that none of this stuff actually exists, which I know intellectually. But. It's, it almost feels like there's a sense, whenever I talk to some of these folks, of like, they're, okay, time to like go back into the land of make believe, where we are telling stories to children about how things work within the, within the logical structures and rules of physics bounded here.

And it's, it must, it's got to be weird for folks who see both sides of the snake to, to figure out what side they're having a given conversation on.

George Porter: Absolutely. And in fact, you know, if you look at AWS, they've built some of their own hardware to do networking. You look at Google and they have been innovating on building.

They have a network called Juniper. I mean, a Jupiter network that is involving a lot of custom chips, custom devices and things like that. And I think what we're seeing now is instead of thinking of this data center that has maybe 100, 200, 000 machines in it. Yeah. And it's got a perfect network where you can deploy software anywhere you want in that.

You can migrate anything you want. I think what we're starting to see is a model where we actually reconfigure the structure of the network for a particular application. And then when it's time to do another application, we can actually change the way the network's built. So we're not trying to build one network for every application.

We're trying to adapt it to the needs of the application.

Corey Quinn: That feels like something you would do in the context of an HPC or supercomputing project, where you have very specific data access patterns that need to happen incredibly quickly at a certain point of scale, where for the next three months, this is what it needs to do.

But that was always done from a perspective of time to release the hounds and you would have the network monkeys go and do the reconfiguration. It sounds like you're talking about something that's a lot more dynamic.

George Porter: Absolutely. What's interesting, in the research world, we had supercomputing dating back to before you and I were born, and you saw a divergence of that in the early 2000s as public cloud providers or public, you know, Google, Facebook, etc.,

well, eventually Facebook. had needs that were just very different than supercomputing because they were running lots of applications rather than one application. And what's interesting is, so then you saw things kind of split into these two different sectors, and now you're starting to see them come back together again, especially with machine learning, AI training, etc.

It's worth it for some of these providers to set up a custom network for 6, 8, 10, 12, 24 hours, and then change it for another training job or another, you know, one of these big, huge, long running tasks. It

Corey Quinn: was one of those areas where it's just, it feels like this is dark magic. And you're right, because whenever you talk to academics about large computing projects, it feels like I'm suddenly talking to people from a very different world.

You're right, when I'm talking about, okay, I'm looking at massive corporate fleets in industry, Yeah, they're all running a whole bunch of different applications, some of whom are never allowed to talk to each other because they think internal NDAs apply or whatnot, but in academic clusters, it's yeah, this is the project it's going to be running for the next foreseeable period of time because we got a grant, etc, etc.

And I do the economics on that and it's a completely different world. I keep looking for people who can say something like, yeah, HPC on public cloud makes Perfect sense for high utilization, uh, steady state workloads. I just have a hard time making that economic case, because at that point of scale, it pays for itself in an embarrassingly short period of time.

George Porter: Yeah, this is the interesting thing about, does it make sense to kind of run on prem? Does it make sense to run in a public cloud? I think there The organization matters. If you're an academic, you might get a grant to look at a particular problem, and you're not going to be able to keep an on prem deployment necessarily busy for three years in a row,

Corey Quinn: let alone build the on prem deployment on that grant.

Like, okay, they put a few zeros on the end of that dollar figure, please.

George Porter: Yeah. And, you know, as much as we'd like to believe equipment can manage itself, it doesn't. You need experts, people on staff who can kind of manage that. And it becomes quite, quite challenging. I think that it. In these public cloud environments, you know, one of the things we were just talking about a second ago, which is that you're seeing these really long jobs for AI, ML, chat, GPT, I'm sure.

And in these particular cases, you saw an evolution where in the mid 2010s, companies like Google had these optical patch panels where a human could go and sort of put little fiber jumpers around and actually change some of the structure of their network. In other words, think about bringing all those fibers, rather than connecting them directly to switches or routers.

You essentially put them into a LiteBrite set, where you've got a bunch of little things on the back that they're all plugged into, and you can kind of plug stuff into the front. And now you're seeing the evolution of that with these optical switches, where you can do that programmatically. You can actually write code that will change the configuration for this particular next six hours, let's say.

And so that's kind of something that's quite interesting, I think.

Corey Quinn: The idea of applications reconfiguring themselves like that is, has been long standing, but networking has always felt much more manual, like the idea of controlling it via the infrastructure as code style approach seems to have come very late to an awful lot of the networking world.

And I get it, because if you screw up a computer, okay, we'll revert that. Screw up the network, you're driving to the data center.

George Porter: Absolutely. And a lot of times the network, if it's broken, how do you fix it? Because you need the network to access things, to fix it, etc. The academic world often is informed by what's going on in industry.

And we're responding, we're looking at trends and roadmaps. But one thing where I think that is reversed. is that there's a lot of formal theory that's actually being brought into network configuration. That's extremely interesting. So the idea is basically imagine you want to specify some properties of your network, and you want to guarantee that all the traffic entering this point goes through this firewall, let's say.

You can actually write software that will ensure that all the configuration changes respect that property. And this is something that's really nice because it gives you more confidence that the network is going to work.

Corey Quinn: Another area that you have been focusing on to some extensive degree has been the idea of carbon footprints of cloud computing, which I've heard a lot about from some AWS folks, then some Google folks who frankly showed how disclosure and transparency should be done, relatively speaking.

And I hear about it very occasionally from customers who have a mandate org wide to ask the questions. How are you approaching it?

George Porter: This issue of the carbon footprint of computing, broadly speaking, I think is something that's really important and something that is a field we have to address. In terms of data centers, it's particularly important because you're seeing so much of this on prem deployment going to data centers.

And so, even though there's not a huge number of these public cloud providers, they account for quite a bit of the compute that underpins websites that we go to. If AWS has some sort of load balancer problem, it feels like half the web has failed and half the websites don't work. So you get a sense of what's actually, you know, on there.

Corey Quinn: The concentration risk is no joke.

George Porter: Oh yeah. And so I think globally data centers account for maybe two percent, two or so percent of the carbon footprint of the planet earth, but that's a That's growing quite dramatically, and you're seeing that especially with AI and ML, Grok, ChatGPT, OpenAI, etc. And a lot of companies had these roadmaps, like you talked about, for carbon neutral, net zero, whatever you want to say.

And it will be an open question how well we keep to those, given that the compute requirements of AI is pushing in the opposite direction. But just to answer your question, there's sort of two ways we've been looking at this, not just at UCSD, but elsewhere, has been reducing the amount of energy, and then I think more importantly, redesigning data centers to support renewable energy, which is a real massive generational challenge, in my opinion.

Corey Quinn: There's a significant and consistent series of stories leaking out from various places. I saw one earlier this week from Oregon. That was talking about how a small utility has apparently gone from basic obscurity to one of the region's biggest polluters. And apparently it's one of the power utilities supplying a bunch of data centers, specifically Amazons.

And it's weird because I remember circa 2016 or so, where they said, Oh, if you want to run a data center on, you want to run your workload on purely carbon, pure renewable energy, put it in Oregon. And they stopped talking about that, and now I'm seeing articles like this, and I'm starting to wonder if, you know, things like leadership principles, and building an arena to remind them of their pledge, and all these other things are just zero interest rate phenomenon.

It's like, well, you know, we need to make the money printer go brr, so at some point we're just going to basically back away from all of that.

George Porter: That is one potential explanation. I think another one is simply the fact that if you look at low carbon energy sources, the big issue you have is what's called intermittency, meaning that they're not always available.

Solar power is super cheap, and it's gotten really cheap over the last 10 years, but it's Even here in Southern California, it's not sunny 100 percent of the time it turns out. We have night here as well. And if you look at other sources like wind, they're intermittent as well. And so the sources that are low carbon that are available pretty much all the time are things like hydro.

And so I think that was where Oregon, I want to say that in Dalles, Oregon, they had a data center that was part of the Columbia River there. And that's where they were getting a lot of their energy 24 7. Nuclear is an example of a source where you can get power 24 7. But if you look at the grid, I think you're seeing a small amount of power that's available all the time.

And then you have this huge percent, maybe 80%, that is available intermittently. And so, as you grow, as you deploy, Amazon gets bigger, AI is more important. Just meeting that need, I think it might be a little bit difficult with those always on low carbon sources. And so, if you start sourcing from coal, natural gas, it's going to drive your carbon footprint up.

Corey Quinn: You recently did something that I found fascinating. You collaborated with someone on effectively telling, doing theatrical storytelling on how, how computing affects the planet. Tell me more about that, please.

George Porter: Well, sure. Um, I figured this might be of interest to you, potentially, given that you have to act in some sense, uh, several times a week.

Corey Quinn: I am the dancing monkey of cloud computing.

George Porter: It's an esteemed title, I will say. But no, my wife, Dr. Monica Stuft, is a professor at the University of San Diego. Which is, uh, um, also here in San Diego. We have three universities that all have the word San Diego in the title. It's very confusing.

Corey Quinn: And one of them happens to be in Los Angeles.

I kid, I kid.

George Porter: It could, it could be. I don't know. We did a collaboration sort of during pandemic, or at least it started then, where we were sort of saying, okay, you know, there's that the compute sector of the economy has this huge carbon footprint. And in general, people really don't have a very much of an awareness about that.

We understand cars, maybe ride your bike. We understand public transit. You know, there's a lot of aspects of things that people understand. But computing is really very opaque. We talk about the cloud. It sounds very happy, you know, that kind of thing. And so she had a set of theater students work to tell stories about climate change and about In particular, things that intersected with computing.

So it might be the energy needed to run all of this AI stuff. So that might be something to try to convey. And it also might be the carbon footprint of making this stuff. So if you think of like your smartphone in your pocket, the carbon footprint of making that smartphone is really high. It involves a global supply chain.

It involves making all these different chips. And we bring all of that together to build this phone, and a typical person only keeps it for about 20 months. And so all of that environmental impact is essentially thrown away at the end of 20 months. And so that's why the vast majority of the carbon impact of a laptop or smartphone has already been spent before you even turn it on.

So even if you powered it, zero carbon energy, Huge percentage of that, that total lifetime carbon footprint is going to be because of just making that device.

Corey Quinn: Yeah, the smelting facility making the, the, the metals for these things is going to have a heck of a larger carbon footprint than whether or not you decide to power it via a central gas fired power system or solar or what over the lifetime of that phone.

George Porter: It's really true. And so you look at things like cobalt, uh, that comes from the Congo, graphite, uh, lithium, other kind of elements that all come together to make it happen. So some of her students, uh, collaborated with the computer science students at UCSD and created a set of performances that highlighted maybe keeping your phone for longer, or how do you kind of keep a device in use longer as a way to lower its carbon footprint?

This is a really interesting, I thought, a really interesting sort of collaboration to try to raise awareness in some sense of some of these issues. I will just say, if you turn on, uh, well it was PBS, I think we have a thousand channels now, but Science Channel, whatever you want to call it, anything on Netflix, there's a million scientific shows about how black holes work and wormholes, but then there's really nothing about how an email gets sent or how a Google search works or anything like that, and so I feel like in some sense, a lot of students, a lot of people have a better understanding of, of, you know, relativity and quantum physics than you about the technology they use every single day.

Corey Quinn: It also is like the old bike shedding problem, which originally came from a discussion, I think on Usenet, where they said, okay, if you say you're going to build a nuclear reactor, very few people are going to challenge you on the merits of that because no one knows how to build a nuclear reactor. But if you say you're going to build a bike shed.

Well, suddenly everyone is coming out of out of the woodwork talking about how, what color you should paint it. And you get nibbled to death by ducks in those stories. And I have to say the first time I learned how an email gets sent or a Google search worked, it's like, yeah, that can't possibly work. How does it really happen?

It's, it is still magic to me that these things work in spite of themselves, not because of them. And I used to run large scale email systems. I, I know that for a fact that it works, but it's still. boggles my mind that it does it all.

George Porter: And it works extremely well, so well, that when something is even slightly not working, we get really upset.

Like you're on an airplane and you're like, oh, my email's slow or something, and you're like, you're on an airplane sending an email. And so making that all that work is, has been kind of a miracle. Now, I will say that one of the reasons sort of over the last, say, 20, 25 years, that everything has been able to work is because we've been able to scale things out.

And like you said, scaling out many hundreds of servers, thousands of servers for a single application, spreading an application across different data centers, etc. And as we leave this kind of zero interest percent, zero percent interest world, and as we start taking climate change more seriously, we're going to have to just re architect.

I feel the way we build software so that it can better match with things like renewables. So this idea that I have a data center, it's always on, that it's got gasoline generators in case the power goes out, it's got UPSs, everything is very perfect. is going to be hard to keep going when we have a world where you want to be sourcing from wind and solar, for example.

And so I think that's one of the big, uh, challenges that we're going to have to face moving forward.

Corey Quinn: It's just such a big problem space that it's hard to wrap your heads around. Uh, one last topic I want to get into before we, we end up wrapping the episode is, is bring it back down a bit to something a lot more prosaic.

One of the reasons that I've always found you to be such a great Twitter follow has been you periodically do talk about using cloud computing for things and. In the land of industry, it's very different from what I can tell where, okay, you're going to have to hire some people past a certain point of scale to handle the care and the feeding of the cloud, but that's okay.

You're presumably going to make money. Ha ha ha. This is a zero interest rate environment with VC money slopping everywhere. You sure about that one, professor? Yeah, well, that's changing a bit. But in academia, the way that the grant process works, the way that you, uh, effectively get, uh, more or less free intentioned servitude for in the form of, you know, grad students lurking around here and there, but they often don't have the 15 years experience that you generally takes to avoid some of the sharp edges.

What? Do you see is different in managing the care and feeding of cloud environments and workloads when you're doing it from a perspective of an

academic?

George Porter: Yeah, this is a really interesting question because I, a lot of my colleagues, a lot of my friends, but myself in particular, I think it's really important to give students hands on practice, hands on experience, building something, deploying something.

Corey Quinn: Yeah, but this is like, I give them the hands on practice, stopping a running chainsaw with their hands.

George Porter: Well, exactly. Now, in order to kind of give them that access, we can do things like Give them credentials, root, root passwords to say AWS. And we've done that in the past. And, you know, our job is to do research.

So we're trying to study some problem. We're trying to deploy something, run an experiment, collect some data and things like that. And the complexity of something like AWS or Google Cloud or Azure is a real benefit because it means that the data that we collect, the experiments that we do are relevant to industry.

And so that helps us with impact. But the challenge is that we cannot staff up, like you said, an organization of people who can manage our cloud resources, look at permissions, give students different access to things, etc. And so this means that occasionally a student or a faculty member will accidentally do something like commit our AWS credentials to GitHub.

And so within a few hours, all of a sudden a hacker has spun up thousands of VMs running crypto mining software and our bill is 20, 000 in 12 hours or something like that. Now it's hard to kind of have a problem like that when you have an on prem deployment like we used to, but now it's very easy to just.

Fatfinger something or type something incorrectly and suddenly your bill is huge.

Corey Quinn: Let they who have never done that cast the first stone. I mean people don't usually ask me the question, but huh Why are you so passionate about AWS bills and the rest? It didn't, it didn't come from nowhere I screwed it up and was made to feel like a fool for it.

It's okay. Let's see if I can smack back a bit

George Porter: Well, and it's, and, and, you know, you're not, you, you might feel that way, but absolutely you're not, and I want to be clear, none of the, myself, I've done this, I've had students who do this, and it's not necessarily our fault. What it is, is just the fact that you have a system that is sort of infinitely scalable in every dimension, and even small decisions like, Does my VM need a world accessible IPv4 address or not?

Well, that might change my bill by 500 a month or something like that. Or, you know, do I need, I typed n 100 thinking I meant something, but actually I started a hundred VMs, you know, or something like that. I thought I was starting a hundred gigabytes of memory, but it's actually, I did a hundred different copies of my system or something like that.

And, uh, and those kinds of things are really challenging. And the providers. are trying to do their best, I think, to help out kind of academic users, but it is pretty difficult. You're seeing, one thing I will say is that, um, the National Science Foundation, which funds A lot of computer science research, pretty much the main funding body of computer science research in the US, has identified this as a little bit of an issue, and so they have started now including with your grant cloud credits that might work on AWS, might work on Azure or something like that.

So rather than necessarily giving you money to spend on cloud credits, you can sort of potentially get resources that way. And so this is a little bit of an opportunity to try to Both sort of make the federal tax money go a little bit further and possibly try to offload a little bit of that complexity from us academic users who don't have experience all day long setting up these cloud based resources.

Corey Quinn: And it's always a set up and tear down story. At least when you're building an on prem university cluster. Great. You're going to build something out. You're going to then have people to do, maintain the care and feeding of it. It gets reused again and again and again. And then you're just getting time on the cluster.

This is effectively build the whole thing from scratch because, you know, everyone should be able to do that off the top of their head. And while someone did something similar four years ago, they, they have their, their scripts still around. Sure that that'll still work without, uh, because nothing like there's not a new version of Python or some weird change has happened somewhere that's going to make this work very differently.

Usually not, but sometimes there are. By the time you can wrap your head around all of this, that's its own career.

George Porter: Yeah, now the one thing that is advantageous about using the real cloud instead of some fake cloud or on prem is that Um, you know, a former student, let's say, or a friend or someone can say, Hey, this project looks kind of cool.

I'm just going to actually just grab your code and deploy it in the same environment you tested on. And so this is an opportunity to have impact in a way that in the old days, we'd write a paper. And if you're lucky, maybe someone looked at it. And if you're really lucky, they decided to code it up and run it in their company or something like that.

Here, you know, you can just. Grab the code from GitHub, deploy it, run it. And you sort of see some of these projects making their way into industry, which is, which is really great.

Corey Quinn: It's neat to see. It's, it sort of answers the question of when am I going to use this in life? Which is, I think, every academic's least favorite question from, from the uneducated masses.

George Porter: Yeah, the answer is, you'd be surprised. Yeah, I mean, as a grad student, I was at Berkeley at the time that the sort of Spark project was taking off. And so there's, you know, when you see these things like, when am I going to use this in your life? You're like, well, there are companies like Databricks that have a really clear.

you know, ancestry back to kind of these academic projects. But even you're seeing this in terms of things like programming languages. When I was an undergrad, everyone was like, I have to learn C because that's what they use in industry. And over here, you're teaching us about Haskell and all these crazy things no one will ever use.

And yet you're seeing, you know, Microsoft, Google deploying code in these other languages and things like that. So it's actually a really exciting time, I think, to to be doing academic research because it's kind of never been easier. To deploy stuff that you develop in, in the, in the academic world, into the industry.

Corey Quinn: That's true. Because a lot of the companies too, like, Hey, you know what you should do is basically volunteer for a bunch of universities, uh, that, which yeah, in good times, people, people still look at that with suspicion and distrust. And when times get tight, it's like, Oh yeah, turns out one of our leadership principles is very much not philanthropy.

So good luck.

George Porter: Yeah, I think that at least at the cloud level, it's a lot easier for cloud providers to provide credits to academics than, you know, hard dollars. You're seeing a little bit of a mixture of both, but at least as far as we can tell, you know, from talking to folks at conferences and things like that, you are seeing this impact kind of go both ways.

And so our students, when they graduate, have generally speaking, had the opportunity to put some code into multiple data centers, and they can say, you know, I wrote a. a program that I deployed to Korea, and it also ran in Europe, and it ran in the U. S., and I failed one of the data centers, and it all kept working.

And you never would have been able to do that in the 90s, or 22, 000s, or even early 2010s.

Corey Quinn: Would have required a research team and an enormous budget. Now it just requires a few lines of configuration.

George Porter: It just requires a credit card, and it might cost 100, or it might cost 10, 000. That's the question you have to figure out.

Or you think

Corey Quinn: it's one, it's going to be the other by surprise. Yeah, those are always great.

George Porter: It's unlikely that you think it's going to be 10, 000. It's only 100.

Corey Quinn: If people want to learn more, where's the best place for them to go to see what you and your team are up to?

George Porter: We have a website at UCSD called cns.

ucsd. edu. It's the Center for Network Systems, which is a bunch of faculty, staff, and students who are working in this space. And then in terms of our low carbon work, and that includes the, the collaborations. with the University of San Diego and Monica stuff. That's on a website called c3lab. net.

Corey Quinn: And we will put links to both of those in the show notes.

Thank you so much for taking the time to speak with me. I really do

George Porter: appreciate it. It's a real pleasure to chat with you and I hope we can talk on Twitter soon.

Corey Quinn: Oh, I expect that it'll be hard to get away from me as that, as that environment continues to contract. Thanks again for making the time. I appreciate it.

George Porter: Thank you so much.

Corey Quinn: George Porter, professor at the University of California, San Diego, computer science department. I'm cloud economist, Corey Quinn, and this is Screaming in the Cloud. If you enjoyed this podcast, please leave a five star review on your podcast platform of choice. Whereas if you hated this podcast, please leave a 5 star review on your podcast platform of choice, along with an angry, insulting comment that channels 20 years of aggression over the way a crappy computer science professor made you feel silly back in the early

two thousands .

Networks and Sustainability in Computing with George Porter

Episode Summary

Episode Video

Episode Show Notes & Transcript

Transcript

You might also like

See Why GenAI Workloads Are Breaking Observability with Wayne Segar

Presenting at re:Invent with Matt Berk and Bowen Wang

The Latest State of IaC with Ido Neeman

Get the Newsletter

Sponsor an Episode