Episode Show Notes & Transcript
- Machine Learning is a Marvelously Executed Scam: https://www.lastweekinaws.com/blog/machine-learning-is-a-marvelously-executed-scam/
Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.
Corey: Tailscale SSH is a new, and arguably better way to SSH. Once you’ve enabled Tailscale SSH on your server and user devices, Tailscale takes care of the rest. So you don’t need to manage, rotate, or distribute new SSH keys every time someone on your team leaves. Pretty cool, right? Tailscale gives each device in your network a node key to connect to your VPN, and uses that same key for SSH authorization and encryption. So basically you’re SSHing the same way that you’re already managing your network.
Corey: Kentik provides Cloud and NetOps teams with complete visibility into hybrid and multi-cloud networks. Ensure an amazing customer experience, reduce cloud and network costs, and optimize performance at scale — from internet to data center to container to cloud. Learn how you can get control of complex cloud networks at www.kentik.com, and see why companies like Zoom, Twitch, New Relic, Box, Ebay, Viasat, GoDaddy, booking.com, and many, many more choose Kentik as their network observability platform.
Corey: Welcome to Screaming in the Cloud, I’m Corey Quinn. In a refreshing change of pace, I have decided to emerge from my home office cave studio thing and go to re:Invent and interview people in person. This is something of a challenge for me because it is way easier in person to punch me in the face, so we’ll see how it winds up playing out. My guest today is Diya Wynn, Senior Practice Manager at AWS. Diya, what is a practice manager at AWS? What do you do?
Diya: So, a practice manager, I guess you can think of it just like a manager of a team. I have a practice that’s specifically focused on Responsible AI. And I mean, practices are just like you could have won in financial services or anything. It’s a department, essentially. But more important than the practice in the title is actually what I get a chance to do, and that’s working with our customers directly that are using and leveraging our AI/ML services to build products.
And we have an opportunity to help them think about how are they using that technology in ways to have improvements or benefit individuals in society, but minimize the risk and the unintended impact or harm. And that’s something that we get to do with customers over any industry as well as globally. And my team and I have been enjoying the opportunity to be able to help them along their Responsible AI journey.
Corey: So, the idea of Responsible AI is… I’m going to sound old and date myself when I say this, but it feels like it’s such a strange concept for me, someone who came up doing systems administration work in physical data centers. The responsible use of a server back when I was hands-on hardware was, “Well, you don’t want to hit your coworker with a server no matter how obnoxious they are.” And it was fairly straightforward. It was clear: yes or no. And now it seems that whenever we talk about AI in society, in popular culture, from a technologist’s point of view, the answer is always a deeply nuanced shade of gray. Help.
Diya: Nuanced shade of gray. That’s interesting. It is a little bit more challenging. I think that it is, you know, in one sense because of the notion of all of the data that we get to leverage, and our machine-learning models are reliant on data that has variations coming from, you know, historical sort of elements, things that are here baked with bias, all of that has to be considered. And I think when we think about some of the challenges and even the ways in which AI is being used, it means that we have to be much more mindful of its context, right?
And these systems are being used in ways that we probably didn’t think about servers being used in the past, but also are in the midst of some high-stakes decisions, right? Whether or not I might be identified or misidentified and inappropriately arrested or if I get the appropriate service that I was thinking about or whether or not there are associations related to my gender or my sexual preference. All of that matters, and so it does become much more of a nuanced conversation. Also because depending on the jurisdiction you’re in, the region, what makes sense and what matters might differ slightly. So, it’s a multidisciplinary problem or challenge that we need to think about what is the legality of this?
And we have to think about social science sometimes and there’s an element of ethics. And all of that plays into what becomes responsible, what is the right way in which we use the technology, what are the implications of technology? And so yes, it is a little bit more gray, but there are things that I think we have at our disposal to help us be able to respond to and put in place so that we really are doing the right things with technology.
Corey: I’ve known Amazon across the board to be customer-obsessed, and they tell us that constantly—and I do believe it; I talk to an awful lot of Amazonians—and so much of what the company does comes directly from customer requests. I have to ask, what were customers asking that led to the creation of your group? Because it seems odd to me that you would have someone coming to you and saying, “Well, we built a ‘Hot Dog/Not A Hot Dog’ image recognition app,” and, “Oopsie. It turns out our app is incredibly biased against Canadians. How do we fix this?” Like, that does not seem like a realistic conversation. What were the customer concerns? How are they articulated?
Diya: No, that’s really good. And you’re right. They weren’t asking the question in that way, but over the last five years or so, I would say, there has been an increase in interest and as well as concern about how AI is being used and the potential risks or the areas of unintended impact. And with this sort of heightened sensitivity or concern, both with our executives as well as members of common society, right—they’re starting to talk about that more—they started to ask questions. They’re using surfaces we want to be responsible in building.
Now, some customers were saying that. And so, they would ask, “What are other customers doing? What should we be aware of? How do we or are there tools that we can use to make sure that we’re minimizing bias in our systems? Are there things that we can think about in the way of privacy?”
And oftentimes privacy and security are one of those areas that might come up first. And those were the kinds of questions. We actually did a survey asking a number of our customer-facing resources to find out what were customers asking so that we could begin to respond with a product or service that would actually meet that need. And I think we’ve done a great job in being able to respond to that in providing them assistance. And I think the other thing that we paid attention to was not just the customer requests but also what we’re seeing in the marketplace. Part of our job is not only to respond to the customer need but also sometimes to see the need that they’re going to have ahead of them because of the way in which the industry is moving. And I think we did a pretty good job of being able to see that and then start to provide service and respond to assist them.
Corey: Yeah, it’s almost like a rule that I believe it was Scott Hanselman that I stole it from where the third time that you’re asked the same question, write a blog post, then that way you can do a full deep—
Diya: Did he really say write a post? [laugh].
Corey: Treatment of it. Yes, he did. And the idea is, write a blog post—because his blog is phenomenal—and that way, you have a really in-depth authoritative answer to that question and you don’t have to ad-lib it off the cuff every time someone asks you in the future. And it feels like that’s sort of an expression of what you did. You started off as a customer-facing team where they were asking you the same questions again and again and at some point it’s, okay, we can either spend the rest of our lives scaling this team ad infinitum and winding up just answering the phone all day, or we can build a service that directly addresses and answers the question.
Diya: Absolutely, absolutely. I think that’s the way in which we scale, right, and then we have some consistency and structure in order to be able to respond and meet a need. What we were able to do was—and I think this is sort of the beauty of being at AWS and Amazon; we have this opportunity to create narratives and to see a need, and be able to identify and respond to that. And that’s something that everybody can do, not just resigned to a VP or someone that’s an executive, we all can do that. And that was an opportunity that I had: seeing the need, getting information and data, and being able to respond and say, “We need to come up with something.”
And so, one of our first pieces of work was to actually define a framework. How would we engage? What would be that repeatable process or structure for us, framework that we can leverage with our customers every time to help them think through, look around corners, understand where there’s risk, be better informed, and make better-informed decisions about how they were using the technology or what ways they could minimize bias? And so, that framework for us was important. And then we have now tools and services as well that were underway, you know, on our product side, if you will, that are complementing—or that, you know, complement the work.
So, not only here’s a process, here’s a framework and structure, but also here are tools that in technology you can bring to bear to help you automate, to help you understand performance, or even you know, help you minimize the bias and risk.
Corey: What’s interesting to me, in a very different part of the world than AI, I live in AWS costing because I decided, I don’t know, I should just go and try and be miserable for the rest of my life and look at bills all day. But whenever I talk to clients, they asked the same question: what are other customers doing, as you alluded to a few minutes ago? And that feels like it’s a universal question. I feel like every customer, no matter in what discipline or what area they’re in, is firmly convinced that somewhere out there is this utopian, platonic ideal of the perfect company that has figured all of this stuff out and we’re all constantly searching for them. Like, there’s got to be someone who has solved this problem the right way.
And in several cases, I’ve had to tell clients that you are actually one of the best in the world and furthest advanced at this particular thing. That customer, the closest we’ve got to them is you, so we should be asking you these questions. And for whatever it’s worth, no one ever likes hearing that because, “Like, oh, we’re doing something wild.” It’s like—
Diya: [crosstalk 00:10:15] pioneers.
Corey: —“Well, we got to solve this ourselves? That’s terrible.”
Diya: Well, it’s interesting you say that because it is a common question. I think customers have an expectation that because we are AWS, we’ve seen a lot. And I think that’s true. There are tens of thousands of customers that are using our services, we have conversations with companies all across the world, so we do have some perspective of what other customers are doing and that’s certainly something that we can bring to the table. But the other part of this is that this is really a new area. This is a sort of new space, that we’re focused on trustworthy and Responsible AI, and there aren’t a ton of customers that are doing this—or companies at all—that have it entirely answered, that have—you know, we’re all on a journey.
So, these are, I would say, early stages. And we do have the benefit of being large, having a lot of customers, having some experience in building services as well as helping our customers build products, having a team that’s focused on looking at standards and working with standards bodies globally, having teams that are working on our understanding what we’re doing in regulation and public policy. And so, all of that we bring to bear when we start talking about, you know, this with our customers. But we don’t have all the answers; we’re on a journey like them. And I think that’s something that we have to be comfortable with, to some degree, that this is an evolving area and we’re learning. And we’re investing even in research to help us continue to move forward. But there’s a lot that we know, that there’s a lot that we can bring to the table, and we can help our customers in that regard.
Corey: Now, this might very well be old news and well understood and my understanding is laughably naive when this gets released, but as of this recording, a few hours beforehand, you released something called Service Cards. And I have to say, my initial glance at this was honestly one of disappointment when I saw what it was because what I was hoping for, with—when you ever see ‘service’ and ‘cards’ together, is these are going to be printable cardboard, little cards that I can slip into the Monopoly board game I have at home and game night at home is going to be so traumatic for my kids afterwards. Like, “What’s a Fargate?” Says the five-year-old, and there we go. “It means that daddy is not going to passing go, going directly to jail with you. Have fun,” it’s great. But I don’t think that’s what it is.
Diya: No, not at all. Not at all. So, it is very similar to the context that people might be familiar with around model cards, being able to give definition and understanding of a model that’s being used. For us, we sort of took that concept at one step beyond that in that, you know, just providing a model card isn’t sufficient necessarily, especially when there are multiple services or multiple models being used for any one of our services. But what our Service Cards allow us to do is to provide a better understanding of the intended use of the service, you know, and the model that’s underpinning that, give context for the performance of that service, give guidelines for our customers to be able to understand how was it best used and how does it best perform.
And that’s a degree of transparency that we’re providing under the hood, for our customers to really help them as well be much more responsible and how they’re building on top of those. And it gives them clarity because there is a growing interest in the marketplace for our customers to hold their vendors—or companies to hold their vendors responsible, right, making sure that they’re doing the right things and covering off, are we building well? Do we have, like, the customer or enough of demographic covered? What the performance looks like. And this is a really big opportunity for us to be transparent with our customers about how our services are being built and give them a little bit more of that guardrail that we were talking about—guidelines—how to best use it as they look to build upon those.
Corey: Not in any way, shape, or form to besmirch the importance of a lot of the areas that you’re covering on this, but on some level, I’m just envious in that it would be so nice to have that for every AWS service, of this is how it is—
Diya: Uh-oh [laugh].
Corey: —actually intended to be used. Because to me, I look at it and all I see is database, database, really expensive database, probably a database, and, like, none of those are designed to be databases. Like, “You lack imagination,” is my approach. And no, it just turns out I’m terrible at computers, but I’m also enthusiastic and those are terrible combinations. But I would love to see breakdowns around things like that as far as intended use, potential pitfalls, and increasingly as we start seeing more and more services get machine learning mixed in, for lack of a better term, increasingly we’re going to start to see areas where the ethical implications absolutely are going to be creeping in. Which is a wild thing to say about, I don’t know, a service that recommends how to right-size instances having ethical concerns. But it’s not that unreasonable.
Diya: Well, I can’t make any promises about us having those kinds of instructions or guidelines for some of our other services, but we are certainly committed to being able to provide this transparency across our AI/ML services. And again, that’s something I will say that’s a journey. We’ve released a few today; there are others that are going to come. We’re going to continue to iterate and evolve so that we can get through our services. And there’s a lot of work behind that, right?
It’s not just that we wrote up this document, but it is providing transparency. But it also means that our teams are doing a great bit in terms of the diligence to be able to provide that feedback, to be able to test their models, understand their datasets, you know, provide information about the datasets in public—you know, for the public datasets that are being tested against, and also have the structure for them to train their models appropriately. So, there’s a lot going into the development of those that may not be immediately transparent, but really is core to our commitment to how we’re building our services now.
Corey: It’s a new area in many respects because, like, to be very direct. If I wind up misusing or being surprised by a bad implementation of something in most cases in AWS context, the disaster area looks a lot closer to I get a big bill. Which—and this [unintelligible 00:16:35] is going to sound bizarre, but here we are, it’s only money. Money can be fixed. I can cry and sob to support and get that fixed.
With things like machine learning and AI, the stakes are significantly higher because given some of the use cases and given some of the rapid emerging technology areas in which these things are being tested and deployed, it hurts people if it gets wrong. And an AWS bill is painful, but not in a damaging to populations level. Yet. I’m sure at some point, it becomes so large it becomes its own micro-economy, I guess the way those credits are now, but it’s a very different way.
Diya: Right. Absolutely. So, I think that’s why our work from a responsibility perspective is important. But I think it’s also valuable for customers to understand, we’re taking a step forward and being able to help them. Very much like what we do with well-architected, right? We have a framework, we have best practices and guidance that is being provided so that our customers who are using our cloud services really know what’s the best.
This is very much like those Service Cards, right? Here’s the best conditions in order to be able to use and get the greatest value out of your cloud investment. The same thing is what we’re doing with this approach in helping our customers in the Responsible AI way. Here’s the best, sort of, best practices, guidance, guardrails, tools that are going to help you make the most out of your investment in AI and minimize where there’s this unintended or potential areas of potential harm that you were describing. And you’re right, there are high stakes use cases, right, that we want to make sure or want to be able to help and equip our customers to think more about intentionally and be prepared to be able to hopefully have a governance structure, people aligned, processes, technology to really be able to minimize that, right? We want to reduce the blast radius.
Corey: One thing I want to call out as well is that as much as we love in tech to pretend that we have invented all of these things ourselves—like, we see it all the time; like, “No one really knows how to hire, there’s no real scientific study on this.” “Yes, there are. There are multi-decade longitudinal studies at places like GM and whatnot.” And, “No, no, no tech is different. There’s no way to know this. La la la.”
And that’s great. We have to invent these things ourselves. But bias has been a thing in business decisions, even ones that are not directly caused by humans, for a long time. An easy example is in many cases, credit ratings and decisions whether to grant credit or not. Like, they were not using machine learning in the 90s to do this, but strangely, depending upon a wide variety of factors that are not actually things that are under your control as a person, you are deemed to be a good credit risk versus a bad credit risk.
And as a result, I think one of the best terms I heard in the early days when machine learning started getting big, was just referring to it as bias laundering. Well, we’ve had versions of that for a long time. Now, at least it seems like this shines a light on it if nothing else, and gives us an opportunity to address it.
Diya: Absolutely. Oh, I’d love that, right? The opportunity to address it. So, one of the things that I often share with folks is we all have bias, right? And so, like you said we’ve had bias in a number of cases. Now, you know, in some cases, bias is understandable. We all have it. It is the thing that often—we talk about the sort of like mental shortcuts, things that we do that help us to respond rapidly in the world in the vast array of information that we’re taking in all the time. So—
Corey: You’re an Amazonian. You yourself bias for action.
Diya: Exactly. Right? So, we have bias. Now, the intent is that we want to be able to disrupt that so that we don’t make decisions, oftentimes, that could be harmful, right? So, we have proclivities, desires, interest, right, that kind of folds into our bias, but there are other things, our background, where we went to school, you know, experiences that we had, information that we’ve been taking that also helped to drive towards some of those biases.
So, that’s one element, right, understanding that. A human bias gets infiltrated into our systems. And there was a study in AI now—I think it was 2019—that talked about that, right, that our systems are often biased by—or the bias is introduced, you know, sometimes by individuals. And part of the necessity for us to be able to eliminate that is understanding that we have bias, do things to interrupt it, and then also bringing in diversity, right? Because some of our biases are just that we don’t have enough of the right perspectives in the room; we don’t have enough of the right people involved, right?
And so, being able to sort of widen the net, making sure that we’re involving the outliers, I think are important to us being able to eliminate bias as well. And then there are tools that we can use. But then you also bring up something interesting here in terms of the data, right? And there’s a part that education plays a good role in helping us understand the things like what you described our institutional biases baked into our data that also can come out in decisions that are now being made. And the more that we use AI in these ways, the more there is risk for that, right?
So, that’s why this effort in Responsible AI, understanding how we mitigate bias, understanding how we invite the right people in, the inclusion of the right perspectives, thinking about the outliers, thinking about whether or not this is the right problem for us to solve with AI is important, right, so that we can minimize those areas where bias is just another thing that we continue to propagate.
Corey: So, a year or two ago, I wrote a blog post titled Machine Learning is a Marvelously Executed Scam. And it was talking about selling digital pickaxes into a data gold rush.
Diya: I [crosstalk 00:22:30] remember this one [laugh].
Corey: And it was a lot of fun. In fact, the Head of Analyst Relations at AWS for Machine Learning responded by sending me a Minecraft pickaxe made out of foam, which is now in my home office hung behind my office and I get a comment at least three times a week on that. It was absolutely genius as far as rebuttal go. And I’ve got to find some way to wind up responding to her in kind one of these days.
But it felt like it was a solution in search of a problem. And I no longer hold so closely to that particular opinion, in no small part due to the fact that, as you’re discussing, this area is fraught, it’s under an awful lot of scrutiny, large companies who use these things and then those tools get it wrong are going to basically wind up being castigated for it. And yet, they are clearly realizing enough value from machine learning that it is worth the risk. And these are companies whose entire business, start to finish, is managing and mitigating risk. There is something there or suddenly everyone has taken leave of their senses. I don’t quite buy that second option, so I’m guessing it’s the first.
Diya: So, the question is, is it worth the risk? And I would say, I think some people might or some companies might have started to step into that area thinking that it is, but it’s not. And that’s what we’re saying and that’s what we’re hearing in the industry [unintelligible 00:23:51], that it’s not worth the risk. And you’re hearing from customers, outcries from others, government officials, right, all of them are saying, like, “It’s not worth the risk and we have to pay attention to that.”
But I think that there’s certainly value and we’re seeing that, right? We’re solving previously unattainable problems with AI. We want to be able to continue to do that, but give people the means to be able to sort of minimize where there is risk and recognize that this is not a risk that’s worth us taking. So, the potential for reputational harm and the damage that will do is real, right? When a company is called out for the fact that they’ve discriminated and they’re unfairly evaluating homes, for instance, for people of color in certain communities, right, that’s not something that’s going to be tolerated or accepted.
And so, you have people really calling those things out so that we start to—organizations do the right things and not think that risk is worth the [unintelligible 00:24:52]. It is very well worth the risk to use AI, but we’ve got to do it responsibly. There’s so much value in what we are able to accomplish.
So, we’re seeing, you know, even with Covid, being able to advance, like, the technology around vaccinations and how that was done and accelerated with machine learning, or being able to respond to some of the needs that small businesses and others had, you know, during Covid, being able to continuate their service because we didn’t have people in businesses or in offices, a lot of that was advanced during that time as a result of AI. We want to be able to see advances like that and companies be able to continue to innovate, and so we want to be able to do that without the risk, without the sort of impact that we’re talking about, the negative impact. And I think that’s why the work is so important.
Corey: Do you believe that societally we’re closer to striking the right balance?
Diya: We’re on our way. I think this is certainly a journey. There is a lot of attention on this in the right ways. And my hope—and certainly, that’s why I’m in a role like this—that we can actually invite the right voices into the room. One of the things—and one of my colleagues said this earlier today, and I think it was a really, really great point, right—as we are seeing—first of all, we never thought that we would have, like, ethicists roles and sort of Responsible AI folks, and chief ethics officers. That was not something that existed in the context of, sort of, machine learning, and that’s something that it’s evolved in the last, you know, few years.
But the other thing that we’re seeing is that the folks that are sitting in those roles are increasingly diverse and are helping to drive the focus on the inclusion that we need and the value of making sure that those voices are there so that we can build in inclusive and responsible ways. And that’s one of the things that I think is helping us get there, right? We’re not entirely there, but I think that we’re on a path. And the more that we can have conversations like this, the more that companies are starting to pay attention and take intentional action, right, to build ethically and to have the trust in the technology and the products that they build, and to do that in responsible ways, we’ll get there.
Corey: I really want to thank you for taking so much time to talk through what you’re up to with me.
Diya: I am super excited and glad that you were able to have me on. I love talking about this, so it’s great. And I think it’s one of the ways that we get more people aware, and hopefully, it sparks the interest in companies to take their own Responsible AI journey.
Corey: Thank you so much for your time.
Diya: Thanks for having me.
Corey: I appreciate it. Diya Wynn, Senior Practice Manager at AWS. I’m Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you’ve hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry insulting comment, presumably because you’re Canadian.
Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.