When Two Clouds Isn’t Enough with Alan Raison

Episode Summary

Alan Raison is the DevOps Lead at Hitachi Capital, a financial services company headquartered in the U.K. Prior to working there, Alan worked as a developer at Sky, a senior analyst programmer at Hermes, and a software developer at BJSS. He also holds AWS Solutions Architect and AWS Developer certifications.

Join Corey and Alan as they talk about what exactly it is that a DevOps lead does, how there’s a big world out there and not every company is a SaaS startup in the Bay Area that hemorrhages money, how it turns out that running an Oracle database in Oracle Cloud isn’t the worst thing in the world, how everyone is using serverless but nobody’s using it properly, learning curves and and how they make it hard to unlock the full power of AWS, why Alan is a big fan of the book Accelerate, and more.

Episode Show Notes & Transcript

About Alan

Developer and DevOps-er; interested in all kinds of cloudy tech, especially deployment pipelines and infrastructure as code. Also building the DevOps capabilities at Hitachi Capital.

Links:

Hitachi Capital UK: https://www.hitachicapital.co.uk/
Accelerate book: https://www.amazon.com/Accelerate-Software-Performing-Technology-Organizations/dp/1942788339
Twitter: https://twitter.com/alanraison
GitHub: https://github.com/alanraison

Transcript

Announcer: Hello, and welcome to Screaming in the Cloud with your host, Cloud Economist Corey Quinn. This weekly show features conversations with people doing interesting work in the world of Cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.

Corey: If your mean time to WTF for a security alert is more than a minute, it's time to look at Lacework. Lacework will help you get your security act together for everything from compliance service configurations to container app relationships, all without the need for PhDs in AWS to write the rules. If you're building a secure business on AWS with compliance requirements, you don't really have time to choose between antivirus or firewall companies to help you secure your stack. That's why Lacework is built from the ground up for the Cloud: low effort, high visibility and detection. To learn more, visit lacework.com.

Corey: This episode is sponsored in part byLaunchDarkly. Take a look at what it takes to get your code into production. I’m going to just guess that it’s awful because it’s always awful. No one loves their deployment process. What if launching new features didn’t require you to do a full-on code and possibly infrastructure deploy? What if you could test on a small subset of users and then roll it back immediately if results aren’t what you expect? LaunchDarkly does exactly this. To learn more, visitlaunchdarkly.com and tell them Corey sent you, and watch for the wince.

Corey: Welcome to Screaming in the Cloud. I’m Corey Quinn. I’m joined this week by Alan Raison, who’s currently a developer meets DevOps lead, slash whatever you want to call yourself, really, over at Hitachi Capital in the UK. Alan, thanks for joining me.

Alan: Pleasure to be here.

Corey: So, what is it you do, exactly? It’s a truism that as long as we continue to improve our titles, it obscures meaning to the point where it, on a long enough timeline, is completely meaningless but has 500 words tied to it. Where do you start? Where do you stop?

Alan: Okay, so I’ve got a background in being a developer. So, I started at Hitachi nearly five years ago as a lead developer. But last year, I really sort of—well, I’ve always been interested in the deployment side, infrastructure as code, things like that. And so the opportunity came about a year ago to try and start up a DevOps team inside it, actually. So, I am now DevOps lead, one of two.

We’ve got a small DevOps team. As such, we work alongside the development teams, the testers, trying to set up their build pipelines, their cloud infrastructure, and any, sort of, bits and pieces along the way. So, we’ve got on-prem infrastructure, as well, that we do a bad job of looking after.

Corey: So, I think it’s fair to say on some level, that Hitachi Capital could be considered a ‘legacy’ company. And I use the term legacy in the way that I always hear it when other people use it, which means, it makes money, which is apparently falling out of fashion. It has one of those old-fashioned business models that our grandparents would have understood of, “Oh, you provide a service or good and you [unintelligible 00:02:12] for more money than it takes to wind up providing that service or good, and that leftover is called a ‘profit.’” and that seems to have not really been absorbed as a lesson by the venture capital set, et cetera. Is that a fair way of casting it?

Alan: Well, yeah, it certainly seems to work as a business model for us. [laugh].

Corey: Yeah, “Well, we’re losing money, but making it up in volume,” never really seems to catch on. But you were a listener of this show and reached out with some fascinating feedback. And you said that, “Well, you’re in a company that’s the antithesis of most of the things that I believe.” And you went through a laundry list of those, and we’ll get to some of them, but what I really appreciated about it was that… yeah, I agree with you; I absolutely have a certain perspective that aligns with a certain type of company, just because that’s what I tend to be surrounded with. It’s where my own career has taken me, it’s a large part of where significant and overly vocal—shall we say—portions of my audience tend to reside.

So, it’s very easy to lose sight of the fact that there’s a big world out there, and it doesn’t always align to a SaaS company in the Bay Area building something that is patently ridiculous. So, a lot of what I said doesn’t necessarily apply to what you’re doing and how you’re doing it. And I think that’s great feedback. More of that, please. But of all the weird positions I’ve taken, let’s start at the top. What offends you, slash you disagree with the most?

Alan: [laugh]. Well, I wouldn’t say anything that you say, offends me. I think, in fact, you’re probably on the right lines objecting to some of these things. But we are where we are, I guess. So, for example, multi-cloud.

As developers, we tend to like AWS. It’s in many ways built for developers. It’s the first one developers tend to go to. But hey, we’ve got a sizable Windows estate of servers and whatever else we do in Windows, including a number of COTS systems. And so it makes sense for us to not only use AWS, but we use Azure as well, and we’ve got some Windows engineers who are putting things into Azure, as well as the developers putting most of their stuff into AWS.

That doesn’t sound too crazy for starters; most people, I would imagine if they’re looking to go multi-cloud, would consider two of the big three—who I consider to be, like, AWS, Azure, and GCP—but we also have data centers. We don’t want data centers, but we do have a large Oracle estate, Oracle databases, Oracle WebLogic application servers. And when it came to putting that into AWS, it looked like it was going to be very expensive. Combine this with those Windows servers that these applications seems to talk to, and it was suggested, by I think it was an architect or maybe from some salesperson who spoke to an architect, that we use Oracle Cloud. And believe you me, me and my boss, we fought this for quite some time.

We thought, you know, [laugh] this can’t be a good idea. Three clouds is quite clearly crazy, and we don’t want to go there. But we did look into it, we looked in the numbers and the Oracle salespeople did their technical demonstrations and things, and it does seem that if you want to run the Oracle Database in the cloud, then Oracle Cloud isn’t actually too bad a platform to do it on.

Corey: Okay. A lot to unpack there. Let’s start at the top. Even though I talk about it otherwise, I fall prey to it the same way anyone else does, which is, it’s easy to talk about an ideal world and how you would do things in that mythical environment. It’s the whiteboard fallacy, almost, where design an architecture that solves for X, Y, and Z.

Well, sure, in a vacuum, it’s easy to do. In practice, there are questions of, “Well, what about the real world being messy?” Plus the whole aspect of we’re very rarely working with anything that’s purely greenfield. There’s got to be support for existing workloads that are not built in a way that align with cloud. So, I think that I need to be more cautious about framing things the way that I do.

So first, thanks for the feedback. I really do appreciate being able to catch these things when they slip out and inadvertently start tainting the ecosystem with, I guess, a too forward-looking SaaS-y startup-style company perspective. I mean that sincerely. It’s great to get feedback in a constructive way.

Alan: Well, I still think there is room for that criticism, and it’s not entirely unfounded. But that’s what works for us, and I’m not offended by your snark at the multi-cloud strategy at all.

Corey: [laugh]. I would also say that multi-cloud does make sense for different workloads that have very few points of interaction, especially if they’re already built to embrace the benefits that one cloud provider offers over another. I don’t really have a problem with that. My anti-multi-cloud stance—which I feel like it gets misinterpreted a fair bit—is, “We’re going to build one workload that we could seamlessly move between cloud providers, even though we either never do it or we’re trying to do it for the wrong reasons.” That’s the multi-cloud that I’ve seen being pushed by a variety of vendors and that’s the thing that upsets me.

But personally, I use GitHub, which is an Azure product or rapidly becoming something like that. I use things that are well under the umbrella of GCP. And I run a lot of infrastructure services on top of AWS. So, from that perspective, we’re all multi-cloud unless we’re doing something hilariously wrong.

Alan: [laugh]. Yep.

Corey: Now, as far as Oracle Cloud goes, I’ve played with them before, and I think I’ve been fairly public about it, in that, from a technical perspective, I like an awful lot about what Oracle Cloud is doing, particularly if you’re already an Oracle customer. From migrating a database that is already an Oracle Database onto a cloud, it’s hard to necessarily push back against some of the value propositions that Oracle comes out with. Even on a technical basis from a pure serverless perspective, Oracle Cloud is still pretty decent, based upon what I’ve seen. My problem has always been, honestly, their business practices, their salespeople, their approach to a lot of things that makes it unpalatable for certain customers to enthusiastically dive in.

Alan: Yeah, and I think that’s what made us nervous to begin with.

Corey: So, something else that you mentioned in your email to me was specifically around the idea of serverless development, and I find that fascinating on a couple of levels. The first is that in your parenthetical, you said—right after serverless development—“Not very well,” which is universally true every time I see someone start to work with serverless. It’s, “Are you using serverless?” “Oh, yeah. But we’re really bad at it.

We’re not using it properly.” And I’ve come to the conclusion that if that’s what people think about their own use of serverless, they’re probably doing it right because everyone feels like it’s unfinished. It’s weird. You’re misusing it in a bunch of different ways. But that’s exactly how everyone uses it. Tell me more about that.

Alan: Yeah, I guess so. I mean, there’s always more services to use in AWS. But yeah, we started as, I guess, most people start in AWS: writing applications that run on EC2, and then decided to tear that up and try it out with containers, and then we decided that that wasn’t good enough, so let’s go and try out Lambda.

Corey: And how did that experience unfold for you?

Alan: Yeah, so it’s a massive learning curve for anyone. So, I think I can pick stuff up quite quickly, but when you’re bringing a whole development team along with that, then there’s always going to be gaps in people’s knowledge and things people know better than other people. And so it’s just really hard to get everyone on the same level. And especially because we did, basically, a change of language. We’re typically Java developers, and now who wants to run Java in a Lambda? I don’t think anyone does. So, we now almost exclusively do TypeScript. So yeah, there’s, as I say, a learning on lots of different levels there.

Corey: One thing that seems to be true across the board with serverless, is that it may be one of the first, shall we say, bleeding-edge futuristic-looking technologies that's seeing greater adoption in the enterprise than it is at smaller scale. Is that something that you’re starting to feel yourself? What drove you folks to start looking at serverless, instead of a number of other approaches, be it containers, or something else?

Alan: I suppose it’s the promise of not having to look after infrastructure. We’ve got a number of infrastructure teams at Hitachi Capital, and it always seems to take a long time, or there’s a lot of roadblocks in server provisioning, and security hardening. And we just wanted to sidestep that, really, and go for the managed offering.

Corey: There really is something to be said, for letting the provider handle these things for you. There’s pushback from more traditional ops side of the world where it’s, “Well, at that point, you’re just letting your cloud provider dictate your availability.” Well, yeah, no kidding. You always are. With serverless, you’re just being a little more upfront about acknowledging that.

Alan: Absolutely. Yeah.

Corey: You said that the learning curve is steep. And you’re right, I keep saying that things like Lambda functions and their equivalent on other providers are inherently platforms that are defined by their limitations rather than by their capabilities. And honestly, I’m kind of on board with that, even though we start to see those limitations relaxing on a year-over-year basis, where it forces you to rethink about how it is you’re approaching these types of things. I find that in my experience dealing with enterprises, one of the first things to get Lambda-fied, for lack of a better term, is a cron job somewhere living on a job server where sometimes that job server fails—did it run? Didn’t it run?—being able to effectively shove that into a Lambda function almost feels like step one, where people start to get their feet wet. Does that align with how you approached it? Or did you come at it from a different angle?

Alan: I guess we came at it from a slightly different angle because we were looking to build applications on top of—so we have APIs that have a Lambda back-end; they’re usually, sort of, fairly dumb Lambdas. They’ve passed through to some third-party back-end somewhere, you know, doing a bit of mapping or transformation, a little bit of business logic, perhaps. But the bulk of it is just listening to a web frontend, calls a back-end API that has some Lambda components.

Corey: That’s functionally how I got into it at first, as well. I started off with a shell script that was great and all, to generate my weekly newsletter, and it was, yeah, what happens if I’m not in front of a box that has a console? I wanted to turn into something that vaguely looked like a web app, ideally that I could use from an iPad out on the road. And in time, it was, “Huh, maybe a Lambda function becomes the right answer here.” And it almost certainly wasn’t, but the way I misused it, it became the right answer and it sort of snowballed away from there.

But what you say about feeling like you’re doing it not very well absolutely resonates. Because the entire time I was doing this, I’m sitting here with the, “I am really not using this in the way it was imagined being used.” But I’ve never met anyone who feels differently.

Alan: Yeah, I think the range of different frameworks, as well, is quite daunting. So, you can either use AWS’ SAM framework, or you can just use the APIs directly, or there’s now the serverless framework, or there’s Architects, or a number of different other—

Corey: The CDK, Chalice, Apex, you can keep going on, and on, and on—

Alan: Exactly.

Corey: —and the problem is, if you start following random blog posts, suddenly it’s, “Oh. They’re using a different one. I should start over.” Or you wind up with this ridiculous combination of 15 different things that you’ve gathered from various places, and, “What on holy horror have I built? Well, [laugh] it’s something that works, so here we are.”

Alan: Absolutely. I think we sometimes miss the opportunity to really use some of the power of AWS, so it kind of frustrates me that—because we’ve always done relational databases, we don’t really have the confidence to use DynamoDB, for example, or if we do try and use it, then we kind of hurry back into our comfort zone when things get hard. So, that kind of frustrates me a little bit with serverless applications.

Corey: I can’t shake the feeling that everyone gets confused sooner or later. Speaking of things that are confusing and everyone tends to go in strange directions with, let’s talk about everyone’s favorite tech buzzword in the enterprise world ‘Kubernetes.’ You playing with that at all? Have any strong opinions one way or the other on it?

Alan: Funny you should ask. Yes so, I’ve been actually trying to promote the idea of using Kubernetes in Hitachi. And this may sound odd as well, but as I said before, we’ve got a number of commercial off-the-shelf systems that we use. They’re way too old to actually have heard about Kubernetes. But sometimes you just need somewhere to deploy something.

You know, not everything will run in a serverless platform. Sometimes I just want to deploy a dashboard somewhere or make this application that you can easily download run somewhere in our infrastructure or that includes cloud as well. And I think in that sense, Kubernetes would be a really good fit because we could have a common platform. As developers, they don’t need to care where it’s running, they just point to an API, and then voila, they can see an endpoint and access it without needing to provision hardware, get it certified by InfoSec, and all of that fun stuff.

Corey: This episode is sponsored by ExtraHop. ExtraHop provides threat detection and response for the Enterprise (not the starship). On-prem security doesn’t translate well to cloud or multi-cloud environments, and that’s not even counting IoT. ExtraHop automatically discovers everything inside the perimeter, including your cloud workloads and IoT devices, detects these threats up to 35 percent faster, and helps you act immediately. Ask for a free trial of detection and response for AWS today at extrahop.com/trial.

Corey: I tend to shriek and something approaching horror whenever I see a Kubernetes deployment beginning because invariably it’s aimed at something that is a ‘Hello World’ style of deployment. And it’s, “Oh, dear lord, this is overwrought for what you’re trying to deploy.” The other side of it, though, is that the applications that people are looking to migrate into a Kubernetes environment are also a lot bigger than the Hello World examples that folks use. Are you already running these apps in containers and looking for a different orchestrator, or is containerization part of the Kubernetes experience as you’re approaching it?

Alan: So, we’ve wanted to run a few things in containers for a while, but they’ve never really sat anywhere, so we’ve just got a few services running just in Docker, and if they fail, then well, who knows? Someone will have to go and restart it when they find out. Whereas the behemoth that is Kubernetes which obviously sought that out for us, and hopefully, let the support engineers sleep better at night, as long as they don’t have to manage too much of the actual hardware parts of it. So, as I indicated, that’s not where my skills lie; I just let someone else deal with that. And I believe that’s probably where most of the complexity of Kubernetes is. So, maybe that explains my viewpoints.

Corey: I think in the aggregate, my take on it is that if you’re building something greenfield or vaguely close to greenfield, serverless feels like the way to go. But if you’re looking at migrating something pre-existing, then Kubernetes slash containers seems like the path that is the least disruptive to the organization. First, do you agree with that? And secondly, do you think that optimizing for not disrupting the organization is the right path?

Alan: I mostly agree with that. So, I would just add to that, that some things you can’t run in a serverless environment. I know that AWS have managed Grafana and managed Prometheus, but before then, if you wanted to deploy those sorts of services, you had to, sort of, roll it yourself or run some crazy infrastructure to do that. Or just deploy it with a Helm Chart into Kubernetes. So, I think there is going to be a place to package applications that don’t necessarily run in a service or as a service. And sometimes you don’t want to buy things in the cloud as a service. Sometimes it’s more convenient to run them yourselves in your own environment. And I think that’s probably where Kubernetes fits in.

Corey: I think that the cloud providers would shriek and scream at the idea of, “Oh, there’s a workload you think isn’t appropriate for the cloud? Well, you’re wrong, based upon nothing other than what you just said.” And that’s not a [laugh] helpful sentiment. “Cool. I’m not just going to grab petabytes of data and shove it on up there.” Or, “I’m not going to just migrate an entire massive fleet overnight.”

It doesn’t work that way. So yeah, I think there are a number of workloads that, based upon business constraints—internal and external—that you approach from that perspective. I think that there’s tremendous validity to having workloads that are at least capable of running in two places at once, at least during the transition, if not beyond. Now, every time you add a provider to that, it feels like you’re inherently limiting the capability story, whereas if you have something on-premises, great. You probably don’t have an object store on-premises that reacts in quite the same way that S3 does, to give an easy example. Does that align with your understanding, or do you feel differently about it?

Alan: Yeah, sure. So, as you say, we’re not going to have unlimited petabytes of storage in our data center, ever—

Corey: Well, you will, but your budget owner is going to have a minor heart attack [laugh] as soon as you try it the first time.

Alan: [laugh]. Yeah, exactly. So, I think our strategy, Hitachi is cloud-first, but some things we know aren’t suitable for that, so take a look at where it’s most suited to run.

Corey: So, looking at how you’ve approached it, is there messaging or storytelling that would have made it easier to wind up heading in this direction sooner? Was it a matter of waiting for certain levels of maturity from the provider perspective? Or was it entirely from an internal point of view where there are enough stakeholders that need to get on board with alternate ways of doing things before you can see any real traction?

Alan: So, I think it’s mainly a case of education. So yeah, definitely getting the stakeholders in the company bought into this crazy new way of working, so having this massive AWS bill land every month and not really being sure who’s paying for it, but it gets paid anyway. And also in terms of the developers, obviously, writing their code in a way that’s appropriate to deploy to the cloud, and also the support staff who monitor and look after these applications.

Corey: So, it’s easy for me to sit here and say, “Oh. You should just go ahead and do X, Y, and Z if you want to wind up exploring down this particular path.” But I don’t work at a large enterprise or even a medium enterprise as you describe it, which is still orders of magnitude bigger than my version of big company. So, what are the biggest stumbling blocks that you’ve encountered going down this path, and how would you address them differently if you had to do all over again?

Alan: Okay. So, one of the biggest issues we’ve had for years is the lack of cloud connectivity. That’s now being addressed, but we’ve had to work around that limitation in quite a few ways. And there’s been various reasons for that. Network infrastructure has changed a bit over the years.

But the lack of cloud connectivity was really a problem in the early days when we were trying to make useful systems run outside of our network, but still, be secure and communicate with the back-end services that we needed in our network.

Corey: So, if you could talk to other folks who are where you were when you started this journey, are there any resources you’d point them to, as far as something that they should understand, something that they should absorb, or just something to help point the way?

Alan: Yeah, okay. So, we’re really trying to change into more of a DevOps culture at Hitachi, and I’m really passionate about how we go about this. And as such, I really love the book Accelerate—which have you had the author on the show? I forget.

Corey: Oh, yes. Dr. Forsgren has been a recurring guest on the show.

Alan: Yeah, I thought so.

Corey: And she’s just spectacular at answering the questions that I wish I had been smart enough to ask in the first place.

Alan: Excellent. Yeah. So, they point out the four key metrics for a high-performing culture, and so I’m always trying to bring those up in team meetings and to the leadership of Hitachi. So: deployment frequency, lead time, change fail percentage, and meantime to restore. So definitely, that book, I think, is really key to understanding why organizations need to look at those metrics, as opposed to measuring lines of code, or hours worked, or anything like that, in terms of getting a high-performance culture.

Corey: And we’ll, of course, throw a link to the Accelerate book in the [show notes 00:22:29] because we are always fans of what Dr. Forsgren is up to, and any chance we get to wind up putting her in front of more people, we’ll take it.

Alan: [laugh]. Great.

Corey: So, I want to thank you for taking the time to speak with me in as much depth as you have. If people want to wind up learning more about what you’re up to, challenges you’re facing, how you’re overcoming them, okay can they find you?

Alan: So, I occasionally Tweet. I’m @alanraison. I’m on the GitHub, also alanraison there. That’s probably about it.

Corey: Excellent. Thank you so much for taking the time to speak with me today. I appreciate it.

Alan: No problem. It’s been a pleasure.

Corey: Alan Raison, DevOps lead at Hitachi Capital UK. I’m Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you’ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you hated this podcast, please leave a five-star review on your podcast platform of choice, along with a comment giving me feedback the insulting way.

Announcer: This has been this week’s episode of Screaming in the Cloud. You can also find more Corey at screaminginthecloud.com, or wherever fine snark is sold.

This has been a HumblePod production. Stay humble.

When Two Clouds Isn’t Enough with Alan Raison

Episode Summary

Episode Show Notes & Transcript

Transcript

You might also like

Coding Agents, Chaos, and the Future of Dev Work with Dexter Horthy

The Rise of Autonomous Ops: Inside AWS’s DevOps Agent with David Yanacek

Building the Backbone of AI Agents: Telemetry, Open Source, and the Future of Developer Infrastructure with Brian Douglas

Get the Newsletter

Gnarly cloud cost questions?