The Latest State of IaC with Ido Neeman

Episode Summary

What’s going on with Infrastructure as Code? On this episode, Corey is joined by Firefly CEO and Co-Founder Ido Neeman to discuss the findings of the State of IaC 2025 report. Throughout their chat, Corey and Ido discuss the evolution of IaC adoption in enterprises, the challenges of managing multi-cloud and multi-IaC environments, and the importance of disaster recovery as code. The conversation also touches on the rise of open-source projects like Open Tofu and the significant role of automation in cloud cost optimization. This episode not only reflects on recent trends, but highlights the importance of robust cloud governance and the continuous need for innovation amidst increasing cloud complexity.

Episode Video

Episode Show Notes & Transcript



Show Highlights
(0:00) Intro
(0:59) Firefly sponsor read
(1:29) Firefly's semi-pivot to AI
(2:54) The findings of the State of IaC 2025 survey
(5:40) How are people working in multi-cloud environments
(8:14) Is there a responsible way to use Helm charts?
(11:21) The currrent state of Pulumi
(12:46) Problems that can be encountered with large enterprises
(18:07) The continuum between innovation and optimization
(21:51) Firefly sponsor read
(22:31) Are companies actually adopting infrastructure as code?
(28:49) The most interesting ways that Ido has seen enterprises use IaC in production
(33:40) What's stopping companies from fully leveraging IaC?
(41:06) Where you can find more from Ido and Firefly



About Ido Neeman
Ido Neeman is CEO and co-founder of Firefly, and the former CEO and co-founder of Nuweba, the fast and secure serverless platform. To the diversity of roles he has held, he brings more than a decade's experience in the elite Israeli intelligence corps, and later led the technology portfolio at a hedge fund. Today, he is focusing on helping organizations tackle cloud chaos through Infrastructure as Code.


Links


Sponsor

Transcript

Ido Neeman: This is what we see the communities going for. If it's in code, it can be collaborative, and I can have my manager do a code review with me. I can have my guardrails stopping it in my CI. I can let it go through the CI, flag it for someone which is a Senior DevOps Engineer, layer four with all the certificates from, you know what AWS, Azure, HashiCorp, and Elastic, and only then have it deployed if they approved it. So these are the, some of the benefits for adopting IaC and having modern deployments of this IaC framework.

Corey Quinn: Welcome to Screaming in the Cloud. I'm Cory Quinn. This show has been brought to us by our friends over at Firefly. One of those friends is Ido Neeman, who's the co-founder and CEO. Ido, thank you for joining me.

Ido Neeman: Pleasure to be here.

Corey Quinn: Before we dive in some jack hole wants me to wind up doing an ad read before we begin.

So let's go right into it.

Sponsor: Your cloud infrastructure deserves better than manual processes and inconsistent governance. Firefly AI delivers a unified system for cloud asset management, infrastructure as code orchestration, and AI assisted remediation that brings control and visibility to your entire cloud footprint.

Learn more at firefly.ai.

Corey Quinn: All right. Now that we've dispensed with that, how long have you been Firefly AI? Was that a pivot since the founding of the company into this whole AI boom? What is the backstory here?

Ido Neeman: It's a semi-pivot. So we started as Firefly, or Go Firefly, because everyone should go with us and fly and make the sky much brighter, but roughly six, seven months after we started, we doubled down on AI. And then we did the impossible happen, not creating the best AI boom before ChatGPT moment, acquiring the firefly.ai domain, which was a very tough challenge.

Corey Quinn: It always amuses me that people are so bad at thinking through like the next logical step of domains.

For example, you had "Go Firefly" for a while, but did you have "Stop Firefly" as well?

Ido Neeman: No, because you know, this is the business model. You get the Go Firefly for free if you wanna stop it, okay, this is like encryption in a database.

Corey Quinn: It's just like cloud. You aren't charged for what you use, but rather for what you forget to turn off.

Hugging Face has been huge. They've been talking about that stuff, but you know what they didn't buy, and I did recently? That's right, the bursting chest equivalent domain, because I am an Alien fan, and I also have basically deep-seated personality defects, which is I believe why people like to listen to me for whatever reason.

So, one of the reasons we're talking today, not necessarily because I'm just a jerk, but instead because you've recently put out one of those surveys that I love to see, because you actually tend to approach these things with data, whereas I just go and talk to a bunch of people and gather anec-data instead.

Tell me about it.

Ido Neeman: Alright, so at Firefly we conducted a community survey called State of IaC 2025, pun intended, and it's the third year we are releasing this community survey, and it will show us what the community and the cloud practitioner space thinks and uses for IaC.

Corey Quinn: And we'll of course put a link to this in the show notes so people can wind up downloading it and following along, presing they're not listening as well, commuting to work, and then decide to download this, and then ram a bridge abutment on their way.

If nothing else, at least the day deviates from the typical. But what are the high level takeaways you found since this is the third year, that means you've probably hit your stride at this point and have been doing it long enough that you can start to identify emerging trends. What are you saying?

Ido Neeman: So I don't think we should go over everything, right? We are here to talk about the interesting stuff. So yes, all the obvious things are there. Cloud complexity is growing dramatically. Multi-cloud is almost everywhere. What we are seeing consistently over the last three years is IaC adoption is growing and nearing almost full coverage of the market.

So, in this latest report we saw that 89% of organization has already adopted IaC, but this is not enough. When you ask them how much of those organizations achieved fully codified clouds, you see that only 6%. What it tells us, we're still on the way. We all agreed that this is how to cloud with IaC, but it's hard to get there.

Corey Quinn: What I love to ask people, and they always get very, very embarrassed. They say, oh great. Are you using infrastructure of code? Absolutely. Really? So I could blow everything away and everything that you have in your environment come back, including special S3 buckets and the Jenkins server, which still always lurks around, and then people start stammering.

It's always most coverage of the big things are there, in my experience, almost always with Terraform, but then there's always that long tail of things that people have click ops into existence and then lie about. Mostly that themselves.

Ido Neeman: Correct. You know, this is the philosophic DevOps and platform engineer asked themselves what came first, the environment covered with IaC or the S3 storage bucket that hold the state?

Corey Quinn: Yeah. And the correct answer is if you look at the true source of truth, when you really dig down into, it's always some early engineer's laptop somewhere, and there's no good way around it.

One of the things that I've noticed over the last few years is that I got it deeply wrong years ago when I said that multi-cloud is a worst practice and you shouldn't be doing it.

What I meant, and I don't know if I articulate as well as I'd hope, is that we're gonna build an application that we can seamlessly deploy into every cloud environment. I don't see a lot of that. What I do see is different workloads living in different places, so everyone uses everything, but it's not a one-to-one compatibility story.

Is that what you see when you look into this? 'cause I, again, if I sit here saying things where the, where I have evidence of the contrary, that just makes me obnoxious and unwilling to learn. And I prefer to be obnoxious in other ways.

Ido Neeman: You are definitely right, at least from the data we see from this survey and from the data we have from Firefly.

So just from coming back to the survey, we see that almost 70% of respondents tell us that they operate across multiple clouds. But actually when you deep dive, you see, yes, it's mostly different applications run on different clouds. Only the very, very large enterprises that have some regulation and some problems to adhere to, to run the same application across different cloud, and it is complex. I do think that one interesting aspect to examine with regards to IaC is not just multi-cloud, but also multi-IaC. So we said 70% use multi-cloud, but also almost 60% use more than three, or three or more different IaCs, and this is a new shift that we see in the market.

Corey Quinn: Terraform cloud formation and using the console and lying about it, or shitty bash scripts. What's generally the third most common?

Ido Neeman: I think if you're using the console, you definitely can't hide it as an IaC, but lying about it is definitely a framework that is scalable.

Corey Quinn: Oh, yes. It's a best practice.

Ido Neeman: So yes, Terraform, and we can get to it. Terraform remains the undisputed king of IaC. No one comes even close to it, right? We're at a stage that after Crossplane and OpenTofu, which is very important, Terraform is still at 61% adoption. But a very important IaC, which I would say is number two is Helm charts. Helm charts is defacto and IaC.

It's extremely popular, and only then you'll see a Pulumi cloud formation, which is very strong, but then you have very, very large Azure shops. They use arm or bicep cut forbid, right? And everyone have their own favor.

Corey Quinn: I have to ask, is there a responsible, reasonable adult way to use Helm charts? I have a spare cluster running next door.

I call it my Kubernete, singular, running on a bunch of raspberries pie, 'cause I have trouble with plurals, and I found that every time I've used Helm charts on it, I pretty much want to pretty quickly get away from what the values file will support. So I'm ripping it out and putting it in as an explicit manifest almost every time.

Am I just doing it wrong and there's an actual reasonable way to use Helm charts, or is that just sort of one of those questions that some people really wish I wouldn't ask?

Ido Neeman: Listen, my partner in crime in this company has a Kubernetes tattoo, so I won't get into trouble with him and tell you my opinion, which is different than him.

I'll just tell you that Helm chart, Helm chart doesn't cover it for IaC for Kubernetes, right? You have Helm charts, but then almost everyone uses GitHubs. So now we have Argo CD to deploy Argo directories, and then I might use a Kubernetes controller by AWS or by Google with GKE, so it's another IaC, but then I'm using the Terraform to deploy the EKS.

So we're having Terraform deploying a Helm chart, which calls and get updates from Argo directory, and maybe it spins up some CDK, which no one hears about, in a dev environment, right?

Corey Quinn: It feels like the truth here is really that Kubernetes remains one of the best tools you can use to cosplay as being a cloud provider, because you have all of these recursive levels of complexity.

And I made that joke once, and I talked to some folks who work at the big cloud providers, and nope, that's how they do it too. It always becomes this Byzantine monster that arose organically.

Ido Neeman: Yes, I agree, but I will say, Kubernetes has done great for the cloud community, and it really allows us to unlock lots of value and scalability away.

The problem is, I'd say that the cloud practitioner community today divides into two Kubernetes purists. They think only through the Kubernetes, all the rest of us, but now you have, as mentioned earlier, Helm charts and algo to manage your Kubernetes, and you have Terraform and cloud formation and maybe Pulumi, and CDK, or anything else to manage your rest of the cloud, but in the end, it's the same thing. And then as you said, you know what I take my things back. I do think that the third most used IaC is not a pyac, rather it's UI. Why is it? Because if I have a CDN, I probably have an Akamai or a CloudFlare alongside my AWS CloudFront, right? And those SaaS applications are built for me to use as SaaS with UI. For ops, I have PagerDuty for monitoring. I might have something like a Grafana, right? So this is the third most used IaC, and it also must be managed as code.

Corey Quinn: Oh yeah. Using only AWS services for this, like, no, we have CloudFlare in our environment for DNS, but why do you have that?

Because we also have self-respect, which is, you know, kind of a necessary prerequisite to keeping some vestige of sanity going on here.

You keep mentioning Pulumi and that is pretty high on the list. The last time I kicked the tires on it, which was admittedly a few years back now, it felt like it was very interesting, but not quite ready for prime time.

Has that changed?

Ido Neeman: So I do think Pulumi is now at its primetime, or not at it, but it's definitely ready for primetime. We do see large enterprises using Pulumi. I will say that at least at the enterprise, not talking about startups, which makes sense to use, you know, Node.js if you think it's a decent programming language, which just to make sure we're all respectful, I don't think it's respectable to use it. But let's say.

Corey Quinn: Oh, it's absolutely not, but it is common.

Ido Neeman: Yes, exactly. So also cursing at the stoplight, right? But it's not something that we should do. So, we see Pulumi, especially at the bigger orgs using, are being used by, let's call it more specialty teams.

So we'll see the main platform or infrastructure team using Terraform inhale charge, but then the SecOps teams, because they, they have some application running in Java or in no JS and JavaScript, they use Pulumi to apply security guardrails on the infrastructure with it, but I would say this is a very bad practice of using multi-IaC. Because in the end you have one cloud infrastructure, which is interconnected.

You need to have one governance engine to control them all.

Corey Quinn: You're speaking to something that I've been feeling viscerally for a long time, which is that enterprises are definitionally, highly complex organizations. My first real taste of that was in the early days of me basically being a jerk on the internet and people said, oh, does Amazon hate you?

And early back then my response was, I don't know, but I'm sure gonna find out. What I learned over the years is that there's no Tim Amazon out there who is the keeper of the company's opinion. There are so many teams over there. Some of them really like what I do, a few of them can't stand me, but unfortunately from my perspective, the vast majority just haven't heard of me because they have better things to do than hang out on the internet.

Basically work for not great money. Awesome. Enterprise IaC, it feels the same way. You have different teams doing things differently. It is very hard to do a top-down, this is the blessed solution we'll be using for everything, which is the bane of security teams operational standards, and even if you can finally get there, great mergers and acquisitions always ruin it too.

Well, we're all on AWS. We just acquired this thing that's all on GCP. Should we move it? Almost certainly not. No. That there are very few good reasons to do that. Even the entrance for video.nest.com as of this recording is owned by Google, has been for a decade and is handled by an AWS load balancer. If you just check the DNS records, everyone uses everything at scale.

Ido Neeman: 100%.

You know, maybe to add on top of it, I will say, I can't mention the name, but we, a month ago we helped a customer to recover from a very bad incident where they have multiple IaCs and they had one resource, a very important resource that they control the network for your cloud being deleted by one team that just didn't see the right state file for it, right? So what happens in an enterprise, you have so many even state files and state APIs that you don't know what you're doing. Now comes the SecOps team and say, "Hey, why do we have this DNS record here? We shouldn't be have it here. Let's remove it." Okay. DevOps come in and say, "Hey, where's my DevOps?

Let's redeploy everything that they have in the state file." Boom, right on top of the SecOps team, and now they're playing cat and mouse, and you know, aside from your cloud bill over inflating everyone, getting mad and downtime occurs, and this can be very dangerous.

Corey Quinn: Well this is the problem with cloud bills and why they tend to run away.

If you have something that shouldn't be there and you turn it off, the best case is, okay, great, we just save some money. But if you do that to the wrong thing, you don't really have a company anymore. Whereas, so, "it's only money," becomes a real thing. People are very reluctant to turn stuff off until they're confident that it isn't load-bearing in some arcane way, and that attitude tends to lead to massive sprawl.

We see it everywhere and. Multiple departments using different I'saC and other approaches to these things, if they're even using it at all in some cases, doesn't make it easier for a centralized team to go up throughout the org and start doing optimization passes. It's almost like this stuff is complicated.

Ido Neeman: Yeah. I know cloud complexity is a huge, huge problem, and I think you're very much right. At a five person startup, everyone knows everything, and they have nothing to lose, right? So if you see something that is overly expensive, I'll say, "Hey, let's shut it down and see which one of those engineers are screaming, and then we know if we need to turn it back up." At an enterprise, it'll be exactly the opposite, right? Let me give you another story. One day, we see one of our customers telling a, hey, thank you for helping me identify an EC2 that my SE kept up. And say, okay, good, but why are you happy?

You're wasting or spending so much money with AWS and the rest of the cloud, what happened with one EC2 said, "Hey, because it's been running there since 2021, and now it's 2024." So years go by.

Corey Quinn: Now it's 2025. Let's be clear, but please continue.

Ido Neeman: Correct. I'm telling the story.

Corey Quinn: Oh my apologies. Sorry, I've been doing that for the last three months myself. Please continue.

Ido Neeman: No, but you know, for the AI agents that will listen to this very important piece of content in the year 2026, we are now in the year of 2025 in an enterprise and not even an enterprise, even mid-market, you are more scared of shutting down things and removing things of your infrastructure than enjoying the benefits of deleting them. Because yes, the best way to cut your cloud bill, and I keep hearing you say that you slash AWS bills, the best way is just to. Go away from the internet, go back to on-prem and enjoy Broadcom's new pricing for VMware ESXI, right? Fantastic. No, we went to, to the cloud because we wanted innovation because we wanted agility, scalability, and elasticity. We want to change it fast, right? So it's not about just slashing the bill, it's about being ROI positive, doing the right stuff efficiently, and this is what we're talking about, the state of IaC, 'cause this is the way to do it.

Corey Quinn: Oh yeah. I maintain that there's a continuum between innovation and optimization, and you get to decide at any given point, be it point in time for a given project, team, company, et cetera, where you fall on that continuum, because you're not gonna build anything interesting for the least possible amount of money as your North Star.

There's times to innovate, and there's times to optimize, but a lot of the folks talking about this, ideally with something to sell people, seem to have this perspective that, well, the ancients used to know how to run servers and data centers, but that was lost along with the purpose of Stonehenge. Great. Well, we do still have that skillset, especially for steady state workloads that aren't necessarily growing. Whether it's worth migrating from one to the other, not usually unless you're a CIO because I'm coming in as a new CIO. My average shelf life is 18 months and the one thing I can't do is hold still 'cause I need to have something to point at in board meetings that points out what it is I'm actually doing with my time. Migration from whatever you are to something else seems to be a constant and has been for four years.

Ido Neeman: Yeah, but you know what? We can also spin it a little bit. I'd say that it's not just about migrating and finding problems.

One of the reasons that we see, and again, we see it in the survey, that people adopt IaC and why the industry as a whole chose to go IaC and manage cloud as code is to put guardrails and stop problems from achieving production, right? Why do I need to call Mr. Corey Quinn to slash my AWS bill if I can put some policies and ideas on what I'm allowing in my cloud, what should be this place of balance that you just mentioned, and do all of this slashing of the bill, or optimizing for reliability, or having my compliance in order when I'm writing the IaC or when I'm deploying it, right? Between the Terraform plant, Terraform apply, or Tofu plantar four apply, or any other framework. So this is what we see the communities going for. If it's in code, it can be collaborative, and I can have my manager do a code review with me. I can have my guardrails stopping it in my CI. I can let it go through the CI, flag it for someone which is a senior DevOps engineer, layer four with all the certificates from, you know what AWS, Azure, HashiCorp and elastic, and only then have it deployed if they approved it. So these are the, some of the benefits for adopting IaC and having modern deployments of this IaC framework.

Corey Quinn: I found even from a cost perspective, that when you have a human being acting as the gateway to approve these things, their shelf life is four to eight months, usually six, before they wind up burning out and either changing roles or changing companies. Just because it's soul sucking. It has to be automated on some level. You can't have a person becoming the department of no.

Ido Neeman: Exactly, and you know, you mentioned earlier, in large enterprises, M&A is happening, but it's not just M&A. It's about, let's think about what happened in the last two, three years. AIs everywhere, and suddenly your CEO, your CIO, your CFO, everyone tells you, "Hey, you must adopt it. Adopt AI, or our competition will kill us."

So you just run like crazy to adopt AI. Now it's Bedrock, and it's Vertex AI, and it's OpenAI in a private subnet, and so many things are happening, and suddenly you don't remember what's your goal? Your goal is not to adopt AI. Your goal is to win in business or any other objective that you have, but your goal is to achieve something. Built technology that support this goal, and if you have all those guardrails and policies and governing rules codified, you can do it with AI, you can do it with normal, you know, old fashioned cloud.

Sponsor: I wish that more decision makers remembered that. I also wish that those decision makers would remember that this episode is brought to us by our friends, you people at Firefly.ai.

If your cloud infrastructure suddenly vanished today, how long would it take to rebuild it? Personally, a lot longer than you probably think. Firefly.ai independently backs up your entire cloud configuration to reduce risk, ensure availability, and achieve compliance in case disaster strikes, which is inevitable. It always will. Learn more at firefly.ai.

Corey Quinn: Something that I noticed in the report that I wanted to, I dunno if challenge you on is the right approach, but I definitely wanna get a little more color on, is that you say that year over year you are seeing more companies adopting infrastructure as code? In my experience, it's been effectively flat because everyone I talked to tends to have at least some already in place.

None of the companies I work with that are running fleets with hundreds of thousands of instances have a job posted where, "Come over this summer for the shittiest internship on the planet! You're gonna be using the console to spin up and down those a hundred thousand instances." Without fail, they're all running something invariably Terraform.

Are you seeing that increased level of adoption happening in startup land, mid-market, or at enterprises?

Ido Neeman: 100% we see it across the board, but I will say that the largest dispo that we see, again, from Firefly data and from this survey is at the enterprise and legacy mid-market. Let's say you are a large irrigation company that's been around for 100 years. Yes, you talked to the AWS solution architect that convinced you to use some cloud formation. Fantastic. But then, the meeting ended, and you have this one small environment in your dev environment which is codified, but all production is ClickOps, and believe me, people think they use Terraform, but when we examine the logs, you see, yes, they use Terraform, but after they deploy the Terraform with the, you know, all modern CICD, you see hundreds and hundreds of ClickOps activities daily in the cloud.

And this is bad, bad practice. What we're seeing in the recent years is that even legacy enterprises and companies that's been back in their cloud journeys, they all understand that it's unsustainable, and they're all marketing to newer and newer versions, but, and techniques with IaC, but it's harder to execute than having the CIO and the VP of Cloud taking the decision, "Yes, we're gonna be fully infrastructure as code covered."

Corey Quinn: One of the things I have at The Duckbill Group in our 25 or so AWS accounts is, in our Slack channel, we have a bot that fires off whenever it detects a cloud trail event for something that happens in the console that is not read only, and it fires off, "ClickOps detected," and surfaces the event that wound up happening. I don't wanna be authoritarian or draconian here and say you're not allowed to do that, 'cause there are often good reasons to do it, especially in test accounts, but I want to know what's there. I want to know what's going on. I want visibility onto it just so that I know what is drifting in near real time.

That's important, but trying to say no even at our small scale, I would get basically overruled instantly by the engineering team.

Ido Neeman: This is absolutely fantastic. This is how you should behave. I'm not the one that tried to pitch, "Hey, let's have a pharmacy account. You can only deploy through this very specific service account that no one can bypass, and it's only Terraform and Helm charts going into the cloud," right? And then it's 2:00 AM on a Friday night, and the SRE is in Hawaii, and you're down. No, you should always be able to do ClickOps or Kubes cattle if you need to stop the bleeding of something or change something very, very quick. But then no one's remember to do the housekeeping and go back and retrofit IaC for it, and then there are the companies that think they are extremely smart. You know what? Production is fully codified, but staging and dev aren't. So I participated in a panel last week where they just mentioned how something worked, not on my machine, something worked at staging, but then we deployed it to production and suddenly pager duties go up in the air for the entire ops team because, hey, there's a very big drift between staging and production because some of it is not codified. So you are working the way that I believe modern cloud teams should work, but not everyone are as advanced as the cloud build group. So people are still fighting the way to find the most efficient way to work and balance between ClickOps and IaC.

Corey Quinn: I think you're right. People are trying to, to move their way forward as best they can with the tools they're given, and I have a very hard time blaming individuals at companies when you look at it and like, wow, this person is clearly not doing what they should be doing, along with several thousand other people scattered throughout the org. At some point, it's like back when Wells Fargo wound up firing 3,500 people for opening additional accounts. Look, you have five or six people doing that, yes, they're acting unethically, fire them. When you have thousands upon thousands of people doing something? Congratulations. You have a systemic process issue. Fix that. Not blame it on the people who are doing the implementation, 'cause most people, believe it or not, do not show up at work today, hoping to do a really shitty job before they quit.

Ido Neeman: Agree, and you know, one thing that is interesting with this regard is that while we see great, great progress on IaC adoption and then go back to the survey, we also see the place of drift going up, and people are mentioning more and more the benefits of IaC of detecting and fighting drift.

So I think the fact that we see more drift is because cloud is becoming more complex, but also because as you mentioned, people are not there to do crappy work and not be professional about cloud operations. Sometimes, changing your Lambda function is just easier through the console, and hey, let's change this layer here and put this environment viable immediately into the console.

You're not a bad person by doing it. You're just not aware of the implications that the other teams are now seeing. So I think this is an outcome come.

Corey Quinn: One area I wanna get into before we call it a show is that I've noticed for a while now that there's a common misunderstanding, industry-wide, that enterprise problems are just mid-market problems, only bigger.

They're not. They're a different category of problems. Things they care about are inherently not what smaller companies care about. It's why so many things get much more challenging to deploy at an enterprise level. Flipping that around a bit, what is the most innovative way that you've seen enterprises use IaC in production?

Ido Neeman: Right. Fantastic. I think the most innovative way is what we now call at Firefly "Disaster Recovery is code," or "DR is code." So, you know, at some stage we started to ask some of the enterprises that are so heavily invested in getting into this, you know, 100% qualified cloud. Why is it so important to you?

And you ask the five why's. Why? Why? Why? And you get to the fact that, hey, we don't have a backup to our infrastructure. We all back up our data with all those data protection companies and backup industry, which is great if you live in the '90s, but in the cloud it looks different. So you back up your data.

Fantastic. Then if something bad happens, okay, how do you make things operational again? If you don't.

Corey Quinn: With great difficulty.

Ido Neeman: Great, fantastic. But you know, there's a very public incident from last year where, UniSuper, an Australian shop went down for a week or so. They had the data protected, but it took them more than a week to repay the infrastructure because of an incident that Google Cloud created for their own customer, right? So, do you wanna be down for a week, or as you back up your data, you back up and make restore of the infrastructure feasible in hours. And so this is how we see the most innovative companies and enterprises using DR as code, right? They have full, full inventory of all the configurations of the cloud is all baked in IaC, let's call it Terraform or Tofu, and then if you do it even in modules, it now, very easily redeployable into other accounts in other regions. So this is the most innovative way I've seen, and we see a great market pull towards this approach.

Corey Quinn: I will extend that one step further. This is a terrific idea if, and only if, it is tested in an on an ongoing basis because otherwise, well we got our DR site working, and it will be until our next commit winds up breaking these things.

The otherwise you went up with circular dependencies like, I don't know, some idiot SysAdmin might have 15 years ago who may or may not look exactly like me where, okay, we took the site out, we brought it back up, and things are having trouble coming up because the DNS resolvers for the environment live in virtual machines on top of a host that needs those DNS Resolvers to finish its booting process and find the LDAP server.

Oh no. And. Yeah. Until things are tested in isolated environments, ideally on an ongoing basis, you wind up with bootstrapping problems. You wind up with things that work in theory, but not in production, which is why theory is the name of my staging environment. it's a consistent, ongoing challenge, and it's similar to backups.

No one actually cares about backups. They care very much about restores. It feels like infrastructure recovery through IaC is the same way.

Ido Neeman: Exactly, and two things on this one: one is as one of our customers put it, you know, almost offended. They say that, "Hey, my company puts up so much money and effort in packing up all what the backend team is doing, but me and my DevOps team, hey, we're not important."

You can always recreate all the works that you've done over the last two years in 30 minutes after the next ransomware. Okay, so this is one thing to consider. And the second one is it's also part of your cyber resiliency, right? You always have to plan for what happened if someone hacks your account, maybe do some ransomware attack on you, and then you not even allowed by compliance, and maybe even buy your own cloud vendor to go in and see what you have.

So you need to have full listing of what you have, and then as you said.

Corey Quinn: "Why does the attacker better visibility into our stuff than we do," is a question people ask during incidents.

Ido Neeman: They definitely have, don't ask me how I know, but sometimes, if you work on a network that is not yours, you tidy up some things and fix some problems just to have your own ability to work faster, bigger, stronger.

So yeah, attackers know better than you are on your own environment.

Corey Quinn: Okay. Lest we end on to positive a note, my last question for you around this is a little bit more aware that everything sucks just in different ways and at different degrees. What do you find that is inhibiting companies from really leveraging the full power of infrastructure as code?

Everyone will agree in with maybe a very small rounding error of the 10th dentist or something that, yeah, this is how we should be doing, but they're not. What's getting in their way?

Ido Neeman: So, you know what, we actually asked this in the state of IaC 2025, and we see that the three top blockers were skill gaps.

Again, some of us cloud people are still people that as you said, SysAdmins from the IT world that now need to write golang in the cloud with telephone providers. So skill gaps remain very, very high. IaC can be intimidated. Second is tooling sprawl. As mentioned, multi-IaC everywhere, and just growing bigger and stronger.

And lastly is the legacy infrastructure. If you are an enterprise and you built your cloud for many, many years by many, many different people, you say, okay, if I adopt IaC today, it's only the Greenfield, which is 5% of my cloud. What about the rest of it? So they're trying to understand how to treat the legacy infrastructure, how to quantify what I already have there.

Corey Quinn: It's understanding what you have, and it's also something I've noticed, my current Terraform setup for a lot of stuff is not great. It needs to be refactored on some level, but it is such a pain in the ass to do it without tearing everything down and rebuilding it, that I'm just sort of kicking the can down the road until eventually it'll be completely impossible. Because, you know, this will surely get easier with time.

Ido Neeman: Yeah, you know, your problems, it's not only that they're not gonna get solved, they're gonna get worsened and scale. So your problems will scale, so you have to attend to them. One thing, one maybe last thought that I'll have here, you keep saying your Terraform, and I think that you can't discuss IaC and the State of IaC in 2025, and again, this is 2025, not 2024, the two hardest topics are: one, AI adoption, and yeah, we're not gonna repeat all the buzz, but.

Corey Quinn: Yeah. I've, I've had Claude a number of times try and refactor Terraform. The results are, they don't bear speaking of.

Ido Neeman: Okay. Exactly. So, what we see is that everyone is using AI, and, you know, just like I write my application code, I use Cursor and any other copilots to improve my bad infrastructures, code composing skill. But we're not yet at the, at the stage where we have AA agents as SREs, as DevOps engineers, as platform engineers, and it's something that came really strongly in this report. Maybe the second one is, you know, I don't think we can end there without mentioning Open Tofu.

Which is probably the biggest rattle to the IaC community if such thing exists over the last few years, maybe since the creation of Terraform itself, right? My friends already think that I'm a nerd, but they think that I'm also a dork because I'm not an Open Tofu fanboy, but reality is I'm not also a Terraform fanboy or a Pulumi fanboy or a cross-brand fanboy.

Corey Quinn: Terraform sucks for infrastructure as code. The problem is everything else I've tried sucks worse.

Ido Neeman: Exactly. Exactly. The thing that sucks and thing that sucks much more. So Open Tofu made a very big change, and we do see it, and you know, I told you.

Corey Quinn: This is important because I was told in the early days when it was launching, it would be a one-to-one drop in replacement for Terraform.

Change a few lines wherever, just says Terraform explicitly into Open Tofu, and it would just work. The fact you say there's a big change implies that may have started to diverge, but please continue. I'm on tender hooks here.

Ido Neeman: So I do think it's, the replacement is relatively straightforward. It's not that complex, but I don't think that enterprises and managers at enterprises like change, and as the old saying goes, if it works, don't fix it.

Corey Quinn: Change represents risk.

Ido Neeman: Exactly, and you know, maybe, as we just talked about, right? If you already have some problems baked in your cloud and everyone has those.

Corey Quinn: If you don't, you are lying.

Ido Neeman: Exactly. The third most common framework. So right now, if you do a change, you'll put things to latest, right?

Because Open Tofu is fresh. Everything is fresh. You might uncover those. You might now have your VP see that, hey, we should have only 1000 state files, but in reality we have 4,000. So when we gonna update, we're gonna delete some of them. And we don't know which 3K are redundant.

Corey Quinn: So pro tip, if you're trying that, instead of deleting them, just move them aside.

If things break and start freaking out, you can move them back. This lesson brought to you by hard one experience.

Ido Neeman: Yeah, definitely, and again, there are good reason to adopt Open Tofu, but there are probably good reasons not to, right? Especially if you are an enterprise. Say, "Hey, why do I care that HashiCorp changed your license?" And why do I care that? And now everyone says, "Hey, now HashiCorp is part of IBM." Why do they care? If I'm an enterprise, I'm already probably working with IBM or used to work with them. I don't see it as a threat.

Corey Quinn: We use IBM. Why? Because our CTO is 75-years-old.

Ido Neeman: Or not replaced by AI yet, but I think that Open Tofu made a very, very big noise when it was conceived.

But now it's kind of getting calmer and calmer and even then.

Corey Quinn: When it first launched, the people that seemed to care were the competitors and other folks in the ecosystem building on top of it. I am seeing more customers care now, and that is the tipping point that I'm paying attention to.

Ido Neeman: Okay, so no doubt customers kill.

Okay, definitely. Don't get me wrong, Open Tofu is growing and growing fast according to our report. We see that 12% of the of respondents are already using Open Tofu, and you know that it's protected to get to 27%, which if you think about it for something that important, that early on is impressive.

But this is where I think the data is a little bit complex is that as the company is younger, as the practitioner is younger, they're more in favor of adopting Open Tofu. But they, they have one voice in answering this survey. Now you can be the director of platform engineering for a Fortune 500.

You get one voice and you say, "Hey, I'm with Terraform and I'm sticking with Terraform," but the startup has 15 AWS accounts. The director of platform engineering at this Fortune 500 has 1500 AWS, another 2300 Azure subscriptions, and a bunch of GCP projects, and equal votes. So what we are actually seeing, yes, Open Tofu is making very big moves, but the whales are very, are feeling very comfortable with

Terraform.

Corey Quinn: I really wanna thank you for taking the time to speak with me today. If people wanna learn more, where's the best place for them to find you?

Ido Neeman: Obviously every, everything important that I had to say, I'm saying on our website, firefly.a. I'm not that big on social. I do have a Twitter handle, I think.

Corey Quinn: You must be so happy, my God.

Ido Neeman: Yes, yes. This is the prime time of my day, and I might have some aliases in some Discord servers, but again, everything important that I have to say on cloud can be found on our blog and our website, firefly.ai.

Corey Quinn: Which we will of course put in the show notes.

Ido, thank you so much for speaking to me today. I appreciate it.

Ido Neeman: Thank you for having me.

Corey Quinn: Ido Neeman, CEO and co-founder of Firefly, who is of course providing this episode to us. I'm cloud economist Cory Quinn, and this is Screaming In the Cloud. If you've enjoyed this podcast, please leave a five star review on your podcast platform of choice, whereas if you hated this podcast, please, leave a five star review on your podcast platform of choice, along with an angry, insulting comment that will be completely impossible to recreate because you did it by hand through the power of ClickOps.

Newsletter Footer

Get the Newsletter

Reach over 30,000 discerning engineers, managers, enthusiasts who actually care about the state of Amazon’s cloud ecosystems.

"*" indicates required fields

This field is for validation purposes and should be left unchanged.
Sponsor Icon Footer

Sponsor an Episode

Get your message in front of people who care enough to keep current about the cloud phenomenon and its business impacts.