Episode Summary
Episode Video
Episode Show Notes & Transcript
The discussion also covers Intercom’s volunteer-based on-call system, their pivot to AI agents after ChatGPT’s launch, concerns about the shrinking pipeline of systems engineers, and how companies like PlanetScale and Snowflake are outpacing AWS by delivering superior user experiences.
About Brian: Brian is an engineer based in Intercom’s Dublin office. He fixes problems, builds things, and grows people.
Show Highlights
Links
Sponsor
Wiz - Listen to Crying Out Cloud: wiz.io/crying-out-cloud
Transcript
Brian : Saying this exact, uh, thing to many, many people in Amazon. Over the last while, um, my excellent account manager has been setting me up with various leaders. Uh, they've been asking for documents, they've been asking for examples. You know, they are hungry for this stuff, so I don't, don't doubt that there's no desire to be the leaders or to, to really satisfy their customers, but, you know, it's execution we care about.
And when it comes down to it. We need excellent databases. We need the best databases to be able to ship world-class product to our customers.
Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn, and I'm here to correct an oversight. Because I have known Brian Scanlon for many years, and somehow he has slipped through the cracks and not been on this show previously, so let's go ahead and fix that. Brian Scanlon is a senior principal engineer at Intercom, where he has been for damn near 11 years at this point.
Brian, welcome to the show.
Brian : Thanks so much, Corey. It's great to finally be here.
Corey: I know it's We save the best toward the end. Yeah, exactly. Except we're not ending the show anytime soon. Much to various people's chagrin. 'cause most people want me to shut the hell up a lot more than I do,
Brian : not me. I hope we keep going for a long time.
Corey: Crying Out Cloud is one of the few cloud security podcasts that's actually fun to listen to. Smart Conversations, great guests, and Zero Fluff. If you haven't heard of it, it's a cloud and AI security podcast from Wiz Run by Clouds Sec. Pros for Clouds SEC Pros. I was actually one of the first guests on the show and it's been amazing to watch it grow.
Make sure to check them out at wiz.io/crying-out-cloud. So Intercom. I have complicated feelings about the company. Originally I hated the thing 'cause I'm on a website trying to get something done and it pops up like freaking digital clippy. Hey, do you want to talk to a person? God no. I'm a millennial.
Elder. Millennial, but still I don't want to talk to people. I. But they've come back around because it turns out the issue is not intercom. It was bad implementations thereof. Maybe you don't need to pop up, talk to a human on your landing page at the front of it, but when you're dealing with a support issue and you want to tag someone in and suddenly you're talking to a human right there, it's transformative.
So yeah, it turns out that anything can be dumb if you hold it wrong.
Brian : Yeah, I used to introduce Intercom to PE people to say, uh, yeah, we're one of those chatbots that pop up in the right hand bottom corner of your website. Except we're the good one, I think. There are a lot of bad implementations, and there you, you know, people use it for outreach and uh, for marketing and sales purposes, and, you know, they want messages in your face.
But, uh, I think when it comes to basic customer support and actually putting humans in touch with humans and increasingly humans in touch with AI bots, that will answer your questions. I think intercom's, uh, definitely got. Uh, really good properties for, uh, in the marketplace.
Corey: It's especially strange in this era that we are in where everyone is building AI chatbots and, okay.
I have a whole laundry list of angry opinions on it, but where I, where I really get annoyed is when they don't admit that they're a chatbot upfront. And it irks me because by the time I finally break down and admit I have to talk to a human being. I have exhausted the documentation. It is not going to be something simple of, have you tried jiggling the handle instead?
It's okay. There is now a weird corner case, which I'm very good at, blindly stumbling into. Uh, I need someone to go ahead and fix a thing on the backend or let me know that, that I wanna simply not possible with the platform or. Admit that your documentation is rubbish, and that's something that AI historically can't do, but forcing me to go through that filter to hit some arbitrary target of a fewest number of customer contacts, fastest resolution possible, and never let them talk to a human, has been maddening.
Where are you right now on that whole gen AI spectrum from bot to human?
Brian : So we are very much all in on, uh, gen ai, gen AI for, uh, chat bots. Um, so much so that, you know, in October, November, 2022 when chat GBT came out, uh, we reorientated the entire company. We saw that it was the future. Uh, we had actually been building ml.
Chatbots, uh, and had them in the marketplace for a good few years prior to the current explosion of, uh, ML chatbots. Um, but the lifts or the improvement that we saw with GPT-3 0.5, and that came out, that was a okay, we need to change the entire company. Um, and we've changed the entire company and moves strongly towards what we think is like the best AI chat bot in the market.
Uh, it needs to work well with humans and peop there's a transition as well also. There's just a lot of work that's to be done to get the quality really great so that people don't hate using it. One of the interesting things that we've seen is, uh, many of our customers who have good knowledge bases and who get high resolution rates, uh, from using the chatbots is that they'll find that customers actually just start asking more questions.
Because if you can get really fast answers and you don't have to go around documentation sites and the chat bot's actually useful people get it. Addicted to this, they'll, they're gonna use it all the time, and weirdly enough, have increased the number of conversations coming in because people. In some cases, and we're seeing more, more and more of this where people will ask more questions because they're getting good fast answers, uh, which is a little bit counterintuitive, but I think we all always need to have ways to fall back to humans.
Uh, of course, depends on the business and the volume and what makes sense for them. But you know, I don't think humans are going away and from what we've seen in the market as well, even for places that have adopted and are. Deflecting or answering large numbers of questions. We see that they're not reducing their support team sizes at all.
They're putting their people on better, better questions, higher quality work, or just deeper work with customers, as well as like feeding the bot with better documentation, that kind of thing. So we're, we're seeing it as like actually a net positive into the customer experience. But, uh, there's definitely lots of bad implementations out there as well.
Corey: It feels like AI assisted. Uh, support is on many cases better than pure AI support, and this is somewhat controversial among people who want to sell bots, but I find the chat bots are not necessarily a great interface just due to complete lack of discoverability. It's, it's the Alexa problem or the Siri problem, or whatever robot assistant you want to do.
You ask it a. Question today, I was getting out of the shower and I asked one of the bots, or out of, I asked, tried it with both Siri and Alexa, neither one could do it. What is the partial pressor of oxygen at 10,000 feet above sea level? And they both drop the thing completely. 'cause I have weird shower thoughts roll with it.
But it's the, but the question then is, okay, it can't do it. Of course it couldn't. I will never ask that question again despite the fact that maybe it does know how to do that, it just didn't hear me properly. Or in two weeks it will be able to answer that question. But when you get, you ask a question and it can't answer it, you kind of feel dumb for having thought for a second that it might have been able to.
The problem that the Alexa group has had forever, which is, is that humans use something like 98% of all the features they will ever use on their Alexa device. Within 90 minutes of setting it up the first time, play a song, set a timer, turn on lights, and that's about it For most people, it's just because it gains a bunch of these features.
How do you tell people about that? Turns out that finishing every sentence with, by the way, pitching something unrelated just pisses people off.
Brian : I think there will be a change as the products get better and as, uh, knowledge bases get better as customers. Our customers and, you know, customers of other chat bots know how to work with them better, that the expectations of.
Of us as consumers of users of these things will improve or change over time and just not assume that these things are as brain dead as when we started interacting with them first time around.
Corey: So I'm curious as far as what you've been up to from a technical perspective lately. We, uh. We've known each other for many years.
You have been to my house for dinner. You're friends with my brother, which I think most listeners will be shocked to realize I have one of those. It's true. I do. He lives in Belgium. Great. The, but what I found that was so interesting and got me talking to you is I was recently talking to some of the fine folks over at Planet Scale, and they have talked about Intercom in general, and you in particular as being very pleased with their database offering.
Now. I have talked to people in the past where I ask them questions about that. The response has been a wait. A company's saying, what about me now? So, right. You and I go back long enough that I can trust you not to bullshit me on these things. So, okay. Is it as good as they're telling me it is. And your response to still down to, if not better, you are a big champion.
What planet scale is up to. Tell me more please. 'cause I don't, I do a lot of things here. Scale is generally not one of them. For my own personal shit posting projects,
Brian : yeah. Short answer is yes. Plan scale is great, but I'm gonna give you a long answer as well. So. Intercom is a Ruby on Rails monolith, and we really, really like this setup.
Uh, we deploy our Ruby on Rails app onto EC2 computers, not stuffed away, and Docker containers, and using incomprehensible three letter acronyms for different parts of, uh, the, the setup. And we,
Corey: oh, you can tear down and rebuild your stack on top of the latest, trendy thing every 18 months. Huh? Wow. Almost like you're not based in Silicon Valley.
Brian : Yeah, maybe being in Dublin has its advantages. So, uh, yeah, we, we run really boring infrastructure and we have stuck with Ruby on Rails and it's largely been great for us. Uh, you know, you have to do a lot of work to scale it out to millions of lines of code, hundreds of developers working in it. But you at some stage you have to connect the thing to a database and database scaling has been a large part of the problems, uh, but also the joy of scaling Intercom in the 11 years that I've been there.
When Intercom started off, we had a very simple, nice MySQL database. We. Then unfortunately we hit product market fit, having explosive growth, that was, uh, very challenging. Um, and even though we were based in RDS, you know, we were cloud first at the start from the very start. But, uh, native RDS couldn't deal or deal easily with, with what we were doing with it.
When Aurora came along, we, uh, it honestly was a game changer for us and we jumped in very aggressively. We got to work with the Aurora team. We had some of the biggest tables on Aurora at the time. They would do all sorts of custom work for us. It was good fun, and just the Aurora architecture itself, the split between Com compute and storage, and how low latency the read replicas were and everything.
This stuff just bought us years and years and years of scalability up to a certain point. And then we started having to do things like sharding some of our data. We had tables that were so large we couldn't mutate, we couldn't add new columns or, or, uh, do kind of other MI database migrations on this data because it was changing so fast, uh, without taking a lot of downtime.
So we had to take action. And so we built our own kind of database charting system, again, built on top of Aurora, choosing. Technologies that we were very comfortable with. Um, and this bought us again, like years of scalability. It bought us the ability to be able to do database migrations on our tables, and life was good for another few years, but over time these like different sharding patterns and and way we were kind of using Aurora meant that we had 13 clusters connected to one application and.
You get into these unfortunate situations where AWS will say, Hey, we gotta patch out and you really need to apply it to all of your clusters.
Corey: Yes, we are. Downtime will be at some point during this broad window that is inconvenient for you and non-deterministic. So you've gotta be able to build a graceful degradation mode into your app from the, if you're using this technology, 'cause you've, one of the things you lose by a, with a managed database offering.
Is the ability to be very granular around when and what gets applied where.
Brian : Yeah. And even some of the upgrades that we would do, most of the time the cluster would come back in a minute or two. You know, not, not bad, but occasionally we'd have, you know, something would get stuck inside of a queue inside of Amazon or something, and it would be 20 minutes and.
This kind of stuff starts to wear you down, especially when you've got so many clusters connected to your app
Corey: and you are highly faced, highly fit, highly critical. As far as customer facing stuff, you're on the front of everyone's website. It, this is one of those areas where mistakes will show.
Brian : Absolutely. And you know, we have. Loads of customers, uh, who have up to thousands of people whose job it is to be using intercom all day, replying to their customers. And it's, it's definitely no fun when they have their entire teams unavailable, not able to do their job because we. We're twiddling our tongues, waiting for an Aurora upgrade to complete.
Corey: Well just do it outside of core business hours. I'm in San Francisco. Lots of people here use it. You're in Dublin. Lots of people there use it. You have customers in Australia, Japan, India, around the world. It is core business hours. Somewhere for someone. At any given point of the day, there is no, and now it is nighttime.
The servers can take a nap now.
Brian : Absolutely.
Corey: This is not the DMV in the US or the social security website in the United States, which still blows my mind. It has a six hour maintenance window every night, like the last person outta the office, like turns off the mainframe or something. Probably some legacy batch job or whatnot.
And there's good reason for it, but it feels like that the servers keep banker's hours.
Brian : So we were aware of VI Tess and Planet Scale. So Vi Tess, uh, I'll like the brief introduction to VI Tess is it's a MySQL. Wrapper as such, or system that came out of YouTube about 12, 13 years ago. It's an open source project.
It's got other large SaaS, B2B SaaS providers, like people like Slack, HubSpot, and
Corey: the Slack folks have been a huge advocate of this. It it makes sense since they've, they given talks about this where effectively all, all of Slack is basically a giant MySQL database. Sharded heavily obviously, but yeah, it is.
Every message is a line in a database, so yeah. Having the database work and not take a nap for a 20 minute upgrade at random times is kind of high on their list of mu it must do this.
Brian : Yeah. And it's, you know, it's not just downtime, it's. We, we need to be able to do things like have we, we have to share a lot of data.
And our, our customer's data is extremely charitable. We have, we're a multi-tenant application. We have lots and lots of our customer's data and they don't need to join across their, like, different workspaces or different customers. So we have very, very charitable data. Um, the other thing we do is. We struggle with connection pool management.
We have hundreds of thousands of ruby on rails processes that all need to connect to databases that can only take 16,000 connections at most. And, uh, so we have to run a, a layer of proxy sql, uh, connection proxies in between our application and our Aurora database. Sometimes the proxy SQL layer goes wrong, and it's just another layer of complication that we don't want to think of.
So we're aware of a test and it became increasingly. Clear that planet scale were the way to get Vitas and we have no interest in as well, or like we'd rather avoid running our own infrastructure or running our own high level services. If we can pay somebody to run a database for us, we will absolutely do that.
Corey: Now I, I'm going to stop you there because historically that has been the entire. Rallying cry of cloud where Oh great. You don't wanna run servers yourself. To a point where some people have now gone so far around the bend that we view running servers in data centers as being a skillset set the ancients possessed, but has since been lost to modern humanity outside of three hyperscalers.
No, but those folks have been with a default go-to for a lot of things for a lot of years. You are looking, but your answer was not to go and yell at the Aurora team to make it better. It was to look somewhere else.
Brian : You know, we did talk to the Aurora team about the problems and certainly Amazon are going in the right direction with the likes of Aurora Limitless, which does have like native sharding.
Um, it's ex, it is a Postgres setup, but it's, they're, they're thinking about it in the right way. But, and they do have things like RDS proxy, which could do some of these proxy things. So they do have these building blocks and they, some of the problems we could solve or maybe swap out with some Amazon managed services.
But really we were looking for something, something a bit bigger and better and where. Actually serving queries, actually serving customers. Our customers is the problem of the provider that we have. We don't just want to be getting a proxy service from one part of the company and limited insights or no ability to go in and help us out with bad queries or give us insights into what's going wrong.
You know, we wanna. We, we really need somebody who's like a partner who can go deeper into our problems and share our problems and not just be hands off with them, which, you know, at Amazon due to scale and due to the way they treat due security and a bunch of other reasons, uh, they don't act that way in their day-to-day operations.
You can convince them eventually kind of to get into certain things, but it's certainly. They don't have one small solution that it fixes all the problems that we want, like connection, pooling, sharding, fast failovers and everything. They're kind of just vending, a bunch of building blocks and maybe it's just because they're two pizza teams the whole way down and but planet scale are I.
A good, healthy, up and coming company who we liked the look of. We liked the way that they were talking about providing managed VI tests in their into companies like us. I think what we liked about Planet Scale was like they were clearly building for companies like us using a technology built for.
Exactly. Customers like us. Um, and you know, the, the kind of way like the, they kind of like a one stop shop white glove service. You just show up, send your queries at their database and they'll do the rest as opposed to you need to assemble a, a. A variety of building blocks and hope for the best.
Corey: The one challenge I see coming outta the planet scale folks, they have, they have amazing talent there.
Richard Crowley I've known for years is phenomenal. Sam Lambert is the CEO and he is there. Kidding, kidding. Uh, they have a bunch of terrific folk working there, but I find that the way that their position of the stories they tell are aligned perfectly. For folks like you, you are deep in the weeds. You know this stuff cold.
You have been running hyperscale systems for many years. Terrific. There are a lot more people that look like me, by which I mean dumb out in the universe than there are people like you. So making it a a broader mass market appeal seems like it's not the story they're telling at the moment, which is kind of a shame because based on the stories I've had with you and others at the conversations around this.
They're, they're solving a problem that meets an awful lot of people. It a awful lot of people's problems. Uh, I will also say that this reinforces a belief I've had for many years, which is as val, as things move up the stack, the value and the margins increase by being able to do it. Amazon has got the low level infrastructure stuff on lock.
No one is gonna build a better VM platform than they're doing. Uh, their reliability is untouchable. They have all kinds of great baseline foundational services, but every time they try to move up the stack into applications or things a little further up the chain. They fail miserably. They've never yet built a good user interface on anything Amazon has ever done.
We all learned to use their website, not because it's good, because we have to, and what we're seeing with things like planet Scale is they're now the, the rest of the industry is starting to erode some of those things and come further down faster than Amazon is able to go up the stack. I, it's not just things like planet Scale.
We see it with Snowflake, Databricks, uh, a whole bunch of other folks out there that are doing these things. People are using Confluence, uh, sorry, confluent instead of running their own Kafka clusters or MSK. It's, it's, those companies are eroding AWS, they're charging more in some cases, but delivering vastly superior value.
And this tells me in the future. Unless, so, they're gonna come out with something I can't foresee. Amazon is going to become the equivalent of the layer one backbone providers. The, they're gonna be next. Like if NTT goes down, the internet isn't working so well today and we're all having a bad time, but most people don't know what the hell that company is.
Everything and the, all the value rides on top of them, and I think that's Amazon's future given their course.
Brian : Yeah. We've seen this isn't a pattern that is, we've only seen with, with the move from Aurora to Planet Scale. We had the exact same with a move from Redshift to Snowflake. Uh, and again, we worked closely with Amazon.
We tried to resolve our Redshift stability problems with them. They gave us a bunch of things to do, but ultimately. When we were moved to Snowflake, not only was the technology just that bit better, they were just able to ship for us or a bit more responsive on solving for our needs. But that bit hungrier of where we didn't feel like we were just one out of a million customers for Redshift.
Um, with Snowflake, we got stuff turned around quickly and the thing has been. Pretty awesome as well and just kind of left Amazon behind. I think once things are business critical for us, uh, and they're higher, higher level applications, uh, I think it's. We're at a point now where we'd be considering taking it off Amazon rather than trying to fix it on Amazon.
Something that could become important could be, say, DDoS has became really problematic for us. At the moment, we just use waf. You know, we use the Amazon stack. It's fine. It's not that. Big a deal for us. But if we really had to nail the DDoS problem, I'd probably go to CloudFlare and wouldn't, wouldn't stick around with Amazon for too long.
Kind of on the understanding that I think Amazon probably do a reasonably good job and will, you know, they'll take support tickets and whatnot.
Corey: You, if you're going to go a DDoS, you definitely need to talk to your provider. There's no real way around that. Yeah. And their team is excellent, but the, the customer touch points, they're, they're not really, I'm sorry, but they aren't.
Brian : And, you know, AWS support, it can be tough to get listened to at times. Like I've done On-Call, a lot of on-call and opened a lot of, uh, issues with AWS support. And even just knowing how to open a case, it's like it's pretty difficult. Whereas if I'm opening a support case with some of our providers, like Open AI or incident.io or.
Or Snowflake or, or or planet scale. Very often it's little more than a message in a Slack channel and all of their automation kicks in. You get routed to the right person very, very quickly and they're able to tell us very quickly if it's our problem or their problem, as opposed to what Amazon of where they're gonna try and catch you out, asking you like which region your problem is in.
And uh, that can be frustrating at times again. It's a problem of scale and I kind of get it. Um, but the experience is way more tailored to our needs from smaller, hungrier c uh, companies. And our experience.
Corey: This episode is sponsored by my own company, the Duck Bill Group, having trouble with your AWS bill.
Perhaps it's time to renegotiate a contract with them. Maybe you're just wondering how to predict what's going on in the wide world of AWS. Well, that's where the Duck Bill group comes in to help. Remember, you can't duck the duck bill. Bill, which I am reliably informed by my business partner is absolutely not our motto.
I, I think that you're right and it's kind of sad. It, it also, if I'm reading trends, it feels like Amazon is moving away on the AWS side, at least from product led growth and speaking explicitly to large enterprises and okay, maybe it's the right answer for them. Lord knows they have better strategic insight into their customers and their needs and their growth patterns than I do sitting in the cheap seats.
But what attracted me to it was the fact that I could get started with these things for pennies and so much of what they're coming out with these days. A prerequisite enterprise support, which starts at $180,000 a year and ends nowhere. It never ends. It grows as an unbounded growth problem like an AWS bill itself, and that is that that rules out a lot of things that I'd want to kick the tires on unless I start taking hostages again.
Brian : You know, the fire hose of AWS. Updates and launches and stuff. I think the hit rate for me of where I see something that I'm actually gonna try out or where, where I'm thinking, Hey, they're nailing this, they're solving our problems. Um, I think that's gone down over the years. I'm sure they're crying into their money.
Uh, I'm sure they're, they're, they've got some pretty good businesses out there. But, uh, for the kind of mid-range cost. Like tech first company, it seems like they're, they're not the lead, the leaders that they used to be because I think with the likes of Aurora and Redshift, maybe they had like early mover advantage 'cause they had obviously access to cloud services.
Before the cloud existed, uh, they were able to build like really great, um, cloud specific services on top of that. But I think they've been outpaced by hungrier competitors at this point. And, you know, it's good for us. We're able to take advantage of these and so I'm kind of happy to do that. But, um, I think it's, I'm kind of quietly sad for, for Amazon as well.
I.
Corey: I am too. Everyone seems to think I have an axe to grind against AWS but it comes from being close to them for so long. I, I don't hate the company. If I did what I did for a company I hated, that's a pathology and I need a diagnosis and probably a restraining order. Uh, it's, I like what they do. I want them to be better than they are.
I want the offerings to improve over time. I just, I don't see, that's the direction it's going in the way that it once was, and it brings me no joy whatsoever to say that.
Brian : Um, I mean, one of the good things about Amazon is that they do want to hear this stuff like saying this exact. Uh, thing to many, many people in Amazon.
Over the last while, um, my excellent account manager has been setting me up with various leaders. Uh, they've been asking for documents, they've been asking for examples. You know, they're hungry for this stuff, so I don't, don't doubt that there's no desire to be the leaders or to, to really satisfy their customers, but, you know, it's execution we care about.
And when it comes down to it. We need excellent databases. We need the best databases to be able to ship world-class product to our customers.
Corey: And I think that that's important. It's, it, it's the, it's the need the customers have. And if the cloud provider won't give it to them, they will find ways to meet that need.
It's what they do. Uh, a last topic that I want to get into, it's been a recurring theme throughout the years on this show, which is where does the. Generation come from because people like you and people like me who came up being, you know, support folk in uh, the early days back when this was all an open field and no one really knew how computers were supposed to work.
Not that we do now, but we lie to ourselves and would, we gathered experience and came from those places to where we are now. That door has been firmly shut. That is not a path that. Is open, at least for me. Where do you come from? How did you get to the place that you are now?
Brian : Yeah, so the fun part about my career, I've had just so much luck and fortune and random timing things, uh, that have.
It worked out reasonably well. I don't think my career has been too bad to date, but it all started in 1997 or so when I went to university. And, uh, we had a, what was then called a networking society, which was basically a bunch of students running a few Unix boxes. And this was kind of in the pre-social media age where.
We didn't have WhatsApp for, even Facebook or anything to talk to each other. So the obvious thing that we did back then was we had a large proportion of the people in our university log on to a Unix shell, uh, on a bunch of servers run by, uh, students. Uh, who, who, and we all, we struggled to keep these things online.
It was pretty tough running these kind of Solaris uh, servers. And we had like instant messaging. We had these wrappers around, right? If you're old enough, you might remember, right? And we had like really healthy news groups and IRC and stuff like that. So we had this like super, super. Awesome community of people who are partially of people who are interested in doing cool stuff with tech, learning about Unix, learning about networking.
And we had a lot of users and just like we were the largest society on campus, we two great parties and it was pretty, pretty cool. Um, but totally co uh, totally coincidentally, or like through Fortune. Uh, there were also some people who've ended up being like, really? Notable in the tech community since the likes of say, John Looney, Tanya Riley, Colin McCarthy, and the list goes on of people who kind of started off their careers in technology just tinkering around on these Unix boxes back in college.
Um, and. My career largely up until maybe when I joined Intercom, um, it was all about getting doors open by knowing people, uh, through that community, staying in touch with them, you know, doing, um, doing things, whether it was our local Linux user group or different activities like that. Um, but really having, uh, good fortune from meeting a bunch of early, I guess, Unix tinkerers or CISA bins back in the day who all then kind of grew into, uh.
Working in various places in the industry and, um, but my own, uh, where I went after. Tinkering around Unison College was into Solaris technical support. Then that moved into like real cis admin work. Then later into like running, uh, building out nationwide broadband networks in, um, in, in Ireland and connecting every school in the state and building out ISP services and, and all the, it was like a mix of classic system in, and a bunch of automation increasingly automated as, as things got better in, in technology.
Then for a while as well, I was in this small bookseller called Amazon prior to, um, Intercom as well. Um, but I guess I had like a bit of a classic, well, what I consider to be a classic mo move kind of up the stack from cis admin help desk to writing more software, maybe a bit of management leadership, and then ultimately into the kind of tech leadership area that I'm in at the moment.
Like, where do people come from? I mean, like, we certainly don't have the pipeline of lots of people sitting around like building ISPs or building hosting providers. I think these are all like really solve problems. You don't have that kind of tinkering or just hands-on work that you need to, to build the, to build and use these services.
And so that, that kind of gateway. Into, uh, running services, infrastructure services, networking and all that, uh, doesn't, it's isn't obviously there as much at the moment, I think.
Corey: No, that's the problem is the, the provider, the cloud providers abstracted so much of that away that I know a lot of folks at hyperscale born in the cloud environments like Intercom.
I'll even ask you this, this is not. Necessarily, please stab coworkers in the back, but looking at your, at your technical team across the board, what is the depth of networking knowledge at Intercom?
Brian : Oh, I think I am the networking team.
Corey: And you're no slouching it. I wanna be very clear, but I gave a keynote at Nano about this last year where this is a, this is a perennial problem.
I was talking to folks at AWS about this, where a lot of your customers do not have a deep bench of networking knowledge and. They make the very reasonable response of, well, that's not true. We were talking to a customer this morning and they had, they were as good at this as we are by easily. I'm like, great.
Just outta curiosity, what sector was that customer in? Oh, they're a telco. Why? Hmm. Wonder if that has anything to do with it. Hmm. Imagine that. But these board in the cloud, standup companies don't do networking because you don't need to know networking until suddenly. You very much need to know networking, but you can go an entire career weaseling your way between the cracks without having to pick it up.
Brian : You know, some of our recent hires, we've, we've we're fortunate enough to be close enough to a fairly large Amazon office, and we've hired a good few people from Amazon support. Um, so maybe it's shifted up the stack. You know, it's no longer people who are building ISPs or hosting providers. It's a. People who work for larger providers do in kind of entry level tech roles or support type roles.
So there's, there's something of maybe le looks like the old pipelines. It's not the exact same, definitely different shape of people. Um, with, uh, they tend to be actually better at coding than I was back in, but when I was at their uh, level. But, um. The, it does seem like it's, uh, with AI coming in as well, it seems like there's gonna be a good bit of change.
Uh, so like where people, like what skills people use and grow and need in their careers. Um, there, you know, there's concerns at the moment that the use of AI and engineering and to write code, we'll, like remove the need for junior engineers or like, just we'll, we'll, uh, maxim or benefit largely senior engineers or people who can.
Guides the agent. Uh, m uh, LLM. Coding tools rather than, you know, working your way up by working, by working on small problems and building and shipping things. Um, so I think in tech, you know, there's a lot of change. Certainly there's many entry paths, uh, like the one that I take, which I think are gone, um, since some kind of replacements, but.
Do worry about, like especially in areas like networking and uh, low level Unix and stuff that we're not seeing the kind of depth or knowledge that, uh, we used to have. And I don't think I'm just being bitter and old about that. I think it is pretty useful stuff to know.
Corey: Oh it is. Uh, one thing I wanna point out, 'cause this is a recurring theme that I see a lot where you mentioned a few extraordinary names of people who are terrific and in the space that have been formative influences, would it surprise you to know that when I've spoken to multiple of those people, your name comes up in the same context?
People don't realize that we all learn from each other. It's, it's one of those things where, oh yeah, those people are smart. I'm just an idiot sitting here. It, it's a common pattern and I think we internalize it pretty well, but it shows there. There's one other aspect I want to get into about Intercom. I was going to mention earlier, but we, we got, we've done this conversational path.
It's one of those interesting things about it. Uh, I. Wound up focusing my skillset, which is not that dissimilar to yours on AWS bills, because I wanted a specific expensive problem eight years ago when I was getting started down this path. Nine years now. My God. And what the reason I did it was I was down to this or IAM like did I know a lot about?
I am at the time, no. But I didn't know that much about AWS Bills either. Turns out when you focus on things, you could pick up a lot. But the reason I went with bills is because there is never a 2:00 AM billing emergency. I've had enough horrifying on-call experiences in my career that I am effectively done with it inter all companies across the board have on-call because they need this stuff to work in various ways.
And you don't have every team have representatives at every hour around the clock in a follow the sun rotation. Intercom takes a unique approach to this, to my understanding. Tell me about it.
Brian : Yeah, so this is one of the things that I'm most proudest of at Intercom, and to be clear, it's not all my work and.
Arguably, I didn't initiate it, but I was a big influence on it, and I've certainly spent a lot of time, a lot of time running about it, and more importantly, talking about it in public and taking load of credit for it. But we have an on-call system where we use volunteers rather than conscripts, and this means that we.
Put people on call out of office hours, um, not because they happen to be on a certain team or are on a a, a rota or know something about maybe networks or systems or anything. Uh, we ask for people to volunteer to join this rotation. And so we generally have about six or seven people, uh, in this rotation, and we compensate them for their time on call.
So the way we do it is it's, you're on call for a week. Uh, not in in office hours. The teams who own the alarms that are firing will get those alarms at that time. But outside of office hours, if you're on call in this volunteer team, you get the page for that. Um, and, but of course you can't just say these things like, let's have a volunteer based on call and hope that it works out for the best.
We have to put in place a bunch of things, uh, both on the technical and social side of things. To make sure that this thing was sustainable, that people would feel like it, the work was valued and not just because of the compensation, but that the work was rewarding and you might actually learn something and maybe even enjoy doing on-call work.
So we insisted on all teams writing runbooks for every alarm that can page somebody. Most importantly, we treat every single page like a heart attack, kind of using charity majors. Uh. Quote here. Um, and so this means that say the next morning after a page goes off in the middle of the night or whatever, our teams take it seriously.
In fact, they take it more seriously than as if they had paged somebody in their own team out of bed. When you're paying, paging somebody you don't know or who's remote from you. At a bed in the middle of night because you're, you set up a bad alarm or because your thing fell over, you feel you're not guiltier about that.
Corey: Oh, whenever I page anyone I start the call, I'm sorry to wake you, but because it's just a little, a little politeness and courtesy can move mountains, but please continue
Brian : without too much effort. We just got excellent buy-in from the teams who own these different areas of the product and you know, could be building a lot and a lot of stuff can go wrong.
But we were able to hold a high bar for pages being. Actually something that a human needs to do and then giving that person the tools to actually fix the problem. We have some technical reasons why this stuff is easier for us than it is for compared to other companies, such as having a large Ruby on Rails monolith, as opposed to every single team having their own bespoke tech stack.
So that stuff helps us, but it's more the culture and how we also reward and, uh, give shout outs to people. You know, everybody from the CEO down. At Christmas whenever Addie kind of time, we always make sure to, uh, not just. Pay the people the money for the time that they spend on call out of hours. But it's recognized socially and in also in things like promotions and things like that.
It's something that's really, uh, valued in, in the organization. So we've had this in place now for seven or eight years. It's hard to remember exactly how long it's been sustainable. One of the biggest problems we've had is so many people want to join us, um, that pe so people actually like it. And we've, we've also built.
People, uh, like we've, we've made people better operators. We've made people actually enjoy and learn and learn more about what happens in the company. Um, and it's been actually a, a great long-term recruitment for my own kind of infrastructure oriented teams where people get a taste of this kind of work.
Um, they might just be a product engineer from some random part of the business, but then when they see this work and they see. They actually see what's going on under the hood. Uh, they, they ask to join our team full time. Um, there's other stuff we have to do as well. There you have to have a way for the person to, uh, who's on call to bring in an expert.
Like, so We have an Instant Commander program as well, and there's support there so that people don't feel like they're isolated on their own outta the PagerDuty playbook. Yeah. And, uh, when it comes down to it as well, look, not everybody can solve every kind of problem and we'll just go to the back phone.
We'll page in as many people as we need to to solve a problem, which is, even if you had 10 people on call, you might need to do that anyway. And so, uh, this has been great. I think having a single person on call for a business decide of intercom, it's, it can be challenging at times. Um, but we've never been at the point of where we've decided or been at any risk of things falling apart or having to inf.
Put multiple teams and lots of people on call, keeping things down to one person on call ruins fewer lives. Uh, we all get a better quality of life and, uh, doing this sustainably gives us something that, uh, we can really feel like we're making a difference in our work. And that the work just isn't feeding the robots.
It's like high quality. It's good. We're learning and setting.
Not just tolerating low quality alarms and stuff like that.
Corey: Yeah, that's the important part. It's if it wakes you up, you're empowered to fix it or turn off the alarm or just thresholds or something. It's, it's the human element of it. It's the fact that this is a, you are compensated for doing it as a volunteer thing.
It's not as part of your job responsibilities. Yes. I know you have a six month old who's having trouble sleeping, get up anyway, it none of that. It's a human approach to it, and that is something this industry has lacked historically.
Brian : Yeah. And. I've been. Spreading the good word about this, uh, trying to influence other places to, to, to improve things and not just accept the status quo.
The interesting part has been having conversations with co different people from different companies who, who are interested in doing this, but they have all sorts of other issues, like whether it's. Many, many tech stack or different compliance approaches or just other socio tech technical problems that can make it difficult.
I think we we're probably on easy mode in intercom. We, there's a lot of, we did design it for our culture and our technology stack. Uh, not everyone can do it so easily, but I would encourage everybody to like not accept. Again, like you said, status quo around. Um, just because you're on a certain team, you need to carry a pager and be always on call.
I think being on call a lot does reduce the quality of your life, even if you're not being paged. Um, and so being deliberate about that as well as recognizing the work, I think it's very important and it just gives you a great story that shows you that you actually care about the people who work for the company, uh, as opposed to just being part of some machine, uh, that.
Needs to satisfy the, the computers,
Corey: which is important. There's a human piece of it, and that's the, that's the thing that gets lost. It's not just a technical problem. I, I wanna thank you for taking the time to speak with me about all of this. If people wanna learn more, where's the best place for 'em to find you?
Brian : I am kind of on X Twitter, but not really anymore. I mean, I, I'm not so good luck. I'm sort of on there. I mean, you can type in Brian Scanlon and I, I repost work stuff, I guess. Um, I'm on blue Sky, but not as much as I was on Twitter. Again, it's like you can type my name, which
Corey: is probably a healthy thing, but yeah, I hear you.
Brian : Yeah, I don't know. I'm kind of sad about those things. I mean, I'm on LinkedIn, but. Who uses LinkedIn?
Corey: Oh God. I maintain that LinkedIn remains the world's largest porn site because it's where business people go to pleasure themselves on the internet. That is the best description I've got of it, and I have no tolerance for it.
Brian : So I don't know. Maybe the best place to find me will be if you set up a Unix server and we all just. Log on and use right to talk to each other.
Corey: I figured you'd put up a personal website. Just have an intercom chat with me. Box in the corner that pops up. 'cause you know it's not like you're doing anything else these days, right?
Brian : That works. You can find me on intercom.com. I am the other person on the side.
Corey: There we go. There we go. I will include links to all of this in the show notes. Brian, thank you so much for taking the time to speak with me. I appreciate it.
Brian : That's been great. We should do it again,
Corey: we should. Brian Scanlon, senior principal engineer at Intercom.
I'm cloud economist Corey Quinn, and this is Screaming In the Cloud. If you've enjoyed this podcast, please leave a five star review on your podcast platform of choice. Whereas if you've hated this podcast, please leave a five star review on your podcast platform of choice along with an angry, insulting comment, but that that platform will not be one of Amazon's 'cause That's way too far up the stack for them to do well.