Episode Summary

Corey is joined by Thomas Hazel, Founder & CTO of ChaosSearch or as Corey remembers them, CHAOSSEARCH. Thomas and Corey dive into how ChaosSearch’s messaging has evolved over the years and why their data indexing solution has always made sense regardless of their brand evolution. Corey and Thomas also discuss how to solve for data analytics costs by using data indexing, the variance in how different companies use data to respond to security breaches, and the ultimate importance of putting yourself in your customer’s shoes.

Episode Show Notes & Transcript

About Thomas

Thomas Hazel is Founder, CTO, and Chief Scientist of ChaosSearch. He is a serial entrepreneur at the forefront of communication, virtualization, and database technology and the inventor of ChaosSearch's patented IP. Thomas has also patented several other technologies in the areas of distributed algorithms, virtualization and database science. He holds a Bachelor of Science in Computer Science from University of New Hampshire, Hall of Fame Alumni Inductee, and founded both student & professional chapters of the Association for Computing Machinery (ACM).

Links Referenced:

ChaosSearch: https://www.chaossearch.io/
Twitter: https://twitter.com/ChaosSearch
Facebook: https://www.facebook.com/CHAOSSEARCH/

Transcript

Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.

Corey: This episode is sponsored in part by our friends at AWS AppConfig. Engineers love to solve, and occasionally create, problems. But not when it’s an on-call fire-drill at 4 in the morning. Software problems should drive innovation and collaboration, NOT stress, and sleeplessness, and threats of violence. That’s why so many developers are realizing the value of AWS AppConfig Feature Flags. Feature Flags let developers push code to production, but hide that that feature from customers so that the developers can release their feature when it’s ready. This practice allows for safe, fast, and convenient software development. You can seamlessly incorporate AppConfig Feature Flags into your AWS or cloud environment and ship your Features with excitement, not trepidation and fear. To get started, go to snark.cloud/appconfig. That’s snark.cloud/appconfig.

Corey: Welcome to Screaming in the Cloud. I’m Corey Quinn. This promoted episode is brought to us by our returning sponsor and friend, ChaosSearch. And once again, the fine folks at ChaosSearch has seen fit to basically subject their CTO and Founder, Thomas Hazel, to my slings and arrows. Thomas, thank you for joining me. It feels like it’s been a hot minute since we last caught up.

Thomas: Yeah, Corey. Great to be on the program again, then. I think it’s been almost a year. So, I look forward to these. They’re fun, they’re interesting, and you know, always a good time.

Corey: It’s always fun to just take a look at companies’ web pages in the Wayback Machine, archive.org, where you can see snapshots of them at various points in time. Usually, it feels like this is either used for long-gone things and people want to remember the internet of yesteryear, or alternately to deliver sick burns with retorting a “This you,” when someone winds up making an unpopular statement. One of the approaches I like to use it for, which is significantly less nefarious—usually—is looking back in time at companies’ websites, just to see how the positioning of the product evolves over time.

And ChaosSearch has had an interesting evolution in that direction. But before we get into that, assuming that there might actually be people listening who do not know the intimate details of exactly what it is you folks do, what is ChaosSearch, and what might you folks do?

Thomas: Yeah, well said, and I look forward to [laugh] doing the Wayback Time because some of our ideas, way back when, seemed crazy, but now they make a lot of sense. So, what ChaosSearch is all about is transforming customers’ cloud object stores like Amazon S3 into an analytical database that supports search and SQL-type use cases. Now, where’s that apply? In log analytics, observability, security, security data lakes, operational data, particularly at scale, where you just stream your data into your data lake, connect our service, our SaaS service, to that lake and automagically we index it and provide well-known APIs like Elasticsearch and integrate with Kibana or Grafana, and SQL APIs, something like, say, a Superset or Tableau or Looker into your data. So, you stream it in and you get analytics out. And the key thing is the time-cost complexity that we all know that operational data, particularly at scale, like terabytes and a day and up causes challenges, and we all know how much it costs.

Corey: They certainly do. One of the things that I found interesting is that, as I’ve mentioned before, when I do consulting work at The Duckbill Group, we have absolutely no partners in the entire space. That includes AWS, incidentally. But it was easy in the beginning because I was well aware of what you folks were up to, and it was great when there was a use case that matched of you’re spending an awful lot of money on Elasticsearch; consider perhaps migrating some of that—if it makes sense—to ChaosSearch. Ironically, when you started sponsoring some of my nonsense, that conversation got slightly trickier where I had to disclose, yeah our media arm is does have sponsorships going on with them, but that has no bearing on what I’m saying.

And if they take their sponsorships away—please don’t—then we would still be recommending them because it’s the right answer, and it’s what we would use if we were in your position. We receive no kickbacks or partner deal or any sort of reseller arrangement because it just clouds the whole conflict of interest perception. But you folks have been fantastic for a long time in a bunch of different ways.

Thomas: Well, you know, I would say that what you thought made a lot of sense made a lot of sense to us as well. So, the ChaosSearch idea just makes sense. Now, you had to crack some code, solve some problems, invent some technology, and create some new architecture, but the idea that Elasticsearch is a useful solution with all the tooling, the visualization, the wonderful community around that, was a good place to start, but here’s the problem: setting it up, scaling it out, keep it up, when things are happening, things go bump in the night. All those are real challenges, and one of them was just the storaging of the data. Well, what if you could make S3 the back-end store? One hundred percent; no SSDs or HDDs. Makes a lot of sense.

And then support the APIs that your tooling uses. So, it just made a lot of sense on what we were trying to do, just no one thought of it. Now, if you think about the Northstar you were talking about, you know, five, six years ago, when I said, transforming cloud storage into an analytical database for search and SQL, people thought that was crazy and mad. Well, now everyone’s using Cloud Storage, everyone’s using S3 as a data lake. That’s not in question anymore.

But it was a question five, six, you know, years ago. So, when we met up, you’re like, “Well, that makes sense.” It always made sense, but people either didn’t think was possible, or were worried, you know, I’ll just try to set up an Elastic cluster and deal with it. Because that’s what happens when you particularly deal with large-scale implementations. So, you know, to us, we would love the Elastic API, the tooling around it, but what we all know is the cost, the time the complexity, to manage it, to scale it out, just almost want to pull your hair out. And so, that’s where we come in is, don’t change what you do, just change how you do it.

Corey: Every once in a while, I’ll talk to a client who’s running an Amazon Elasticsearch cluster, and they have nothing but good things to say about it. Which, awesome. On the one hand, part of me wishes that I had some of their secrets, but often what’s happened is that they have this down to a science, they have a data lifecycle that’s clearly defined and implemented, the cluster is relatively static, so resizes aren’t really a thing, and it just works for their use cases. And in those scenarios, like, “Do you care about the bill?” “Not overly. We don’t have to think about it.”

Great. Then why change? If there’s no pain, you’re not going to sell someone something, especially when we’re talking, this tends to be relatively smaller-scale as well. It’s okay, great, they’re spending $5,000 a month on it. It doesn’t necessarily justify the engineering effort to move off.

Now, when you start looking at this, and, “Huh, that’s a quarter million bucks a month we’re spending on this nonsense, and it goes down all the time,” yeah, that’s when it starts to be one of those logical areas to start picking apart and diving into. What’s also muddied the waters since the last time we really went in-depth on any of this was it used to be we would be talking about it exactly like we are right now, about how it's Elasticsearch-compatible. Technically, these days, we probably shouldn’t be saying it is OpenSearch compatible because of the trademark issues between Elastic and AWS and the Schism of the OpenSearch fork of the Elasticsearch project. And now it feels like when you start putting random words in front of the word search, ChaosSearch fits right in. It feels like your star is rising.

Thomas: Yeah, no, well said. I appreciate that. You know, it’s funny when Elastic changed our license, we all didn’t know what was going to happen. We knew something was going to happen, but we didn’t know what was going to happen. And Amazon, I say ironically, or, more importantly, decided they’ll take up the open mantle of keeping an open, free solution.

Now, obviously, they recommend running that in their cloud. Fair enough. But I would say we don’t hear as much Elastic replacement, as much as OpenSearch replacement with our solution because of all the benefits that we talked about. Because the trigger points for when folks have an issue with the OpenSearch or Elastic stack is got too expensive, or it was changing so much and it was falling over, or the complexity of the schema changing, or all the above. The pipelines were complex, particularly at scale.

That’s both for Elasticsearch, as well as OpenSearch. And so, to us, we want either to win, but we want to be the replacement because, you know, at scale is where we shine. But we have seen a real trend where we see less Elasticsearch and more OpenSearch because the community is worried about the rules that were changed, right? You see it day in, day out, where you have a community that was built around open and fair and free, and because of business models not working or the big bad so-and-so is taking advantage of it better, there’s a license change. And that’s a trust change.

And to us, we’re following the OpenSearch path because it’s still open. The 600-pound gorilla or 900-pound gorilla of Amazon. But they really held the mantle, saying, “We’re going to stay open, we assume for as long as we know, and we’ll follow that path. But again, at that scale, the time, the costs, we’re here to help solve those problems.” Again, whether it’s on Amazon or, you know, Google et cetera.

Corey: I want to go back to what I mentioned at the start of this with the Wayback Machine and looking at how things wound up unfolding in the fullness of time. The first time that it snapshotted your site was way back in the year 2018, which—

Thomas: Nice. [laugh].

Corey: Some of us may remember, and at that point, like, I wasn’t doing any work with you, and later in time I would make fun of you folks for this, but back then your brand name was in all caps, so I would periodically say things like this episode is sponsored by our friends at [loudly] CHAOSSEARCH.

Thomas: [laugh].

Corey: And once you stopped capitalizing it and that had faded from the common awareness, it just started to look like I had the inability to control the volume of my own voice. Which, fair, but generally not mid-sentence. So, I remember those early days, but the positioning of it was, “The future of log management and analytics,” back in 2018. Skipping forward a year later, you changed this because apparently in 2019, the future was already here. And you were talking about, “Log search analytics, purpose-built for Amazon S3. Store everything, ask anything all on your Amazon S3.”

Which is awesome. You were still—unfortunately—going by the all caps thing, but by 2020, that wound up changing somewhat significantly. You were at that point, talking for it as, “The data platform for scalable log analytics.” Okay, it’s clearly heading in a log direction, and that made a whole bunch of sense. And now today, you are, “The data lake platform for analytics at scale.” So, good for you, first off. You found a voice?

Thomas: [laugh]. Well, you know, it’s funny, as a product mining person—I’ll take my marketing hat off—we’ve been building the same solution with the same value points and benefits as we mentioned earlier, but the market resonates with different terminology. When we said something like, “Transforming your Cloud Object Storage like S3 into an analytical database,” people were just were like, blown away. Is that even possible? Right? And so, that got some eyes.

Corey: Oh, anything is a database if you hold that wrong. Absolutely.

Thomas: [laugh]. Yeah, yeah. And then you’re saying log analytics really resonated for a few years. Data platform, you know, is more broader because we do more broader things. And now we see over the last few years, observability, right? How do you fit in the observability viewpoint, the stack where log analytics is one aspect to it?

Some of our customers use Grafana on us for that lens, and then for the analysis, alerting, dashboarding. You can say that Kibana in the hunting aspect, the log aspects. So, you know, to us, we’re going to put a message out there that resonates with what we’re hearing from our customers. For instance, we hear things like, “I need a security data lake. I need that. I need to stream all my data. I need to have all the data because what happens today that now, I need to know a week, two weeks, 90 days.”

We constantly hear, “I need at least 90 days forensics on that data.” And it happens time and time again. We hear in the observability stack where, “Hey, I love Datadog, but I can’t afford it more than a week or two.” Well, that’s where we come in. And we either replace Datadog for the use cases that we support, or we’re auxiliary to it.

Sometimes we have an existing Grafana implementation, and then they store data in us for the long tail. That could be the scenario. So, to us, the message is around what resonates with our customers, but in the end, it’s operational data, whether you want to call it observability, log analytics, security analytics, like the data lake, to us, it’s just access to your data, all your data, all the time, and supporting the APIs and the tooling that you’re using. And so, to me, it’s the same product, but the market changes with messaging and requirements. And this is why we always felt that having a search and SQL platform is so key because what you’ll see in Elastic or OpenSearch is, “Well, I only support the Elastic API. I can’t do correlations. I can’t do this. I can’t do that. I’m going to move it over to say, maybe Athena but not so much. Maybe a Snowflake or something else.”

Corey: “Well, Thomas, it’s very simple. Once you learn our own purpose-built, domain-specific language, specifically for our product, well, why are you still sitting here, go learn that thing.” People aren’t going to do that.

Thomas: And that’s what we hear. It was funny, I won’t say what the company was, a big banking company that we’re talking to, and we hear time and time again, “I only want to do it via the Elastic tooling,” or, “I only want to do it via the BI tooling.” I hear it time and time again. Both of these people are in the same company.

Corey: And that’s legitimate as well because there’s a bunch of pre-existing processes pointing at things and we’re not going to change 200 different applications in their data model just because you want to replace a back-end system. I also want to correct myself. I was one tab behind. This year’s branding is slightly different: “Search and analyze unlimited log data in your cloud object storage.” Which is, I really like the evolution on this.

Thomas: Yeah, yeah. And I love it. And what was interesting is the moving, the setting up, the doubling of your costs, let’s say you have—I mean, we deal with some big customers that have petabytes of data; doubling your petabytes, that means, if your Elastic environment is costing you tens of millions and then you put into Snowflake, that’s also going to be tens of millions. And with a solution like ours, you have really cost-effective storage, right? Your cloud storage, it’s secure, it’s reliable, it’s Elastic, and you attach Chaos to get the well-known APIs that your well-known tooling can analyze.

So, to us, our evolution has been really being the end viewpoint where we started early, where the search and SQL isn’t here today—and you know, in the future, we’ll be coming out with more ML type tooling—but we have two sides: we have the operational, security, observability. And a lot of the business side wants access to that data as well. Maybe it’s app data that they need to do analysis on their shopping cart website, for instance.

Corey: The thing that I find curious is, the entire space has been iterating forward on trying to define observability, generally, as whatever people are already trying to sell in many cases. And that has seemed to be a bit of a stumbling block for a lot of folks. I figured this out somewhat recently because I’ve built the—free for everyone to use—the lasttweetinaws.com, Twitter threading client.

That’s deployed to 20 different AWS regions because it’s go—the idea is that should be snappy for people, no matter where they happen to be on the planet, and I use it for conferences when I travel, so great, let’s get ahead of it. But that also means I’ve got 20 different sources of logs. And given that it’s an omnibus Lambda function, it’s very hard to correlate that to users, or user sessions, or even figure out where it’s going. The problem I’ve had is, “Oh, well, this seems like something I could instrument to spray logs somewhere pretty easily, but I don’t want to instrument it for 15 different observability vendors. Why don’t I just use otel—or Open Telemetry—and then tell that to throw whatever I care about to various vendors and do a bit of a bake-off?” The problem, of course, is that open telemetry and Lambda seem to be in just the absolute wrong directions. A lot.

Thomas: So, we see the same trend of otel coming out, and you know, this is another API that I’m sure we’re going to go all-in on because it’s getting more and more talked about. I won’t say it’s the standard that I think is trending to all your points about I need to normalize a process. But as you mentioned, we also need to correlate across the data. And this is where, you know, there are times where search and hunting and alerting is awesome and wonderful and solves all your needs, and sometimes correlation. Imagine trying to denormalize all those logs, set up a pipeline, put it into some database, or just do a SELECT *, you know, join this to that to that, and get your answers.

And so, I think both OpenTelemetry and SQL and search all need to be played into one solution, or at least one capability because if you’re not doing that, you’re creating some hodgepodge pipeline to move it around and ultimately get your questions answered. And if it takes weeks—maybe even months, depending on the scale—you may sometimes not choose to do it.

Corey: One other aspect that has always annoyed me about more or less every analytics company out there—and you folks are no exception to this—is the idea of charging per gigabyte ingested because that inherently sets up a weird dichotomy of, well, this is costing a lot, so I should strive to log less. And that is sort of the exact opposite, not just of the direction you folks want customers to go in, but also where customers themselves should be going in. Where you diverge from an awful lot of those other companies because of the nature of how you work, is that you don’t charge them again for retention. And the idea that, yeah, the fact that anything stored in ChaosSearch lives in your own S3 buckets, you can set your own lifecycle policies and do whatever you want to do with that is a phenomenal benefit, just because I’ve always had a dim view of short-lived retention periods around logs, especially around things like audit logs. And these days, I would consider getting rid of audit logging data and application logging data—especially if there’s a correlation story—any sooner than three years feels like borderline malpractice.

Thomas: [laugh]. We—how many times—I mean, we’ve heard it time and time again is, “I don’t have access to that data because it was too costly.” No one says they don’t want the data. They just can’t afford the data. And one of the key premises that if you don’t have all the data, you’re at risk, particularly in security—I mean, even audits. I mean, so many times our customers ask us, you know, “Hey, what was this going on? What was that go on?” And because we can so cost-effectively monitor our own service, we can provide that information for them. And we hear this time and time again.

And retention is not a very sexy aspect, but it’s so crucial. Anytime you look in problems with X solution or Y solution, it’s the cost of the data. And this is something that we wanted to address, officially. And why do we make it so cost-effective and free after you ingest it was because we were using cloud storage. And it was just a great place to land the data cost-effective, securely.

Now, with that said, there are two types of companies I’ve seen. Everybody needs at least 90 days. I see time and time again. Sure, maybe daily, in a weeks, they do a lot of their operation, but 90 days is where it lands. But there’s also a bunch of companies that need it for years, for compliance, for audit reasons.

And imagine trying to rehydrate, trying to rebuild—we have one customer—again I won’t say who—has two petabytes of data that they rehydrate when they need it. And they say it’s a nightmare. And it’s growing. What if you just had it always alive, always accessible? Now, as we move from search to SQL, there are use cases where in the log world, they just want to pay upfront, fixed fee, this many dollars per terabyte, but as we get into the more ad hoc side of it, more and more folks are asking for, “Can I pay per query?”

And so, you’ll see coming out soon, about scenarios where we have a different pricing model. For logs, typically, you want to pay very consistent, you know, predetermined cost structure, but in the case of more security data lakes, where you want to go in the past and not really pay for something until you use it, that’s going to be an option as well coming out soon. So, I would say you need both in the pricing models, but you need the data to have either side, right?

Corey: This episode is sponsored in part by our friends at ChaosSearch. You could run Elasticsearch or Elastic Cloud—or OpenSearch as they’re calling it now—or a self-hosted ELK stack. But why? ChaosSearch gives you the same API you’ve come to know and tolerate, along with unlimited data retention and no data movement. Just throw your data into S3 and proceed from there as you would expect. This is great for IT operations folks, for app performance monitoring, cybersecurity. If you’re using Elasticsearch, consider not running Elasticsearch. They’re also available now in the AWS marketplace if you’d prefer not to go direct and have half of whatever you pay them count towards your EDB commitment. Discover what companies like Equifax, Armor Security, and Blackboard already have. To learn more, visit chaossearch.io and tell them I sent you just so you can see them facepalm, yet again.

Corey: You’d like to hope. I mean, you could always theoretically wind up just pulling what Ubiquiti apparently did—where this came out in an indictment that was unsealed against an insider—but apparently one of their employees wound up attempting to extort them—which again, that’s not their fault, to be clear—but what came out was that this person then wound up setting the CloudTrail audit log retention to one day, so there were no logs available. And then as a customer, I got an email from them saying there was no evidence that any customer data had been accessed. I mean, yeah, if you want, like, the world’s most horrifyingly devilish best practice, go ahead and set your log retention to nothing, and then you too can confidently state that you have no evidence of anything untoward happening.

Contrast this with what AWS did when there was a vulnerability reported in AWS Glue. Their analysis of it stated explicitly, “We have looked at our audit logs going back to the launch of the service and have conclusively proven that the only time this has ever happened was in the security researcher who reported the vulnerability to us, in their own account.” Yeah, one of those statements breeds an awful lot of confidence. The other one makes me think that you’re basically being run by clowns.

Thomas: You know what? CloudTrail is such a crucial—particularly Amazon, right—crucial service because of that, we see time and time again. And the challenge of CloudTrail is that storing a long period of time is costly and the messiness the JSON complexity, every company struggles with it. And this is how uniquely—how we represent information, we can model it in all its permutations—but the key thing is we can store it forever, or you can store forever. And time and time again, CloudTrail is a key aspect to correlate—to your question—correlate this happened to that. Or do an audit on two years ago, this happened.

And I got to tell you, to all our listeners out there, please store your CloudTrail data—ideally in ChaosSearch—because you’re going to need it. Everyone always needs that. And I know it’s hard. CloudTrail data is messy, nested JSON data that can explode; I get it. You know, there’s tricks to do it manually, although quite painful. But CloudTrail, every one of our customers is indexing with us in CloudTrail because of stories like that, as well as the correlation across what maybe their application log data is saying.

Corey: I really have never regretted having extra logs lying around, especially with, to be very direct, the almost ridiculously inexpensive storage classes that S3 offers, especially since you can wind up having some of the offline retrieval stuff as part of a lifecycle policy now with intelligent tiering. I’m a big believer in just—again—the Glacier Deep Archive I’ve at the cost of $1,000 a month per petabyte, with admittedly up to 12 hours of calling that as a latency. But that’s still, for audit logs and stuff like that, why would I ever want to delete things ever again?

Thomas: You’re exactly right. And we have a bunch of customers that do exactly that. And we automate the entire process with you. Obviously, it’s your S3 account, but we can manage across those tiers. And it’s just to a point where, why wouldn’t you? It’s so cost-effective.

And the moments where you don’t have that information, you’re at risk, whether it’s internal audits, or you’re providing a service for somebody, it’s critical data. With CloudTrail, it’s critical data. And if you’re not storing it and if you’re not making it accessible through some tool like an Elastic API or Chaos, it’s not worth it. I think, to your point about your story, it’s epically not worth it.

Corey: It’s really not. It’s one of those areas where that is not a place to overly cost optimize. This is—I mean we talked earlier about my business and perceptions of conflict of interest. There’s a reason that I only ever charge fixed-fee and not percentage of savings or whatnot because, at some point, I’ll be placed in a position of having to say nonsense, like, “Do you really need all of these backups?” That doesn’t make sense at that point.

I do point out things like you have hourly disk snapshots of your entire web fleet, which has no irreplaceable data on them dating back five years. Maybe cleaning some of that up might be the right answer. The happy answer is somewhere in between those two, and it’s a business decision around exactly where that line lies. But I’m a believer in never regretting having kept logs almost into perpetuity. Until and unless I start getting more or less pillaged by some particularly rapacious vendor that’s oh, yeah, we’re going to charge you not just for ingest, but also for retention. And for how long you want to keep it, we’re going to treat it like we’re carving it into platinum tablets. No. Stop that.

Thomas: [laugh]. Well, you know, it’s funny, when we first came out, we were hearing stories that vendors were telling customers why they didn’t need their data, to your point, like, “Oh, you don’t need that,” or, “Don’t worry about that.” And time and time again, they said, “Well, turns out we didn’t need that.” You know, “Oh, don’t index all your data because you just know what you know.” And the problem is that life doesn’t work out that way business doesn’t work out that way.

And now what I see in the market is everyone’s got tiering scenarios, but the accessibility of that data takes some time to get access to. And these are all workarounds and bandaids to what fundamentally is if you design an architecture and a solution is such a way, maybe it’s just always hot; maybe it’s just always available. Now, we talked about tiering off to something very, very cheap, then it’s like virtually free. But you know, our solution was, whether it’s ultra warm, or this tiering that takes hours to rehydrate—hours—no one wants to live in that world, right? They just want to say, “Hey, on this date on this year, what was happening? And let me go look, and I want to do it now.”

And it has to be part of the exact same system that I was using already. I didn’t have to call up IT to say, “Hey, can you rehydrate this?” Or, “Can I go back to the archive and look at it?” Although I guess we’re talking about archiving with your website, viewing from days of old, I think that’s kind of funny. I should do that more often myself.

Corey: I really wish that more companies would put themselves in the customers’ shoes. And for what it’s worth, periodically, I’ve spoken to a number of very happy ChaosSearch customers. I haven’t spoken to any angry ones yet, which tells me you’re either terrific at crisis comms, or the product itself functions as intended. So, either way, excellent job. Now, which team of yours is doing that excellent job, of course, is going to depend on which one of those outcomes it is. But I’m pretty good at ferreting out stories on those things.

Thomas: Well, you know, it’s funny, being a company that’s driven by customer ask, it’s so easy build what the customer wants. And so, we really take every input of what the customer needs and wants—now, there are cases where we relace Splunk. They’re the Cadillac, they have all the bells and whistles, and there’s times where we’ll say, “Listen, that’s not what we’re going to do. We’re going to solve these problems in this vector.” But they always keep on asking, right? You know, “I want this, I want that.”

But most of the feedback we get is exactly what we should be building. People need their answers and how they get it. It’s really helped us grow as a company, grow as a product. And I will say ever since we went live now many, many years ago, all our roadmap—other than our Northstar of transforming cloud storage into a search SQL big data analytics database has been customer-driven, market customer-driven, like what our customer is asking for, whether it’s observability and integrating with Grafana and Kibana or, you know, security data lakes. It’s just a huge theme that we’re going to make sure that we provide a solution that meets those needs.

So, I love when customers ask for stuff because the product just gets better. I mean, yeah, sometimes you have to have a thick skin, like, “Why don’t you have this?” Or, “Why don’t you have that?” Or we have customers—and not to complain about customers; I love our customers—but they sometimes do crazy things that we have to help them on crazy-ify. [laugh]. I’ll leave it at that. But customers do silly things and you have to help them out. I hope they remember that, so when they ask for a feature that maybe takes a month to make available, they’re patient with us.

Corey: We sure can hope. I really want to thank you for taking so much time to once again suffer all of my criticisms, slings and arrows, blithe market observations, et cetera, et cetera. If people want to learn more, where’s the best place to find you?

Thomas: Well, of course, chaossearch.io. There’s tons of material about what we do, use cases, case studies; we just published a big case study with Equifax recently. We’re in Gartner and a whole bunch of Hype Cycles that you can pull down to see how we fit in the market.

Reach out to us. You can set up a trial, kick the tires, again, on your cloud storage like S3. And ChaosSearch on Twitter, we have a Facebook, we have all this classic social medias. But our website is really where all the good content and whether you want to learn about the architecture and how we’ve done it, and use cases; people who want to say, “Hey, I have a problem. How do you solve it? How do I learn more?”

Corey: And we will, of course, put links to that in the show notes. For my own purposes, you could also just search for the term ChaosSearch in your email inbox and find one of their sponsored ads in my newsletter and click that link, but that’s a little self-serving as we do it. I’m kidding. I’m kidding. There’s no need to do that. That is not how we ever evaluate these things. But it is funny to tell that story. Thomas, thank you so much for your time. As always, it’s appreciated.

Thomas: Corey Quinn, I truly enjoyed this time. And I look forward to upcoming re:Invent. I’m assuming it’s going to be live like last year, and this is where we have a lot of fun with the community.

Corey: Oh, I have no doubt that we’re about to go through that particular path very soon. Thank you. It’s been an absolute pleasure.

Thomas: Thank you.

Corey: Thomas Hazel, CTO and Founder of ChaosSearch. I’m Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you’ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you’ve hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment that I will then set to have a retention period of one day, and then go on to claim that I have received no negative feedback.

Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Announcer: This has been a HumblePod production. Stay humble.

ChaosSearch and the Evolving World of Data Analytics with Thomas Hazel

Episode Summary

Episode Show Notes & Transcript

About Thomas

You might also like

See Why GenAI Workloads Are Breaking Observability with Wayne Segar

Presenting at re:Invent with Matt Berk and Bowen Wang

The Latest State of IaC with Ido Neeman

Get the Newsletter

Sponsor an Episode