Corey: Welcome to AWS Morning Brief: Whiteboard Confessional. I’m Cloud Economist Corey Quinn. This weekly show exposes the semi-polite lie that is whiteboard architecture diagrams. You see, a child can draw a whiteboard architecture, but the real world is a mess. We discuss the hilariously bad decisions that make it into shipping products, the unfortunate hacks the real-world forces us to build, and that the best to call your staging environment is “theory”. Because invariably whatever you’ve built works in the theory, but not in production. Let’s get to it.
nOps will help you reduce AWS costs 15 to 50 percent if you do what tells you. But some people do. For example, watch their webcast, how Uber reduced AWS costs 15 percent in 30 days; that is six figures in 30 days. Rather than a thing you might do, this is something that they actually did. Take a look at it. It's designed for DevOps teams. nOps helps quickly discover the root causes of cost, and correlate that with infrastructure changes. Try it free for 30 days, go to nops.io/snark. That's N-O-P-S dot I-O, slash snark.
Good morning AWS, and welcome to the AWS Morning Brief: Whiteboard Confessional. Today we're going to revisit DNS. Now, now, slow down there, Hasty Pudding. Don't bother turning the podcast off. For once, I'm not talking about using it as a database… this time. As you're probably aware, DNS is what folks use to equate friendly names for twitterforpets.com, or incredibly unfriendly names like Oracle.com, to IP addresses, which is how computers tend to see the world. I'm not going to rehash what DNS does.
Instead, I'm going to talk about a particular kind of DNS problem that befell a place I used to consult for. They're publicly traded now, so I'm not going to name them. An awful lot of shops do something that's called split-horizon DNS. What that means is that if you're on a particular network, a DNS name resolves differently than it does when you're on a different network. For example, admin.twitterforpets.com will resolve to an administrative dashboard if you're on the Twitter For Pets internal network via VPN, but it won't resolve to that dashboard if you're outside the network, or it might resolve nowhere, or it might resolve just back to their main website, www.twitterforpets.com.
And that's fine. Most DNS providers can support this, and Route 53 is, of course, no exception. This is, incidentally, what the Route 53 resolver, that was released in 2018, is designed to do: it bridges private DNS zones to on-premises environments, so your internal zones can then resolve to private IP addresses without having to show your private IP address ranges in public zones to everyone. So, the reason that matters is that this keeps you from broadcasting your architecture or your network layout externally to your company. Some folks consider doing that to be a security problem because it discloses information that an attacker can then leverage to gain further toeholds into your network. Some folks also think that that tends to be a little bit on the extreme side. I'll let you decide because I don't care, and that's not what the story is about.
The point is that split-horizon DNS is controversial, for a few reasons, but in many shops, it is considered the right thing to do because it's what they've been doing. The internal DNS names either don't resolve anything publicly, or they resolve to a different system that’s configured to reject the request outright. But there is another path you can take; a third option that no one discusses because it's a path that's far darker, because it is oh, so very much dumber. But first…
This episode is sponsored in part by N2WS. Do you know what you care about? Many things, but never backups. At least until right after you really, really, really needed to care about backups. That's what N2WS does for your AWS account. It allows you to cycle backups through different storage tiers; you can back things up cost-effectively, and safely. For a limited time, N2WS is offering you $100 in AWS credits for setting up their free trial, and I encourage you to give it a shot. To learn more visit snark.cloud/n2ws. That's snark.cloud/n2ws.
What I'm about to describe is far too stupid for my made-up startup of Twitter For Pets, so we're going to have to invent a somehow even dumber company, and we're going to call it Uber For Squirrels. It's like regular Uber, except it somehow manages to lose less money. Now, there's a very strong argument among the engineering community inside of Uber For Squirrels. Split-horizon DNS is dangerous is what is decided and argued for. And that's the proclamation because a misconfiguration could leak records in the wrong places, and theoretically take the entire online site for Uber For Squirrel down. There are merits to those arguments and you can't dismiss them out of hand, so a bargain was struck.
The external DNS zone was therefore decreed to be uberforsquirrels.com, while the internal zone was configured to be uberforsquirrels.net. The uberforsquirrels.net zone was only accessible inside of the network. From the outside, nobody could query it. Now, this is, in isolation—before I go further—a bad plan all on its own. When you're reading quickly, uberforsquirrels.com and uberforsquirrels.net don't jump out visually to people as being meaningfully different. You're going to typo it in config files constantly without meaning to, and then you're going to have a hell of a time tracking it down because it's not immediately obvious that you're talking to the wrong thing; you might think it's a network problem. Your tab completion is going to break out of your known_hosts file, if you have such a thing configured in your environment, it's going to have to hit tab a couple of extra times to cycle through the dot net variants and the dot com variants. It's just a general irritant.
But that's not enough to justify an episode of the show. Because wait, that is still some Twitter For Pets level brokenness. Why do I need to throw Uber For Squirrels under the bus? Well, because it turns out that despite using uberforsquirrels.net everywhere as their internal domain, they didn't actually own uberforsquirrels.net. It wasn't entirely clear who did other than that the registration was in another country, so it probably wasn't something that the CEO registered and then forgot about in his random domain list of things he acquired for companies he was going to start one day. And that zone itself was wildcarded to a single IP address. And what that means is that no matter what you typed in, admin.uberforsquirrels.net, vpn.uberforsquirrels.net, payments.uberforsquirrels.net, it all landed on the same place on a single server.
And that server had some interesting things configured on it. HTTP it would listen on, HTTPS, SSH, and many other listeners were hanging out on that server. Just sat there listening on basically every port to every protocol. It would silently wait for connections and then let you send it whatever it is you wanted. So, if you weren't on the VPN when you thought you were, boom, you just typed your credentials into some randos web server. They even had a wildcarded email server set up. Anything emailed to any username at all at uberforsquirrels.net would go through, and it was never clear what happened to it afterwards.
“Damn it, I hit the dot net, again!” was the rallying cry in the Uber For Squirrels engineering halls. And then, it was time for yet another credential rotation. Now, maybe this person who set this up had no idea what chaos they caused. Maybe they did it intentionally. Maybe they were a disgruntled former employee; I don't know. What I do know is that one day, the domain was transferred to the company by way of an escrow service. So, I can only assume that that person was in turn given an eye-wateringly large check. Good for them. I mean, that is the type of mistake that was easily avoided, if only someone had been paying attention. By the time people realize the trouble that they were in, it was too late because changing all of your systems to use a different DNS zone entirely is non-trivial.
Now, I try not to fill this podcast with stories of things that broke once in a weird way that couldn't possibly ever recur again. There should, ideally, be a moral to every episode, something you can take with you. And the idea is that there's a takeaway here, something that you can do to make your own environment better. So, here you go. This episode is no exception to that general trend. It is imperative that you own all of the domains you use, regardless if they're internal, or external. And that includes domains that don't exist. For a long time, there was a finite list of publicly resolvable top-level domains, so folks would take liberties with the rest. Internal domains would be set to companyname.corp, development domains would be companyname.dev, and production domains would be dot prod.
And then, the chuckle-fucks at ICANN—that’s I-C-A-N-N—the group that regulates all of these things, decided that they like money a lot. And they put up a system by which anyone could get their own top-level domain if they make a good enough argument for it and cut an $80,000 check to ICANN. For example, dot aws is now a domain. You go to amazon.aws and that will resolve.
Chime.aws is a domain, but they refuse to give me lastweekin.aws. It's sad, and if you're listening and can help with that, please reach out. But this entire problem was then made even worse by the chuckle-fucks at Google because they did something right. Namely, they bought the dot prod domain, and then they're sitting on it so it doesn't resolve, so people aren't going to be sending company secrets all over the place, but they also bought dot dev. And they opened up dot dev to anyone who wanted to register any domain, which means that if your company uses yourcompanyname.dev as an internal testing domain, understand that if someone registers that domain, they can set up the exact same listening problem I've just described. Don't make the same mistake. Check your internal domains, check your testing domains, and make sure you own it. Then point it to something that you control, so you don't have to wonder who just sent company secrets to the wrong place. Domains are not expensive; data breaches, very much are.
This has been another episode of the AWS Morning Brief: Whiteboard Confessional. I am Cloud Economist Corey Quinn, and if you've enjoyed this episode, please do me a personal favor and leave a five-star review on Apple Podcasts. Whereas if you've hated this, please leave a five-star review on Apple Podcasts, and then send your complaint to lastweekinaws.net.
Thank you for joining us on Whiteboard Confessional. If you have terrifying ideas, please reach out to me on twitter at @quinnypig and let me know what I should talk about next time.
Announcer: This has been a HumblePod production. Stay humble.