9 Ways AWS Made Me Headdesk When Using the CDK

I recently revisited my dedicated Last Tweet in AWS (free) threading Twitter client, to help users of that ridiculous shitposting app solve some latency issues when they were far away.

The client has always been pretty zippy for me, but it’s been hosted out of us-west-2 in Oregon, and I’m in San Francisco. When some friends went to conferences in Europe, I immediately facepalmed; of course it was going to be slow for them.

A careful study showed that AWS does, in fact, offer services in more than one region. I concocted the harebrained idea to not only revisit and slightly modernize the application stack, but also to deploy this thing to every commercial AWS region simultaneously.

It turns out that this was something of a Bad Idea.

Kicking off a multi-region CDK extravaganza

It made sense to deploy Last Tweet in AWS to multiple regions. After all, the Twitter client is completely stateless aside from a couple of API credentials that get set as Lambda environment variables at deploy-time. It’s just a DNS record, an API Gateway, and an omnibus Lambda function housing an Express application (thus Lambda@Edge was a complete nonstarter, which is fine because that service is complete garbage; if you disagree then ask yourself why AWS launched CloudFront Functions instead of fixing something that irredeemably awful).

Historically, I built and deployed this thing via the Serverless Framework. After some users found Serverless playing fast and loose with AWS credentials and not telling them about it, I resolved that maybe I didn’t want to use its offering anymore.

I researched alternatives and decided to spend a day building this thing out via the AWS Cloud Development Kit, putting my last experience with CDK to good use. This time, I went with TypeScript (a language, like JavaScript, that I had no real prior experience with) instead of something I have more experience with, such as Crappy Python.

Despite the often-challenging obstacles the CDK threw my way, I enjoyed the experience enough that this is likely to be my tool of choice for future AWS projects. You can tell that I succeeded: If you visit Last Tweet in AWS, the footer lists what region it’s being served out of. I’m kinda proud of that.

That said, this was still an AWS project, and as a result, there were some incredibly sharp edges. I will now enumerate them.

The woes of using AWS

1. Like some of the best people, regional service availability is nonbinary

If you go to the AWS Regional Services List, it says that API Gateways are available in basically all regions, including both Jakarta, Indonesia, and Osaka, Japan.

Given that I was modernizing this entire stack away from the Serverless framework, it made sense to go with AWS’s new v2 HTTP API offering. In addition to being absolutely impossible to Google for as a term, an “HTTP API” is not supported in either the Jakarta or Osaka regions. At the very least, this deserves an asterisk in the regional services listing, before I have to discover it myself mid-project.

Very similarly, I originally wanted this Lambda function to run under AWS’s Graviton2 Arm architecture. The joke’s on me: Arm-based Lambda functions are only available in a subset of regions. To figure out which ones, you get to go spelunking into old blog posts, as AWS loves to play hide-and-seek with this kind of caveat. json

2. I won’t tell you my secrets

There are two API creden tials I wanted the application to have, and I didn’t want to bake them into the code for obvious reasons. Parameter Store’s SecureString values are unsupported in CloudFormation without some gnarly caveats, and all of the documentation pushes you toward using AWS Secrets Manager instead. With two secrets in 20 regions at 40¢ apiece, I was somewhat unwilling to pay AWS $16 every month just to juggle two secrets for me. So I decided “to hell with this” and built out an S3 bucket containing an object featuring these credentials stored in JSON.

Getting TypeScript to retrieve these properly as a part of the CDK stack was not only challenging, it was also the single biggest obstacle to overcome. “Hey, that’s a promise object, not a string” led to a variety of documentation that presupposed I knew what the hell those words meant, because of course I was already conversant with the joy that is TypeScript and its ecosystem or I wouldn’t be here, right? “Guess, check, scream with rage, depend upon a community member named Eric Tucker to be unreasonably generous with his time, and eventually solve it via brute force” was the single most off-putting part of the entire process.

The CDK clearly needs a better story around semi-sensitive secrets storage. The far easier path would have been to just hardcode the creds into code itself. Yuck.

3. The documentation needs love

The documentation is rich and deep, but it was clearly written by folks who are themselves very familiar with both the TypeScript language and the CDK itself. You wouldn’t think that this would be an issue, but the problem that invariably creeps in is the implicit assumption that the reader knows as much as the+- documentation author. If that were the case, would I be reading the docs?

Here’s a good rule of thumb: After you write documentation, have someone who’s intelligent yet unfamiliar with the problem domain read it to make sure what you’ve authored is understandable for folks who aren’t you. Do you really think that this piece wasn’t reviewed by a professional editor before I published it, just to make sure that it’s coherent? (Hi, Alysha!)

4. AWS sure talks about regions a lot for not working well across them

AWS loves talking about regions. But it became incredibly clear that despite having 20 or so usable regions for this project, nothing AWS builds ever really envisions a customer using even a majority of them. There’s no support and basically no community documentation for deploying an application to every AWS region, because it’s apparently something that nobody does without a whole mess of either homegrown or third-party tooling to assist them.

I felt like I was breaking an awful lot of new ground here — and that’s usually a red flag that you’re doing something wrong. In this case, I was fully aware of my trailblazing across regions, but it was still unsettling.

5. Route 53 latency records hide in plain sight

The way my ridiculous application does load balancing is ludicrously simple. It uses an Amazon Route 53 latency-based routing policy to return the region that’s the closest to the user requesting it. A thorough search throughout the CDK and CloudFormation documentation revealed so few references to latency records that I started to wonder if I’d imagined the entire thing.

The joke’s on me; if you specify all the stuff you need for a variety of those routing policies but don’t explicitly choose one of the other routing policies, latency-based is the default choice. I’m not sure I could’ve made this less obvious if you’d paid me to obfuscate it. Yowza. I will call out something rather well-designed about Route 53 latency records: Each stack controls its own record, instead of each regional stack sharing a single omnibus record and somehow juggling state for failed stack deploys and whatnot. That was remarkably forward-looking and a pleasant surprise to encounter!

6. GitHub actions are awesome

AWS has an entire CDK pipeline model available. I looked into this, had a hearty laugh, and of course went with GitHub Actions for my auto-deployments instead.

CDK Pipelines was too far up the stack of Developer Experience for me to have any confidence whatsoever that AWS’s offerings weren’t going to cut me to ribbons with edge cases. A quick straw poll in the cdk.dev Slack reaffirmed this position: The clear consensus was “use GitHub Actions or prepare for pain.”

GitHub Actions’ matrix jobs made this an absolute pleasure to build out. I provided a list of AWS regions in the GitHub workflow YAML file, referenced it in the build job, and it Just Worked(tm). Until it crosses over 3,000 minutes a month of build time, it’s also completely free — and the code, of course, lives in GitHub already.

One problem I noticed is that once I use up those 3,000 monthly free minutes within GitHub Actions, every deploy costs another 36¢. That’s not a lot of money, to be sure. But it means that sloppy one-line typo commits (because I have no idea how testing works) start to add up.

I attempted to run a local runner on my Oracle Free Tier VM, but it turns out that 21 concurrent Node processes beat the snot out of most VMs. I then shoved a Docker image into a Lambda function to act as the runner, and that mostly worked. But after that was done, I discovered that CloudSnorkel’s implementation via the CDK is way more elegant than my monstrosity and even supports Fargate — so if you’re using this post as some sort of horrible how-to guide, go use CloudSnorkel instead.

7. The SEO rot is real

This one isn’t entirely on AWS; I need to smack Google around a bit as well.

Whenever I searched for a particular problem, the CDK documentation was one of the first results. This is good.

Unfortunately, I was greeted at the top of every page with a warning that I was looking at the v1 documentation, which went out of support as of June 1, 2022. This is bad.

The documentation is close but not identical, and it becomes low-key frustrating to have to consistently click an extra link to get to the Actual Answers instead of the Plausibly Correct But Subtly Wrong Options.

8. Pin your versions or abandon hope

Speaking of version issues, if you have a CDK construct or library that’s a different version than the CDK itself, all hell will break loose. I’m not talking about v1 vs. v2, but rather 2.24.0 vs. 2.25.0.

Remove the “^” in your package-lock.json file if you value your sanity, because the version bumps come fairly frequently. I dread having to maintain this thing after not touching it for 18 months or so, but that’s Future Corey’s problem.

9. CDK best practices aren’t AWS-driven

AWS’s own CDK best practices guide starts by talking about how to set up an organizationwide Cloud Center of Excellence. That is complete poppycock.

People look for best practices to answer one key question: “What should I avoid doing now that’s an easy decision to make but will haunt me if I don’t get it right the first time?” People reaching for a tool like the CDK for the first time aren’t thinking about sweeping transformational change. They’re trying to figure out what their project structure should look like, what idioms exist that they can take advantage of, and where to go to find help.

The best practices page avoids giving most of those answers, opting instead to share advice like “infrastructure and application code live in the same package,” in an attempt to undo the entire last decade of DevOps evolution and once again breach the divide between what’s inside of the container and how you manage it externally. No! Stop it! Stay in your overly complicated lane with bad naming conventions!

No AWS pain, no shitposting gain

In fairness, I think the CDK is a great interface for a certain subset of developers, and I believe that type of developer is very much the future of cloud. Gold star on that one!

The rest of my own explorations went about like you’d expect. It took a bit of work to get OICD working responsibly with GitHub, but I ultimately succeeded. I still hate JavaScript promises, async/await, and the way they make me feel like I don’t know how computers are supposed to work. There’s no great way to teach an HTTP API to just offer a redirect to its TLS version on port 80, and subdomain redirects remain as obnoxiously annoying as they always have been.

But on balance, it’s now way easier for me to quickly shitpost, regardless of where on the planet I am. And isn’t that what really matters?