The Blog

Why Your AWS Bill is Likely a Product of 2-Pizza Teams

Calendar Icon 09.15.2021
aws-section-divider aws-section-divider

How AWS makes its billions — $13 billion in operating income in 2020 alone — to support the bulk of Amazon’s net income is a question that stumps the public. To shine some light on the topic, CNBC’s Jordan Novet published an article on AWS’ estimated margins on a per-service basis.

Amazon obviously isn’t publishing its margins and billing structure for all us plebeians to pick apart, and for some unfathomable reason they haven’t opened their books to me for casual inspection, but I’ve got some good guesses about how it works. I chatted with Novet at length as he researched this article (and he did cite me!). There’s still a lot of nuance to the question of how AWS bills for its services and realizes profits that doesn’t neatly fit into an article intended for a mass-market audience. Luckily, that doesn’t describe you, dear reader.

Which AWS line items are the money-makers

First, my assertion that EC2 represents a majority share of the spend is relatively uncontroversial. If you run ECS or EKS without Fargate, you’re paying either nothing or a small management fee, respectively, to run what the bill lists as a bunch of EC2 instances. A number of services take this model. One outlier is Amazon EMR, which charges a per-instance management fee on top of the — you guessed it! — EC2 fee.

After that, big drivers are S3, RDS, Data Transfer, and Elastic Block Store. Incidentally, that last one shows up on the AWS bill under the disturbingly broad catch-all category of “EC2-Other.”

There’s one other point raised in the article that I think is worth addressing: that AWS Lambda can wind up charging almost double the price of EC2 instances. It’s one of those things that is mathematically true on paper but, in practice, it makes little if any sense. Yes, you can absolutely beat the economics of Lambda if you have perfect foreknowledge of your EC2 usage so you can buy three-year Savings Plans or Reserved Instances. The trouble is, predicting usage patterns like that is hard, and few workloads are consistent minute by minute, hour by hour, day by day. Lambda gets you an awful lot of it-looks-like-but-don’t-call-it-that autoscaling as well. [email protected]’s economics are significantly worse than all of this and should be excluded from the conversation completely, as well as your production environment and AWS’ entire cloud offering.

How AWS approaches product margins and profits

AWS is clearly a company built out of small teams; Amazon talks about this constantly in the context of “two-pizza teams, no larger than you could feed with two large pizzas.” We’ll set aside entirely how laughable a concept it is that a company with 17 ways to run containers could ever agree on pizza toppings at all. The two-pizza philosophy clearly leads to what presents as a per-service, if not a per-feature, margin target. I find what this suggests to be relatively abhorrent: AWS is giving each service team margin targets rather than viewing profitability as a cohesive cross-org strategic metric.

We know that the AWS billing system is an exabyte-scale problem from conference talks, public statements, and a variety of job postings talking about the team’s work. Despite a fairly sizable infrastructure required to generate those bills, AWS doesn’t present customers with a separate charge for the billing itself (“here is your bill, please pay it, and then pay me for billing you”). The cost of that system is clearly subsidized by the other services that customers are directly charged for. This is as it should be.

But by not applying this model to other areas of the business, AWS is getting it wrong while actively incentivizing certain customer profiles to avoid AWS entirely. Nowhere is this more evident than data transfer.

Here’s an example: If cross-AZ data transfer were free instead of costing $20,000 per petabyte transferred, what would happen? Customers would have more architectures that span multiple availability zones, meaning that there would be more storage and compute provisioned. Traffic over the (presumably) dedicated fiber between AZs would increase — but seeing as AWS almost certainly owns everything between the two AZs, it would manifest as a potential one-time upgrade investment. Then, of course, AWS would take a hit to its next quarterly earnings as that change manifested across its customers. Yes, I do believe that charge to be significant enough to materially change quarterly earnings.

That said, think of the goodwill earned by not surprising customers who discover that two cent per GB charge the hard way. It would start to feel a bit more like AWS were partnering with customers to build out their durable applications rather than preying upon them. If Amazon truly is a company with a longer-term view than quarterly numbers, this strikes me as a relatively easy win.

Here’s another example, stepping away entirely from data transfer (and the dreaded Managed NAT Gateway, the usurious egress/free ingress charges, etc.). CloudWatch Logs charges 50¢ per GB ingested into the system (and then 3¢ per month to store it). The customer takeaway here is to log less, lest you be charged for it. By default, CloudWatch log groups never expire, so this is an area of unbounded growth that’s usually small enough to go overlooked until, one day, a customer discovers they’ve been paying a “tax on going about their business” for the last decade and configures some rotation or expiry rules.

Lastly, observe the fact that SageMaker Savings Plans are completely distinct from the Compute Savings Plans for EC2, Lambda, and Fargate. This smacks of each business unit needing to protect, demonstrate, and defend its own margins, at the cost of doing what’s right for customers. Cloud cost and cloud architecture are the same thing. By incentivizing behavior that puts the team margins ahead of the business’ goals, customers are often pressured into sub-optimal architectures by the nature of their commitments. It’s needless; the money is going to go to AWS either way! What possible value to the customer is there in forcing it to flow to one service team instead of another?

Simple predictability is all we’re asking for

At times it’s easy to mistake my position on AWS bills as saying that they’re too high. Let me be very clear: I have no objection to paying an AWS bill that’s higher than the one I pay today, provided I’m able to easily predict and understand the drivers of that bill. Most customers would, if given that choice.

The CNBC article left industry insiders with a big point for discussion: What should AWS’ approach to product margins be? For me, it’s not the AWS margins that I begrudge; it’s the complexity foisted upon its customers. That complexity is encouraged in part by what appears to be each service team needing to hit margin targets instead of delighting customers.

So much of what’s hidden in an AWS bill isn’t an expense that a customer isn’t prepared or willing to pay if they were informed about it upfront and left to consider the trade-offs. Rather, it’s the result of such stupendous complexity that customers did not realize they were paying for it.

In the platonic ideal of AWS billing, The Duckbill Group should have a much smaller market than it does. We should be focusing on helping companies negotiate their contracts with AWS far more than on cost optimization projects. I don’t know how we get there from here, but I do know that AWS’ apparent current model of per-team margin targets leaves us all the poorer for it as customers.

aws-section-divider