DynamoDB vs. MongoDB: A Comparison and How to Choose

DynamoDB and MongoDB are both wildly successful modern replacements for traditional database systems, such as MySQL and PostgreSQL (and maybe even Route53!).

The primary use case for both is storing and querying data stored in “documents”—not Word documents that real people use, but lumps of JSON passed back and forth by computers serving sites like twitterforpets.com.

These systems tend to outperform relational databases solving scaling problems. Most technology comparison guides compare a bunch of launch features and don’t look at the real-world use of both. This article compares the two databases in areas that matter, and chooses an overall winner because that makes these comparisons exciting!

Actually, don’t compare them

The default answer when considering which of DynamoDB or MongoDB to use is to use neither.

Building a service using a relational database is a better-understood pattern; they’re a lot easier to hire expertise for and Google for solutions when you get stuck. If you need to choose between them because of concerns around scale, the next best answer is this: Use the one you’re already successfully using.

Introducing new technologies can look great on resumes. But making your production environment more complex because of one project is usually a bad idea.

With that out of the way, let’s start the competition.

Ease of use

MongoDB is more flexible than DynamoDB. You can use larger documents. There are more data types available, and it’s good at working with geospatial data. You can query any attribute in a document. You can run MongoDB anywhere: your laptop, continuous integration environment, and even your overly-complex Kubernetes cluster that is destroying the productivity of your engineering organization.

MongoDB has a well-respected hosted offering, Atlas, that will soon become the majority of the company’s revenue.

MongoDB doesn’t strictly enforce query rate or complexity limits. If you send a massive volume of queries at it, it will try to do its best to answer the requests, slowly strangling your application and annoying your users. Do the same with DynamoDB, and it will just quickly lock your application out through rate limits and/or massively increase your bill, depending on the provisioning model you chose for your table.

This means MongoDB is better suited for bursty workloads and applications without strict latency requirements. For large workloads needing to scale beyond a single server’s capacity, sharding can be used, but this comes with the baggage of having to select an appropriate shard key.

DynamoDB’s performance is impressively consistent. Its data access APIs are relatively simple to understand, however, data modeling and design can be deceptively difficult—especially for those who are used to relational databases. Global and local secondary indexes are awkward to work with and need to be fully understood upfront.

Winner: MongoDB’s early go to market strategy was all about ease of adoption by developers, and it shows here.

Operations

DynamoDB is a truly serverless, maintenance-free service. As detailed at Re:Invent 2018, it’s a huge distributed system that automatically provisions, scales, and heals itself. DynamoDB backups are simple. Point-in-time restores and timed snapshots are trivial to configure. Policies can be configured alongside other datastores with AWS Backup. Restores used to be frustratingly slow, but they’re a lot faster now.

MongoDB can be deployed any way you like. This can often mean “badly” outside of the Atlas service or other managed options, adding a general element of danger. Replica set failovers due to upgrades, hardware failure, etc. can be noticeable from your application, even when using Atlas and/or Sharded MongoDB. Backup and restore services are provided through Atlas or MongoDB Ops Manager for your on-prem installations.

Winner: DynamoDB by a mile.

Ecosystem

These days, MongoDB has a lot more going on than a document database. You can query data stored directly in S3 using MongoDB Atlas Data Lake, similar to Amazon Athena but with the drawback of your data being transferred out of your AWS account for processing. MongoDB Atlas Search provides full-text search similar to Elasticsearch. Realm offers APIs and SDKs for developers to build with, such as Javascript functions triggered by document update/insertion, the processing of incoming webhooks, and a trendy GraphQL API. It is clear that MongoDB’s future is building more hosted services and developer solutions that move up the stack.

DynamoDB can stream changes into DynamoDB Streams and Kinesis Data Streams. This functionality is the gateway to implementing the Change Data Capture pattern, allowing for code to be executed asynchronously against DynamoDB updates/inserts—kind of like stored procedures. Data in DynamoDB can also be synced in real time into services like Amazon Elasticsearch and Amazon Redshift using Kinesis Data Firehose, another fully managed solution.

Winner: DynamoDB for its integration into other AWS services. However, MongoDB’s Realms provides similar functionality but is less powerful.

Security

DynamoDB is a battle-hardened armoured tank of a service. All data is encrypted at rest and in transit, which means you can answer yes when responding to annoying questions in security surveys written in the 2000s. DynamoDB integrates with AWS security services, allowing you to bring your own encryption keys using KMS while keeping your traffic private using VPC Endpoints.

MongoDB still runs with defaults that permit direct access to data without any authentication. These permissive defaults have led to real loss of data. Much functionality is still missing from the community edition of MongoDB, such as LDAP and Kerberos authentication and System Event auditing.

Winner: DynamoDB is secure by default. MongoDB can be secured, and managed MongoDB services are a lot better in this respect, but the default configuration is still questionable.

Cost

Any serious DynamoDB shop will end up spending an amount of time on DynamoDB capacity management. Table capacity is measured using read and write capacity units (RCUs and WCUs). The complexity of a given query dictates how many capacity units are used in each request. Autoscaling provisioned capacity works well for large tables with relatively predictable usage patterns. Significant discounts can be achieved for purchasing RCUs and WCUs through making reservations, but unlike other services, AWS provides no tooling or recommendations to help you manage your purchases. For small tables or those with very bursty and unpredictable traffic patterns, it is frequently cheaper to use On Demand—even though it cannot be reserved and costs a lot more than provisioned capacity.

Outside of *CUs, DynamoDB charges in many dimensions, such as the sum of data stored, replicated data, streamed data, backups, and restores.

If you have some compute capacity sitting around, MongoDB can be inexpensive to use and get started with. Any production-worthy or hosted environment will incur dedicated hosting costs. Most workloads hosted on MongoDB clusters will work out cheaper than DynamoDB but don’t have the ability to scale to massive levels with a single API call. Atlas does have autoscaling, but it’s a bit awkward to use.

Winner: MongoDB for true adherence to frugality.

Lock-in

MongoDB is independent of any particular cloud provider and has Atlas available at GCP, Azure, and AWS. There are copy-cat protocol-mostly-compatible services such as Amazon DocumentDB and Azure Cosmos DB that provide a MongoDB-esque service if you want to use the protocol but somehow not MongoDB’s implementation.

A lift-and-shift approach between cloud providers, on-prem etc. is way easier for applications built on MongoDB than DynamoDB. However, chances are you’ll never have to move your application, and the value of such a move would be low due to the complexity of moving from one provider to another.

Winner: MongoDB is cloud-agnostic, but that isn’t as big a deal as you might think it is.

Will it eat my data?

DynamoDB’s consistency and durability guarantees are robust and simple to understand. MongoDB, on the other hand, well…

You can choose to sacrifice speed for not being sure that the data you just wrote got stored or that it was successfully replicated to another host in a replica set.

In many situations—such as in development or when the data is trivial to recreate—these settings are fine. But when you need to seriously store data, these settings are not fine.

In fairness to MongoDB, they have made many improvements here and did engage the services of one of the world’s best database botherers to help improve things. But the outcome hasn’t always been great to date.

Winner: DynamoDB will never eat your data unless you tell it to.

Overall

Both DynamoDB and MongoDB are impressive systems that have made a serious dent into the data storage landscape over the past decade. It is no secret that DynamoDB is used as a foundational building block for both Amazon retail’s business and other AWS services, which shows. DynamoDB is reliable and scales to mind-bogglingly high levels of throughput and database size.

MongoDB is developer-friendly; they deserve credit for building a world-class cloud offering.

Often grouped together, they are each most effective when solving different problems: DynamoDB for building applications on AWS that require massive scale and MongoDB for building applications rapidly, anywhere.

Winner: DynamoDB, as it does exactly what it says on the tin.
Picking the right tech stack can make all the difference in the world. Need some help figuring out which tools to use in your AWS environment? Drop us a line today.