What kind of DBaaS do you really need?

Posted in: Business Insights

The Database-as-a-Service (DBaaS) model offers a way to run your databases in the public cloud. With DBaaS, you can automate various management tasks, get products out to market faster, and, thanks to the ability to scale up or down with demand, lower your capital expenditures.

The three major public cloud providers—Amazon, Microsoft, and Google—boast a variety of DBaaS offerings. Using a vehicle analogy to categorize thee offerings, we can say the solutions range from Corollas (your classic relational database management system in the cloud) to the container ships (big-data systems that carry everything—think “Hadoop-as-a-service”).
In between are the Formula One offerings, purpose-built to rapidly ingest and query data. Then there are the 18-wheelers, which would be more aptly classed as data warehouses because they’re designed to store and query massive amounts of data.

But how do you know what kind of DBaaS you’ll need? In this blog, we’ll help you narrow down your options by taking a quick look at the DBaaS offerings from Amazon, Microsoft, and Google.

If you’re new to DBaaS, Amazon’s Relational Database Service (RDS) is a great way to get started because using it is remarkably similar to the on-premises database experience. This makes data migration easy. All you need to do is provision an RDS instance, which is a lot like Amazon’s virtual machine models. For applications that have known query patterns, ingest tons of data, and don’t require complex transactions, DynamoDB, Amazon’s Formula One offering, is a good bet. It’s essentially a NoSQL document/key value table store with a completely flexible schema.

Amazon’s other DBaaS options are best suited if you’re already heavily invested in Amazon Web Services (AWS). Redshift, a cloud-based data warehouse, lets you manage specific node configurations, giving you the flexibility to choose how many cores and how much memory each node has. Elastic MapReduce, Amazon’s managed Hadoop framework, lends itself to fast, cost-effective distribution and the processing of vast amounts of data. This 18-wheeler also supports other popular frameworks such as Spark and Presto, and can interact with data stored in other AWS applications.

Flexibility and scalability make Microsoft Azure SQL Database an excellent DBaaS platform. Need more capacity? Simply add more databases in the cloud. The service is also easy to manage, with Microsoft providing code for pooling resources and performing elastic queries or job executions. Microsoft also offers Azure DocumentDB, a NoSQL document database ideal for JSON-based storage. It supports SQL-style queries, which could eliminate the need to learn a new query language. Also in the catalog is Azure SQL Data Warehouse, a fully relational platform that features independent compute and storage scaling, which sets it apart from Redshift. You can also pause compute completely—for example, over the weekend when there’s not as much load on your data warehouse—for maximum savings.

At the container ship end, Microsoft comes in with two offerings. Azure HDInsight is an Apache Hadoop distribution able to process unstructured or semi-structured data, and supports C#, Java, and .NET. The other, Azure Data Lake, features separate storage and analytics. Storage is limitless and the analytics service runs large queries on demand.

Google Cloud SQL offers an interesting billing option, with pay-per-use billing and automatic sustained use discounts. Otherwise, Google’s Corolla is similar to Amazon’s: using it is as simple as selecting and instance and deploying it. The service also supports backup, replication, patch, and update automation, as well as automatic failover.

Google has two offerings in the Formula One category. Google Cloud Datastore is the company’s take on a NoSQL cloud service. It features out-of-the-box transaction support and encryption at rest, and is tightly integrated with other Google Cloud Platform services. Google Cloud BigTable, its NoSQL big-data database service, is ideal for operational and analytical applications such as Internet of Things use cases and user analytics. BigQuery, a serverless analytics data warehouse, represents Google’s 18-wheeler. Here, Google favors a standards-compliant dialect over SQL, enabling advanced query planning and optimization. For batch processing, querying, streaming, and machine learning using open-source data tools, Google offers Cloud Dataproc, which supports Google Cloud Storage integration as an alternative to the Hadoop Distributed File System.

To know for sure which DBaaS best suits your needs, it’s essential you understand all of the options out there. If you’d like more in-depth information on the DBaaS offerings from Amazon, Microsoft, and Google, check out Pythian’s Choosing a Database-as-a-Service white paper. It includes technical specifics, pricing model details, and tips for planning and deploying each DBaaS offering.

If you need more guidance, we’re always ready to help answer any questions you may have on choosing the right DBaaS.


About the Author

Lynda Partner is a self-professed data addict who understands how transformational data can be for organizations. In her role as EVP of Data and Analytics, Lynda focuses on Pythian’s services that help customers harness the power of data and analytics and holistically manage their data estate.

No comments

Leave a Reply

Your email address will not be published. Required fields are marked *