Monolithic Databases Fail in a Cloud Native World — Horizontally Scalable Distributed Cloud-Native Databases

Mehmet Ozkaya
8 min readApr 24, 2023

--

In today’s fast-paced world of technology, organizations are increasingly adopting cloud-native architectures to build and deploy modern applications.

CNCF Cloud-Native Databases

With the rise of microservices and Kubernetes, there is a growing need for databases that can scale horizontally and provide continuous availability. In this article, we’ll discuss why monolithic databases fall short in a cloud-native world and explore the features and capabilities of distributed cloud-native databases that can meet the demands of modern applications.

Monolithic Databases Fail in a Cloud Native World

Traditional monolithic databases were designed for a different era, where applications were less distributed, and scaling was limited to vertical scaling (adding more resources to a single machine). However, in a cloud-native world, applications are increasingly built using microservices and deployed on container orchestration platforms like Kubernetes. Monolithic databases struggle to provide the horizontal scalability, resiliency, and flexibility required by these modern applications.

Microservices and Kubernetes Deserve a Cloud Native Database

As organizations embrace microservices and Kubernetes for their applications, it’s essential to have a database that can match the flexibility and scalability of these modern architectures. A cloud-native database should be designed to work seamlessly with Kubernetes, support horizontal scalability, and provide powerful RDBMS and NoSQL capabilities. Accelerating app development and creating meaningful value for your customers depends on having a flexible, modern stack from infrastructure and the data layer to the application itself. To fully take advantage of cloud-native benefits, such as productivity, agility, performance, and simplicity, a modern database is crucial.

Developers strive to deliver impactful solutions and focus on adding value through coding, rather than grappling with technical debt or temporary workarounds.

  • Quickly start with a database that is compatible with Postgres and Cassandra
  • Set up your cloud-native database in a matter of hours, not weeks
  • Decrease architectural complexity by utilizing a unified, distributed database

Customers’ expectations have never been higher, but offering the right application experiences, such as quick performance and adaptability to meet their needs, can significantly improve customer loyalty and revenue growth.

  • Improve satisfaction with high performance, always-on availability, reliability, and low-latency.
  • Deliver consistent, low-latency performance to customers regardless of location
  • Operate across various geographic locations with minimal impact on performance

The Time is Now for Stateful Distributed SQL on Kubernetes for Cloud-Native Apps

With the rise of cloud-native applications and the increasing importance of distributed systems, now is the perfect time to adopt stateful distributed SQL databases for your Kubernetes deployments. These databases can help you build and scale your applications while maintaining the ACID guarantees and SQL support you need.

Owing to its adaptable, active/active architecture, Cloud-Native Databases are an excellent stateful application designed to work seamlessly with Kubernetes. They can be deploying on various platforms, such as Amazon EKS, Google Kubernetes Engine (GKE), Rancher, Red Hat OpenShift, and VMware Tanzu, to name just a few options.

Flexible, active/active architecture makes it well-suited for Kubernetes, as it can manage the complex requirements of stateful applications. Kubernetes, which was initially designed for stateless applications, has evolved to better handle stateful applications like databases. This means that using Cloud-Native Databases with Kubernetes allows businesses to take advantage of a modern, distributed SQL database that can provide high availability, fault tolerance, and global data distribution.

How should be Cloud-Native Databases ?

Here you can find the core characteristics of Horizontally Scalable Distributed Cloud-Native Databases for Cloud-Native Architectures:

Geo-distributed, multi-cloud

A cloud-native database should be deployable in public clouds and natively inside Kubernetes. It should support deployments that span three or more fault domains, such as multi-zone, multi-region, and multi-cloud deployments.

Horizontal scalability

A cloud-native database should be a high-performance, distributed SQL database that aims to support all relational and NoSQL features. It should be best suited for cloud-native OLTP (i.e., real-time, business-critical) applications that need absolute data correctness and require at least one of the following: scalability, high tolerance to failures, or globally-distributed deployments.

Powerful RDBMS and NoSQL capabilities

A cloud-native database should combine the best features of traditional SQL databases and NoSQL databases. These databases should provide ACID (Atomicity, Consistency, Isolation, Durability) guarantees and support SQL queries like traditional RDBMS (Relational Database Management System), while also offering the scalability, performance, and flexibility of NoSQL databases.

Continuous availability

A cloud-native database should be extremely resilient to common outages with native failover and repair.

Distributed transactions

Strong consistency of writes should be achieved by using Raft consensus for replication and cluster-wide distributed ACID transactions using hybrid logical clocks. Snapshot, serializable, and read-committed isolation levels should be supported.

Multi API design

A cloud-native database should be built to be extensible APIs and support distributed SQL APIs: both Relational and NoSQL APIs like,

  • A fully relational API that re-uses the query layer of MySQL or PostgreSQL,
  • and a semi-relational SQL-like API with documents/indexing support with Apache Cassandra QL roots.

100% open source

Databases should be fully open-source for community support.

Horizontally scalable distributed Cloud-native Databases

Horizontally scalable distributed cloud-native databases are databases designed to work seamlessly with cloud-native architectures and scale out by adding more nodes to the system, rather than scaling up by increasing resources in a single node. These databases are built to provide high availability, fault tolerance, and consistency across multiple nodes, making them an excellent fit for microservices and other cloud-native applications. Some examples of such databases include Vitess, TiDB, CockroachDB, and YugabyteDB:

Vitess

Vitess is an open-source database clustering system for horizontal scaling of MySQL. It combines important MySQL features with the scalability of a NoSQL database. Vitess is used by companies like Slack, Square, and GitHub to manage their database infrastructure at scale. It provides features like connection pooling, query routing, and automatic failover to ensure high availability and performance.

TiDB

TiDB is an open-source distributed NewSQL database that is MySQL-compatible and supports Hybrid Transactional/Analytical Processing (HTAP) workloads. TiDB provides horizontal scalability, strong consistency, and high availability. It is designed to work with cloud-native environments and can be easily deployed and managed using Kubernetes.

CockroachDB

CockroachDB is an open-source, distributed SQL database built on a transactional and strongly consistent key-value store. It is designed for horizontal scalability and provides features like geo-replication, automated data distribution, and automatic rebalancing of data across nodes. CockroachDB offers compatibility with the PostgreSQL wire protocol, which makes it easy to integrate with existing tools and applications.

YugabyteDB

YugabyteDB is an open-source, high-performance, distributed SQL database designed for global, internet-scale applications. It is built on the foundation of the PostgreSQL-compatible Yugabyte SQL (YSQL) API and the Apache Cassandra-compatible Yugabyte Cloud-Qualified (YCQL) API. YugabyteDB provides horizontal scalability, strong consistency, and resilience to failures, making it suitable for cloud-native deployments.

These horizontally scalable distributed cloud-native databases help address the challenges of scaling and managing data in modern, cloud-based applications. They are designed to work with containerized and orchestrated environments for example Kubernetes-nature deployments
and provide the necessary features for high availability, fault tolerance, and performance at scale.

Hands-on: Deploying Vitess on a Kubernetes Cluster with Minikube

We will walk you through the process of deploying Vitess, a cloud-native, horizontally scalable database solution, on a Kubernetes cluster using Minikube. Vitess is designed to work seamlessly with Kubernetes and provides powerful features such as horizontal scalability, sharding, and support for relational and NoSQL APIs.

Prerequisites

  1. Install Minikube: Follow the official Minikube installation guide for your operating system.
  2. Install kubectl: Ensure you have kubectl installed and configured to work with your Kubernetes cluster.
  3. Install Helm: Helm is a package manager for Kubernetes. Follow the official Helm installation guide to install Helm on your system.

Start a Minikube cluster with the following command:

minikube start

The Vitess Operator provides an easy way to deploy and manage Vitess clusters on Kubernetes. First, add the Vitess Operator Helm repository:

helm repo add vitess-operator https://vitess-operator.github.io/vitess-helm/
helm repo update

Install the Vitess Operator in your Kubernetes cluster:

kubectl create namespace vitess
helm install vitess vitess-operator/vitess --namespace vitess

Deploy a Vitess Cluster

Create a vitess-cluster.yaml file with the following contents:

apiVersion: planetscale.com/v2
kind: VitessCluster
metadata:
name: example
spec:
images:
vtctld: vitess/lite:v11.0.0
vtgate: vitess/lite:v11.0.0
vttablet: vitess/lite:v11.0.0
mysqld:
mysql80Compatible: vitess/lite:v11.0.0
cells:
- name: zone1
globalLockserver:
etcd:
resources:
limits:
cpu: "1"
memory: 1Gi
requests:
cpu: 100m
memory: 100Mi
keyspaces:
- name: commerce
turndownPolicy: Immediate
replication:
enforceSemiSync: true
replicas: 1
rdonly: 1
schema:
initial: |
create table product (
sku varbinary(128),
description varbinary(128),
primary key(sku)
) comment 'cache';
vschema:
initial: |
{
"sharded": false,
"tables": {
"product": {}
}
}

Apply the configuration file:

kubectl apply -f vitess-cluster.yaml --namespace vitess

Expose the Vitess cluster using a Kubernetes Service:

kubectl expose deployment/vitess-example-zone1-vtgate --type=LoadBalancer --port 3306 --target-port 3306 --name=vtgate-service --namespace vitess

Get the external IP of the Vitess Service:

minikube service vtgate-service --namespace vitess --url

You should see a URL like http://192.168.49.2:3306 as the output.

Connect to the Vitess Cluster

Now that you have exposed the Vitess cluster, you can connect to it using a MySQL client. In this example, we’ll use the mysql command-line tool.

First, install the MySQL client on your system if you haven’t already:

  • On macOS: brew install mysql-client
  • On Ubuntu: sudo apt-get install mysql-client

Next, retrieve the credentials for the vtgate service:

kubectl get secret vitess-example-zone1-vtgate-auth -o jsonpath='{.data.username}' --namespace vitess | base64 --decode
kubectl get secret vitess-example-zone1-vtgate-auth -o jsonpath='{.data.password}' --namespace vitess | base64 --decode

Connect to the Vitess cluster using the mysql command and the credentials obtained above:

mysql -u <username> -p<password> -h <IP> -P 3306

Replace <username> and <password> with the values you retrieved from the Kubernetes secret, and <IP> with the external IP of the vtgate-service. Note that there should be no space between -p and the password.

You should now be connected to the Vitess cluster and can issue SQL commands:

USE `commerce`;
SHOW TABLES;
SELECT * FROM product;

Clean Up

When you’re done experimenting with the Vitess cluster, you can delete the resources you created.

Delete the Vitess cluster:

kubectl delete -f vitess-cluster.yaml --namespace vitess

Delete the Vitess Operator:

helm uninstall vitess --namespace vitess

Delete the Vitess namespace:

kubectl delete namespace vitess

Finally, stop the Minikube cluster:

minikube stop

We’ve learned how to deploy a Vitess cluster on a Kubernetes cluster using Minikube. We’ve also seen how to expose the Vitess cluster, connect to it using a MySQL client, and issue SQL commands.

Conclusion

Adopting horizontally scalable distributed cloud-native databases for cloud-native architectures is crucial in today’s modern application landscape. As organizations continue to embrace microservices and Kubernetes, having a database that can scale horizontally, provide continuous availability, and offer powerful RDBMS and NoSQL capabilities is essential. By using a cloud-native database that supports geo-distribution, multi-cloud deployments, and distributed transactions, you can ensure that your applications can meet the ever-growing demands of the cloud-native world.

I have just published a new course — Design Microservices Architecture with Patterns & Principles.

In this course, we’re going to learn how to Design Microservices Architecture with using Design Patterns, Principles and the Best Practices. We will start with designing Monolithic to Event-Driven Microservices step by step and together using the right architecture design patterns and techniques.

--

--

Mehmet Ozkaya

Software Architect | Udemy Instructor | AWS Community Builder | Cloud-Native and Serverless Event-driven Microservices https://github.com/mehmetozkaya