How To Scale MongoDB in kubernetes using statefulSet and Headless svc

Introduction

When it comes to deploying MongoDB on Kubernetes (K8s), choosing the right orchestration strategy is crucial to ensure data consistency and availability. While the Deployment object in K8s offers a straightforward way to manage stateless applications, it may not be the best fit for databases like MongoDB that require stable network identities and persistent storage. This is where StatefulSets and Headless Services shine. StatefulSets provide unique, ordered pod identities and persistent storage, ensuring that each MongoDB instance maintains its identity and data integrity even during scaling operations. On the other hand, Headless Services enable direct communication with individual MongoDB pods using their fully qualified domain names (FQDNs), facilitating seamless cluster discovery and dynamic load balancing. By combining StatefulSets and Headless Services, you ensure the deployment of MongoDB in a K8s environment that guarantees data consistency, high availability, and predictable network behaviors—essential characteristics for a resilient database deployment. In this blog, we will delve into the benefits and intricacies of utilizing StatefulSets and Headless Services to deploy MongoDB effectively on Kubernetes.

Visual Representation

before starting anything else we need a visual understanding of what we are going to do now. So here is an excalidraw image.

What is StatefulSet?

For stateful applications like databases, message brokers, or any applications that store data in persistent disk storage, deployment might not be a good choice. cause when we need to scale our application Deployment creates replicas of your pod and shares persistent storage. That will lead to data inconsistency when two pods write in the data storage at the same time. StatefulSet helps us to deploy Stateful applications in Kubernetes. StatefulSet uses individual persistent storage for each pod. Unlike deployment It won't create and delete pods in random order, StatefulSet creates and deletes pods in order. It guarantees a predictable and stable pod order. It uses master-slave architecture to maintain data consistency where a single replica which is the master, is responsible for managing and updating data, and the rest of the replicas known as slaves can only read the data.

Why HeadlessService?

Using a headless service inside a StatefulSet in Kubernetes enables direct communication between individual pods while maintaining stable network identities and DNS-based service discovery. This is particularly useful for stateful applications that require predictable scaling, ordered communication, and direct connections between pods, such as databases or distributed systems.Unlike traditional services that perform load balancing across pod instances, a headless service does not perform load balancing.

Steps to deploy MongoDB in K8s

Before diving into practical implementation, it's crucial to comprehend a few key points. In most instances, we'll be referring to the term "MongoDB Replica Set." However, it's important to clarify that this is distinct from the "replicaSet" object in Kubernetes. Kubernetes itself doesn't inherently manage data replication. In contrast, MongoDB employs its own replication mechanisms, known as a Replica Set, to handle data replication and synchronization among nodes. StatefulSets, on the other hand, ensure that each MongoDB node maintains a stable network identity, persistent storage, and an ordered scaling process. These features are vital for upholding the integrity of the replica set.

Create ConfigMap

First, we will create a configMap object for storing the configuration of mongodb

apiVersion: v1
kind: ConfigMap
metadata:
  name: mongodb-config
immutable: false  # Indicates that the ConfigMap can be modified
data:
  username: admin1  # Sets the value of the "username" key to "admin1"
  mongodb.conf: |  # Defines the content of the MongoDB configuration file
    storage:
      dbPath: /data/db  # Sets the MongoDB data storage path
    replication:
      replSetName: "rs0"  # Specifies the name of the MongoDB replica set

now we have to manifest the configMap

kubectl apply -f mongo_configmap.yml

Create Secret

then create secret object for storing the mongodb password

apiVersion: v1
kind: Secret
metadata:
  name: mongodb-secret
immutable: false
type: Opaque
data:
  password: password_for_mongodb

now we have to manifest the secret yml file

kubectl apply -f mongo_secret.yml

Create Headless Service

apiVersion: v1
kind: Service
metadata:
  name: mongo  # Specifies the name of the Service
spec:
  ports:
    - name: mongo  # Specifies the name of the port
      port: 27017  # Specifies the port number
      targetPort: 27017  # Specifies the target port on the pods
  clusterIP: None  # Specifies that no cluster IP is assigned (Headless Service)
  selector:
    app: mongo  # Selects pods with the label "app: mongo"

now we have to manifest the headless config yml file

kubectl apply -f mongo_headless.yml

Create Storage Class

The provided YAML snippet defines a Kubernetes StorageClass named "demo-storage" for dynamically provisioning storage volumes using the "k8s.io/minikube-hostpath" provisioner. The StorageClass specifies various attributes for how the storage volumes should be provisioned and managed.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: demo-storage  # Specifies the name of the StorageClass
provisioner: k8s.io/minikube-hostpath  # Specifies the provisioner used for volume provisioning
volumeBindingMode: Immediate  # Specifies that volumes should be immediately bound when requested
reclaimPolicy: Delete  # Specifies the reclaim policy for the volumes (Delete in this case)

now we have to manifest the headless config yml file

kubectl apply -f mongo_storageclass.yml

Create StatefulSet

The provided YAML snippet defines a Kubernetes StatefulSet named "mongo" for deploying a MongoDB replica set with three replicas. It also includes probe configurations for health checks, environment variable definitions, volume mounts, and volume claim templates.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mongo
spec:
  selector:
    matchLabels:
      app: mongo
  serviceName: "mongo"
  replicas: 3
  template:
    metadata:
      labels:
        app: mongo
    spec:
      containers:
        - name: mongo
          image: mongo:4.0.8
          startupProbe:
            exec:
              command:
                - mongo
                - --eval
                - "db.adminCommand('ping')"
            initialDelaySeconds: 1
            periodSeconds: 10
            timeoutSeconds: 5
            successThreshold: 1
            failureThreshold: 2
          livenessProbe:
            exec:
              command:
                - mongo
                - --eval
                - "db.adminCommand('ping')"
            initialDelaySeconds: 1
            periodSeconds: 10
            timeoutSeconds: 5
            successThreshold: 1
            failureThreshold: 2
          readinessProbe:
            exec:
              command:
                - mongo
                - --eval
                - "db.adminCommand('ping')"
            initialDelaySeconds: 1
            periodSeconds: 10
            timeoutSeconds: 5
            successThreshold: 1
            failureThreshold: 2
          env:
            - name: MONGO_INITDB_ROOT_USERNAME
              valueFrom:
                configMapKeyRef:
                  key: username
                  name: mongodb-config
            - name: MONGO_INITDB_ROOT_PASSWORD
              valueFrom:
                secretKeyRef:
                  key: password
                  name: mongodb-secret
          command:
            - mongod
            - "--bind_ip_all"
            - --config=/etc/mongo/mongodb.conf
          volumeMounts:
            - name: mongo-volume
              mountPath: /data/db
            - name: mongodb-config
              mountPath: /etc/mongo
      volumes:
        - name: mongodb-config
          configMap:
            name: mongodb-config
            items:
              - key: mongodb.conf
                path: mongodb.conf
  volumeClaimTemplates:
    - metadata:
        name: mongo-volume
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: demo-storage
        resources:
          requests:
            storage: 1Gi

now we have to manifest the stateful set configuration yml file

kubectl apply -f mongo_sts.yml

Checking Resource behavior inside StatefulState

Checking pods

  kubectl get po

we can notice one thing here after running this command every pod is getting created in an ordered way

Increasing replicas

  kubectl scale sts mongo --replicas=9

Now if we delete everything will be deleted in an ordered fashion. First mongo-n will get deleted then mongo-(n-1).

for checking persistentVolumeClaim

  kubectl get pvc

Even after decreasing the number of pods, the persistent volume remains same.

for checking the persistent volume

  kubectl get pv

and PVC

MongoDB Replica Set config

StatefulSet doesn't provide Replication. But MongoDB supports it internally To configure master-slave architecture we need to configure it.

first, run the Mongo shell inside one replica

 kubectl exec -it mongo-0 -- mongo

after moving into this we have to run the code given below.

  rs.initiate({
        "_id" : "rs0",
        "members" : [
                {
                        "_id" : 0,
                        "host" : "mongo-0.mongo.default.svc.cluster.local:27017",
                },
                {
                        "_id" : 1,
                        "host" : "mongo-1.mongo.default.svc.cluster.local:27017",
                },
                {
                        "_id" : 2,
                        "host" : "mongo-2.mongo.default.svc.cluster.local:27017",
                }
        ]
})

So now mongo-0 pod is a primary instance that can do operations read/write both.

after this run this command to check replica Set status

rs.status()

Now in terminal if you take a close look at member directive in the result you will get output like this

Congrats you have successfully implemented replication in mongoDB using replica set.

But things aren't done yet. You cannot perform read operations in secondary nodes.

Setup secondary node to enable reading

By default, secondary helps in replication only. For defining a slave node to do the reading operations we need to run this command inside Mongoshell of another node that is not primary.

  rs.slaveOk()

So your connecting string URI will be :-

mongodb://mongo-0.mongo,mongo-1.mongo,mongo-2.mongo:27017/?replicaSet=test

A Tale of Deploying MongoDB in K8s : StatefulSets,Headless Service

Table of contents