Dynamically provisioned PersistentVolumes using StatefulSet

kubernetes PersistentVolume Deployment StatefulSet

6 min read | by Jordi Prats

The basic idea behind a StatefulSet is to be able to manage stateful workloads on Kubernetes, unlike Deployments, creating a unique identity for each Pod using a common spec.

With this in mind we might just copy the Pod's template from a Deployment to a StatefulSet object to make it stateful, but it's not always quite that simple.

You'll be able to find all the object's definitions on the pet2cattle/kubernetes-statefullset-vs-deployment repository on GitHub.

Given the following Deployment:

apiVersion: apps/v1 kind: Deployment metadata:  name: deploy-test spec:  replicas: 1  selector:  matchLabels:  component: deploy-test  template:  metadata:  labels:  component: deploy-test  spec:  volumes:  - name: empty-dir  emptyDir: {}  containers:  - name: file-generator  image: "alpine:latest"  command:  - sleep  - '24h'  volumeMounts:  - mountPath: /test  name: empty-dir 

Let's blindly copy it's spec.template into a StatefulSet:

apiVersion: apps/v1 kind: StatefulSet metadata:  name: sts-test spec:  serviceName: default  replicas: 1  selector:  matchLabels:  component: sts-test  volumeClaimTemplates:  - metadata:  name: sts-test  labels:  component: sts-test  spec:  accessModes: [ "ReadWriteOnce" ]  resources:  requests:  storage: 10Gi  template:  metadata:  labels:  component: sts-test  spec:  volumes:  - name: empty-dir  emptyDir: {}  containers:  - name: file-generator  image: "alpine:latest"  command:  - sleep  - '24h'  volumeMounts:  - mountPath: /test  name: empty-dir 

If we deploy both objects, we will be able to see how the Deployment creates a Pod with hash on it's name. Meanwhile, the StatefulSet will give it a friendlier name:

$ kubectl get pods NAME READY STATUS RESTARTS AGE deploy-test-57bb4d58bf-c67ck 1/1 Running 0 79s sts-test-0 1/1 Running 0 37s 

The fact it is using a friendlier name doesn't really mean anything: What's important it to note is that a Deployment, Pods are meant to be interchanged (cattle). Using a StatefulSet, on the other hand, each Pod has it's own identity (pets).

Another important difference, we will be able to see how for the StatefulSet will create a PersistentVolume for the emptyDir volume:

$ kubectl get pv | grep sts pvc-30131c25-2c6c-4883-9f30-58793c72b442 10Gi RWO Delete Bound test/sts-test-sts-test-0 ebs-gp2 68s 

Does this means that the data on the Volume is persistent? Actually, it's not.

Let's try to write some data on the volume using the Pod created by the Deployment and then delete the Pod:

$ kubectl get pods NAME READY STATUS RESTARTS AGE deploy-test-57bb4d58bf-c67ck 1/1 Running 0 2m14s $ kubectl exec -it deploy-test-57bb4d58bf-c67ck -- ls -l /test total 0 $ kubectl exec -it deploy-test-57bb4d58bf-c67ck -- touch /test/persistence $ kubectl exec -it deploy-test-57bb4d58bf-c67ck -- ls -l /test total 0 -rw-r--r-- 1 root root 0 Feb 16 22:31 persistence $ kubectl delete pod deploy-test-57bb4d58bf-c67ck pod "deploy-test-57bb4d58bf-c67ck" deleted 

As expected, if we check the volume again it will be empty:

$ kubectl get pods NAME READY STATUS RESTARTS AGE deploy-test-57bb4d58bf-878x2 1/1 Running 0 57s sts-test-0 1/1 Running 0 4m35s $ kubectl exec -it deploy-test-57bb4d58bf-878x2 -- ls -l /test total 0 

How is this going to be handled on a StatefulSet? The volume that will be used if going to be a PersistentVolume but it's data will be wiped at each container restart. We can repeat the same test on the Pod created by the StatefulSet:

$ kubectl exec -it sts-test-0 -- ls -l /test total 0 $ kubectl exec -it sts-test-0 -- touch /test/persistence $ kubectl exec -it sts-test-0 -- ls -l /test total 0 -rw-r--r-- 1 root root 0 Feb 16 22:34 persistence $ kubectl delete pod sts-test-0 pod "sts-test-0" deleted $ kubectl get pods NAME READY STATUS RESTARTS AGE deploy-test-57bb4d58bf-878x2 1/1 Running 0 3m27s sts-test-0 1/1 Running 0 17s $ kubectl exec -it sts-test-0 -- ls -l /test total 0 

It's not emptying the data because we are deleting the Pod, even with a rollout restart we'll get the same result:

$ kubectl exec -it sts-test-0 -- ls -l /test total 0 $ kubectl exec -it sts-test-0 -- touch /test/persistence $ kubectl exec -it sts-test-0 -- ls -l /test total 0 -rw-r--r-- 1 root root 0 Feb 16 22:39 persistence $ kubectl rollout restart sts sts-test statefulset.apps/sts-test restarted $ kubectl get pods NAME READY STATUS RESTARTS AGE sts-test-0 1/1 Terminating 0 4m33s $ kubectl get pods NAME READY STATUS RESTARTS AGE sts-test-0 1/1 Running 0 34s $ kubectl exec -it sts-test-0 -- ls -l /test total 0 

The volume is wiped because we are using an emptyDir that will make sure that every time a Pod is deleted/restarted/... the data in the emptyDir is deleted permanently:

apiVersion: apps/v1 kind: StatefulSet metadata:  name: sts-test spec: (...)  template:  metadata:  labels:  component: sts-test  spec:  volumes:  - name: empty-dir  emptyDir: {}  containers:  - name: file-generator  image: "alpine:latest"  command:  - sleep  - '24h'  volumeMounts:  - mountPath: /test  name: empty-dir 

If we want data to be persistent across restarts, what we really want is a dynamically provisioned PersistentVolume (one for each replica). To accomplish this we can use the volumeClaimTemplates. We only need to make sure it's name matches the name of the volume we are mounting using volumeMounts (no need to declare the volume under spec.template.spec.volumes)

apiVersion: apps/v1 kind: StatefulSet metadata:  name: sts-vt-test spec:  serviceName: default  replicas: 1  selector:  matchLabels:  component: sts-vt-test  volumeClaimTemplates:  - metadata:  name: sts-vt-test  labels:  component: sts-vt-test  spec:  accessModes: [ "ReadWriteOnce" ]  resources:  requests:  storage: 10Gi  template:  metadata:  labels:  component: sts-vt-test  spec:  containers:  - name: file-generator  image: "alpine:latest"  command:  - sleep  - '24h'  volumeMounts:  - mountPath: /test  name: sts-vt-test 

Once this new StatefulSet object is deployed, we can create a test file and then delete the Pod as we previously did, but this time the file will be there:

$ kubectl apply -f sts-volume-template.yaml statefulset.apps/sts-vt-test created $ kubectl get pv | grep sts pvc-1bde765e-cd2b-4de9-a1f5-92095dccc11a 10Gi RWO Delete Bound test/sts-vt-test-sts-vt-test-0 ebs-gp2 12s $ kubectl exec -it sts-vt-test-0 -- ls -l /test total 16 drwx------ 2 root root 16384 Feb 16 23:54 lost+found $ kubectl exec -it sts-vt-test-0 -- touch /test/persistence $ kubectl exec -it sts-vt-test-0 -- ls -l /test total 16 drwx------ 2 root root 16384 Feb 16 23:54 lost+found -rw-r--r-- 1 root root 0 Feb 16 23:55 persistence $ kubectl delete pod sts-vt-test-0 pod "sts-vt-test-0" deleted $ kubectl exec -it sts-vt-test-0 -- ls -l /test total 16 drwx------ 2 root root 16384 Feb 16 23:54 lost+found -rw-r--r-- 1 root root 0 Feb 16 23:55 persistence 

Posted on 21/02/2022