Many applications hosted in a Docker container need a volume to store data on or to read from. The data can’t be stored in the Docker container itself because the data will be lost after a restart or when the container crashes. Persistent Storage has an independent lifecycle of a Pod. This blogposts shows the most used possibilities to use persistent storage using Kubernetes on Azure.
Let’s first see how to use a volume using Docker on your Windows PC.
Docker Volume
We begin without Kubernetes, let’s first see what we want to accomplish on our own local machine.
- Here on my github account you can find a simple .NET Core project. This Console application writes the date and time every 5 seconds to a file and to the Console. This file should not be available in a Docker Container, because when we stop the Docker container, we will loose our file. This is because Docker Containers store data ephemeral. The file should be written to the filesystem of the host. In this case, Windows. Clone this Github project.
- Publish the project to directory obj/Docker/publish
- In the command prompt, in the directory of the project, run
docker build . -t pnimages.azurecr.io/storagedemo - Create directory c:\storage
- Now run the Docker Container. Mount a volume with the name storage to the Docker container.
docker run -d -it -v c:/storage:/storage pnimages.azurecr.io/storagedemo. Every 5 seconds the date and time will be written to c:\storage\log.txt. When we stop the container, the file stays on it’s location. So we have persistent storage. This is what we want using Kubernetes.
Prerequisite for Kubernetes
To use the Storagedemo Docker image in Kubernetes, the Docker Image needs to be available in a Docker registry. I’ve pushed the image to my Azure Container Registry (ACR). My Container Registry it’s name is pnimages. Replace this name with the name of your ACR.
- Add an extra tag to the image with a version number:
docker tag pnimages/storagedemo pnimages.azurecr.io/storagedemo:1 - login to Azure Container Registry:
docker login pnimages.azurecr.io -u pnimages -p [password] - docker push pnimages.azurecr.io/storagedemo
- docker push pnimages.azurecr.io/storagedemo:1
- Create a secret in Kubernetes with the credentials of the ACR:
kubectl create secret docker-registry pnimages –docker-server=https://pnimages.azurecr.io –docker-username=pnimages –docker-password=[password] –docker-email=pnaber@xpirit.com
Kubernetes Volume
For Azure there are 2 kinds of Volumes available in Kubernetes. AzureDisk and Azurefile. AzureDisk maps to a vhd. AzureFile maps to a directory in an Azure Storage Account on a Fileshare.
Kubernetes Volume with Azure disk
What we will create looks like this:
The steps to use AzureDisk:
- Create an Azure Storage Account in any resourcegroup you like, but in the same location as your Kubernetes cluster.
- Create a File Share with the name storage
- Get the name of the storage account and the key of the storageaccount. And for both the account name and the key get a base64 encoded value. I’ve used https://www.base64encode.org/ to do this.
- Create a secret in Kubernetes. Replace the value of azurestorageaccountname and azurestorageaccountkey with the values you obtained in step 3.
apiVersion: v1 kind: Secret metadata: name: volume-azurefile-storage-secret type: Opaque data: azurestorageaccountname: [base64encodedStorageAccountName] azurestorageaccountkey: [base64encodedStorageAccountKey]
kubectl -f apply secret.yaml
- Deploy the Docker image with a Deployment of a pod:
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: volume-azurefile-storage-deployment spec: template: metadata: labels: app: storagedemo spec: containers: - name: storagedemo image: pnimages.azurecr.io/storagedemo:1 volumeMounts: - name: azurefileshare mountPath: /storage imagePullSecrets: - name: pnimages volumes: - name: azurefileshare azureFile: secretName: volume-azurefile-storage-secret shareName: storage readOnly: false
kubectl -f apply deployment.yaml
- The Docker container will start immediately and when you take a look in the fileshare of the Azure Storage account, a log.txt file will written.
Advantages and disadvantages:
+ you can use a storage account in any resourcegroup you like (must be in the same region though)
– have to do some work to create the secret with the base64 encoded values
– A Pod is directly coupled with a Volume, so the Pod has knowledge about the underlying cloud.
Kubernetes Persistent Storage
Kubernetes has a concept called StorageClass. With StorageClasses administrators can offer Profiles regarding storage. For example a profile to store data on a HDD named slow-storage or a profile to store data on a SSD named fast-storage. The kind of storage is determined by the Provisioner. For Azure there are 2 kind of provisioners: AzureFile and AzureDisk.
The big difference between AzureFile and AzureDisk is the AccessMode. There are 3 AccessModes.
ReadWriteOnce – the volume can be mounted as read-write by a single node
ReadOnlyMany – the volume can be mounted read-only by many nodes
ReadWriteMany – the volume can be mounted as read-write by many nodes
AzureFile supports all three. AzureDisk supports ReadWriteOnce only.
In the situation where a pod is configured to use a volume to write or read data to and is being restarted on a different host for whatever reason, you can’t use AzureDisk.
With the use of AzureDisk a vhd is attached to the node. The size of the node is dependent on the number of attached vhd’s. As a general rule you can have twice as many attached disks as CPU cores.
AzureFile uses SMB drives. Regarding performance is this not suitable to store the data of databases for example.
If you have multiple Pods of the same image (multiple replicas) that form a group that work together and you want to mount a volume per Pod to write data to. You may want to use StateFull Set. This is a subject for another blogpost.
Azure Container Service with Kubernetes offers StorageClasses out of the box:
kubectl get storageclass
AKS has a different number of out of the box StorageClasses.
As you can see all 3 StorageClasses in ACS are azure-disk type of storage. There is a standard storage to store to HDD and a premium to store to SSD. There is also 1 default storage. This one stores to HDD also and is marked as default StorageClass. It is possible to mark another StorageClass as default.
There is no name of a Storage Account configured in the StorageClasses above. When one of the StorageClasses is used, a Storage Account with a generated name is provisioned and used.
To use a specific storageAccount you can create a StorageClass yourself. This StorageClass can optionally reference a specific StorageAccount and uses the azure-file provisioner to use fileshares.
kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: azurefilestorage provisioner: kubernetes.io/azure-file parameters: storageAccount: pnstandarddatastorage
Kubernetes Dynamic Persistent Volume (with non-default StorageClass)
We are only going to provision a PersistentVolumeClaim and a Pod. The PersistentVolume and Secret is created Dynamic. Because we are not using the default StorageClass, we need to create a custom StorageClass also. This StorageClass can be reused for multiple deployments.
- Make sure you have created the StorageAccount. This StorageAccount must be located in the same ResourceGroup as the Cluster!
- Create the StorageClass and reference the name of the existing StorageAccount
- A PersistentVolumeClaim (PVC) can request a size and specify an accessmode.
Deploy the PersistentVolumeClaim. The created StorageClass is referenced.apiVersion: v1 kind: PersistentVolumeClaim metadata: name: dynamic-persistence-volume-claim spec: accessModes: - ReadWriteMany storageClassName: azurefilestorage resources: requests: storage: 5Gi
When we deploy a PersistentVolumeClaim without the line storageClassName: azurefilestorage the default StorageClass is used! A StorageAccount is provisioned in the same ResourceGroup as the cluster.
After creating this PersistentVolumeClaim (PVC), Kubernetes creates a new Secret with access parameters to the StorageAccount. The same kind of Secret we created earlier ourselves with the base64 encoded values. The name is azure-storage-account–secret
Not only the PVC is created, also a PersistentVolume is created by Kubernetes. The name is pvc-.
A FileShare is created also. Located in the StorageAccount with a generated name. - Now we are ready to deploy our Docker container:
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: dynamic-persistence-volume-deployment spec: template: metadata: labels: app: storagedemo spec: containers: - name: storagedemo image: pnimages.azurecr.io/storagedemo:1 volumeMounts: - name: azurefileshare mountPath: /storage imagePullSecrets: - name: pnimages volumes: - name: azurefileshare persistentVolumeClaim: claimName: dynamic-persistence-volume-claim
If you want to use the Default StorageClass, just leave the line: storageClassName: azurefilestorage from the PersistentVolumeClaim
Advantages and disadvantages:
+ The secret is generated. No manual base64 encoding needed.
+ You decide the name of the StorageAccount
+ Create StorageClass for reuse and every deployment you create a Pod and a PersistentVolumeClaim only
+ Deployment of the Pod is Cloud agnostic, the Pod doesn’t know anything about the underlying cloud.
– The Azure Storage Account has to be in the same ResourceGroup as the cluster
– The name of the FileShare in the Storage Account can’t be configured. It’s name is generated.
– No reuse of PersistentVolume. Every Deployment results besides a PersistentVolumeClaim, also in a PersistentVolume. This is because the PersistentVolume is created with the requested amount specified in the Claim. So the StorageVolume is never created with a larger size. This is the reason why there is always created a new PersistentVolume for the new Claim. (It’s technically possible to reference the same PersistentVolumeClaim from the Pod)
Static Persistent Volume
When static Persistence is used, a PersistentVolume is created manually. This way the administrator of the cluster can make Storage available to developers of Pods on a managed way.
-
- Make sure you have created the StorageAccount. This StorageAccount must be located in the same ResourceGroup as the Cluster!
- Create the StorageClass and reference the name of the existing StorageAccount. This is commonly done by the cluster administrator.
- Create a PersistentVolume. This is commonly done by the cluster administrator.
apiVersion: v1 kind: PersistentVolume metadata: name: static-persistence-volume labels: storage: slow spec: capacity: storage: 50Gi accessModes: - ReadWriteOnce storageClassName: azurefilestorage azureFile: secretName: static-persistence-secret shareName: storage readOnly: false
- Create the secret that is referenced in the PersistentVolume
Both the azurestorageaccountname and azurestorageaccountkey should be base64 encoded.apiVersion: v1 kind: Secret metadata: name: static-persistence-secret type: Opaque data: azurestorageaccountname: [base64encodedStorageAccountName] azurestorageaccountkey: [base64encodedStorageAccountKey]
- Now the steps look similar to the dynamic way.
Create a PersistentVolumeClaim. Note that the storageClassName matches the storageClassName in the PersistentVolume.apiVersion: v1 kind: PersistentVolumeClaim metadata: name: static-persistence-volume-claim spec: accessModes: - ReadWriteMany resources: requests: storage: 5Gi storageClassName: azurefilestorage
- Create the Pod:
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: static-persistence-volume-deployment spec: template: metadata: labels: app: storagedemo spec: containers: - name: storagedemo image: pnimages.azurecr.io/storagedemo:1 volumeMounts: - name: azurefileshare mountPath: /storage imagePullSecrets: - name: pnimages volumes: - name: azurefileshare persistentVolumeClaim: claimName: static-persistence-volume-claim
Advantages and disadvantages:
+ You have all naming in control. For the FileShare, the StorageAccount
+ Possible to apply seperation of concerns regarding roles. Cluster administrator vs Developers
+ Deployment of the Pod is Cloud agnostic, the Pod doesn’t know anything about the underlying cloud.
– The Azure Storage Account has to be in the same ResourceGroup as the cluster
– Relatively a lot of work to configure everyting including the Secret.
– If you make a mistake in naming and no StorageVolume can be matched by the StorageVolumeClaim, a new StorageVolume is generated.
Alle files can be found on this github repo.