4 min read | by Jordi Prats
While trying to deploy Pods we might notice the on the Events section that Pod cannot be scheduled due to a volume node affinity conflict:
$ kubectl describe pod website-365-flask-ampa2-ha-member-1 -n website-365 Name: website-365-flask-ampa2-ha-member-1 Namespace: website-365 Priority: 0 Node: <none> Labels: (...) Annotations: (...) Status: Pending IP: IPs: <none> Controlled By: StatefulSet/website-365-flask-ampa2-ha-member Init Containers: (...) Containers: (...) Conditions: Type Status PodScheduled False Volumes: volume: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: volume-website-365-flask-ampa2-ha-member-1 ReadOnly: false (...) Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal NotTriggerScaleUp 31m (x20835 over 7d19h) cluster-autoscaler pod didn't trigger scale-up: 2 node(s) had taint {pti/role: system}, that the pod didn't tolerate, 1 node(s) had volume node affinity conflict Normal NotTriggerScaleUp 95s (x46144 over 7d19h) cluster-autoscaler pod didn't trigger scale-up: 1 node(s) had volume node affinity conflict, 2 node(s) had taint {pti/role: system}, that the pod didn't tolerate Warning FailedScheduling 64s (x2401 over 43h) default-scheduler 0/4 nodes are available: 2 node(s) had taint {pti/role: system}, that the pod didn't tolerate, 2 node(s) had volume node affinity conflict.
This message is stating the fact that the node sits on a different availability zones than the volume it tries to use hence it cannot be scheduled on that node since it wouldn't be able to mount the requested volume.
We can check it looking to the Volumes section:
$ kubectl describe pod website-365-flask-ampa2-ha-member-1 -n website-365 Name: website-365-flask-ampa2-ha-member-1 Namespace: website-365 Priority: 0 Node: <none> (...) Volumes: volume: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: volume-website-365-flask-ampa2-ha-member-1 ReadOnly: false (...)
We'll need to check the PVC first to retrieve the actual volume it is using:
$ kubectl get pvc -n website-365 NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE data-website-365-postgresql-0 Bound pvc-dc818c5c-2677-4bc0-aa32-e141e0ac1516 200Gi RWO ebs-gp2 41d volume-website-365-flask-ampa2-ha-member-0 Bound pvc-710b454f-c06b-4367-b8da-1ec5a3d78a00 200Gi RWO ebs-gp2 41d volume-website-365-flask-ampa2-ha-member-1 Bound pvc-a0cb18a4-b471-4169-b408-699aedaed33d 200Gi RWO ebs-gp2 41d volume-website-365-flask-ampa2-ha-primary-0 Bound pvc-7d4ea83f-da45-44bd-88eb-801950abb8de 200Gi RWO ebs-gp2 41d
If we describe it we'll be able to see on which availability zone it is:
$ kubectl describe pv pvc-a0cb18a4-b471-4169-b408-699aedaed33d Name: pvc-a0cb18a4-b471-4169-b408-699aedaed33d Labels: <none> Annotations: pv.kubernetes.io/provisioned-by: ebs.csi.aws.com Finalizers: [kubernetes.io/pv-protection external-attacher/ebs-csi-aws-com] StorageClass: ebs-gp2 Status: Bound Claim: website-365/volume-website-365-flask-ampa2-ha-member-1 Reclaim Policy: Delete Access Modes: RWO VolumeMode: Filesystem Capacity: 200Gi Node Affinity: Required Terms: Term 0: topology.ebs.csi.aws.com/zone in [eu-west-1b] Message: Source: Type: CSI (a Container Storage Interface (CSI) volume source) Driver: ebs.csi.aws.com FSType: ext4 VolumeHandle: vol-09923383c7c9af32f ReadOnly: false VolumeAttributes: storage.kubernetes.io/csiProvisionerIdentity=1633054440112-8081-ebs.csi.aws.com Events: <none>
Now it's just a matter of checking the availability zone of each of the nodes:
$ kubectl get nodes NAME STATUS ROLES AGE VERSION ip-10-120-194-190.eu-west-1.compute.internal Ready <none> 7d22h v1.21.4-eks-033ce7e ip-10-120-194-235.eu-west-1.compute.internal Ready <none> 37d v1.21.4-eks-033ce7e ip-10-120-195-8.eu-west-1.compute.internal Ready <none> 8m28s v1.21.4-eks-033ce7e ip-10-120-197-126.eu-west-1.compute.internal Ready <none> 14h v1.21.4-eks-033ce7e $ kubectl describe node ip-10-120-195-8.eu-west-1.compute.internal Name: ip-10-120-195-8.eu-west-1.compute.internal Roles: <none> Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/instance-type=m5a.xlarge beta.kubernetes.io/os=linux failure-domain.beta.kubernetes.io/region=eu-west-1 failure-domain.beta.kubernetes.io/zone=eu-west-1a kubernetes.io/arch=amd64 kubernetes.io/hostname=ip-10-120-195-8.eu-west-1.compute.internal kubernetes.io/os=linux node.kubernetes.io/instance-type=m5a.xlarge pti/eks-workers-group-name=default pti/lifecycle=spot topology.ebs.csi.aws.com/zone=eu-west-1a topology.kubernetes.io/region=eu-west-1 topology.kubernetes.io/zone=eu-west-1a vpc.amazonaws.com/has-trunk-attached=true Annotations: csi.volume.kubernetes.io/nodeid: {"ebs.csi.aws.com":"i-0e34bcb1ab40300fb"} node.alpha.kubernetes.io/ttl: 0 volumes.kubernetes.io/controller-managed-attach-detach: true (...)
Depending on how we have our cluster configured this can be handled in different ways. Usually the ClusterAutoscaler or Karpenter to schedule new nodes on the appropriate availability zone. If, after some time, they don't we'll have to check why: Being having reached it's maximum number of nodes the most likely reason
Posted on 27/04/2022