3 min read | by Jordi Prats
One of the advantages of using AWS Karpenter is that makes straightforward using spot instances. But how do we handle termination notices coming from AWS?
AWS Karpenter is not supposed handle the termination notices, if we want to drain the node to gracefully relocate it's resources before the instance is terminated we will have to install AWS node termination handler.
Supposing we have configured Karpenter to be able to use spot instances bt setting the key karpenter.sh/capacity-type as follows:
apiVersion: karpenter.sh/v1alpha5 kind: Provisioner metadata: name: pet2cattle-workers spec: ttlSecondsUntilExpired: 2592000 ttlSecondsAfterEmpty: 30 labels: nodelabel: example requirements: - key: "node.kubernetes.io/instance-type" operator: In values: ["m5a.large", "m5a.xlarge", "m5a.2xlarge"] - key: "topology.kubernetes.io/zone" operator: In values: ["es-west-1a", "eu-west-1b", "eu-west-1c"] - key: "kubernetes.io/arch" operator: In values: ["arm64", "amd64"] - key: "karpenter.sh/capacity-type" operator: In values: ["spot", "on-demand"] provider: instanceProfile: 'eks_pet2cattle_worker-instance-profile' securityGroupSelector: Name: 'eks_pet2cattle-worker' tags: exampleTag: TagValue limits: resources: cpu: 1000
We can take advantatge to the fact that AWS Karpenter, by default, adds the karpenter.sh/capacity-type label to the nodes specifying whether it is a spot instance or and on demand instance:
$ kubectl describe node ip-10-12-16-11.eu-west-1.compute.internal Name: ip-10-12-16-11.eu-west-1.compute.internal Roles: <none> Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/instance-type=m5a.xlarge beta.kubernetes.io/os=linux failure-domain.beta.kubernetes.io/region=eu-west-1 failure-domain.beta.kubernetes.io/zone=eu-west-1a karpenter.sh/capacity-type=spot karpenter.sh/provisioner-name=workers-nodeprovisioner kubernetes.io/arch=amd64 kubernetes.io/hostname=ip-10-12-16-11.eu-west-1.compute.internal kubernetes.io/os=linux node.kubernetes.io/instance-type=m5a.xlarge topology.ebs.csi.aws.com/zone=eu-west-1a topology.kubernetes.io/region=eu-west-1 topology.kubernetes.io/zone=eu-west-1a vpc.amazonaws.com/has-trunk-attached=true Annotations: csi.volume.kubernetes.io/nodeid: {"ebs.csi.aws.com":"i-0caac9adebadda005"} (...)
We can use this label to select the nodes where we want to schedule the termination handler DaemonSet. To do so, we can install the termination handler with helm using the following settings:
helm repo add eks https://aws.github.io/eks-charts helm upgrade --install aws-node-termination-handler --namespace termination-handler \ --set enableSpotInterruptionDraining=true \ --set nodeSelector.karpenter.sh/capacity-type="spot" eks/aws-node-termination-handler
If we already have the termination handler installed we'll have to modify it's values.yaml to se the following options:
enableSpotInterruptionDraining: "true" nodeSelector: karpenter.sh/capacity-type: "spot"
Having both Karpenter and the termination handler we are making sure we are handling the spot instances lifecycle: Once we receive the notification from AWS that the node is going to be terminated the instance is stopped so the Pods can, as gracefully as possible, be rescheduled on another node (or on a new node)
Posted on 21/01/2022