Using AWS Karpenter with spot instances

3 min read | by Jordi Prats

One of the advantages of using AWS Karpenter is that makes straightforward using spot instances. But how do we handle termination notices coming from AWS?

AWS Karpenter is not supposed handle the termination notices, if we want to drain the node to gracefully relocate it's resources before the instance is terminated we will have to install AWS node termination handler.

Supposing we have configured Karpenter to be able to use spot instances bt setting the key karpenter.sh/capacity-type as follows:

apiVersion: karpenter.sh/v1alpha5 kind: Provisioner metadata:  name: pet2cattle-workers spec:  ttlSecondsUntilExpired: 2592000  ttlSecondsAfterEmpty: 30  labels:  nodelabel: example  requirements:  - key: "node.kubernetes.io/instance-type"   operator: In  values: ["m5a.large", "m5a.xlarge", "m5a.2xlarge"]  - key: "topology.kubernetes.io/zone"   operator: In  values: ["es-west-1a", "eu-west-1b", "eu-west-1c"]  - key: "kubernetes.io/arch"   operator: In  values: ["arm64", "amd64"]  - key: "karpenter.sh/capacity-type"  operator: In  values: ["spot", "on-demand"]   provider:  instanceProfile: 'eks_pet2cattle_worker-instance-profile'  securityGroupSelector:  Name: 'eks_pet2cattle-worker'  tags:  exampleTag: TagValue  limits:  resources:  cpu: 1000 

We can take advantatge to the fact that AWS Karpenter, by default, adds the karpenter.sh/capacity-type label to the nodes specifying whether it is a spot instance or and on demand instance:

$ kubectl describe node ip-10-12-16-11.eu-west-1.compute.internal Name: ip-10-12-16-11.eu-west-1.compute.internal Roles: <none> Labels: beta.kubernetes.io/arch=amd64  beta.kubernetes.io/instance-type=m5a.xlarge  beta.kubernetes.io/os=linux  failure-domain.beta.kubernetes.io/region=eu-west-1  failure-domain.beta.kubernetes.io/zone=eu-west-1a  karpenter.sh/capacity-type=spot  karpenter.sh/provisioner-name=workers-nodeprovisioner  kubernetes.io/arch=amd64  kubernetes.io/hostname=ip-10-12-16-11.eu-west-1.compute.internal  kubernetes.io/os=linux  node.kubernetes.io/instance-type=m5a.xlarge  topology.ebs.csi.aws.com/zone=eu-west-1a  topology.kubernetes.io/region=eu-west-1  topology.kubernetes.io/zone=eu-west-1a  vpc.amazonaws.com/has-trunk-attached=true Annotations: csi.volume.kubernetes.io/nodeid: {"ebs.csi.aws.com":"i-0caac9adebadda005"} (...) 

We can use this label to select the nodes where we want to schedule the termination handler DaemonSet. To do so, we can install the termination handler with helm using the following settings:

helm repo add eks https://aws.github.io/eks-charts helm upgrade --install aws-node-termination-handler --namespace termination-handler \  --set enableSpotInterruptionDraining=true \  --set nodeSelector.karpenter.sh/capacity-type="spot"  eks/aws-node-termination-handler 

If we already have the termination handler installed we'll have to modify it's values.yaml to se the following options:

enableSpotInterruptionDraining: "true" nodeSelector:  karpenter.sh/capacity-type: "spot" 

Having both Karpenter and the termination handler we are making sure we are handling the spot instances lifecycle: Once we receive the notification from AWS that the node is going to be terminated the instance is stopped so the Pods can, as gracefully as possible, be rescheduled on another node (or on a new node)


Posted on 21/01/2022