How risky it really is to run a Pod with privileged: true?

3 min read | by Jordi Prats

When running containers, by default we will have an isolation between the host and the running container: you cannot access the host’s resources. But when you run a Pod with the privileged flag, you are effectively disabling this isolation making it equivalent to running that process as root on the host server.

When you tun a privileged Pod means that the pod can access the host’s resources and kernel capabilities: This is essentially equivalent to root on the host. This might be needed on some scenarios, such as being able to run GPU enabled containers in a Kubernetes cluster, so that the GPU can be accessed directly from the container.

We can test it out using the following Deployment

apiVersion: apps/v1 kind: Deployment metadata:  name: test-privileged spec:  selector:  matchLabels:  app: nginx  replicas: 1  template:  metadata:  labels:  app: nginx  spec:  containers:  - name: nginx  image: nginx:latest  ports:  - containerPort: 80  securityContext:  privileged: true

Now if we try list the files we can see on the /dev filesystem we will be able to see all the devices that the host can see:

$ kubectl --context arm64 exec -it test-privileged-8468dbb5c7-6m7mk -- ls /dev autofs loop6 rpivid-hevcmem tty20 tty49 vcs3 bsg loop7 rpivid-intcmem tty21 tty5 vcs4 btrfs-control mapper rpivid-vp9mem tty22 tty50 vcs5 bus mem sda tty23 tty51 vcs6 cachefiles mqueue sda1 tty24 tty52 vcs7 cec0 net sda2 tty25 tty53 vcsa cec1 null sda3 tty26 tty54 vcsa1 cpu_dma_latency port sdb tty27 tty55 vcsa2 cuse ppp sg0 tty28 tty56 vcsa3 dma_heap ptmx sg1 tty29 tty57 vcsa4 dri pts sg2 tty3 tty58 vcsa5 fd ram0 shm tty30 tty59 vcsa6 full ram1 snd tty31 tty6 vcsa7 fuse ram10 stderr tty32 tty60 vcsu gpiochip0 ram11 stdin tty33 tty61 vcsu1 gpiochip1 ram12 stdout tty34 tty62 vcsu2 gpiomem ram13 termination-log tty35 tty63 vcsu3 hwrng ram14 tty tty36 tty7 vcsu4 i2c-11 ram15 tty0 tty37 tty8 vcsu5 i2c-12 ram2 tty1 tty38 tty9 vcsu6 input ram3 tty10 tty39 ttyAMA0 vcsu7 kmsg ram4 tty11 tty4 ttyprintk vga_arbiter kvm ram5 tty12 tty40 uhid vhci longhorn ram6 tty13 tty41 uinput vhost-net loop-control ram7 tty14 tty42 urandom watchdog loop0 ram8 tty15 tty43 vc-mem watchdog0 loop1 ram9 tty16 tty44 vchiq zero loop2 random tty17 tty45 vcio loop3 raw tty18 tty46 vcs loop4 rfkill tty19 tty47 vcs1 loop5 rpivid-h264mem tty2 tty48 vcs2

On a non-privileged Pod we wouldn't be able to see all the devices:

$ kubectl exec -it pet2cattle-7f9775bbd8-klkfd -c pet2cattle -- ls /dev fd ptmx stderr tty full pts stdin urandom mqueue random stdout zero null shm termination-log

So, as soon as we have access to all the devices (disks) we can do whatever we want to the Kubernetes node by mounting the relevant filesystems and writing whatever change we want.

We must bear in mind that privileged=true is a shorthand for ALL PRIVILEGES, if we really need to add some extra capabilities we can configure permissions in a more granular way. You can check the SecurityContext reference for a comprehensive list.

Posted on 22/12/2021

How risky it really is to run a Pod with privileged: true?

Categories