Here is my learning of how I debugged the CrashLoopBackOff in kubernetes when the pod wasn’t starting.

I wanted to deploy the jenkins docker image in the cluster. As mentioned in the jenkins docker  repo I wanted to mount an external drive which is an AWS EBS volume.  Here is my deployment yaml.

[gist id = “98aa6da98ebb9b7e3d7f996c8ef2cb38”]

After starting the deployment the pod never came up and this the output of  kubectl get pod

NAME                                      READY      STATUS                  RESTARTS  AGE

jenkins-3317895845-x84u3  0/1       CrashLoopBackOff      10                 27m

The next step was to issue the kubectl describe pod jenkins-3317895845-x84u3

[gist id = “7a79af84b53fc923aa609997fdb4fcfd”]

So from the logs I could make out the container is being pulled correctly but it is failing on StartContainer.  Now based on this information the next step was to get the docker logs. But the docker logs aren’t accessible from my box because it is managed by kubernetes. The only way to get the docker logs was to actually ssh into the box.

But which box should I ssh? The describe output has the node information which is Node: ip-172-20-0-29.us-west-2.compute.internal/172.20.0.29 . This_ _is the private ip of the box in aws but with that information you should be able to figure out the public ip to ssh.

Now I know I have to ssh but where is the key for this server. If you have used the kube-up.sh then the keys would be stored  in this location ~/.ssh/kube_aws_rsa. ssh -i ~/.ssh/kube_aws_rsa admin@public-ip-oftheabovenode.

After_ _sshing into the box I issued the command _sudo docker ps -a | grep naveen _because the container could have been stopped and looked for naveen because that was my container name. This gave me container id which was stopped with exit status as 1.

And this was the output of docker logs command

admin@ip-172-20-0-29:~$ sudo docker logs c98a338268a1

_touch: cannot touch ‘/var/jenkins_home/copy_referencefile.log’: Permission denied

_Can not write to /var/jenkins_home/copy_referencefile.log. Wrong volume permissions?

which identified the _/var/jenkins_home  _which was mounted with aws ebs voulme  didn’t have permission to write by the jenkins user https://github.com/kubernetes/kubernetes/issues/2630.

And after doing all of this I realized I could have done _kubectl logs jenkins-3317895845-x84u3_ which would have given the same output without having to ssh into the box. But knowing this handy because when things go wrong we really need to debug the root cause.