Use context: kubectl config use-context k8s-c3-CCC

There seems to be an issue with the kubelet not running on cluster3-node1. Fix it and confirm that cluster has node cluster3-node1 available in Ready state afterwards. You should be able to schedule a Pod on cluster3-node1 afterwards.

Write the reason of the issue into /opt/course/18/reason.txt.

Troubleshooting a Non-Responsive Kubernetes Node: A Step-by-Step Guide

In a Kubernetes cluster, ensuring that all nodes are operational is crucial for the stability and performance of your applications. When a node becomes unresponsive or enters a “NotReady” state, it can cause disruptions. In this guide, we’ll walk through the process of troubleshooting a non-responsive node, identifying issues with the kubelet service, and resolving them.

Step 1: Checking Node Status

The first step in troubleshooting a non-responsive node is to check its status using kubectl get nodes:


kubectl get node

kubectl get node

Example output:



NAME                     STATUS     ROLES           AGE   VERSION
cluster3-controlplane1   Ready      control-plane   14d   v1.30.1
cluster3-node1           NotReady   <none>          14d   v1.30.1

NAME STATUS ROLES AGE VERSION

cluster3-controlplane1 Ready control-plane 14d v1.30.1

cluster3-node1 NotReady <none> 14d v1.30.1

Here, we see that cluster3-node1 is in the NotReady state, indicating that the node is unresponsive.

Step 2: Checking the Kubelet Service

The kubelet is the primary “node agent” that runs on each node in the cluster. If the kubelet service is not running, the node will not be able to communicate with the control plane. First, SSH into the problematic node and check if the kubelet is running:


ssh cluster3-node1

ps aux | grep kubelet

ssh cluster3-node1

ps aux | grep kubelet

If the kubelet is not running, check its status using systemd:


service kubelet status

service kubelet status

Example output:



● kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
    Drop-In: /usr/lib/systemd/system/kubelet.service.d
             └─10-kubeadm.conf
     Active: inactive (dead) (Result: exit-code) since Thu 2024-01-04 13:12:54 UTC; 1h 23min ago
       Docs: https://kubernetes.io/docs/

● kubelet.service - kubelet: The Kubernetes Node Agent

Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)

Drop-In: /usr/lib/systemd/system/kubelet.service.d

└─10-kubeadm.conf

Active: inactive (dead) (Result: exit-code) since Thu 2024-01-04 13:12:54 UTC; 1h 23min ago

Docs: https://kubernetes.io/docs/

In this case, the kubelet service is inactive, and we need to restart it:


service kubelet start

service kubelet start

Step 3: Analyzing Kubelet Startup Issues

If the kubelet fails to start, check the output of the service status command for errors. One common issue is a misconfigured path to the kubelet binary. You can manually attempt to run the kubelet binary to verify its location:


/usr/local/bin/kubelet
-bash: /usr/local/bin/kubelet: No such file or directory

whereis kubelet
kubelet: /usr/bin/kubelet

/usr/local/bin/kubelet

-bash: /usr/local/bin/kubelet: No such file or directory

whereis kubelet

kubelet: /usr/bin/kubelet

In this case, the kubelet binary was incorrectly specified as /usr/local/bin/kubelet, but the correct path is /usr/bin/kubelet.

Step 4: Correcting the Kubelet Service Configuration

To fix the issue, edit the kubelet service configuration file and correct the path to the kubelet binary:


vim /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf

vim /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf

After updating the path, reload the systemd daemon and restart the kubelet service:


systemctl daemon-reload
service kubelet restart

service kubelet status  # Check if it's now running

systemctl daemon-reload

service kubelet restart

service kubelet status # Check if it's now running

The kubelet should now be running correctly, and the node should return to a Ready state after a few moments.

Step 5: Verifying Node Status

After fixing the kubelet service, check the status of the node again to ensure it’s back to normal:


kubectl get node

kubectl get node

Expected output:



NAME                     STATUS   ROLES           AGE   VERSION
cluster3-controlplane1   Ready    control-plane   14d   v1.30.1
cluster3-node1           Ready    <none>          14d   v1.30.1

NAME STATUS ROLES AGE VERSION

cluster3-controlplane1 Ready control-plane 14d v1.30.1

cluster3-node1 Ready <none> 14d v1.30.1

The node cluster3-node1 should now be in the Ready state, indicating that it has successfully rejoined the cluster.

Step 6: Documenting the Issue

Finally, it’s important to document the cause of the issue and the steps taken to resolve it. This information can be valuable for future reference or for other team members:


# /opt/course/18/reason.txt
wrong path to kubelet binary specified in service config

# /opt/course/18/reason.txt

wrong path to kubelet binary specified in service config

Conclusion

In this guide, we’ve walked through troubleshooting a non-responsive Kubernetes node by checking the kubelet service, identifying a misconfiguration, and resolving the issue. Ensuring that all nodes are in a Ready state is essential for maintaining the health and performance of your Kubernetes cluster.

K8s – Question18

Troubleshooting a Non-Responsive Kubernetes Node: A Step-by-Step Guide

Step 1: Checking Node Status

Step 2: Checking the Kubelet Service

Step 3: Analyzing Kubelet Startup Issues

Step 4: Correcting the Kubelet Service Configuration

Step 5: Verifying Node Status

Step 6: Documenting the Issue

Conclusion

Leave a Reply Cancel reply

Troubleshooting a Non-Responsive Kubernetes Node: A Step-by-Step Guide

Step 1: Checking Node Status

Step 2: Checking the Kubelet Service

Step 3: Analyzing Kubelet Startup Issues

Step 4: Correcting the Kubelet Service Configuration

Step 5: Verifying Node Status

Step 6: Documenting the Issue

Conclusion

Leave a Reply Cancel reply

Related News

Kubectl-Commands-05

Networking in Kubernetes-04

k8slearn-Labels Selectors and Annotations-03

k8slearn-yaml-02