Repairing pod communication problems with the join network
While converting a production region to Kube-OVN managed by Helm chart from its original kubespray installation as detailed in the Genestack doc
docs/k8s-cni-kube-ovn-helm-conversion.md,
we encountered an issue with some of the kube-ovn-cni pods when they restarted with an inability to communicate with the join network, and crash
looping or pending pod states. We found that a reboot of the pod's node generally solved this problem, but also discovered a method that could repair
connectivity with the join network without rebooting the node.
We saw errors like this in the output of kubectl logs for the pod.
cni-server W0610 19:39:31.974287 476942 ovs.go:35 100.64.0.118 network not ready after 195 ping to gateway 100.64.0.1
cni-server W0610 19:39:34.973713 476942 ovs.go:35 100.64.0.118 network not ready after 198 ping to gateway 100.64.0.1
cni-server W0610 19:39:36.991756 476942 ovs.go:45 100.64.0.118 network not ready after 200 ping, gw 100.64.0.1
cni-server E0610 19:39:36.991800 476942 ovs_linux.go:488 failed to init ovn0 check: no packets received from gateway 100.64.0.1
cni-server E0610 19:39:36.991841 476942 klog.go:10 "failed to initialize node gateway" err="no packets received from gateway 100.64.0.1"
In this case, IP 100.64.0.1 belonged to our Kube-OVN join network.
The fix described below likely serves as a general method of resolving pod communication problems with the join network without rebooting the affected node.
Repairing pod communication problems with the join network
The Kube-OVN documentation says of the join network here.
The join network
Note
In the Kubernetes network specification, it is required that Nodes can communicate directly with all Pods. To achieve this in Overlay network mode, Kube-OVN creates a join Subnet and creates a virtual NIC ovn0 at each node that connect to the join subnet, through which the nodes and Pods can communicate with each other.
kubectl ko plugin for OVS commands
The example below uses the ko plugin for kubectl which allows for issuing OVS commands without explicitly
using a kubectl exec to run commands in the ovs-ovn pod, amongst other things. The Genestack
documentation shows installation of this tool in
docs/k8s-tools.md under heading "Install the ko plugin"
As an alternative to using kubectl ko, you can do something like
kubectl -n kube-system exec -it <ovs-ovn pod for the node>
and run, for example ovs-vsctl instead of kubectl ko vsctl <node>, etc.
Resolving the issue
We rebooted one node to resolve this issue, but had the problem with 5 or 6 nodes, and continued trying to resolve the issue without rebooting. We managed to do so as follows.
As an overview, this simply involved deleting the ovn0 port for the node and
restarting the kube-ovn-cni pod for the node, but a more detailed example
follows.
-
Get the node name for the pod with the problem
This command will clearly show the node name:
output
I will use a hyperconverged lab as created by Genestack's
scripts/hyperconverged-lab.shfor this example, and nodehyperconverged-1.cluster.local, podkube-ovn-cni-mj2hx -
Check the logs
For a
kube-ovn-cnipod with this problem, you see log messages like those shown above.W0610 21:36:34.676655 514300 ovs.go:35] 100.64.0.12 network not ready after 33 ping to gateway 100.64.0.1 W0610 21:36:37.677050 514300 ovs.go:35] 100.64.0.12 network not ready after 36 ping to gateway 100.64.0.1 W0610 21:36:40.676623 514300 ovs.go:35] 100.64.0.12 network not ready after 39 ping to gateway 100.64.0.1 -
Confirm the affected network as the join network
You can run
kubectl get subnetand confirm the problem as with thejoinnetwork or subnet, and see the gateway IP:root@hyperconverged-0:~# kubectl get subnet NAME PROVIDER VPC PROTOCOL CIDR PRIVATE NAT DEFAULT GATEWAYTYPE V4USED V4AVAILABLE V6USED V6AVAILABLE EXCLUDEIPS U2OINTERCONNECTIONIP join ovn ovn-cluster IPv4 100.64.0.0/16 false false false distributed 3 65530 0 0 ["100.64.0.1"] ovn-default ovn ovn-cluster IPv4 10.236.0.0/14 false true true distributed 112 262029 0 0 ["10.236.0.1"]The example output shows the join network with the gateway
100.64.0.1 -
Try the ping
Generally, the Kube-OVN image has the ping command. This should also work on any pod with the
pingcommand and serves as a way to confirm the problem. However, some pods could probably restrict this with k8s security contexts and so forth, but most pods don't usually run in a way that prevents ping.Output
PING 100.64.0.1 (100.64.0.1): 56 data bytes 64 bytes from 100.64.0.1: icmp_seq=0 ttl=254 time=0.272 ms 64 bytes from 100.64.0.1: icmp_seq=1 ttl=254 time=0.496 ms 64 bytes from 100.64.0.1: icmp_seq=2 ttl=254 time=0.278 ms 64 bytes from 100.64.0.1: icmp_seq=3 ttl=254 time=0.304 ms 64 bytes from 100.64.0.1: icmp_seq=4 ttl=254 time=0.317 ms --- 100.64.0.1 ping statistics --- 5 packets transmitted, 5 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.272/0.333/0.496/0.083 ms root@hyperconverged-0:~#In this example, the ping works and gives output as shown, but for the example, I will go ahead with deleting the
ovn0port in later steps as if it had not worked and I want to apply this fix. -
Confirm the port exists
root@hyperconverged-0:~# kubectl ko vsctl hyperconverged-1.cluster.local show | grep ovn0 Port ovn0 Interface ovn0Obviously, you could use many more commands to check this, such as
kubectl ko vsctl hyperconverged-1.cluster.local list-ports br-int | grep ovn0, but that doesn't matter for this example. -
Delete the port
This does have a tendency to cause the
kube-ovn-cnipod to restart by itself, and for theovn0to get recreated automatically. If you delay confirming the port as deleted as in the previous step, you may find that the port already exists again and thekube-ovn-cnicontainer already restarted. -
Restart the
kube-ovn-cnipod for the affected nodekubectl -n kube-system get pod -o wide | grep kube-ovn-cni | grep hyperconverged-1.cluster.local kube-ovn-cni-mj2hx 1/1 Running 4 (2m50s ago) 17h 192.168.100.231 hyperconverged-1.cluster.local <none> <none>As mentioned above, you may find this pod automatically got restarted. If so, you can check the logs and possibly go ahead with the delete. If not, you can simply proceed to delete the pod:
The above steps generally resolved this problem. You can follow through with checking by reading the logs of the kube-ovn-cni container
and running the ping again to see if the pod could then actually ping the gateway for the join network.
In a few cases, we had problems deleting the kube-ovn-cni pod, so a section on that follows.
Problems deleting the kube-ovn-cni pod
As mentioned, in some cases, we had problems deleting the kube-ovn-cni pod. Generally, when this happened, we could proceed to doing
the kubectl delete pod with the --force switch. Then, in some cases after using --force, this would show errors in kubectl logs
referring to the sandbox errors, and/or an error binding to a port already in use when k8s tried to recreate the pod. When that happened,
we could generally resolve these by SSHing to the affected node and killing the kube-ovn-daemon process as shown in the example below:
ps fauwxx | grep -i '[k]ube-ovn-daemon'
root 676519 0.5 0.4 1307276 79068 ? Sl 16:35 0:43 \_ ./kube-ovn-daemon --ovs-socket=/run/openvswitch/db.sock --bind-socket=/run/openvswitch/kube-ovn-daemon.sock --enable-mirror=false --mirror-iface=mirror0 --node-switch=join --encap-checksum=true --service-cluster-ip-range=10.233.0.0/18 --iface=enp3s0 --dpdk-tunnel-iface=br-phy --network-type=geneve --default-interface-name=enp3s0 --cni-conf-dir=/etc/cni/net.d --cni-conf-file=/kube-ovn/01-kube-ovn.conflist --cni-conf-name=01-kube-ovn.conflist --logtostderr=false --alsologtostderr=true --log_file=/var/log/kube-ovn/kube-ovn-cni.log --log_file_max_size=200 --enable-metrics=true --kubelet-dir=/var/lib/kubelet --enable-tproxy=false --ovs-vsctl-concurrency=100 --secure-serving=false
Then you can delete the kube-ovn-cni pod (if it doesn't simply restart by
itself from killing this process, which it tends to do, as also tends to happens
when deleting the ovn0 port) and the pod should generally actually get a
normal Running state