随着学习日渐深入,遇到了不少问题。有问题,度娘+谷哥。学习阶段,我只是各种文章的搬运工。先解决用的问题
本文主要解决以下几个问题:
pod无法访问ClusterIP
busybox做dns查询
pod间互访及访问外网
问题1: pod无法访问ClusterIP
这个问题困扰了我好些天,最后,配置了IPVS就OK了
step0 - kube-proxy日志
从kube-proxy的日志中看到Unknown proxy mode "", assuming iptables proxy
1 2 3 4 5 6 7 8 9 10 11 12 13 14 $ kubectl logs -n kube-system kube-proxy-5n29r | more W0720 03:22:47.942827 1 server_others.go:559] Unknown proxy mode "" , assuming iptables proxy I0720 03:22:48.245820 1 node.go:136] Successfully retrieved node IP: 10.160.18.183 I0720 03:22:48.245876 1 server_others.go:186] Using iptables Proxier. I0720 03:22:48.246253 1 server.go:583] Version: v1.18.3 I0720 03:22:48.395170 1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072 I0720 03:22:48.395210 1 conntrack.go:52] Setting nf_conntrack_max to 131072 I0720 03:22:48.395578 1 conntrack.go:83] Setting conntrack hashsize to 32768 I0720 03:22:48.414004 1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400 I0720 03:22:48.414067 1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600 I0720 03:22:48.533638 1 config.go:315] Starting service config controller I0720 03:22:48.533673 1 shared_informer.go:223] Waiting for caches to sync for service config I0720 03:22:48.533997 1 config.go:133] Starting endpoints config controller I0720 03:22:48.534016 1 shared_informer.go:223] Waiting for caches to sync for endpoints config
step1 - 安装相关包
1 2 $ apt install ipset ipvsadm
step2 - 加载module
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 $ modprobe -- ip_vs $ modprobe -- ip_vs_rr $ modprobe -- ip_vs_wrr $ modprobe -- ip_vs_sh $ modprobe -- nf_conntrack $ lsmod | grep -e ipvs -e nf_conntrack nf_conntrack_netlink 45056 0 nfnetlink 16384 3 nf_conntrack_netlink,ip_set nf_conntrack 139264 5 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE,ip_vs nf_defrag_ipv6 24576 2 nf_conntrack,ip_vs nf_defrag_ipv4 16384 1 nf_conntrack libcrc32c 16384 3 nf_conntrack,nf_nat,ip_vs
注:如果ipvs默认没有加载的话,需要写一个脚本,系统重启时也需要加载
step3 - 修改kube-proxy配置文件
修改kube-proxy的configmap中的mode字段为ipvs
1 2 3 4 5 6 7 $ kubectl edit configmap kube-proxy -n kube-system ... kind: KubeProxyConfiguration metricsBindAddress: "" mode: "ipvs" nodePortAddresses: null ...
step4 - 重启kube-proxy
可以逐个删除kube-proxy的pod,由k8s自动重启,也可以批量删除
1 $ kubectl get pod -n kube-system | grep kube-proxy |awk '{system("kubectl delete pod "$1" -n kube-system")}'
查看kube-proxy的日志
1 2 3 4 5 6 7 8 9 10 11 12 $ kubectl logs -n kube-system kube-proxy-44zw5 I0720 05:37:30.026304 1 node.go:136] Successfully retrieved node IP: 10.160.18.181 I0720 05:37:30.026349 1 server_others.go:259] Using ipvs Proxier. W0720 05:37:30.026600 1 proxier.go:429] IPVS scheduler not specified, use rr by default I0720 05:37:30.026814 1 server.go:583] Version: v1.18.3 I0720 05:37:30.027200 1 conntrack.go:52] Setting nf_conntrack_max to 131072 I0720 05:37:30.027452 1 config.go:133] Starting endpoints config controller I0720 05:37:30.027474 1 shared_informer.go:223] Waiting for caches to sync for endpoints config I0720 05:37:30.027507 1 config.go:315] Starting service config controller I0720 05:37:30.027529 1 shared_informer.go:223] Waiting for caches to sync for service config I0720 05:37:30.127736 1 shared_informer.go:230] Caches are synced for endpoints config I0720 05:37:30.127790 1 shared_informer.go:230] Caches are synced for service config
可以看到Using ipvs Proxier.
,说明IPVS已经启用了
现在,可以启动一个busybox的container来ping一下coredns的clusterIP了
问题2: busybox做dns查询失败
step0 - 问题现象
在解决了pod无法访问dns clusterIP的问题之后,发现busybox还是无法解析到某个service的IP
1 2 3 4 kubectl run -i --tty --image busybox dns-test --restart=Never --rm /bin/sh If you don't see a command prompt, try pressing enter. / # nslookup web-0.nginx ;; connection timed out; no servers could be reached
上网查完后发现:busybox的版本高于1.28.4都存在这个问题
step1 - 解决方法
使用1.28.4的busybox镜像执行dns查询
1 2 3 4 5 6 7 8 9 10 11 12 $ kubectl run -i --tty --image busybox:1.28.4 dns-test --restart=Never --rm /bin/sh If you don't see a command prompt, try pressing enter. / # nslookup web-0.nginx Server: 10.96.0.10 Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local Name: web-0.nginx Address 1: 172.1.2.175 web-0.nginx.default.svc.cluster.local / # ping web-0.nginx PING web-0.nginx (172.1.2.175): 56 data bytes 64 bytes from 172.1.2.175: seq=0 ttl=62 time=1.050 ms 64 bytes from 172.1.2.175: seq=1 ttl=62 time=0.432 ms
问题3: pod互访及访问外网不通
问题原因 :iptables
解决方法 :
分别在每个节点上执行
1 2 3 4 5 6 $ iptables -P INPUT ACCEPT $ iptables -P FORWARD ACCEPT $ iptables -F $ iptables -L -n $ iptables -t nat -I POSTROUTING -s 172.1.2.0/24 -j MASQUERADE
测试
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 $ kubectl run -i --tty --image busybox:1.28.4 connect-test --restart=Never --rm -- /bin/sh If you don't see a command prompt, try pressing enter. / # ping 10.96.0.10 -c 1 PING 10.96.0.10 (10.96.0.10): 56 data bytes 64 bytes from 10.96.0.10: seq=0 ttl=64 time=0.082 ms --- 10.96.0.10 ping statistics --- 1 packets transmitted, 1 packets received, 0% packet loss round-trip min/avg/max = 0.082/0.082/0.082 ms / # ping 10.96.0.1 -c 1 PING 10.96.0.1 (10.96.0.1): 56 data bytes 64 bytes from 10.96.0.1: seq=0 ttl=64 time=0.072 ms --- 10.96.0.1 ping statistics --- 1 packets transmitted, 1 packets received, 0% packet loss round-trip min/avg/max = 0.072/0.072/0.072 ms / # ping 172.1.1.193 -c 1 PING 172.1.1.193 (172.1.1.193): 56 data bytes 64 bytes from 172.1.1.193: seq=0 ttl=64 time=0.108 ms --- 172.1.1.193 ping statistics --- 1 packets transmitted, 1 packets received, 0% packet loss round-trip min/avg/max = 0.108/0.108/0.108 ms / # ping 223.5.5.5 -c 1 PING 223.5.5.5 (223.5.5.5): 56 data bytes 64 bytes from 223.5.5.5: seq=0 ttl=114 time=5.659 ms --- 223.5.5.5 ping statistics --- 1 packets transmitted, 1 packets received, 0% packet loss round-trip min/avg/max = 5.659/5.659/5.659 ms / # # 注: 10.96.0.1为kube-apiserver的ClusterIP # 10.96.0.10为coredns的ClusterIP # 172.1.1.193为一个pod的IP
使iptables规则重启生效
分别在每个节点上执行:
1 2 3 $ iptables-save > /etc/iptables.up.rules $ echo -e '#!/bin/bash\n/sbin/iptables-restore < /etc/iptables.up.rules' > /etc/network/if -pre-up.d/iptables $ chmod +x /etc/network/if -pre-up.d/iptables