0%

K8s学习笔记——DaemonSet

学习极客时间上的《深入剖析Kubernetes》

秉持眼过千遍不如手过一遍的原则。动手实践并记录结果

对应章节:21 | 容器化守护进程的意义:DaemonSet

nodeAffinity

原文中的nodeSelector和nodeAffinity的设置的yaml,apply之后一直处于pending状态。机智的我查看了一下flannel的pod设置,发现原文中使用了matchExpressions,而flannel的则使用了matchFields

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
apiVersion: v1
kind: Pod
metadata:
name: node-affinity-pod
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- node2
containers:
- name: busybox
image: busybox
imagePullPolicy: IfNotPresent
stdin: true
tty: true

这样,我指定了调度到node2

1
2
3
$ kubectl get pod node-affinity-pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
node-affinity-pod 1/1 Running 0 23s 172.1.1.100 node2 <none> <none>

当然,这样只是指定了node来调度,但并不惟一。比如修改上面yaml文件中的name后,再创建一个pod,同样可以创建成功

1
2
3
4
$ kubectl get pod node-affinity-pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
node-affinity-pod 1/1 Running 0 23s 172.1.1.100 node2 <none> <none>
node-affinity-pod2 1/1 Running 0 14s 172.1.1.101 node2 <none> <none>

但如果我将node设置为node1(node1是我的环境的master节点)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
apiVersion: v1
kind: Pod
metadata:
name: node-affinity-pod
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- node1
containers:
- name: busybox
image: busybox
imagePullPolicy: IfNotPresent
stdin: true
tty: true
1
2
3
$ kubectl get pod node-affinity-pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
node-affinity-pod 0/1 Pending 0 96s <none> <none> <none> <none>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ kubectl describe pod node-affinity-pod
Name: node-affinity-pod
Namespace: default
Priority: 0
Node: <none>
Labels: <none>
Annotations: Status: Pending
...
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 36s (x3 over 2m) default-scheduler 0/4 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 3 node(s) didn't match node selector.

1 node(s) had taint {node-role.kubernetes.io/master: },由于master节点不允许普通pod调度上去,所以,pod处于pending状态。

污点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
apiVersion: v1
kind: Pod
metadata:
name: toleration-pod
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- node1
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: busybox
image: busybox
imagePullPolicy: IfNotPresent
stdin: true
tty: true

改造了上面的pod,增加了对node-role.kubernetes.io/master的容忍

1
2
3
$ kubectl get pod toleration-pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
toleration-pod 1/1 Running 0 1s 172.1.0.89 node1 <none> <none>

现在,pod已经被调度到了node1上

这样,就解决了master node上不能被调度的问题。同样,课程中提到了unschedulable的污点容忍。

DaemonSet 自动地给被管理的 Pod 加上了这个特殊的 Toleration,就使得这些 Pod 可以忽略这个限制,继而保证每个节点上都会被调度一个 Pod

DaemonSet

创建ds

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: test-ds
spec:
selector:
matchLabels:
name: my-test
template:
metadata:
labels:
name: my-test
spec:
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: my-test-busybox
image: busybox
imagePullPolicy: IfNotPresent
stdin: true
tty: true

创建了一个test-ds的DaemonSet,在污点部分,容忍了master的污点。使其可以被调度在master节点上

查看结果

1
2
3
kubectl get ds
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
test-ds 4 4 4 4 4 <none> 3m7s
1
2
3
4
5
6
$ kubectl get pods -l name=my-test -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test-ds-5nxj9 1/1 Running 0 4m33s 172.1.1.103 node2 <none> <none>
test-ds-bc9jx 1/1 Running 0 4m33s 172.1.2.54 bqi-k8s-node3 <none> <none>
test-ds-kgxm5 1/1 Running 0 4m33s 172.1.3.9 k8s-node4 <none> <none>
test-ds-wvhm2 1/1 Running 0 4m33s 172.1.0.90 node1 <none> <none>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
$ kubectl describe pod test-ds-kgxm5
Name: test-ds-kgxm5
Namespace: default
Priority: 0
Node: k8s-node4/10.160.18.184
Start Time: Fri, 31 Jul 2020 11:50:29 +0800
Labels: controller-revision-hash=7cdb9f7c5c
name=my-test
pod-template-generation=1
Annotations: <none>
Status: Running
IP: 172.1.3.9
IPs:
IP: 172.1.3.9
Controlled By: DaemonSet/test-ds
...
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/pid-pressure:NoSchedule
node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unschedulable:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m55s default-scheduler Successfully assigned default/test-ds-kgxm5 to k8s-node4
Normal Pulling 4m54s kubelet, k8s-node4 Pulling image "busybox"
Normal Pulled 4m53s kubelet, k8s-node4 Successfully pulled image "busybox"
Normal Created 4m53s kubelet, k8s-node4 Created container my-test-busybox
Normal Started 4m52s kubelet, k8s-node4 Started container my-test-busybox

可以看到,每个node上都创建了一个pod,并且Tolerations字段中,除了node-role.kubernetes.io/master:NoSchedule,还自动增加了很多污点

kill一个pod

1
2
3
4
5
6
7
8
$ kubectl delete pod test-ds-kgxm5
pod "test-ds-kgxm5" deleted
$ kubectl get pods -l name=my-test -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test-ds-5nxj9 1/1 Running 0 9m9s 172.1.1.103 node2 <none> <none>
test-ds-bc9jx 1/1 Running 0 9m9s 172.1.2.54 bqi-k8s-node3 <none> <none>
test-ds-dckg2 1/1 Running 0 5s 172.1.3.10 k8s-node4 <none> <none>
test-ds-wvhm2 1/1 Running 0 9m9s 172.1.0.90 node1 <none> <none>

更新

  • 尝试更新镜像版本为一个错误的镜像
1
2
3
4
5
6
7
8
9
10
11
12
$ kubectl get pods -l name=my-test -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test-ds-5nxj9 1/1 Running 0 5h27m 172.1.1.103 node2 <none> <none>
test-ds-dckg2 1/1 Running 0 5h18m 172.1.3.10 k8s-node4 <none> <none>
test-ds-sbsbq 0/1 ContainerCreating 0 23s <none> bqi-k8s-node3 <none> <none>
test-ds-wvhm2 1/1 Running 0 5h27m 172.1.0.90 node1 <none> <none>
$ kubectl get pods -l name=my-test -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test-ds-5nxj9 1/1 Running 0 5h28m 172.1.1.103 node2 <none> <none>
test-ds-dckg2 1/1 Running 0 5h19m 172.1.3.10 k8s-node4 <none> <none>
test-ds-sbsbq 0/1 ImagePullBackOff 0 42s 172.1.2.55 bqi-k8s-node3 <none> <none>
test-ds-wvhm2 1/1 Running 0 5h28m 172.1.0.90 node1 <none> <none>

可以看到,DaemonSet的控制器会选择一个pod进行更新,当遇到更新失败时,将停止更新

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
$ kubectl describe pod test-ds-5nxj9
Name: test-ds-5nxj9
Namespace: default
Priority: 0
Node: node2/10.160.18.181
Start Time: Fri, 31 Jul 2020 11:50:29 +0800
Labels: controller-revision-hash=7cdb9f7c5c
name=my-test
pod-template-generation=1
...
$ kubectl describe pod test-ds-dckg2
Name: test-ds-dckg2
Namespace: default
Priority: 0
Node: k8s-node4/10.160.18.184
Start Time: Fri, 31 Jul 2020 11:59:33 +0800
Labels: controller-revision-hash=7cdb9f7c5c
name=my-test
pod-template-generation=1
...
$ kubectl describe pod test-ds-sbsbq
Name: test-ds-sbsbq
Namespace: default
Priority: 0
Node: bqi-k8s-node3/10.160.18.183
Start Time: Fri, 31 Jul 2020 17:18:02 +0800
Labels: controller-revision-hash=6755d9c956
name=my-test
pod-template-generation=2

也可以看到,labels中:

  • controller-revision-hash更新为一个新的
  • pod-template-generation更新为2

现在,修改镜像为一个可用的镜像

1
2
3
4
5
6
7
8
9
10
11
12
$ kubectl get pods -l name=my-test -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test-ds-dckg2 1/1 Terminating 0 5h25m 172.1.3.10 k8s-node4 <none> <none>
test-ds-jd99w 1/1 Running 0 47s 172.1.2.56 bqi-k8s-node3 <none> <none>
test-ds-nw5lk 1/1 Running 0 9s 172.1.1.104 node2 <none> <none>
test-ds-wvhm2 1/1 Running 0 5h34m 172.1.0.90 node1 <none> <none>
$ kubectl get pods -l name=my-test -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test-ds-72z5v 1/1 Running 0 40s 172.1.3.11 k8s-node4 <none> <none>
test-ds-jd99w 1/1 Running 0 118s 172.1.2.56 bqi-k8s-node3 <none> <none>
test-ds-nw5lk 1/1 Running 0 80s 172.1.1.104 node2 <none> <none>
test-ds-wvhm2 1/1 Terminating 0 5h35m 172.1.0.90 node1 <none> <none>

可以看到,当更新成功后,对应的pod被逐个替换

1
2
3
4
5
6
7
8
9
$ kubectl describe pod test-ds-72z5v
Name: test-ds-72z5v
Namespace: default
Priority: 0
Node: k8s-node4/10.160.18.184
Start Time: Fri, 31 Jul 2020 17:25:23 +0800
Labels: controller-revision-hash=86b8bf4df4
name=my-test
pod-template-generation=3

更新后的pod:

  • controller-revision-hash被更新为一个新的
  • pod-template-generation也增加到了3

小结

DaemonSet分别采用了遍历node来创建pod以及toleration等措施,保证了DaemonSet对应的pod在每一个node上被创建。

通过 nodeAffinity 和 Toleration 这两个调度器的小功能,保证了每个节点上有且只有一个 Pod

同时,通过controller-revision进行版本管理