0%

国内源安装k8s——Ubuntu

  • 阿里云源
  • Ubuntu20.04
  • master + slave

安装docker

分别在两个node上安装docker-ce

1
2
3
4
5
6
$ apt-get update 
$ apt-get -y install apt-transport-https ca-certificates curl software-properties-common
$ curl -fsSL http://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
$ add-apt-repository "deb [arch=amd64] http://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"
$ apt update
$ apt-get -y install docker-ce

安装kubeadm,kubectl,kubelet

分别在两个node上安装

1
2
3
4
5
6
7
$ apt-get update && apt-get install -y apt-transport-https
$ curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
$ cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
$ apt-get update
$ apt-get install -y kubelet kubeadm kubectl

初始化master节点

1
$ kubeadm init --pod-network-cidr=172.172.0.0/16 --image-repository registry.aliyuncs.com/google_containers

安装成功后,会有如下信息:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 10.160.18.180:6443 --token 5xxosi.3du1z15pevcvnyyx \
--discovery-token-ca-cert-hash sha256:4cc4977482e04ac0ca845bf3520a6a5fa8a0cf6ac8233e734a47e0250c259f73

根据提示,执行

1
2
3
$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config

安装flannel

参考: https://github.com/coreos/flannel

1
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

安装dashboard

参考:https://github.com/kubernetes/dashboard

1
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0/aio/deploy/recommended.yaml

修改dashboard配置

修改spec中的typeNodePort

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ kubectl -n kubernetes-dashboard edit service kubernetes-dashboard
.......
spec:
clusterIP: 10.101.212.193
externalTrafficPolicy: Cluster
ports:
- nodePort: 32609
port: 443
protocol: TCP
targetPort: 8443
selector:
k8s-app: kubernetes-dashboard
sessionAffinity: None
type: NodePort

修改成功后,查看port信息

1
2
3
$ kubectl -n kubernetes-dashboard get service kubernetes-dashboard
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-dashboard NodePort 10.101.212.193 <none> 443:32609/TCP 27m

现在,可以通过https://<master-ip>:<NodePort>(这里的port是32609)来访问dashboard了

虽然页面已经展示出来了,但需要使用token或Kubeconfig才能访问

创建sample-user

创建服务账号

新建dashboard-adminuser.yaml并写入:

1
2
3
4
5
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kubernetes-dashboard

执行:

1
$ kubectl apply -f dashboard-adminuser.yaml
创建ClusterRoleBinding

新建cluster-role-binding.yaml并写入:

1
2
3
4
5
6
7
8
9
10
11
12
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kubernetes-dashboard

执行:

1
$ kubetcl apply -f cluster-role-binding.yaml
获取token
1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin-user | awk '{print $1}')
Name: admin-user-token-jmggp
Namespace: kubernetes-dashboard
Labels: <none>
Annotations: kubernetes.io/service-account.name: admin-user
kubernetes.io/service-account.uid: 58210c16-0fac-438c-8867-d0a3e7b950b9

Type: kubernetes.io/service-account-token

Data
====
ca.crt: 1025 bytes
namespace: 20 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6IlhTSnlXMUhXTlNnUmd4MlVMTzdtbm14YVdiSzNUdjk4UnVoZ3RRbUFXZGsifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi11c2VyLXRva2VuLWptZ2dwIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluLXVzZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiI1ODIxMGMxNi0wZmFjLTQzOGMtODg2Ny1kMGEzZTdiOTUwYjkiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZXJuZXRlcy1kYXNoYm9hcmQ6YWRtaW4tdXNlciJ9.F4TKNO_6Guu-vcLUtELUOhRI2dGMcZ3V1et2evono_a6f-TvCR9c4pbyYCnRdCG6_MumTmyE5W1g3zHioVnb5TgnGwfmAfIWLltwwLEOxOdLfO7oqM8zrYfzZnIH16SoOZQYMU7xIk5MhE5WN265n8Q2kpDMraf0L06_nqNy1pq8h9eaX0QIntosl4fmf9KVew0geLCKbknEwpnzGGfSCcKLLgE7a45ACWwStJiL29t69gcKJ6ze33MXpA5_irk2nKkavXbKEk7ejapgYK66nOxJnDKgbNVDcBP47xHrPjGeeupB6bw6uUMWxA6z4kJUTVRepk6yTMGVDPzB9Muicw

现在,可以使用token登录Dashboard了

slave节点加入集群

1
2
$ kubeadm join 10.160.18.180:6443 --token 5xxosi.3du1z15pevcvnyyx \
--discovery-token-ca-cert-hash sha256:4cc4977482e04ac0ca845bf3520a6a5fa8a0cf6ac8233e734a47e0250c259f73

问题及解决方案

docker cgroup driver问题

问题日志

1
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/

解决方法

  1. /etc/docker/下创建daemon.json
1
2
3
4
5
cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
EOF
  1. 重启docker进程
1
2
$ systemctl restart docker
$ systemctl status docker

swap问题

问题日志

1
[ERROR Swap]: running with swap on is not supported. Please disable swap

解决方法

1
2
$ swapoff -a
# 在所有node上执行

但这只是暂时关闭了swap,重启node后,就会再次打开。需要修改/etc/fstab,在swap那行加上#

1
#UUID=4eeb5155-41f9-4478-a420-2beb4290a721 none            swap    sw              0       0

Node处于NotReady状态

node处于NotReady状态的原因有很多。可以一步一步处理

1
2
3
4
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
node1 Ready master 4h56m v1.18.2
node2 NotReady <none> 4h6m v1.18.2

先查看错误原因:

1
2
3
4
5
6
7
8
9
10
11
12
$ kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-7ff77c879f-2k7rw 1/1 Running 1 4h47m
coredns-7ff77c879f-q76jr 1/1 Running 1 4h47m
etcd-node1 1/1 Running 2 4h47m
kube-apiserver-node1 1/1 Running 2 4h47m
kube-controller-manager-node1 1/1 Running 2 4h47m
kube-flannel-ds-amd64-2jn8n 0/1 Init:ImagePullBackOff 0 3h49m
kube-flannel-ds-amd64-ftpxl 1/1 Running 1 3h49m
kube-proxy-5q8wp 1/1 Running 2 4h47m
kube-proxy-wfcjq 0/1 ContainerCreating 0 5m46s
kube-scheduler-node1 1/1 Running 2 4h47m

k8s有些服务会在各个节点上启动,比如这里的proxy,flannel。

1
2
3
4
5
6
7
8
9
10
11
$ kubectl describe pod -n kube-system kube-flannel-ds-amd64-2jn8n
.....
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulling 6m51s kubelet, node2 Pulling image "quay.io/coreos/flannel:v0.12.0-amd64"
Warning Failed 5m48s kubelet, node2 Failed to pull image "quay.io/coreos/flannel:v0.12.0-amd64": rpc error: code = Unknown desc = Error response from daemon: Get https://quay.io/v2/coreos/flannel/manifests/v0.12.0-amd64: received unexpected HTTP status: 500 Internal Server Error
Warning Failed 5m48s kubelet, node2 Error: ErrImagePull
Normal BackOff 5m47s kubelet, node2 Back-off pulling image "quay.io/coreos/flannel:v0.12.0-amd64"
Warning Failed 5m47s kubelet, node2 Error: ImagePullBackOff
Normal Pulling 5m36s (x2 over 5m51s) kubelet, node2 Pulling image "quay.io/coreos/flannel:v0.12.0-amd64"

最常见的是ImagePull失败。比如master node上镜像拉取正常,而其他节点拉取失败。

解决方法

1. 在slave节点上手工拉取镜像
1
$ docker pull quay.io/coreos/flannel:v0.12.0-amd64
2. 将master节点上的镜像导入slave节点
  • 查看master节点上的镜像
1
2
3
4
5
6
7
8
9
10
11
12
(master)$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
kubernetesui/dashboard v2.0.0 8b32422733b3 3 weeks ago 222MB
registry.aliyuncs.com/google_containers/kube-proxy v1.18.2 0d40868643c6 4 weeks ago 117MB
registry.aliyuncs.com/google_containers/kube-scheduler v1.18.2 a3099161e137 4 weeks ago 95.3MB
registry.aliyuncs.com/google_containers/kube-apiserver v1.18.2 6ed75ad404bd 4 weeks ago 173MB
registry.aliyuncs.com/google_containers/kube-controller-manager v1.18.2 ace0a8c17ba9 4 weeks ago 162MB
kubernetesui/metrics-scraper v1.0.4 86262685d9ab 7 weeks ago 36.9MB
quay.io/coreos/flannel v0.12.0-amd64 4e9f801d2217 2 months ago 52.8MB
registry.aliyuncs.com/google_containers/pause 3.2 80d28bedfe5d 3 months ago 683kB
registry.aliyuncs.com/google_containers/coredns 1.6.7 67da37a9a360 3 months ago 43.8MB
registry.aliyuncs.com/google_containers/etcd 3.4.3-0 303ce5db0e90 6 months ago 288MB
  • 将镜像导出为文件
1
(master)$ docker save quay.io/coreos/flannel  > flannel.tar
  • 将文件传输到slave节点上
  • slave节点上导入镜像
1
2
3
4
5
6
7
(slave)$ docker load < flannel.tar
256a7af3acb1: Loading layer [==================================================>] 5.844MB/5.844MB
d572e5d9d39b: Loading layer [==================================================>] 10.37MB/10.37MB
57c10be5852f: Loading layer [==================================================>] 2.249MB/2.249MB
7412f8eefb77: Loading layer [==================================================>] 35.26MB/35.26MB
05116c9ff7bf: Loading layer [==================================================>] 5.12kB/5.12kB
Loaded image: quay.io/coreos/flannel:v0.12.0-amd64

Now, all is OK