- 资源调度(nodeSelector、nodeAffinity、taint、Tolrations)
- 1.nodeSelector
- 2.nodeAffinity
- 3.Taint(污点)与Tolerations(污点容忍)
nodeSelector是最简单的约束方式。nodeSelector是pod.spec的一个字段
通过--show-labels可以查看指定node的labels
[root@master haproxy]# kubectl get node node1 --show-labels NAME STATUS ROLES AGE VERSION LABELS node1 Ready4d12h v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1,kubernetes.io/os=linux [root@master haproxy]#
如果没有额外添加 nodes labels,那么看到的如上所示的默认标签。我们可以通过 kubectl label node 命令给指定 node 添加 labels:
[root@master haproxy]# kubectl get node node1 --show-labels //这下就可以查看到 NAME STATUS ROLES AGE VERSION LABELS node1 Ready4d12h v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=ssd,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1,kubernetes.io/os=linux [root@master haproxy]#
也可以通过 kubectl label node 删除指定的 labels
[root@master haproxy]# kubectl label node node1 disktype- node/node1 labeled [root@master haproxy]# kubectl get node node1 --show-labels NAME STATUS ROLES AGE VERSION LABELS node1 Ready4d12h v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1,kubernetes.io/os=linux [root@master haproxy]#
创建一个pod并指定nodeSelector选项绑定节点:
[root@master haproxy]# kubectl label node node1 disktype=ssd node/node1 labeled [root@master haproxy]# kubectl get node node1 --show-labels NAME STATUS ROLES AGE VERSION LABELS node1 Ready4d12h v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=ssd,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1,kubernetes.io/os=linux [root@master haproxy]#
[root@master haproxy]# cat test.yml
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: httpd2
name: httpd2
spec:
replicas: 1
selector:
matchLabels:
app: httpd2
template:
metadata:
labels:
app: httpd2
spec:
containers:
- image: 3199560936/httpd:v0.4
name: httpd2
---
apiVersion: v1
kind: Service
metadata:
name: httpd2
spec:
ports:
- port: 80
targetPort: 80
selector:
app: httpd2
[root@master haproxy]# kubectl create -f test.yml
deployment.apps/httpd2 created
service/httpd2 created
[root@master haproxy]#
查看pod调度的节点
[root@master haproxy]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES httpd2-fd86fb676-l7gk4 1/1 Running 0 61s 10.244.1.45 node1[root@master haproxy]#
可以看出来创建的pod强制调度到disktype=ssd这个labes的node上了
2.nodeAffinitynodeAffinity意为node亲和性调度策略。是用于替换nodeSelector的全新调度策略。目前有两种节点亲和性表达:
- RequiredDuringSchedulingIgnoredDuringExecution:
必须满足制定的规则才可以调度pode到Node上。相当于硬限制 - PreferredDuringSchedulingIgnoreDuringExecution:
强调优先满足制定规则,调度器会尝试调度pod到Node上,但并不强求,相当于软限制。多个优先级规则还可以设置权重值,以定义执行的先后顺序。
IgnoredDuringExecution的意思是:
如果一个pod所在的节点在pod运行期间标签发生了变更,不在符合该pod的节点亲和性需求,则系统将忽略node上lable的变化,该pod能机选在该节点运行。
NodeAffinity 语法支持的操作符包括:
-
In:label 的值在某个列表中
-
NotIn:label 的值不在某个列表中
-
Exists:某个 label 存在
-
DoesNotExit:某个 label 不存在
-
Gt:label 的值大于某个值
-
Lt:label 的值小于某个值
nodeAffinity规则设置的注意事项如下:
- 如果同时定义了nodeSelector和nodeAffinity,name必须两个条件都得到满足,pod才能最终运行在指定的node上。
- 如果nodeAffinity指定了多个nodeSelectorTerms,那么其中一个能够匹配成功即可。
- 如果在nodeSelectorTerms中有多个matchexpressions,则一个节点必须满足所有matchexpressions才能运行该pod。
[root@master haproxy]# cat test.yml
apiVersion: v1
kind: Pod
metadata:
name: test1
labels:
app: nginx
spec:
containers:
- name: test1
image: nginx
imagePullPolicy: IfNotPresent
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchexpressions:
- key: disktype
values:
- ssd
operator: In
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 10
preference:
matchexpressions:
- key: name
values:
- test
operator: In
[root@master haproxy]#
给node2主机也打上disktype=ssd的标签
[root@master haproxy]# kubectl label node node2 disktype=ssd node/node2 labeled [root@master haproxy]# kubectl get node node2 --show-labels NAME STATUS ROLES AGE VERSION LABELS node2 Ready4d12h v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=ssd,kubernetes.io/arch=amd64,kubernetes.io/hostname=node2,kubernetes.io/os=linux [root@master haproxy]#
测试
给node1打上name=sb的标签
[root@master ~]# kubectl label node node1 name=sb node/node1 labeled [root@master ~]# kubectl get node node1 --show-labels NAME STATUS ROLES AGE VERSION LABELS node1 Ready4d12h v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=ssd,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1,kubernetes.io/os=linux,name=sb [root@master ~]#
创建pod查看结果
[root@master haproxy]# cat httpd.yml
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: httpd2
name: httpd2
spec:
replicas: 1
selector:
matchLabels:
app: httpd2
template:
metadata:
labels:
app: httpd2
spec:
containers:
- image: 3199560936/httpd:v0.4
name: httpd2
---
apiVersion: v1
kind: Service
metadata:
name: httpd2
spec:
ports:
- port: 80
targetPort: 80
selector:
app: httpd2
[root@master haproxy]#
[root@master haproxy]# kubectl apply -f httpd.yml
deployment.apps/httpd2 created
service/httpd2 created
[root@master haproxy]#
[root@master haproxy]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES httpd2-fd86fb676-b8pqx 1/1 Running 0 13s 10.244.1.46 node1[root@master haproxy]#
删除name=sb并测试查看结果
[root@master haproxy]# kubectl label node node1 name- node/node1 labeled [root@master haproxy]# kubectl get node node1 --show-labels NAME STATUS ROLES AGE VERSION LABELS node1 Ready4d12h v1.20.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=ssd,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1,kubernetes.io/os=linux [root@master haproxy]#
[root@master haproxy]# kubectl apply -f haproxy.yml deployment.apps/haproxy created service/haproxy created [root@master haproxy]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES haproxy-74f8f5c6cf-6pf9w 0/1 ContainerCreating 0 8snode2 httpd2-fd86fb676-xggxk 1/1 Running 0 65s 10.244.1.47 node1 [root@master haproxy]#
上面这个pod首先是要求要运行在有disktype=ssd这个标签的node上,如果有多个node上都有这个标签,则优先在有name=sb这个标签上创建
3.Taint(污点)与Tolerations(污点容忍)Taints:避免Pod调度到特定Node上
Tolerations:允许Pod调度到持有Taints的Node上
应用场景:
专用节点:根据业务线将Node分组管理,希望在默认情况下不调度该节点,只有配置了污点容忍才允许分配
- 配备特殊硬件:部分Node配有SSD硬盘、GPU,希望在默认情况下不调度该节点,只有配置了污点容忍才允许分配
- 基于Taint的驱逐
effect说明
上面的例子中effect的取值为NoSchedule,下面对effect的值作下简单说明:
- NoSchedule: 如果一个pod没有声明容忍这个Taint,则系统不会把该Pod调度到有这个Taint的node上
- PreferNoSchedule:NoSchedule的软限制版本,如果一个Pod没有声明容忍这个Taint,则系统会尽量避免把这个pod调度到这一节点上去,但不是强制的。
- NoExecute:定义pod的驱逐行为,以应对节点故障。NoExecute这个Taint效果对节点上正在运行的pod有以下影响:
- 没有设置Toleration的Pod会被立刻驱逐
- 配置了对应Toleration的pod,如果没有为tolerationSeconds赋值,则会一直留在这一节点中
- 配置了对应Toleration的pod且指定了tolerationSeconds值,则会在指定时间后驱逐
- 从kubernetes 1.6版本开始引入了一个alpha版本的功能,即把节点故障标记为Taint(目前只针对node unreachable及node not ready,相应的NodeCondition "Ready"的值为Unknown和False)。激活TaintbasedEvictions功能后(在–feature-gates参数中加入TaintbasedEvictions=true),NodeController会自动为Node设置Taint,而状态为"Ready"的Node上之前设置过的普通驱逐逻辑将会被禁用。注意,在节点故障情况下,为了保持现存的pod驱逐的限速设置,系统将会以限速的模式逐步给node设置Taint,这就能防止在一些特定情况下(比如master暂时失联)造成的大量pod被驱逐的后果。这一功能兼容于tolerationSeconds,允许pod定义节点故障时持续多久才被逐出。
Taint
[root@master ~]# kubectl describe node master 略。。。。。。 detach: true CreationTimestamp: Sat, 18 Dec 2021 22:07:52 -0500 Taints: node-role.kubernetes.io/master:NoSchedule //避免pod调度到特定的node上 Unschedulable: false
Tolerations(污点容忍)
[root@master ~]# kubectl describe pod httpd2-fd86fb676-xnrcc
略。。。。。。
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s //允许pod调度到已有的Taints的node上
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 57s default-scheduler Successfully assigned default/httpd2-fd86fb676-xnrcc to node1
Normal Pulled 56s kubelet Container image "3199560936/httpd:v0.4" already present on machine
Normal Created 56s kubelet Created container httpd2
Normal Started 56s kubelet Started container httpd2
[root@master ~]#
节点添加五点
格式: kubectl taint node [node] key=value:[effect]
其中[effect]可取值:
- NoSchedule :一定不能被调度
- PreferNoSchedule:尽量不要调度,非必须配置容忍
- NoExecute:不仅不会调度,还会驱逐Node上已有的Pod
添加污点容忍(tolrations)字段到Pod配置中
添加污点disktype
[root@master ~]# kubectl taint node node1 disktype:NoSchedule node/node1 tainted [root@master ~]#
查看
[root@master ~]# kubectl describe node node1 略。。。。。。 detach: true CreationTimestamp: Sat, 18 Dec 2021 22:10:36 -0500 Taints: disktype:NoSchedule //这里可以看到添加成功 Unschedulable: false 略。。。。。。 [root@master ~]#
创建一个容器进行测试
[root@master haproxy]# cat haproxy.yml
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: haproxy
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: haproxy
template:
metadata:
labels:
app: haproxy
spec:
containers:
- image: 93quan/haproxy:v1-alpine
imagePullPolicy: Always
env:
- name: RSIP
value: "10.106.56.19 10.96.149.182"
name: haproxy
ports:
- containerPort: 80
hostPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: haproxy
namespace: default
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: haproxy
type: NodePort
[root@master haproxy]#
[root@master haproxy]# kubectl create -f haprxoy.yml
deployment.apps/haproxy created
service/haproxy created
查看
[root@master haproxy]# kubectl get pods -o wide #在这里可以发现node1上面有污点,所以创建的容器会在node2上面出现 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES haproxy-74f8f5c6cf-k8867 0/1 ContainerCreating 0 2m47snode2 httpd2-fd86fb676-xnrcc 1/1 Running 0 17m 10.244.1.44 node1 [root@master haproxy]#
去除污点
语法:kubectl taint node [node] key:[effect]-
[root@master haproxy]# kubectl taint node node1 disktype- node/node1 untainted [root@master haproxy]#
查看
[root@master haproxy]# kubectl describe node node1 略。。。。。。 detach: true CreationTimestamp: Sat, 18 Dec 2021 22:10:36 -0500 Taints://可以看见去除污点成功 Unschedulable: false



