Kubernetes 中的 ReplicaSet 主要的作用是维持一组 Pod 副本的运行,它的主要作用就是保证一定数量的 Pod 能够在集群中正常运行,它会持续监听这些 Pod 的运行状态,在 Pod 发生故障重启数量减少时重新运行新的 Pod 副本。
这篇文章会介绍 ReplicaSet 的工作原理,其中包括在 Kubernetes 中是如何被创建的、如何创建并持有 Pod 并在出现问题时重启它们。
概述
在具体介绍 ReplicaSet 的实现原理之前,我们还是会先简单介绍它的使用,与其他的 Kubernetes 对象一样,我们会在 Kubernetes 集群中使用 YAML 文件创建新的 ReplicaSet 对象,一个常见的 ReplicaSet 的定义其实是这样的:
YAML
这里的 YAML 文件除了常见的 apiVersion
、kind
和 metadata
属性之外,规格中总共包含三部分重要内容,也就是 Pod 副本数目 replicas
、选择器 selector
和 Pod 模板 template
,这三个部分共同定义了 ReplicaSet 的规格:
同一个 ReplicaSet 会使用选择器 selector
中的定义查找集群中自己持有的 Pod
对象,它们会根据标签的匹配获取能够获得的 Pod,下面就是持有三个 Pod 对象的 Replica 拓扑图:
#mermaid-1575353572966 .label{font-family:trebuchet ms,verdana,arial;color:#333}#mermaid-1575353572966 .node circle,#mermaid-1575353572966 .node ellipse,#mermaid-1575353572966 .node polygon,#mermaid-1575353572966 .node rect{fill:#ececff;stroke:#9370db;stroke-width:1px}#mermaid-1575353572966 .node.clickable{cursor:pointer}#mermaid-1575353572966 .arrowheadPath{fill:#333}#mermaid-1575353572966 .edgePath .path{stroke:#333;stroke-width:1.5px}#mermaid-1575353572966 .edgeLabel{background-color:#e8e8e8}#mermaid-1575353572966 .cluster rect{fill:#ffffde!important;stroke:#aa3!important;stroke-width:1px!important}#mermaid-1575353572966 .cluster text{fill:#333}#mermaid-1575353572966 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:trebuchet ms,verdana,arial;font-size:12px;background:#ffffde;border:1px solid #aa3;border-radius:2px;pointer-events:none;z-index:100}#mermaid-1575353572966 .actor{stroke:#ccf;fill:#ececff}#mermaid-1575353572966 text.actor{fill:#000;stroke:none}#mermaid-1575353572966 .actor-line{stroke:grey}#mermaid-1575353572966 .messageLine0{marker-end:“url(#arrowhead)”}#mermaid-1575353572966 .messageLine0,#mermaid-1575353572966 .messageLine1{stroke-width:1.5;stroke-dasharray:“2 2”;stroke:#333}#mermaid-1575353572966 #arrowhead{fill:#333}#mermaid-1575353572966 #crosshead path{fill:#333!important;stroke:#333!important}#mermaid-1575353572966 .messageText{fill:#333;stroke:none}#mermaid-1575353572966 .labelBox{stroke:#ccf;fill:#ececff}#mermaid-1575353572966 .labelText,#mermaid-1575353572966 .loopText{fill:#000;stroke:none}#mermaid-1575353572966 .loopLine{stroke-width:2;stroke-dasharray:“2 2”;marker-end:“url(#arrowhead)”;stroke:#ccf}#mermaid-1575353572966 .note{stroke:#aa3;fill:#fff5ad}#mermaid-1575353572966 .noteText{fill:#000;stroke:none;font-family:trebuchet ms,verdana,arial;font-size:14px}#mermaid-1575353572966 .section{stroke:none;opacity:.2}#mermaid-1575353572966 .section0{fill:rgba(102,102,255,.49)}#mermaid-1575353572966 .section2{fill:#fff400}#mermaid-1575353572966 .section1,#mermaid-1575353572966 .section3{fill:#fff;opacity:.2}#mermaid-1575353572966 .sectionTitle0,#mermaid-1575353572966 .sectionTitle1,#mermaid-1575353572966 .sectionTitle2,#mermaid-1575353572966 .sectionTitle3{fill:#333}#mermaid-1575353572966 .sectionTitle{text-anchor:start;font-size:11px;text-height:14px}#mermaid-1575353572966 .grid .tick{stroke:#d3d3d3;opacity:.3;shape-rendering:crispEdges}#mermaid-1575353572966 .grid path{stroke-width:0}#mermaid-1575353572966 .today{fill:none;stroke:red;stroke-width:2px}#mermaid-1575353572966 .task{stroke-width:2}#mermaid-1575353572966 .taskText{text-anchor:middle;font-size:11px}#mermaid-1575353572966 .taskTextOutsideRight{fill:#000;text-anchor:start;font-size:11px}#mermaid-1575353572966 .taskTextOutsideLeft{fill:#000;text-anchor:end;font-size:11px}#mermaid-1575353572966 .taskText0,#mermaid-1575353572966 .taskText1,#mermaid-1575353572966 .taskText2,#mermaid-1575353572966 .taskText3{fill:#fff}#mermaid-1575353572966 .task0,#mermaid-1575353572966 .task1,#mermaid-1575353572966 .task2,#mermaid-1575353572966 .task3{fill:#8a90dd;stroke:#534fbc}#mermaid-1575353572966 .taskTextOutside0,#mermaid-1575353572966 .taskTextOutside1,#mermaid-1575353572966 .taskTextOutside2,#mermaid-1575353572966 .taskTextOutside3{fill:#000}#mermaid-1575353572966 .active0,#mermaid-1575353572966 .active1,#mermaid-1575353572966 .active2,#mermaid-1575353572966 .active3{fill:#bfc7ff;stroke:#534fbc}#mermaid-1575353572966 .activeText0,#mermaid-1575353572966 .activeText1,#mermaid-1575353572966 .activeText2,#mermaid-1575353572966 .activeText3{fill:#000!important}#mermaid-1575353572966 .done0,#mermaid-1575353572966 .done1,#mermaid-1575353572966 .done2,#mermaid-1575353572966 .done3{stroke:grey;fill:#d3d3d3;stroke-width:2}#mermaid-1575353572966 .doneText0,#mermaid-1575353572966 .doneText1,#mermaid-1575353572966 .doneText2,#mermaid-1575353572966 .doneText3{fill:#000!important}#mermaid-1575353572966 .crit0,#mermaid-1575353572966 .crit1,#mermaid-1575353572966 .crit2,#mermaid-1575353572966 .crit3{stroke:#f88;fill:red;stroke-width:2}#mermaid-1575353572966 .activeCrit0,#mermaid-1575353572966 .activeCrit1,#mermaid-1575353572966 .activeCrit2,#mermaid-1575353572966 .activeCrit3{stroke:#f88;fill:#bfc7ff;stroke-width:2}#mermaid-1575353572966 .doneCrit0,#mermaid-1575353572966 .doneCrit1,#mermaid-1575353572966 .doneCrit2,#mermaid-1575353572966 .doneCrit3{stroke:#f88;fill:#d3d3d3;stroke-width:2;cursor:pointer;shape-rendering:crispEdges}#mermaid-1575353572966 .activeCritText0,#mermaid-1575353572966 .activeCritText1,#mermaid-1575353572966 .activeCritText2,#mermaid-1575353572966 .activeCritText3,#mermaid-1575353572966 .doneCritText0,#mermaid-1575353572966 .doneCritText1,#mermaid-1575353572966 .doneCritText2,#mermaid-1575353572966 .doneCritText3{fill:#000!important}#mermaid-1575353572966 .titleText{text-anchor:middle;font-size:18px;fill:#000}#mermaid-1575353572966 g.classGroup text{fill:#9370db;stroke:none;font-family:trebuchet ms,verdana,arial;font-size:10px}#mermaid-1575353572966 g.classGroup rect{fill:#ececff;stroke:#9370db}#mermaid-1575353572966 g.classGroup line{stroke:#9370db;stroke-width:1}#mermaid-1575353572966 .classLabel .box{stroke:none;stroke-width:0;fill:#ececff;opacity:.5}#mermaid-1575353572966 .classLabel .label{fill:#9370db;font-size:10px}#mermaid-1575353572966 .relation{stroke:#9370db;stroke-width:1;fill:none}#mermaid-1575353572966 #compositionEnd,#mermaid-1575353572966 #compositionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-1575353572966 #aggregationEnd,#mermaid-1575353572966 #aggregationStart{fill:#ececff;stroke:#9370db;stroke-width:1}#mermaid-1575353572966 #dependencyEnd,#mermaid-1575353572966 #dependencyStart,#mermaid-1575353572966 #extensionEnd,#mermaid-1575353572966 #extensionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-1575353572966 .branch-label,#mermaid-1575353572966 .commit-id,#mermaid-1575353572966 .commit-msg{fill:#d3d3d3;color:#d3d3d3}#mermaid-1575353572966 {
color: rgb(58, 65, 69);
font: normal normal 400 normal 18px / 33.3px “Hiragino Sans GB”, “Heiti SC”, “Microsoft YaHei”, sans-serif, Merriweather, serif;
}
被 ReplicaSet 持有的 Pod 有一个 metadata.ownerReferences
指针指向当前的 ReplicaSet,表示当前 Pod 的所有者,这个引用主要会被集群中的 垃圾收集器 使用以清理失去所有者的 Pod 对象。
实现原理
所有 ReplicaSet 对象的增删改查都是由 ReplicaSetController
控制器完成的,该控制器会通过 Informer
监听 ReplicaSet 和 Pod 的变更事件并将其加入持有的待处理队列:
#mermaid-1575353573032 .label{font-family:trebuchet ms,verdana,arial;color:#333}#mermaid-1575353573032 .node circle,#mermaid-1575353573032 .node ellipse,#mermaid-1575353573032 .node polygon,#mermaid-1575353573032 .node rect{fill:#ececff;stroke:#9370db;stroke-width:1px}#mermaid-1575353573032 .node.clickable{cursor:pointer}#mermaid-1575353573032 .arrowheadPath{fill:#333}#mermaid-1575353573032 .edgePath .path{stroke:#333;stroke-width:1.5px}#mermaid-1575353573032 .edgeLabel{background-color:#e8e8e8}#mermaid-1575353573032 .cluster rect{fill:#ffffde!important;stroke:#aa3!important;stroke-width:1px!important}#mermaid-1575353573032 .cluster text{fill:#333}#mermaid-1575353573032 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:trebuchet ms,verdana,arial;font-size:12px;background:#ffffde;border:1px solid #aa3;border-radius:2px;pointer-events:none;z-index:100}#mermaid-1575353573032 .actor{stroke:#ccf;fill:#ececff}#mermaid-1575353573032 text.actor{fill:#000;stroke:none}#mermaid-1575353573032 .actor-line{stroke:grey}#mermaid-1575353573032 .messageLine0{marker-end:“url(#arrowhead)”}#mermaid-1575353573032 .messageLine0,#mermaid-1575353573032 .messageLine1{stroke-width:1.5;stroke-dasharray:“2 2”;stroke:#333}#mermaid-1575353573032 #arrowhead{fill:#333}#mermaid-1575353573032 #crosshead path{fill:#333!important;stroke:#333!important}#mermaid-1575353573032 .messageText{fill:#333;stroke:none}#mermaid-1575353573032 .labelBox{stroke:#ccf;fill:#ececff}#mermaid-1575353573032 .labelText,#mermaid-1575353573032 .loopText{fill:#000;stroke:none}#mermaid-1575353573032 .loopLine{stroke-width:2;stroke-dasharray:“2 2”;marker-end:“url(#arrowhead)”;stroke:#ccf}#mermaid-1575353573032 .note{stroke:#aa3;fill:#fff5ad}#mermaid-1575353573032 .noteText{fill:#000;stroke:none;font-family:trebuchet ms,verdana,arial;font-size:14px}#mermaid-1575353573032 .section{stroke:none;opacity:.2}#mermaid-1575353573032 .section0{fill:rgba(102,102,255,.49)}#mermaid-1575353573032 .section2{fill:#fff400}#mermaid-1575353573032 .section1,#mermaid-1575353573032 .section3{fill:#fff;opacity:.2}#mermaid-1575353573032 .sectionTitle0,#mermaid-1575353573032 .sectionTitle1,#mermaid-1575353573032 .sectionTitle2,#mermaid-1575353573032 .sectionTitle3{fill:#333}#mermaid-1575353573032 .sectionTitle{text-anchor:start;font-size:11px;text-height:14px}#mermaid-1575353573032 .grid .tick{stroke:#d3d3d3;opacity:.3;shape-rendering:crispEdges}#mermaid-1575353573032 .grid path{stroke-width:0}#mermaid-1575353573032 .today{fill:none;stroke:red;stroke-width:2px}#mermaid-1575353573032 .task{stroke-width:2}#mermaid-1575353573032 .taskText{text-anchor:middle;font-size:11px}#mermaid-1575353573032 .taskTextOutsideRight{fill:#000;text-anchor:start;font-size:11px}#mermaid-1575353573032 .taskTextOutsideLeft{fill:#000;text-anchor:end;font-size:11px}#mermaid-1575353573032 .taskText0,#mermaid-1575353573032 .taskText1,#mermaid-1575353573032 .taskText2,#mermaid-1575353573032 .taskText3{fill:#fff}#mermaid-1575353573032 .task0,#mermaid-1575353573032 .task1,#mermaid-1575353573032 .task2,#mermaid-1575353573032 .task3{fill:#8a90dd;stroke:#534fbc}#mermaid-1575353573032 .taskTextOutside0,#mermaid-1575353573032 .taskTextOutside1,#mermaid-1575353573032 .taskTextOutside2,#mermaid-1575353573032 .taskTextOutside3{fill:#000}#mermaid-1575353573032 .active0,#mermaid-1575353573032 .active1,#mermaid-1575353573032 .active2,#mermaid-1575353573032 .active3{fill:#bfc7ff;stroke:#534fbc}#mermaid-1575353573032 .activeText0,#mermaid-1575353573032 .activeText1,#mermaid-1575353573032 .activeText2,#mermaid-1575353573032 .activeText3{fill:#000!important}#mermaid-1575353573032 .done0,#mermaid-1575353573032 .done1,#mermaid-1575353573032 .done2,#mermaid-1575353573032 .done3{stroke:grey;fill:#d3d3d3;stroke-width:2}#mermaid-1575353573032 .doneText0,#mermaid-1575353573032 .doneText1,#mermaid-1575353573032 .doneText2,#mermaid-1575353573032 .doneText3{fill:#000!important}#mermaid-1575353573032 .crit0,#mermaid-1575353573032 .crit1,#mermaid-1575353573032 .crit2,#mermaid-1575353573032 .crit3{stroke:#f88;fill:red;stroke-width:2}#mermaid-1575353573032 .activeCrit0,#mermaid-1575353573032 .activeCrit1,#mermaid-1575353573032 .activeCrit2,#mermaid-1575353573032 .activeCrit3{stroke:#f88;fill:#bfc7ff;stroke-width:2}#mermaid-1575353573032 .doneCrit0,#mermaid-1575353573032 .doneCrit1,#mermaid-1575353573032 .doneCrit2,#mermaid-1575353573032 .doneCrit3{stroke:#f88;fill:#d3d3d3;stroke-width:2;cursor:pointer;shape-rendering:crispEdges}#mermaid-1575353573032 .activeCritText0,#mermaid-1575353573032 .activeCritText1,#mermaid-1575353573032 .activeCritText2,#mermaid-1575353573032 .activeCritText3,#mermaid-1575353573032 .doneCritText0,#mermaid-1575353573032 .doneCritText1,#mermaid-1575353573032 .doneCritText2,#mermaid-1575353573032 .doneCritText3{fill:#000!important}#mermaid-1575353573032 .titleText{text-anchor:middle;font-size:18px;fill:#000}#mermaid-1575353573032 g.classGroup text{fill:#9370db;stroke:none;font-family:trebuchet ms,verdana,arial;font-size:10px}#mermaid-1575353573032 g.classGroup rect{fill:#ececff;stroke:#9370db}#mermaid-1575353573032 g.classGroup line{stroke:#9370db;stroke-width:1}#mermaid-1575353573032 .classLabel .box{stroke:none;stroke-width:0;fill:#ececff;opacity:.5}#mermaid-1575353573032 .classLabel .label{fill:#9370db;font-size:10px}#mermaid-1575353573032 .relation{stroke:#9370db;stroke-width:1;fill:none}#mermaid-1575353573032 #compositionEnd,#mermaid-1575353573032 #compositionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-1575353573032 #aggregationEnd,#mermaid-1575353573032 #aggregationStart{fill:#ececff;stroke:#9370db;stroke-width:1}#mermaid-1575353573032 #dependencyEnd,#mermaid-1575353573032 #dependencyStart,#mermaid-1575353573032 #extensionEnd,#mermaid-1575353573032 #extensionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-1575353573032 .branch-label,#mermaid-1575353573032 .commit-id,#mermaid-1575353573032 .commit-msg{fill:#d3d3d3;color:#d3d3d3}#mermaid-1575353573032 {
color: rgb(58, 65, 69);
font: normal normal 400 normal 18px / 33.3px “Hiragino Sans GB”, “Heiti SC”, “Microsoft YaHei”, sans-serif, Merriweather, serif;
}
ReplicaSetController
中的 queue
其实就是一个存储待处理 ReplicaSet 的『对象池』,它运行的几个 Goroutine 会从队列中取出最新的数据进行处理,上图展示了事件从发生到被处理的流向,我们接下来将分别介绍 ReplicaSet 中常见的同步过程。
同步
ReplicaSetController
启动的多个 Goroutine 会从队列中取出待处理的任务,然后调用 syncReplicaSet
进行同步,这个方法会按照传入的 key
从 etcd 中取出 ReplicaSet 对象,然后取出全部 Active 的 Pod:
Go
随后执行的 ClaimPods
方法会获取一系列 Pod 的所有权,如果当前的 Pod 与 ReplicaSet 的选择器匹配就会建立从属关系,否则就会释放持有的对象,或者直接忽视无关的 Pod,建立和释放关系的方法就是 AdoptPod
和 ReleasePod
,AdoptPod
会设置目标对象的 metadata.OwnerReferences
字段:
JSON
而 ReleasePod
会使用如下的 JSON 数据删除目标 Pod 中的 metadata.OwnerReferences
属性:
JSON
无论是建立还是释放从属关系,都是根据 ReplicaSet 的选择器配置进行的,它们根据匹配的标签执行不同的操作。
在对已经存在 Pod 进行更新之后,manageReplicas
方法会检查并更新当前 ReplicaSet 持有的副本,如果已经存在的 Pod 数量小于 ReplicaSet 的期望数量,那么就会根据模板的配置创建一些新的 Pod 并与这些 Pod 建立从属关系,创建使用 slowStartBatch
方法分组批量创建 Pod 以减少失败的次数:
Go
删除 Pod 的方式就是并发进行的了,代码使用 WaitGroup
等待全部的删除任务运行结束才会返回:
Go
如果需要删除全部的 Pod 就不对传入的 filteredPods
进行排序,否则就会按照三个不同的维度对 Pod 进行排序:
NotReady < Ready
Unscheduled < Scheduled
Pending < Running
按照上述规则进行排序的 Pod 能够保证删除在早期阶段的 Pod 对象,简单总结一下,manageReplicas
方法会在与已经存在的 Pod 建立关系之后,对持有的数量和期望的数量进行比较之后,会根据 Pod 模板创建或者删除 Pod:
到这里整个处理 ReplicaSet 的主要工作就结束了,syncReplicaSet
中剩下的代码会更新 ReplicaSet 的状态并结束同步 ReplicaSet 的工作。
删除
如果我们在 Kubernetes 集群中删除一个 ReplicaSet 持有的 Pod,那么控制器会重新同步 ReplicaSet 的状态并启动一个新的 Pod,但是如果删除集群中的 ReplicaSet 所有相关的 Pod 也都会被删除:
Bash
删除相关 Pod 的工作并不是 ReplicaSetController
负责的,而是由集群中的 垃圾收集器,也就是 GarbageCollector
实现的。
Kubernetes 中的垃圾收集器会负责删除以前有所有者但是现在没有的对象,metadata.ownerReference
属性标识了一个对象的所有者,当垃圾收集器发现对象的所有者被删除时,就会自动删除这些无用的对象,这也是 ReplicaSet 持有的 Pod 被自动删除的原因,我们会在 垃圾收集器 一节中具体介绍垃圾收集器的原理。
总结
Kubernetes 中的 ReplicaSet 并不是一个工程师经常需要直接接触的对象,常用的 Deployment 其实使用 ReplicaSet 实现了很多复杂的特性,例如滚动更新,虽然作为使用者我们并不会经常直接与 ReplicaSet 这一对象打交道,但是如果需要对 Kubernetes 进行一些定制化开发,可能会用 ReplicaSet 和其他对象实现一些更复杂的功能。
相关文章
基础
数据库
分布式协调 & 服务发现
容器编排
Reference
**本文转载自 Draveness 技术博客。
原文链接:https://draveness.me/kubernetes-replicaset
评论