如何使 Kubernetes 集群自动扩容？Cluster Autoscaler 全面解析( 二 ) _小知识

random：随机选择一个 NodeGroup 。如果未指定，则默认为此策略。

most-pods：选择能够调度最多 Pod 的 NodeGroup，比如有的 Pod 未调度是因为 nodeSelector，此策略会优先选择能满足的 NodeGroup 来保证大多数的 Pod 可以被调度。

least-waste：为避免浪费，此策略会优先选择能满足 Pod 需求资源的最小资源类型的 NodeGroup 。

price：根据 CloudProvider 提供的价格模型，选择最省钱的 NodeGroup 。

priority：通过配置优先级来进行选择，用起来比较麻烦，需要额外的配置，可以看文档(https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/expander/priority/readme.md) 。

如果有需要，也可以平衡相似 NodeGroup 中的 Node 数量，避免 NodeGroup 达到 MaxSize 而导致无法加入新 Node 。通过
--balance-similar-node-groups 选项配置，默认为 false 。
在经过一系列的操作后，最终计算出要扩容的 Node 数量及 NodeGroup，使用 CloudProvider 执行 IncreaseSize 操作，增加云厂商的伸缩组大小，从而完成扩容操作。
文字表达能力不足，如果有不清晰的地方，可以参考下面的 ScaleUP 源码解析。
Scale Down
缩容是一个可选的功能，通过 --scale-down-enabled 选项配置，默认为 true 。
在 Cluster Autoscaler 监控 Node 资源时，如果发现有 Node 满足以下三个条件时，就会标记这个 Node 为 unneeded：

Node 上运行的所有的 Pod 的 Cpu 和内存之和小于该 Node 可分配容量的 50% 。可通过 --scale-down-utilization-threshold 选项改变这个配置。

Node 上所有的 Pod 都可以被调度到其他节点。

Node 没有表示不可缩容的 annotaition 。

如果一个 Node 被标记为 unneeded 超过 10 分钟（可通过
--scale-down-unneeded-time 选项配置），则使用 CloudProvider 执行 DeleteNodes 操作将其删除。一次最多删除一个 unneeded Node，但空 Node 可以批量删除，每次最多删除 10 个（通过 ----max-empty-bulk-delete 选项配置）。
实际上并不是只有这一个判定条件，还会有其他的条件来阻止删除这个 Node，比如 NodeGroup 已达到 MinSize，或在过去的 10 分钟内有过一次 Scale UP 操作（通过
--scale-down-delay-after-add 选项配置）等等，更详细可查看文档(
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#how-does-scale-down-work) 。
Cluster Autoscaler 的工作机制很复杂，但其中大部分都能通过 flags 进行配置，如果有需要，请详细阅读文档：
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md
如何实现 CloudProvider如果使用上述中已实现接入的云厂商，只需要通过 --cloud-provider 选项指定来自哪个云厂商就可以，如果想要对接自己的 IaaS 或有特定的业务逻辑，就需要自己实现 CloudProvider Interface 与 NodeGroupInterface 。并将其注册到 builder 中，用于通过 --cloud-provider 参数指定。
builder 在 cloudprovider/builder 中的 builder_all.go (
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/builder/builder_all.go) 中注册，也可以在其中新建一个自己的 build，通过 go 文件的 +build 编译参数来指定使用的 CloudProvider 。
CloudProvider 接口与 NodeGroup 接口在 cloud_provider.go (
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/cloud_provider.go) 中定义，其中需要注意的是 Refresh 方法，它会在每一次循环（默认 10 秒）的开始时调用，可在此时请求接口并刷新 NodeGroup 状态，通常的做法是增加一个 manager 用于管理状态。有不理解的部分可参考其他 CloudProvider 的实现。

type CloudProvider interface {	// Name returns name of the cloud provider.	Name() string	// NodeGroups returns all node groups configured for this cloud provider.	// 会在一次循环中多次调用此方法，所以不适合每次都请求云厂商服务，可以在 Refresh 时存储状态	NodeGroups() []NodeGroup	// NodeGroupForNode returns the node group for the given node, nil if the node	// should not be processed by cluster autoscaler, or non-nil error if such	// occurred. Must be implemented.	// 同上	NodeGroupForNode(*apiv1.Node) (NodeGroup, error)	// Pricing returns pricing model for this cloud provider or error if not available.	// Implementation optional.	// 如果不使用 price expander 就可以不实现此方法	Pricing() (PricingModel, errors.AutoscalerError)	// GetAvailablemachineTypes get all machine types that can be requested from the cloud provider.	// Implementation optional.	// 没用，不需要实现	GetAvailableMachineTypes() ([]string, error)	// NewNodeGroup builds a theoretical node group based on the node definition provided. The node group is not automatically	// created on the cloud provider side. The node group is not returned by NodeGroups() until it is created.	// Implementation optional.	// 通常情况下，不需要实现此方法，但如果你需要 ClusterAutoscaler 创建一个默认的 NodeGroup 的话，也可以实现 。// 但其实更好的做法是将默认 NodeGroup 写入云端的伸缩组	NewNodeGroup(machineType string, labels map[string]string, systemLabels map[string]string,taints []apiv1.Taint, extraResources map[string]resource.Quantity) (NodeGroup, error)	// GetResourceLimiter returns struct containing limits (max, min) for resources (cores, memory etc.).	// 资源限制对象，会在 build 时传入，通常情况下不需要更改，除非在云端有显示的提示用户更改的地方，否则使用时会迷惑用户	GetResourceLimiter() (*ResourceLimiter, error)	// GPULabel returns the label added to nodes with GPU resource.	// GPU 相关，如果集群中有使用 GPU 资源，需要返回对应内容 。hack: we assume anything which is not cpu/memory to be a gpu.	GPULabel() string	// GetAvailableGPUTypes return all available GPU types cloud provider supports.	// 同上	GetAvailableGPUTypes() map[string]struct{}	// Cleanup cleans up open resources before the cloud provider is destroyed, i.e. go routines etc.	// CloudProvider 只会在启动时被初始化一次，如果每次循环后有需要清除的内容，在这里处理	Cleanup() error	// Refresh is called before every main loop and can be used to dynamically update cloud provider state.	// In particular the list of node groups returned by NodeGroups can change as a result of CloudProvider.Refresh().	// 会在 StaticAutoscaler RunOnce 中被调用	Refresh() error}// NodeGroup contains configuration info and functions to control a set// of nodes that have the same capacity and set of labels.type NodeGroup interface {	// MaxSize returns maximum size of the node group.	MaxSize() int	// MinSize returns minimum size of the node group.	MinSize() int	// TargetSize returns the current target size of the node group. It is possible that the	// number of nodes in Kubernetes is different at the moment but should be equal	// to Size() once everything stabilizes (new nodes finish startup and registration or	// removed nodes are deleted completely). Implementation required.	// 响应的是伸缩组的节点数，并不一定与 kubernetes 中的节点数保持一致	TargetSize() (int, error)	// IncreaseSize increases the size of the node group. To delete a node you need	// to explicitly name it and use DeleteNode. This function should wait until	// node group size is updated. Implementation required.	// 扩容的方法，增加伸缩组的节点数	IncreaseSize(delta int) error	// DeleteNodes deletes nodes from this node group. Error is returned either on	// failure or if the given node doesn't belong to this node group. This function	// should wait until node group size is updated. Implementation required.	// 删除的节点一定要在该节点组中	DeleteNodes([]*apiv1.Node) error	// DecreaseTargetSize decreases the target size of the node group. This function	// doesn't permit to delete any existing node and can be used only to reduce the	// request for new nodes that have not been yet fulfilled. Delta should be negative.	// It is assumed that cloud provider will not delete the existing nodes when there	// is an option to just decrease the target. Implementation required.	// 当 ClusterAutoscaler 发现 kubernetes 节点数与伸缩组的节点数长时间不一致，会调用此方法来调整	DecreaseTargetSize(delta int) error	// Id returns an unique identifier of the node group.	Id() string	// Debug returns a string containing all information regarding this node group.	Debug() string	// Nodes returns a list of all nodes that belong to this node group.	// It is required that Instance objects returned by this method have Id field set.	// Other fields are optional.	// This list should include also instances that might have not become a kubernetes node yet.	// 返回伸缩组中的所有节点，哪怕它还没有成为 kubernetes 的节点	Nodes() ([]Instance, error)	// TemplateNodeInfo returns a schedulernodeinfo.NodeInfo structure of an empty	// (as if just started) node. This will be used in scale-up simulations to	// predict what would a new node look like if a node group was expanded. The returned	// NodeInfo is expected to have a fully populated Node object, with all of the labels,	// capacity and allocatable information as well as all pods that are started on	// the node by default, using manifest (most likely only kube-proxy). Implementation optional.	// ClusterAutoscaler 会将节点信息与节点组对应，来判断资源条件，如果是一个空的节点组，那么就会通过此方法来虚拟一个节点信息 。TemplateNodeInfo() (*schedulernodeinfo.NodeInfo, error)	// Exist checks if the node group really exists on the cloud provider side. Allows to tell the	// theoretical node group from the real one. Implementation required.	Exist() bool	// Create creates the node group on the cloud provider side. Implementation optional.	// 与 CloudProvider.NewNodeGroup 配合使用	Create() (NodeGroup, error)	// Delete deletes the node group on the cloud provider side.	// This will be executed only for autoprovisioned node groups, once their size drops to 0.	// Implementation optional.	Delete() error	// Autoprovisioned returns true if the node group is autoprovisioned. An autoprovisioned group	// was created by CA and can be deleted when scaled to 0.	Autoprovisioned() bool}
上一页
1
2
3
4
下一页
		  	





























推荐阅读

           
                  
              
                  基金|建发股份：拟出资3037.5万元认购安科基金 
                
                   
                
              
            

                  
              
                  樱桃奶球|盘点五部鬼片可以当作搞笑片看的电影 
                
                   
                
              
            

                  
              
                  c罗|第79分钟，球王C罗射丢第10脚攻门，亲手送别大奖，留下无奈苦笑 
                
                   
                
              
            

                  
              
                  封面新闻|30秒｜韩星金贤重抢救昏迷者，事后回应“只是做了该做的事” 
                
                   
                
              
            

                  
              
                  对于六安瓜片，中国工程院院士有话说 
                
                   
                
              
            

                  
              
                  「中国特色社会主义」北京市发布全国文化中心建设未来15年规划 
                
                   
                
              
            

                  
              
                  怎样把皮肤变细腻白嫩 
                
                   
                
              
            

                  
              
                  挖贝网|营业成本同比减少，华电国际2020年上半年净利23.86亿增长43.49% 
                
                   
                
              
            

                  
              
                  法律博士试点!对美国法学博士的了解 
                
                   
                
              
            

                  
              
                  借款连累朋友，很难过 
                
                   
                
              
            

                  
              
                  春季谨防感冒发生 糖尿病患者巧用“香菜”可治病 
                
                   
                
              
            

                  
              
                  乐而雅零触感卫生巾安全吗 
                
                   
                
              
            

                  
              
                  怎样才能快速祛斑美白？ 快速美白祛斑 
                
                   
                
              
            

                  
              
                  酷玩四方|我的世界：最安全的火柴盒在高处？4个极具创意的MC建筑灵感 
                
                   
                
              
            

                  
              
                  音乐响起|一开就是大半年，沾土就活，新手也能养的好！，聪明人都养3种花 
                
                   
                
              
            

                  
              
                  安徽未来将有大发展的城市，不是芜湖和马鞍山，是你家乡吗 
                
                   
                
              
            

                  
              
                  国家经济|印度前国安顾问：中国远未被孤立 
                
                   
                
              
            

                  
              
                  怪咖搞笑|幽默笑话：几天前 因工作失误被老板训了半小时 
                
                   
                
              
            

                  
              
                  居民|儿童青少年超重肥胖问题凸显 
                
                   
                
              
            

                  
              
                  浙报融媒体|8个镇迎来发展契机，“萧山南花园”即将被串联 
                
                   
                
              
            

          

在Linux系统上安装和使用dig和nslookup命令 

Linux下如何通过挂载iso文件安装文件 

淘宝店如何获得流量 淘宝店铺老店新开怎么获取流量 

淘宝如何做到月销几千 淘宝新店怎么做基础销量 

如何让淘宝店铺排名靠前 淘宝网店排名提升 

茶器的实用性与美感,关于紫砂壶的实用性使用功能 

砚台收藏如何辨别真假 

淘宝怎么经营 淘宝开网店如何运营 

淘宝开店优势 淘宝店铺如何优化 

茶毫会影响茶叶的品质,如何通过茶叶的条索