Autoscaling

Horizontal Pod Autoscaling (HPA) 可以根据 CPU 使用率或应用自定义 metrics 自动扩展 Pod 数量（支持 replication controller、deployment 和 replica set ）。

控制管理器每隔 15s（可以通过 --horizontal-pod-autoscaler-sync-period 修改）查询 metrics 的资源使用情况
支持三种 metrics 类型
- 预定义 metrics（比如 Pod 的 CPU）以利用率的方式计算
- 自定义的 Pod metrics，以原始值（raw value）的方式计算
- 自定义的 object metrics
支持两种 metrics 查询方式：Heapster 和自定义的 REST API
支持多 metrics

注意：

本章是关于 Pod 的自动扩展，而 Node 的自动扩展请参考 Cluster AutoScaler。
在使用 HPA 之前需要 确保已部署好 metrics-server。

API 版本对照表

Kubernetes 版本	autoscaling API 版本	支持的 metrics
v1.5+	autoscaling/v1	CPU
v1.6+	autoscaling/v2beta1	Memory及自定义

示例

# 创建 pod 和 service
$ kubectl run php-apache --image=k8s.gcr.io/hpa-example --requests=cpu=200m --expose --port=80
service "php-apache" created
deployment "php-apache" created

# 创建 autoscaler
$ kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
deployment "php-apache" autoscaled

$ kubectl get hpa
NAME         REFERENCE                     TARGET    MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache/scale   0% / 50%  1         10        1          18s

# 增加负载
$ kubectl run -i --tty load-generator --image=busybox /bin/sh
Hit enter for command prompt
$ while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done

# 过一会就可以看到负载升高了
$ kubectl get hpa
NAME         REFERENCE                     TARGET      CURRENT   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache/scale   305% / 50%  305%      1         10        1          3m

# autoscaler 将这个 deployment 扩展为 7 个 pod
$ kubectl get deployment php-apache
NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
php-apache   7         7         7            7           19m

# 删除刚才创建的负载增加 pod 后会发现负载降低，并且 pod 数量也自动降回 1 个
$ kubectl get hpa
NAME         REFERENCE                     TARGET       MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache/scale   0% / 50%     1         10        1          11m

$ kubectl get deployment php-apache
NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
php-apache   1         1         1            1           27m

自定义 metrics

使用方法

控制管理器开启 --horizontal-pod-autoscaler-use-rest-clients
控制管理器配置的 --master 或者 --kubeconfig
在 API Server Aggregator 中注册自定义的 metrics API，如 https://github.com/kubernetes-incubator/custom-metrics-apiserver 和 https://github.com/kubernetes/metrics

注：可以参考 k8s.io/metics 开发自定义的 metrics API server。

比如 HorizontalPodAutoscaler 保证每个 Pod 占用 50% CPU、1000pps 以及 10000 请求 / s：

HPA 示例

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1beta1
    kind: Deployment
    name: php-apache
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      targetAverageUtilization: 50
  - type: Pods
    pods:
      metricName: packets-per-second
      targetAverageValue: 1k
  - type: Object
    object:
      metricName: requests-per-second
      target:
        apiVersion: extensions/v1beta1
        kind: Ingress
        name: main-route
      targetValue: 10k
status:
  observedGeneration: 1
  lastScaleTime: <some-time>
  currentReplicas: 1
  desiredReplicas: 1
  currentMetrics:
  - type: Resource
    resource:
      name: cpu
      currentAverageUtilization: 0
      currentAverageValue: 0

状态条件

v1.7+ 可以在客户端中看到 Kubernetes 为 HorizontalPodAutoscaler 设置的状态条件 status.conditions，用来判断 HorizontalPodAutoscaler 是否可以扩展（AbleToScale）、是否开启扩展（ScalingActive）以及是否受到限制（ScalingLimitted）。

$ kubectl describe hpa cm-test
Name:                           cm-test
Namespace:                      prom
Labels:                         <none>
Annotations:                    <none>
CreationTimestamp:              Fri, 16 Jun 2017 18:09:22 +0000
Reference:                      ReplicationController/cm-test
Metrics:                        (current / target)
  "http_requests" on pods:      66m / 500m
Min replicas:                   1
Max replicas:                   4
ReplicationController pods:     1 current / 1 desired
Conditions:
  Type                  Status  Reason                  Message
  ----                  ------  ------                  -------
  AbleToScale           True    ReadyForNewScale        the last scale time was sufficiently old as to warrant a new scale
  ScalingActive         True    ValidMetricFound        the HPA was able to successfully calculate a replica count from pods metric http_requests
  ScalingLimited        False   DesiredWithinRange      the desired replica count is within the acceptable range
Events:

HPA 最佳实践

为容器配置 CPU Requests
HPA 目标设置恰当，如设置 70% 给容器和应用预留 30% 的余量
保持 Pods 和 Nodes 健康（避免 Pod 频繁重建）
保证用户请求的负载均衡
使用 kubectl top node 和 kubectl top pod 查看资源使用情况

参考文档

Ensure High Availability and Uptime With Kubernetes Horizontal Pod Autoscaler and Prometheus

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

autoscaling.md

autoscaling.md

Autoscaling

API 版本对照表

示例

自定义 metrics

状态条件

HPA 最佳实践

参考文档

Files

autoscaling.md

Latest commit

History

autoscaling.md

File metadata and controls

Autoscaling

API 版本对照表

示例

自定义 metrics

状态条件

HPA 最佳实践

参考文档