With Horizontal Pod Autoscaling, Kubernetes automatically scales the number of pods in a replication controller, deployment, or replica set based on observed CPU utilization (or, with alpha support, on some other, application-provided metrics).
The HorizontalPodAutscaler autoscaling/v2
stable API moved to GA in 1.23.
The previous stable version, which only includes support for CPU autoscaling, can
be found in the autoscaling/v1
API version. The beta version, which includes
support for scaling on memory and custom metrics, can be found in
autoscaling/v2beta1
in 1.8 - 1.24 (and autoscaling/v2beta2
in 1.12 - 1.25).
kOps sets up HPA out of the box. Relevant reading to go through:
- Extending the Kubernetes API with the aggregation layer
- Configure The Aggregation Layer
- Horizontal Pod Autoscaling
While the above links go into details on how Kubernetes needs to be configured to work with HPA, the work is already done for you by kOps. Specifically:
- Enable the Aggregation Layer via the following
kube-apiserver flags:
-
--requestheader-client-ca-file=<path to aggregator CA cert>
-
--requestheader-allowed-names=aggregator
-
--requestheader-extra-headers-prefix=X-Remote-Extra-
-
--requestheader-group-headers=X-Remote-Group
-
--requestheader-username-headers=X-Remote-User
-
--proxy-client-cert-file=<path to aggregator proxy cert>
-
--proxy-client-key-file=<path to aggregator proxy key>
-
- Enable Horizontal Pod Scaling ... set the appropriate flags for
kube-controller-manager
:-
--kubeconfig <path-to-kubeconfig>
-
To enable the resource metrics API for scaling on CPU and memory, enable metrics-server by setting spec.metricsServer.enabled=true
in the Cluster spec. The
compatibility matrix is as follows:
Metrics Server | Metrics API group/version | Supported Kubernetes version |
---|---|---|
0.3.x | metrics.k8s.io/v1beta1 |
1.8+ |
To enable the custom metrics API, register it via the API aggregation layer. If you're using Prometheus, checkout the custom metrics adapter for Prometheus.