KEP-2702: Graduate HPA v2beta2 API to GA

Graduate v2beta2 Autoscaling API to GA

Summary
Motivation
- Goals
- Non-Goals
Proposal
- Implementation Details
- Risks and Mitigations
Design Details
Production Readiness Review Questionnaire
Implementation History
Drawbacks
Alternatives

Summary

This document outlines required steps to graduate autoscaling v2beta2 API to GA.

Motivation

The HPA v2 series APIs were first introduced in November, 2016 (5 years ago). The primary feature of the v2 series is adding support for multiple and custom metrics. The structure was improved slightly in the v2beta2 API which became available in May 2018 and has remained largely unchanged since then. The v2beta2 API has been used extensively and informally treated as stable. The motivation for this KEP is to push it over the line to make it formally so.

Goals

Promote all of v2beta2 to stable
HPA behavior and container resource targets have E2E tests in order to meet stable requirements.
Deprecate v2beta2 as soon as v2 stable is landed
Deprecate v2beta1 immediately
Container Resource Targets
- Rename Resource to PodResource to match new ContainerResource target
Behavior Rename behavior select policy values from Min to MinChange and Max to MaxChange

Non-Goals

Promote scale-to-zero feature as a part of this effort. Since it requires additional effort to deprecate the special flag meaning of scaling subresource replicas=0 (disable autoscaling). Progress has been made. However, it is not part of HPA v2 stable effort since APIs cannot introduce breaking changes.

Proposal

Implementation Details

Risks and Mitigations

v1-v2 conversion loss of multiple CPU targets
v2beta1 has the significant amount of boilerplate and overhead in maintaining conversion routines for multiple public APIs.

Design Details

Renames

Rename Min and Max with respective MinChange and MaxChange in v2 stable to eliminate confusion with SelectPolicy enumeration
Rename the value Disabled with ScalingDisabled for better understanding
Rename Container Resource Targets Resource to PodResource to match new ContainerResource target

Test Plan

Add e2e tests for HPA behavior
The KEP test plan includes unit tests
Add e2e tests for container resource targets
Add conformance tests

Graduation Criteria

The following code changes must be made for graduating to GA

Move API objects to v2 and support conversion internally
Add behavior and container target E2E tests.

Version Skew Strategy

Upgrade/Downgrade Strategy

All HPA APIs to date are forward and backward conversion without loss by serializing all unsupported fields to annotations. HPA v2 stable will be the same, verified by unit tests.

Production Readiness Review Questionnaire

Requirements for migration

All HPA objects are losslessly converted between API versions, which are just a view of the data on disk. Neither the deprecation of v2beta1 nor the addition of v2 requires any changes or conversion on the server side. They will continue being stored in disk in v1 format as always, with new v2 fields serialized to annotiations.

However any HPA objects stored in the user’s code repository (all your YAML files) must stop using the v2beta1 format. You should migrate all your HPA objects to the v2 format. See the types.go files or just run kubectl get hpa.v2.autoscaling -oyaml to see your objects in the v2 format.

Feature Enablement and Rollback

N/A

Does enabling the feature change any default behavior?

Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?

The feature can be enabled by adding autoscaling/v2 to the --runtime-config flag: https://github.com/kubernetes/kubernetes/blob/ea0764452222146c47ec826977f49d7001b0ea8c/staging/src/k8s.io/apiserver/pkg/server/options/api_enablement.go#L45

Adding api/all will also include autoscaling/v2.

The feature can be disabled by removing the --runtime-config entry.

What happens if we reenable the feature if it was previously rolled back?

Are there any tests for feature enablement/disablement?

Rollout, Upgrade and Rollback Planning

How can a rollout or rollback fail? Can it impact already running workloads?

What specific metrics should inform a rollback?

Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?

Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?

Monitoring Requirements

The HPA requires the metrics.k8s.io APIs to be available in the cluster to operate. This API is served by the Metrics Server. An operator can verify the Metrics Server is available to provide resource metrics to the HPA by running the command kubectl get apiservices and looking for the status of v1beta1.metrics.k8s.io (version subject to change). Operators should take care to make sure Metrics Server is up and running to maintain resource autoscaling.

The v2 HPA requires the custom.metrics.k8s.io and external.metrics.k8s.io APIs as well to retrieve custom and external metrics. There is no default implementation of these APIs and cluster operators must install an “adapter” for their metrics backend (e.g. Prometheus ).

An operator can verify the adapter is working properly by running the same kubectl for apiservices and looking for the v1beta1.custom.metrics.k8s.io and v1beta1.external.metrics.k8s.io APIs (usually served by the same adapter). Care should be taken to ensure the adapter and specific metrics backend is available to maintain custom metric autoscaling.

How can an operator determine if the feature is in use by workloads?

All HPA objects are stored in v1 format on disk. They are up converted the requested version and down converted upon update. The document on how to run HPA includes quite a bit of background,algorithm details, and some good operator notes .

How can someone using this feature know that it is working for their instance?

[ x ] Events
- Event Reason: The event type Normal, reason SuccessfulRescale, note New size: N; reason: FOO indicates autoscaling is operating normally. Abnormal events type Warning include reasons such as FailedRescale and FailedComputeMetricsReplicas and will include details about the error in the note.
[ x ] API .status
- Condition name: There are three condition types which indicate the operating status of the HPA. They are ScalingEnabled, AbleToScale and ScalingLimited (see type comments ) Under normal operating circumstances ScalingEnabled and AbleToScale should be status true, indicating the HPA is successfully reconciling the scale. ScalingLimited indicates user configuration is limiting the “ideal” scale with a minimum, maximum, rate or delay. Which limit is the cause will be indicated in the message. It’s normal for this to be true or false periodically.
- Other field:
[ x ] Other (treat as last resort)
- Details: The HPA status includes the current observed metric values, one for each given target. Using these values an operator can verify the HPA is maintaining the desired target for the dominant metric. The operator can also see the number of pods the HPA observed under status.currentReplicas and the most recent recommendation under status.desiredReplicas. The latest observed generation is echoed back in status so an operator can verify the HPA is keeping up-to-date with configuration changes.

What are the reasonable SLOs (Service Level Objectives) for the enhancement?

Are there any missing metrics that would be useful to have to improve observability of this feature?

Dependencies

Does this feature depend on any specific services running in the cluster?

The HPA requires the metrics.k8s.io APIs to be available in the cluster to operate,This API is served by the Metrics Server, without Metrics Server autoscaling on resource metrics will not work. Without the a custom metrics adapter and the backing metric store running, custom and external metrics will not work. If there are multiple metrics defined and one is not available, scale up will continue but scale down will not (for safety).

Scalability

The HPA v2 APIs allow users to configure multiple metrics, each with a separate target. A recommendation is calculated for each metric and the largest recommendation is used. The more metrics are added to a given HPA the longer it will take to reconcile. The HPA is single-threaded processing recommendations one-at-a-time. When default reconciliation period is 15 seconds. If there is too much work to do reconciliation will slow down and happen less frequently than every 15 seconds. This will cause autoscaling to be less responsive at high scale.

Previously v1 scaled along two dimensions, number of HPA and number of pods selected by each HPA (linearly). Now it will scale with the number of metrics defined in HPAs and the number of pods selected each metric (linearly).

Additionally, v2 adds a behavior structure which allows the user configure that rate and delay of scaling and down. Enforcing these constraints require storing previous recommendations and scaling events in memory. The longer the configured interval the more memory is used. The maximum window allows is 60 minutes (code ) so 240 recommendations / events per configured metric. Each recommendation is an int32 and time.Time. Each scaling event is an int32, a time.Time and a bool (code ) so the memory footprint is relatively small. It will scale linearly with the number of metrics defined and the size of the HPA’s configured window.

Will enabling / using this feature result in any new API calls?

No, not in comparison to using the existing v2beta2 APIs, but of course using HPA results in new API calls as described above.

Will enabling / using this feature result in introducing new API types?

Yes. It will introduce the new autoscaling/v2 API types.

Will enabling / using this feature result in any new calls to the cloud provider?

Configuring custom metrics (the difference from v1 to v2) will result in API calls to the installed custom metrics adapter and the backing metrics store (which might be hosted in the cloud provider). These calls will happen every 15 seconds for each configured metric. Targets of type Value will retrieve for a single metric. Targets of type AverageValue will retrieve a metric for each pod.

Will enabling / using this feature result in increasing size or count of the existing API objects?

No. Data on disk remains as-in, in v1 format.

Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?

Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, …) in any components?

Troubleshooting

How does this feature react if the API server and/or etcd is unavailable?

If the API server or etcd are not available the HPA will not reconcile the scale subresource to the target metrics. This feature depends on other APIs served not from etcd but Metrics Server and custom metrics adapters. These are referenced in another section for monitoring to keep them alive. When one of the metrics is unavailable (e.g. a custom metric along side a resource metric) the HPA will continue to scale up if the other metric indicates to do so,this is for safety. However if one of the metrics is unavailable the HPA will not scale down in case the unavailable metric would have prevented a scale down. This is again for safety.

What are other known failure modes?

Implementation History

HPA v1
- HPA v1 proposal merged on Aug 13, 2015.
  - Design
- Graduated to beta on Oct 15, 2015
- Graduated to stable on Feb 2, 2016 as v1
HPA v2
- HPA v2 addition on Nov 2, 2016 for v2alpha1
  - Design
- Graduated to beta on Aug 15, 2017 as v2beta1
- Released second beta version v2beta2 on May 21, 2018
  - Design
Scale-to-zero
- scale-to-zero addition to external metrics on Jul 16, 2019 for alpha feature feature
HPA Controls
- HPA behavior controls addition on Dec 11, 2019 for v2beta2 API
Container Resource Targets
- Proposed on Mar 30, 2020