KEP-4633: Only allow anonymous auth for configured endpoints

Release Signoff Checklist
Summary
Motivation
- Goals
- Non-Goals
Proposal
Design Details
Production Readiness Review Questionnaire
Implementation History
Drawbacks
Alternatives
Infrastructure Needed (Optional)
Possible Future Improvements

Release Signoff Checklist

Items marked with (R) are required prior to targeting to a milestone / release.

(R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
(R) KEP approvers have approved the KEP status as implementable
(R) Design details are appropriately documented
(R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- e2e Tests for all Beta API Operations (endpoints)
- (R) Ensure GA e2e tests meet requirements for Conformance Tests
- (R) Minimum Two Week Window for GA e2e tests to prove flake free
(R) Graduation criteria is in place
- (R) all GA Endpoints must be hit by Conformance Tests
(R) Production readiness review completed
(R) Production readiness review approved
“Implementation History” section is up-to-date for milestone
User-facing documentation has been created in kubernetes/website , for publication to kubernetes.io
Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

Summary

By default, requests to the kube-apiserver that are not rejected by configured authentication methods are treated as anonymous requests, and given a username of system:anonymous and a group of system:unauthenticated.

This behavior is can be toggled on or off by using the --anonymous-auth boolean flag. By default anonymous-auth is set to true.

We propose that kubernetes should allow users to configure which endpoints can be accessed anonymously and disable anonymous auth for all other endpoints.

Motivation

Many kubernetes users still misconfigure their cluster by creating RoleBindings and ClusterRoleBindings to powerful rules in their cluster. This KubeCon talk covers a real world example of a production kubernetes cluster where a ClusterRoleBinding was created that bound the cluster-admin ClusterRole to the system:anonymous user, allowing for full cluster takeover.

One of the mitigations would be to disable anonymous authentication by setting the kube-apiserver flag --anonymous-auth to false, but this is not possible for many deployments that depend on unauthenticated requests (from clients like load balancers or the kubelet) to check health endpoints of a kubernetes cluster (healthz, livez and readyz). In order to allow these health checks a cluster admin has to enable anonymous requests opening the door for misconfigurations.

Goals

Add a way to disable anonymous authentication for all endpoints except a set of configured endpoints.

Non-Goals

Disable anonymous authentication for all endpoints.
Change kubernetes default behavior around anonymous authentication.

Proposal

We propose that anonymous-auth have 3 states:

Disabled: Authentication fails for any anonymous requests
Enabled: Authentication succeeds for anonymous requests
Enabled for certain endpoints: Authentication succeeds for anonymous requests only for the configured endpoints.

User Stories (Optional)

Story 1

As a security conscious cluster admin I want to disable anonymous authentication but still allow unauthenticated access to cluster health endpoints so that I don’t have to reconfigure external services like load balancers. This means that even if a RoleBinding or ClusterRoleBinding is added by a user that targets system:anonymous or system:unauthenticated then access to that resource/endpoint would not be possible.

Story 2

kubeadm requires anonymous access to the cluster-info ConfigMap in the kube-public namespace during cluster bootstrapping. As a security conscious kubeadm user I would like to configure just /api/v1/namespaces/kube-public/configmaps/cluster-info to be accessed anonymously.

Notes/Constraints/Caveats (Optional)

Risks and Mitigations

N/A

Design Details

We will update the kube-apiserver AuthenticationConfiguration with a field that allows a user to configure anonymous authentication as shown below.

type AuthenticationConfiguration struct {
     JWT []JWTAuthenticator `json:"jwt"`
+
+    // If present --anonymous-auth must not be set.
+    Anonymous *AnonymousConfig `json:"anonymous,omitempty"`
}
+
+type AnonymousConfig struct {
+    Enabled bool `json:"enabled"`
+
+    // If set, anonymous auth is only allowed for requests whose path exactly matches one of the entries.
+    // This can only be set when enabled is true.
+    Conditions []AnonymousAuthCondition `json:"conditions,omitempty"`
}

+type AnonymousAuthCondition struct {
+    // Path for which anonymous auth is allowed.
+    Path string  
+}

Using the structure described above a user will be able to do the following:

Disable Anonymous Auth.

apiVersion: apiserver.config.k8s.io/v1alpha1
kind: AuthenticationConfiguration
anonymous:
  enabled: false

Note: This is the same as setting --anonymous-auth flag to false.

Enable Anonymous Auth.

apiVersion: apiserver.config.k8s.io/v1alpha1
kind: AuthenticationConfiguration
anonymous:
  enabled: true

Note: This is the same as setting --anonymous-auth flag to true.

Allow Anonymous Auth for certain endpoints only.
```
apiVersion: apiserver.config.k8s.io/v1alpha1
kind: AuthenticationConfiguration
anonymous:
  enabled: true
  conditions:
  - path: "/healthz"
  - path: "/readyz"
  - path: "/livez"
```
Note: The path must be an exact case-sensitive match. We do not intend to support globbing of paths to keep the surface area here as small as possible. Globbing wasn’t required for the use cases presented so far, and the intent of this feature is to constrain anonymous auth to a well-known set of endpoints.
Note: We expect anyone using this feature to have a small number of (1-3) entries that are very explicit about the paths anonymous auth can be used for and for users who want to allow anonymous access to a wider or more complicated set of endpoints should lean on authorization policy.

A user will either be able to set --anonymous-auth or set the Anonymous field in the AuthenticationConfiguration. If nither --anonymous-auth nor the Anonymous field in the AuthenticationConfiguration are set then the kubernetes default behavior of anonymous auth being enabled will be observed.

We will gate the ability for a user to configure anonymous auth using the AuthenticationConfiguration behind a feature gate called AnonymousAuthConfigurableEndpoints.

When a user configures AuthenticationConfiguration.Anonymous the following behavior should be observed:

If AuthenticationConfiguration.Anonymous is non-nil and AnonymousAuthConfigurableEndpoints is not set to true then kube-apiserver should fail to start with an appropriate error guiding the user to enable the feature gate.
If AuthenticationConfiguration.Anonymous is non-nil and --anonymous-auth flag is set then kube-apiserver should fail to start with an appropriate error guiding the user to either use --anonymous-auth or use AuthenticationConfiguration.Anonymous.
If AuthenticationConfiguration.Anonymous.Enabled is false but AuthenticationConfiguration.Anonymous.Conditions is not empty then kube-apiserver should fail to start with an appropriate error guiding the user to set AuthenticationConfiguration.Anonymous.Enabled to true.
If AuthenticationConfiguration.Anonymous.Enabled is true but AuthenticationConfiguration.Anonymous.Conditions is empty then anonymous requests should be able to authenticate for any path.
If AuthenticationConfiguration.Anonymous.Enabled is true and AuthenticationConfiguration.Anonymous.Conditions is not empty then anonymous requests should be able to authenticate only for the paths specified in AuthenticationConfiguration.Anonymous.Conditions.

Note: Today the authentication config file is dynamically reloaded when the jwt field is updated. However, for the proposed anonymous field we plan not to support dynamic reloading. This behavior is consistent with built-in authorizers (like Node, RBAC etc.) that are also not reloaded during the dynamic reloading of the authorization config file. To make this clear to users we will update the relevant documentation for authentication config.

Test Plan

[X] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.

Prerequisite testing updates

None.

Unit tests

We will add unit tests for the following scenarios:

Validation of the authentication configuration.
Making sure that the flag and the config are mutually exclusive.
Behavior of the path restricted anonymous authenticator.

Unit tests were added to the following:

pkg/kubeapiserver/options/authentication_test.go
staging/src/k8s.io/apiserver/pkg/authentication/request/anonymous/anonymous_test.go

Integration tests

We will add an integration tests that exercise each of the following config file based authentication scenarios:

anonymous auth disabled in the auth-config file.
anonymous auth enabled and unrestricted in the auth-config file.
anonymous auth enabled and restricted to certain paths in the auth-config file.

The following integration tests were added:

test/integration/apiserver/anonymous/anonymous_test.go

e2e tests

We believe that all scenarios will be sufficiently covered by the unit and integration tests so we will not need any additional e2e tests.

Graduation Criteria

Alpha

Feature implemented behind a feature flag
Full unit and integration test coverage

Beta

Feature gate set to true by default

GA

Examples of real-world usage
- GKE and AWS are using this feature to limit anonymous access to /healthz, /readyz and /livez endpoints.

Upgrade / Downgrade Strategy

When the feature-gate is enabled none of the defaults or current settings regarding anonymous auth are changed. The feature-gate enables the ability for users to set the anonymous field using the AuthenticationConfiguration file.

Version Skew Strategy

This feature only impacts kube-apiserver and does not introduce any changes that would be impacted by version skews. All changes are local to kube-apiserver and are controlled by the AuthenticationConfiguration file passed to kube-apiserver as a parameter.

Production Readiness Review Questionnaire

Feature Enablement and Rollback

How can this feature be enabled / disabled in a live cluster?

Feature gate (also fill in values in kep.yaml)
- Feature gate name: AnonymousAuthConfigurableEndpoints
- Components depending on the feature gate: kube-apiserver
Other
- Describe the mechanism:
- Will enabling / disabling the feature require downtime of the control plane?
- Will enabling / disabling the feature require downtime or reprovisioning of a node?

Does enabling the feature change any default behavior?

Enabling the feature gate does not change the default behavior unless the user also changes the value of --anonymous-auth flag or updates the AuthenticationConfiguration.

Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?

Yes.

What happens if we reenable the feature if it was previously rolled back?

Nothing, unless the user also changes the value of --anonymous-auth flag or updates the AuthenticationConfiguration.

Are there any tests for feature enablement/disablement?

Yes we will add the following tests:

When AnonymousAuthConfigurableEndpoints feature gate is disabled:
- If AuthenticationConfiguration contains the Anonymous stanza then kube-apiserver should fail to start with an appropriate error guiding the user to enable the feature gate
- Users should be able to set --anonymous-auth to false
- Users should be able to set --anonymous-auth to true
When AnonymousAuthConfigurableEndpoints feature gate is enabled:
- If AuthenticationConfiguration contains the Anonymous stanza then --anonymous-auth flag cannot be set
- If AuthenticationConfiguration does not contain the Anonymous stanza then --anonymous-auth flag can be set to true
- If AuthenticationConfiguration does not contain the Anonymous stanza then --anonymous-auth flag can be set to false

Rollout, Upgrade and Rollback Planning

How can a rollout or rollback fail? Can it impact already running workloads?

Enabling the feature flag alone does not change kube-apiserver defaults. However if different API servers have different AuthenticationConfiguration for Anonymous then some requests that would be denied by one API server could be allowed by another.

What specific metrics should inform a rollback?

kube-apiserver fails to start when AuthenticationConfiguration file has anonymous field set.

If audit logs indicate that endpoints other than the ones configured in the AuthenticationConfiguration file using the anonymous.conditions field are reachable by anonymous users.

Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?

N/A

Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?

Monitoring Requirements

How can an operator determine if the feature is in use by workloads?

N/A

How can someone using this feature know that it is working for their instance?

If a user sets AuthenticationConfig file and sets the anonymous.enabled to true and sets anonymous.conditions to allow only certain endpoints. Then they can check if the feature is working by:

making an anonymous request to an endpoint that is not in the list of endpoints they allowed. Such a request should fail with http status code 401.
making an anoymous request to an endpoint that is in the list of endpoints they allowed. Such a request should either succeed with http status code 200 (if authz is configured to allow acees to that endpoint) or fail with http statis code 403 (if authz is not configured to allow access to that endpoint)

What are the reasonable SLOs (Service Level Objectives) for the enhancement?

SLOs for actual requests should not change in any way compared to the flag-based Anonymous configuration.

What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?

Metrics
- Metric name:
- [Optional] Aggregation method:
- Components exposing the metric:
Other (treat as last resort)
- Details:

N/A

Are there any missing metrics that would be useful to have to improve observability of this feature?

N/A

Dependencies

Does this feature depend on any specific services running in the cluster?

No.

Scalability

Will enabling / using this feature result in any new API calls?

No.

Will enabling / using this feature result in introducing new API types?

No.

Will enabling / using this feature result in any new calls to the cloud provider?

No.

Will enabling / using this feature result in increasing size or count of the existing API objects?

No.

Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?

No.

Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, …) in any components?

No.

Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?

No.

Troubleshooting

How does this feature react if the API server and/or etcd is unavailable?

This feature is about for API Server handles Authentication for anonymous requests. If API server is unavailable then this feature is also unavailable.

What are other known failure modes?

What steps should be taken if SLOs are not being met to determine the problem?

After observing an issue (e.g. uptick in denied authentication requests), kube-apiserver logs from the authenticator may be used to debug.

Additionally, manually attempting to exercise the affected codepaths would surface information that’d aid debugging. For example, attempting to issue an anonymous request to an endpoint that is allowed or disallowed based on the contraints set in the anonymous config in the AuthenticationConfiguration file.

Implementation History

2024-05-13 - KEP introduced
2024-06-07 - KEP Accepted as implementable
2024-06-27 - Alpha implementation merged https://github.com/kubernetes/kubernetes/pull/124917
2024-07-15 - Integration tests merged https://github.com/kubernetes/kubernetes/pull/125967
2024-08-13 - First release (1.31) when feature available
2024-08-16 - Targeting beta in 1.32

Drawbacks

Alternatives

A sidecar proxy could have handled this but would push complexity into all consumers who are not running side car proxies today, and the complexity of allowing the restriction in-tree is minimal.
A deny authorizer could be added that does this but we think this approach is better for the following reasons:
- a deny authorizer is a lot more complex to implement than restricting the anonymous authenticator
- having two decoupled subsystems where a later phase is responsible for locking down over-granted requests from the first phase is not ideal. We already have this with authz/admission, and we don’t want to repeat that pattern if we don’t have to.

Infrastructure Needed (Optional)

Possible Future Improvements

We decided not to apply any restrictions here to anonymous userInfo that comes back after all authenticators and impersonation have run because we think that the scope of this KEP is to provide cluster admins with a way to restrict actual anonymous requests. A request that was considered authenticated and as permitted to impersonate system:anonymous is not actually anonymous.

If we want to allow cluster admins the ability to add such restrictions we think its better to give them the capability to configure webhook authenticators and add userValidationRules capabilities. But doing so would expand the scope of this KEP and it should likely be a separate effort.