KEP-2862: Fine grained Kubelet API authorization

Implementation History
STABLE Implementable
Created 2024-07-11
Latest v1.36
Milestones
Alpha v1.32
Beta v1.33
Stable v1.36
Ownership
Owning SIG
SIG Node
Participating SIGs
Primary Authors

KEP-2862: Fine-grained Kubelet API Authorization

Release Signoff Checklist

Items marked with (R) are required prior to targeting to a milestone / release.

  • (R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
  • (R) KEP approvers have approved the KEP status as implementable
  • (R) Design details are appropriately documented
  • (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
    • e2e Tests for all Beta API Operations (endpoints)
    • (R) Ensure GA e2e tests meet requirements for Conformance Tests
    • (R) Minimum Two Week Window for GA e2e tests to prove flake free
  • (R) Graduation criteria is in place
  • (R) Production readiness review completed
  • (R) Production readiness review approved
  • “Implementation History” section is up-to-date for milestone
  • User-facing documentation has been created in kubernetes/website , for publication to kubernetes.io
  • Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

Summary

We propose a change to the Kubelet API authorization to provide more fine-grained control so that logging and monitoring agents that interact with the Kubelet directly can do so while adhering to the least privilege principle. Currently, the Kubelet API authorization uses a coarse authorization scheme, where actions like reading health status and the ability to exec into a pod require the same RBAC permissions.

We also propose that we document the kubelt API endpoints that we are adding fine-grained authorization for as these were previously undocumented.

Motivation

Historically the /healthz and /pods endpoints were available on the unauthenticated kubelet read-only port(10255) but kubelet read-only port is disabled by default and enabling the port is considered a security worst practice. This has caused a lot of monitoring and logging agents to switch to using the kubelet authenticated port (10250). However, the Kubelet API on the authenticated(10250) port currently uses a very coarse scheme for authorizing requests. For example, reading /healthz and calling /exec/… (i.e. execute arbitrary code) require the same proxy subresource. So, if an application needs to say list pods on the node, as many monitoring and logging agents do, they must be granted the proxy subresource in RBAC. Doing so grants many other powerful permissions to these agents which could be exploited by an attacker to escalate privilege.

Goals

  • Introduce new authorization subresources for configz, /healthz and /pods/ kubelet endpoints, allowing for more granular authorization.
  • Officially document the configz, healthz and /pods/ endpoints.
  • These changes should be backwards compatible.
  • These changes should not break users on upgrade.

Non-Goals

  • Create a new Kubelet API.
  • Remove existing authoriztion subresources.
  • Make the kubelet API node-restricted.

Proposal

Add new authorization subresources for healthz and pods endpoint while supporting the old coarse grained proxy authorization subresources as shown in the table below.

Request pathExisting SubresourceProposed Subresource(s)
/configzproxyproxy, configz
/healthzproxyproxy, healthz
/healthz/logproxyproxy, healthz
/healthz/pingproxyproxy, healthz
/healthz/syncloopproxyproxy, healthz
/pods/proxyproxy, pods
/runningpods/proxyproxy, pods

This way users who were previously using the proxy authorization subresource to grant access to the /healthz kubelet endpoint can update their ClusterRole or Role by replacing the proxy subresource with the healthz subresource.

User Stories (Optional)

Story 1

As a security conscious node monitoring agent owner I want to interact with the kubelet API to list pods on the node without having to grant the agent the proxy authorization subresource and use least privilege to grant the exact permissions the agent needs like pods or healthz.

Notes

If this change were to be implemented as proposed, the proxy authorization subresource would still cover /attach/, /exec/, /run/, /debug/. The reader might be wondering why we didn’t break these permissions out into their own authorization subresource?

The reason why we did not breakout all endpoints under proxy into their own subresource is because we could not reason about a case where if we allowed one of these permissions, having any of the other permissions would be considered worse. If you have /exec/ then having /run/ or /attach/ isn’t making it worse for an attacker.

We picked /configz, /healthz and /pods endpoints as they are read-only.

Risks and Mitigations

Since we are adding a second SubjectAccessReview request the latency for some requests to the Kubelet API will increase. However, Kubelet has a SubjectAccessReview response cache, so subsequent requests should result in cache hits.

The SubjectAccessReview QPS will also increase but this will also be mitigated by the cache.

Design Details

To determine if the caller has the required permissions for a particular request made to the Kubelet API, the Kubelet creates a SubjectAccessReview request to kube-apiserver. The SubjectAccessReviews are currently populated as follows:

apiVersion: authorization.k8s.io/v1
kind: SubjectAccessReview
spec:
  user: user1
  uid: 64167384-10aa-4361-bcef-526ab51d1e4d
  groups:
  - groups1
  resourceAttributes:
    group: ""
    version: "v1"
    resource: "nodes"
    namespace: ""
    name: "node-1"
    subresource: "proxy"
    verb: "GET"

The following information is passed into the SubjectAccessReview request

  • Requesting user (generally from a TokenReviewRequest or client certificate authentication)
  • Request verb
  • A resource request with:
    • APIGroup: ""
    • APIVersion: “v1”
    • Resource: “nodes”
    • Name: nodeName (the name of the node the request was against)
    • Subresource: which, before this KEP, can be one of: proxy, log, metrics, spec, stats

The subresource is determined by the path of the Kubelet API request. kube-apiserver upon receiving this request will check if the user in the SubjectAccessReview is authorized to access the relevant resource and subresource. For example, if a cluster is using RBAC then this ClusterRole and ClusterRoleBinding might allow access:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: nodes-proxy
rules:
- apiGroups: [""]
  resources: ["nodes/proxy"]
  verbs: ["get", "create"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: nodes-proxy-global
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: user1
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: nodes-proxy

We will add a feature-gate called KubeletFineGrainedAuthz that will be defaulted to false in alpha. When this feature-gate is set to true Kubelet will first send a SubjectAccessReview specifically for the configz healthz or pods endpoint, based on the path of the request made to the Kubelet. If that request fails Kubelet will retry with the coarse-grained verb (proxy).

When kube-apiserver communicates with the Kubelet in a cluster with RBAC enabled, then its user is bound to the system:kubelet-api-admin ClusterRole. For a cluster with the KubeletFineGrainedAuthz feature gate enabled, the
system:kubelet-api-admin ClusterRole could be changed to look like:-

  apiVersion: rbac.authorization.k8s.io/v1
  kind: ClusterRole
  metadata:
    annotations:
      rbac.authorization.kubernetes.io/autoupdate: "true"
    labels:
      kubernetes.io/bootstrapping: rbac-defaults
    name: system:kubelet-api-admin
  rules:
  - apiGroups:
    - ""
    resources:
    - nodes
    verbs:
    - get
    - list
    - watch
  - apiGroups:
    - ""
    resources:
    - nodes
    verbs:
    - proxy

  - apiGroups:
    - ""
    resources:
    - nodes/log
    - nodes/metrics
    - nodes/proxy
    - nodes/stats
+   - nodes/healthz
+   - nodes/pods
+   - nodes/configz
    verbs:
    - '*'

Note: Kubelet uses the DelegatingAuthorizerConfig which already implements a cache for allowed and denied requests. We will rely on this cache to prevent sending duplicate requests.

Note: We thought about making this behavior controlled via the KubeletConfiguration by adding a field called mode to the authoriztion.webhook field of type KubeletWebhookAuthorization . But we could not find a reasonable use case in which someone would want to not do fine-grained authorization. We also did not want to support more than one behavior when webhook authorization is enabled for Kubelet API.

Test Plan

[x] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.

Prerequisite testing updates
Unit tests
  • The following unit tests will be added to nodeAuthorizerAttributesGetter
    • When feature gate is disabled for /configz, /healthz and /pods/ only 1 attribute with proxy subresource should be returned
    • When feature gate is enabled for /configz, /healthz and /pods/ 2 attributes should be returned and the first should use subresource configz, healthz or pods and the second should use the proxy subresource.
Integration tests

We’ll add the following integration tests:

  • check that when the feature-gate is enabled a request by a user with nodes/proxy permission to kubelet /configz endpoint authorizes successfully
  • check that when the feature-gate is enabled a request by a user with nodes/configz permission to kubelet /configz endpoint authorizes successfully
  • check that when the feature-gate is enabled a request by a user with nodes/proxy permission to kubelet /healthz endpoint authorizes successfully
  • check that when the feature-gate is enabled a request by a user with nodes/healthz permission to kubelet /healthz endpoint authorizes successfully
  • check that when the feature-gate is enabled a request by a user with nodes/proxy permission to kubelet /pods/ endpoint authorizes successfully
  • check that when the feature-gate is enabled a request by a user with nodes/pods permission to kubelet /pods/ endpoint authorizes successfully
  • check that when the feature-gate is disabled a request by a user with nodes/proxy permission to kubelet /configz endpoint authorizes successfully
  • check that when the feature-gate is disabled a request by a user with nodes/configz permission to kubelet /configz endpoint authorizes unsuccessfully
  • check that when the feature-gate is enabled a request by a user with nodes/proxy
    permission to kubelet /healthz endpoint authorizes successfully
  • check that when the feature-gate is disabled a request by a user with nodes/healthz permission to kubelet /healthz endpoint authorizes unsuccessfully
  • check that when the feature-gate is enabled a request by a user with nodes/proxy permission to kubelet /pods/ endpoint authorizes successfully
  • check that when the feature-gate is enabled a request by a user with nodes/pods permission to kubelet /pods/ endpoint authorizes unsuccessfully
e2e tests

Unit tests and integration tests should sufficiently cover these changes without having to introduce new or update existing e2e tests.

Graduation Criteria

Alpha

  • Feature implemented behind a feature flag
  • Initial unit and integration tests completed and enabled

Beta

  • Feature gate set to true by default
  • e2e tests added

GA

  • Examples of real-world usage

Upgrade / Downgrade Strategy

If workloads are using non fine-grained permissions.

ScenarioResult
Upgrade both kubelet and kube-apiserver so that feature gate is enabled in both.workloads and kube-apiserver are able to reach kubelet
Upgrade only kubelet to enable the feature-gateworkloads and kube-apiserver are able to reach kubelet
Upgrade only kube-apiserver to enable the feature-gateworkloads and kube-apiserver are able to reach kubelet
Rollback both kubelet and kube-apiserver so that feature gate is disabled in both.workloads and kube-apiserver are able to reach kubelet
Rollback only kubelet to disable the feature-gateworkloads and kube-apiserver are able to reach kubelet
Rollback only kube-apiserver to disable the feature-gateworkloads and kube-apiserver are able to reach kubelet

If workloads are using fine-grained permissions.

ScenarioResult
Upgrade both kubelet and kube-apiserver so that feature gate is enabled in both.workloads and kube-apiserver are able to reach kubelet
Upgrade only kubelet to enable the feature-gateworkloads and kube-apiserver are able to reach kubelet
Upgrade only kube-apiserver to enable the feature-gateworkloads won’t be able to reach kubelet unless they revert to using coarse-grained permissions but kube-apiserver is able to reach kubelet
Rollback both kubelet and kube-apiserver so that feature gate is disabled in both.workloads won’t be authorized by kubelet unless coarse-grained permissions are also granted, but kube-apiserver is able to reach kubelet
Rollback only kubelet to disable the feature-gateworkloads won’t be authorized by kubelet unless coarse-grained permissions are also granted, but kube-apiserver is able to reach kubelet
Rollback only kube-apiserver to disable the feature-gateworkloads and kube-apiserver are able to reach kubelet

Version Skew Strategy

If webhook authorization is not enabled for the kubelet then nothing changes. But if webhook authorization is enabled for the kubelet API then older (n-3) kubelets that don’t support this feature will use only coarse-grained RBAC verbs so workload authors would need to make sure that they grant the coarse-grained permissions to their workloads when trying to communicate with these kubelets. Unless the workload is only granted the new fine-grained permission no changes are required to communicate with older kubelets.

Production Readiness Review Questionnaire

Feature Enablement and Rollback

How can this feature be enabled / disabled in a live cluster?
  • Feature gate (also fill in values in kep.yaml)
    • Feature gate name: KubeletFineGrainedAuthz
    • Components depending on the feature gate:
      • kubelet
      • kube-apiserver
  • Other
    • Describe the mechanism:
    • Will enabling / disabling the feature require downtime of the control plane?
    • Will enabling / disabling the feature require downtime or reprovisioning of a node?
Does enabling the feature change any default behavior?

While there will be no change to use-facing behavior as we will still send a SubjectAccessReview with the proxy authorization subresource, we will be sending an extra SubjectAccessReview request.

Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?

Yes, but a workload that is only authorized to use the new authorization subresource will lose access and will need to update its RBAC Role to use nodes/proxy.

Having the feature-gate enabled in kubelet and disabled in kube-apiserver or vice versa will not impact kube-apiserver’s ability to talk to the kubelet API. This is because whether the feature-gate is enabled or disabled kube-apiserver will always have nodes/proxy permissions in it’s RBAC. So either the first or the second SubjectAccessReview request will authorize kube-apiserver.

What happens if we reenable the feature if it was previously rolled back?

If the kubelet feature-gate is re-enabled then kubelet will again start sending 2 SubjectAccessReview requests.

If the kube-apiserver feature-gate is re-enabled then the ClusterRole system:kubelet-api-admin will be updated as described in the (Design Details section)[#design-details].

Readers might wonder if the order in which the feature-gate is disabled matters? It does not because no matter what the state kube-apiserver will always have nodes/proxy permissions in it’s RBAC.

Are there any tests for feature enablement/disablement?

Yes.

Rollout, Upgrade and Rollback Planning

How can a rollout or rollback fail? Can it impact already running workloads?

We have designed a fallback mechanism that prevents from failed rollouts or rollbacks from impacting an already running workloads ability to interact with the kubelet API.

Please see the Design Details section for more information.

What specific metrics should inform a rollback?

Increase in failed requests to kubelet API from workloads.

Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?

We have tested the following upgrade scenarios manually:

ScenarioResult
Upgrade both kubelet and kube-apiserver so that feature gate is enabled in both.workloads and kube-apiserver are able to reach kubelet
Upgrade only kubelet to enable the feature-gateworkloads and kube-apiserver are able to reach kubelet
Updrade only kube-apiserver to enable the feature-gateworkloads and kube-apiserver are able to reach kubelet

We have tested the following rollback scenarios manually:

ScenarioResult
Rollback both kubelet and kube-apiserver so that feature gate is disabled in both.workloads and kube-apiserver are able to reach kubelet
Rollback only kubelet to disable the feature-gateworkloads and kube-apiserver are able to reach kubelet
Rollback only kube-apiserver to disable the feature-gateworkloads and kube-apiserver are able to reach kubelet
Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?

No.

Monitoring Requirements

How can an operator determine if the feature is in use by workloads?

Users can check if this feature is enabled in kube-apiserver by running the following command:

kubectl get --raw /metrics | grep kubernetes_feature_enabled | grep KubeletFineGrainedAuthz

Users can check if this feature is nabled in the kubelet by running the following command in a pod that is running on the node:

If readonly port is enabled:

curl http://<node-ip>:10255/metrics | grep kubernetes_feature_enabled | grep KubeletFineGrainedAuthz

If readonly port is not enabled:

curl -k https://$MY_NODE_IP:10250/metrics | grep kubernetes_feature_enabled | grep KubeletFineGrainedAuthz 

NOTE: for port 10250 the pod will need to have the right RBAC bindings (if RBAC is enabled) to view the metrics.

How can someone using this feature know that it is working for their instance?
  • Events
    • Event Reason:
  • API .status
    • Condition name:
    • Other field:
  • Other (treat as last resort)
    • Details: By replacing nodes/proxy permission in RBAC with the fine-grained permissions required by the workload such as nodes/metrics, nodes/pods etc. and then confirming that the requests to kubelet succeed and don’t encounter authorization errors.
What are the reasonable SLOs (Service Level Objectives) for the enhancement?

Same SLOs as the kubelet API currently offers.

What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
  • Metrics
    • Metric name:
    • [Optional] Aggregation method:
    • Components exposing the metric:
  • Other (treat as last resort)
    • Details:

Same SLIs as the kubelet API currently offers.

Are there any missing metrics that would be useful to have to improve observability of this feature?

No.

Dependencies

Does this feature depend on any specific services running in the cluster?

This feature only comes into play if kubelet authotization mode is set to Webhook.

Scalability

Will enabling / using this feature result in any new API calls?

For some requests, the Kubelet will perform an additional SubjectAccessReview for the proxy authorization subresource when the first request with the fine-grained authorization subresource wasn’t authorized.

Will enabling / using this feature result in introducing new API types?

No.

Will enabling / using this feature result in any new calls to the cloud provider?

No.

Will enabling / using this feature result in increasing size or count of the existing API objects?

The count of SubjectAccessReviews by Kubelet may double if a SubjectAccessReview request is not cached previously.

Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?

Kubelet API is not covered by existing SLIs/SLOs.

The count of SubjectAccessReviews by Kubelet may double if a SubjectAccessReview request is not cached previously, which means that the time to authorize a request to Kubelet may also double.

Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, …) in any components?

No.

Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?

No.

Troubleshooting

How does this feature react if the API server and/or etcd is unavailable?

Not any different from how it would affect kubelet without this feature. If kube-apiserver is unavailable any SAR from kubelet will fail.

What are other known failure modes?

If requests to kubelet API start failing due to authorization issues users can disabled the feature-gate.

Users can check the kubernetes Audit logs for SubjectAccessReview requests created by system:nodes:* and check the reason they failed.

What steps should be taken if SLOs are not being met to determine the problem?
  1. Check that the feature gate is enabled in kube-apiserver and kubelet.
  2. Check that the workload has the right permissions. Requesets are expected to fail if you are using fine-grained subresources but the feature gate is not enabled in kubelet.
  3. Check the audit logs for SubjectAccessReview requests created by system:nodes:* and check the reason these requests failed.
  4. Check kubelet logs.

Implementation History

2024-09-28: KEP-2862 merged as implementable and PRR approved for ALPHA.

2024-10-17: Alpha Code implementation PR merged.

2024-10-22: Alpha Documentation PR merged.

2025-01-22: KEP graduated to BETA PR

2025-01-27: BETA code implementation PR merged.

2025-03-29: BETA documentation PR merged.

Drawbacks

Alternatives

Infrastructure Needed (Optional)