KEP-6060: API Server Authentication to Admission Webhooks
KEP-6060: API Server Authentication to Admission Webhooks
- Release Signoff Checklist
- Summary
- Motivation
- Terms
- Proposal
- Design Details
- Production Readiness Review Questionnaire
- Implementation History
- Drawbacks
- Alternatives
Release Signoff Checklist
Items marked with (R) are required prior to targeting to a milestone / release.
- (R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
- (R) KEP approvers have approved the KEP status as
implementable - (R) Design details are appropriately documented
- (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- e2e Tests for all Beta API Operations (endpoints)
- (R) Ensure GA e2e tests meet requirements for Conformance Tests
- (R) Minimum Two Week Window for GA e2e tests to prove flake free
- [ x (R) Graduation criteria is in place
- (R) all GA Endpoints must be hit by Conformance Tests within one minor version of promotion to GA
- (R) Production readiness review completed
- (R) Production readiness review approved
- “Implementation History” section is up-to-date for milestone
- User-facing documentation has been created in kubernetes/website , for publication to kubernetes.io
- Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
Summary
Today, kube-apiserver does not authenticate itself to admission
webhooks by default. Any entity with service network access can
send requests to a webhook endpoint and impersonate kube-apiserver.
CVE-2025-1974
demonstrated
real-world consequences of this class of vulnerability.
The introduction of the capability to authenticate API Servers
consists of three main additions. First, webhook authentication
clients
will be updated to request a service
account token from kube-apiserver, and to present the credential to the
admission webhook. Second, kube-apiserver will be updated to dispense
those tokens to authenticated and authorized principals. Third, a token
verification library will be introduced for use by webhook maintainers.
This KEP augments the TokenRequest API to allow for the client to request
claims in the JWT token to which kube-apiserver attests. In addition, tokens
intended for admission webhooks will be bound to an webhook configuration
object (either validating or mutating). The specifics of these changes are
discussed in detail in the design details
section.
A list of terms is provided below for disambiguation, and to prevent awkward sentence constructions.
Motivation
Any entity with service network access can send requests to an admission webhook endpoint. If the webhook does not authenticate the caller, an attacker can probe for policy information, trigger unintended side effects, or exploit the webhook’s own privileges within the cluster.
Opt-in mechanisms for authenticating the kube-apiserver to webhooks exist
(client certs, bearer tokens, or basic auth via a kubeconfig file configured
through --admission-control-config-file), but they require manual credential
management and an API server restart to change. That opt-in mechanism is
unopinionated as to the method of authentication (mTLS / token / basic auth),
creating a large burden on webhook maintainers to support verification of
client identity by all three methods. The burden is greatest
when the actor setting up the API Server (or aggregated API server) and the
actor setting up the webhook are not the same, as is usually the case with
“off-the-shelf”, community OSS webhooks.
There are existing out-of tree solutions, such as generic-admission-server . However, they require manual setup. This KEP posits that an opinionated, on-by-default solution is needed to reduce the friction to adoption. It is designed to make it possible to transition in phases. First, webhook authentication client libraries are configured to use them by default (except in cases where it would break an existing authentication setup). At this stage, webhooks may not yet have been updated to verify the tokens. Webhooks can instead silently ignore them. In the second phase, once credential issuance is GA and webhook maintainers can reasonably expect a credential to be present, webhook maintainers can use the provided library to opt-in to token verification. Over time, we expect this to make the landscape as a whole more secure.
In addition to kube-apiserver, aggregated API servers often need to contact
webhooks. Yet, they should should not have broad access to ask arbitrary
questions. A design is needed to make it easy for aggregated API servers to
query webhooks about resources it controls, but which prevents a malicious
or compromised aggregated API server from requesting policy information
about resources it does not control.
The scope of this KEP is limited to authenticating to admission webhooks.
Authentication webhooks, authorization webhooks, and audit webhooks do not
share the same practical barriers to authentication experienced by admission
webhooks. Those webhooks are not dynamically deployed at runtime, and don’t
have the same logistical barriers as admission webhooks: the actor setting
up kube-apiserver and the actor setting up the webhook are the same in the
vast majority of cases. Therefore, it is much more reasonable to expect that
such an actor would use the already available solution: they are in control
of both the method of authentication used by the client and the verification
methods used by the webhook. This leaves a slight gap for audit webhooks,
requiring that all deployed aggregated API servers that communicate with these
webhooks must also have access to the necessary credentials. To close this gap,
aggregated API servers can deploy their own audit webhook. TokenReview and
SubjectAccessReview make this a non-issue for everything but audit webhooks.
Conversion webhooks are likewise out of scope because they pertain to CRDs,
which will be handled exclusively by kube-apiserver and do not share the
same set of complications caused by allowing access from aggregated API servers.
Credentials issued for authentication to webhooks must be reviewable by
kube-apiserver, but webhooks should be able to verify them independently. No
API calls should be required of webhooks to verify tokens.
Goals
kube-apiserverauthenticates itself to admission webhooks by default, without requiring manual credential configuration.- Aggregated API servers can authenticate themselves to admission webhooks using the same mechanism.
- Minimal manual setup involved, both for webhook maintainers and cluster administrators. The KEP authors believe firmly that friction prevents adoption.
- The default behavior of webhook authentication clients is to procure a token and provide it to webhooks.
- The design does not break webhooks that have not yet adopted token verification.
kube-apiserverauthorizes webhook authentication clients at token issuance, and will refuse to provide a token to unauthorized principals.- Setting up the requisite permissions for token acquisition should be simple and easy.
- The token is scoped to a subset of resources about which its bearer may contact the webhook.
- Tokens may alternatively be scoped per-webhook (by audience).
- The design is backward compatible: existing kubeconfig-based webhook authentication setups continue to work without modification.
- Defining the webhook-side verification go library.
- Supporting both
client-goandcontroller-runtime. - Tokens must be reviewable by
TokenReview, although the webhook library WILL NOT useTokenReviewby default to avoid unnecessary round trips.
Non-Goals
- Authentication to non-admission webhooks (authentication webhooks, authorization webhooks, audit webhooks).
- Requiring
kube-apiserverto cache and refresh massive numbers of narrowly-scoped tokens. - Requiring webhooks to perform
TokenRevieworSubjectAccessReviewrequests tokube-apiserver. - Permitting aggregated API servers to have broad access to webhooks.
Terms
Token
Unless otherwise indicted, the term token will be used to exclusively
refer to service account tokens with two qualities: a webhook binding and
a claim indicating which APIGroups the token’s bearer may inquire about.
When discussing any other token, or service account tokens without both of these
two qualities, they will be clearly distinguished by the surrounding context.
Token Acquisition Service Account
The service account named in tokens for webhook authentication will be termed
the Token Acquisition Service Account. This is distinct from the identity
that the principal requesting the token uses to authenticate itself to the
Kubernetes API Server (which may or may not be a service account). The Token
Acquisition Service Account must have attest permissions on the APIGroup
named in the TokenRequest.
Webhook Authentication Client
Because both kube-apiserver and aggregated API servers will attempt
to authenticate to webhooks, the term webhook authentication client
will be used as a throughout this document as a generic term to refer to
both types of client when distinguishing between them is not important. The
overall flow for both kube-apiserver and aggregated API servers is mostly
the same, but with a few subtle differences.
Aggregated API Servers and kube-apiserver
When referring specifically to the Kubernetes API Server, the
terms kube-apiserver and Kubernetes API Server will be used
interchangeably. When referring specifically to an Aggregated API Server,
the full term will always be used.
Proposal
Webhook authentication clients
may request
service account tokens with a narrow scope, indicating to the webhook that
it is only valid for its audience and for AdmissionReview requests about
resources with a particular APIGroup. Because the number of per-webhook,
per-APIGroup tokens can quickly get out of hand for kube-apiserver, tokens
may alternatively be requested that are valid per-webhook, but which authorize
the bearer to ask questions about any resource. Because the authorization
scope of such tokens is larger, broader permissions are required to obtain
them. Tokens that authorize the bearer to make AdmissionReview requests about
any resource are intended for use only with kube-apiserver, although
there is no enforcement of this recommendation. Tokens restricted by APIGroup
are intended for use by aggregated API Servers to prevent giving them more
access than is needed. A fuller description of the permission model for token
acquisition is described in the design details
section.
The TokenRequest API will be expanded to accommodate the scoping of
service account tokens to a particular usage. A brief description of
those expansions is in order. To obtain a token
, the webhook
authentication client will make a TokenRequest on a service account. For a
client to obtain a token, it must meet four conditions. First, it must request
claims indicating which APIGroups it intends to query the webhook about;
these claims must be coherent with the resources the webhook expects. Further
details are described in the design details
section. Second, it must specify either a ValidatingWebhookConfiguration
or a MutatingWebhookConfiguration as the BoundObjectRef. Third, it must
specify an audience that is coherent for that *WebhookConfiguration. The
exact specification of the derivation of the audience is deferred until
implementation time, and is at the moment subject to change. Fourth and
finally, the ServiceAccount for which the TokenRequest is being made
must have sufficient permission to obtain the token. This is accomplished by
means of a synthetic authorization check at token issuance, and is described
in greater detail in the design details
section.
When the TokenRequest caller wants a token authorizing the bearer to inquire
about resources in any APIGroup, it will request that kube-apiserver attest
to a claim on the "*" APIGroup. This requires that the token acquisition
service account have broader permissions, described further in the design
details
section.
This broad permission should only be granted to kube-apiserver, and its use by
principals representing aggregated API servers is strongly discouraged. Instead,
aggregated API servers should request that kube-apiserver attest to the
APIGroup corresponding to the server’s APIService(s) (there may be multiple
APIServices to express multiple APIVersions of a single APIGroup). This
indicates to the webhook that it should deny AdmissionReview requests
that do not pertain to objects within that APIGroup. This is recommended
to prevent a potentially malicious aggregated API server from exposing a
webhook’s policy information or compromising it in some other way.
The TokenRequest handler will be updated to accommodate the new kinds
of bound object. When one of them is used, it will trigger additional
authorization checks (described below
), and checks
on the existence of the bound objects. Likewise, it will be augmented to
perform authorization checks on the permissions of the service account for
which the token is requested.
Webhook libraries will be updated to optionally (and eventually always) require a bearer token. The webhook then verifies these tokens by taking the following steps:
- Verify the token’s signature via the OIDC discovery endpoint.
- Verify that the token’s audience matches the expected audience. This audience is derived deterministically from the webhook configuration. Several alternatives have been discussed including the url, but the tradeoffs are still being evaluated.
- Verify that the resource named in the
AdmissionReviewrequest body is a member of theAPIGroup(s) named in the private claims. The*APIGroupis a superset of allAPIGroups. When theAPIGroupis*, this check will always succeed.
Sequence Diagrams
The following examples (with diagrams) illustrate the flows for token request, k8s authorization, issuance, and webhook authentication. There are two successful flows and one flow that fails at the time of webhook token verification.
Flow 1: kube-apiserver as Webhook Client
In this example, a user attempts to create a deployment named
“ninja-turtles” (user request omitted from diagram). This requires
review by the mutating admission webhook “mutagen-capsule”. In order to
authenticate itself to the webhook, kube-apisever makes a TokenRequest
for its dedicated service account kube-system:webhook-auth, bound to
the MutatingWebhookConfiguration for the “mutagen-capsule” webhook. Note
that the name for this dedicated service account remains an open question
(details
). It requests the
"*" APIGroup, rather than the "apps" APIGroup, to avoid the burden of
maintaining too many tokens, and because kube-apiserver is a privileged
actor. The kube-system:webhook-auth service account has "attest"
permissions on the synthetic APIGroup resource with the "*" name.
sequenceDiagram
participant TokenReq as TokenRequest Handler<br/>(kube-apiserver)
participant KAS as kube-apiserver<br/>(as AdmissionReview client)
participant Webhook as Admission Webhook<br/>(mutagen-capsule)
KAS->>KAS: Check token cache
alt Cache miss or token expired
KAS->>TokenReq: TokenRequest for "kube-system:webhook-auth"<br/>BoundObjectRef: MutatingWebhookConfiguration "mutagen-capsule"<br/>Audience: "https://mutagen-capsule.default.svc/admission/review"<br/>AttestationClaims: { "webhook-authentication.k8s.io/allowedAPIGroup": ["*"] }
TokenReq->>TokenReq: Authenticate client
Note over TokenReq: Authorization
TokenReq->>TokenReq: Can TokenRequest client principal (`kube-system` as client)<br/>"create" serviceaccounts/token for kube-system:webhook-auth?
TokenReq->>TokenReq: Does MutatingWebhookConfiguration<br/>"mutagen-capsule" exist?
TokenReq->>TokenReq: Does kube-system:webhook-auth have<br/>"attest" on APIGroup "*"?
TokenReq-->>KAS: Token issued (see payload below)
KAS->>KAS: Cache JWT
end
Note over KAS,Webhook: AdmissionReview
KAS->>Webhook: AdmissionReview (Deployment)<br/>+ Authorization: Bearer <token><br/>(see payload below)
Note over Webhook: Verify JWT
Webhook->>Webhook: Check signature<br/>(OIDC discovery)<br/>
Webhook->>Webhook: Check audience
Webhook->>Webhook: Check private claims<br/>Authorized for `"apps" APIGroup`?
Webhook-->>KAS: AdmissionReview responseJWT payload
{
"sub": "system:sercviceaccount:kube-system:webhook-auth",
"aud": "https://mutagen-capsule.default.svc/admission/review",
<...>
"kubernetes.io": {
"mutatingWebhookConfiguration": {
"name": "mutagen-capsule",
"uid": "<uid>"
},
"attestationClaims": {
"webhook-authentication.k8s.io/allowedAPIGroup": ["*"]
}
}
}
Flow 2: kube-apiserver denies a suspicious request
In this example, a compromised aggregated API server attempts to probe
a validating webhook, “splinter-validate”, for policy information. It
wants to spam the webhook with AdmissionReview requests in an attempt
to find principals that can write Secrets. To do so, it requests a JWT
bound to the “splinter-validate” ValidatingWebhookConfiguration,
requesting that kube-apiserver attest to the claim
{"webhook-authentication.k8s.io/allowedAPIGroup": ["*"]}.
sequenceDiagram
participant KAS as kube-apiserver
participant AAS as Aggregated API Server<br/>(v1.ninja.turtles.ai)
participant Webhook as Mutating Webhook<br/>(mutagen-capsule)
KAS->>AAS: Delegate request to aggregated API server
AAS->>AAS: Needs admission review
AAS->>KAS: List webhooks
KAS-->>AAS: #%20;
AAS->>AAS: Check JWT cache
alt Cache miss
Note over KAS,AAS: Token Acquisition
AAS->>KAS: TokenRequest for "turtles-webhook-auth"<br/>BoundObjectRef: ValidatingWebhookConfiguration "splinter-validate"<br/>Audience: https://splinter-validate.default.svc/admission/review<br/>AttestationClaims: { "webhook-authentication.k8s.io/allowedAPIGroup": ["*"] }
KAS->>KAS: Authenticate client
Note over KAS: Authorization
KAS->>KAS: Can AAS principal "create"<br/>serviceaccounts/token for turtles-webhook-auth?
KAS->>KAS: Does MutatingWebhookConfiguration<br/>"mutagen-capsule" exist?
KAS->>KAS: Does turtles-webhook-auth<br/>have "attest" on APIGroup "*"?
KAS-->>AAS: 403 Forbidden
end
alt possibly
Note over AAS,Webhook: AdmissionReview
AAS->>Webhook: AdmissionReview request<br/>(no Authorization header)
Webhook-->>AAS: 401 Unauthorized
endFlow 3: Aggregated API Server denied/accepted by webhook
In this example, an untrusted aggregated API server attempts to make an
AdmissionReview request to the validating webhook, “splinter-validate”. It
makes two requests about two different resources; it makes these requests
with the same token, which is bound to the ValidatingWebhookConfiguration
for that webhook. In the first request, which is denied, it makes an
AdmissionReview request about Secrets in the "" APIGroup (i.e. no
API group). In the second request, which is successful, it is asking about
NinjaTurtles in the "ninja.turtles.ai" APIGroup.
sequenceDiagram
participant KAS as kube-apiserver<br/>(issuer, authorizer)
participant AAS as Compromised AAS
participant Webhook as Admission Webhook<br/>(splinter-validation)
KAS->>AAS: Delegate request to aggregated API server
AAS->>AAS: Needs admission review
AAS->>KAS: List webhooks
KAS-->>AAS: #%20;
AAS->>AAS: Check JWT cache
alt Cache miss
Note over KAS,AAS: Token Acquisition
AAS->>KAS: TokenRequest<br/>SA:"turtles-webhook-auth"<br/>BoundObjectRef: ValidatingWebhookConfiguration "splinter-validate"<br/>Audience:"https://splinter-validate.default.svc/admission/review"<br/>AttestationClaims: { "webhook-authentication.k8s.io/allowedAPIGroup": ["ninja.turtles.ai"] }
Note over KAS: Authorization
KAS->>KAS: Can AAS principal "create"<br/>tokens for turtles-webhook-auth?
KAS->>KAS: Is there a registered APIService with <br/>APIGroup "ninja.turtles.ai"?
KAS->>KAS: Does turtles-webhook-auth have<br/>"attest" on APIGroup "ninja.turtles.ai"?
KAS-->>AAS: Issue JWT
AAS->>AAS: Cache JWT
end
Note over AAS,Webhook: AdmissionReview
alt Disallowed AdmissionReview resource
AAS->>Webhook: AdmissionReview<br/>Authorization: Bearer <JWT><br/>AdmissionReview: can "ben" write Secrets (APIGroup "")?
Note over Webhook: Verify JWT
Webhook->>Webhook: Signature (OIDC discovery)? - OK
Webhook->>Webhook: Audience? - OK
Webhook-->>Webhook: APIGroup/AdmissionReview coherence?<br/>"ninja.turtles.ai" ≠ "" - NO
Webhook-->>AAS: 403 Forbidden
end
alt Successful flow
AAS->>Webhook: AdmissionReview<br/>Authorization: Bearer <JWT><br/>AdmissionReview: can "ben" create NinjaTurtles (APIGroup "ninja.turtles.ai")?
Note over Webhook: Verify Token
Webhook->>Webhook: Check signature<br/>(OIDC discovery)
Webhook->>Webhook: Check audience
Webhook->>Webhook: Check allowedAPIGroup claim<br/>is AdmissionReview about ninja.turtles.ai?
Webhook-->>AAS: 200 OK
end
AAS-->>KAS: 200 OKJWT payload (both requests)
{
"sub": "system:serviceaccount:turtles:turtles-webhook-auth",
"aud": "https://splinter-validate.default.svc/admission/review",
<...>
"kubernetes.io": {
"validatingWebhookConfiguration": {
"name": "splinter-validate",
"uid": "<uid>"
},
"attestationClaims": {
"webhook-authentication.k8s.io/allowedAPIGroup": ["ninja.turtles.ai"]
}
}
}
Risks and Mitigations
Broad access to webhooks
This is, unfortunately, the reality of most admission webhooks deployed today. This KEP addresses this by requiring explicit permission on a service account to create tokens for use in authenticating to webhooks. Therefore, only specially authorized service accounts may create webhook authentication tokens.
Token replay across webhooks
A JWT obtained for one webhook could be presented to another webhook if they serve overlapping resources. The per-webhook audience scoping prevents this: each token is only valid for the specific webhook audience for which it was minted.
Token replay across API groups
A token with claims to one APIGroup could be presented when admitting
a resource from a different API group. The webhook’s verification of the
APIGroup claims against the AdmissionReview body prevents this: the
groups must match.
Service account compromise
If a service account is compromised, an attacker could request tokens and
impersonate an aggregated API server to webhooks. The dedicated-SA-per-server
model limits the impact of such a compromise. The attest permission,
properly applied to only the resources controlled by the aggregated API
server, prevents the service account from even obtaining a token for other uses.
Increased authorization load
Each request triggers an additional authorization check (the attest
verification). This is mitigated by caching: tokens are cached for their
lifetime, so the authorization check is amortized over many webhook calls.
Potential Deadlock
If an admission webhook is configured to intercept TokenRequests, and
the webhook requires an authentication token, there will be a deadlock with
no way to proceed. In this case, we accept the deadlock until the webhook
configuration is fixed.
Design Details
Changes to TokenRequest API
Changes to TokenRequestSpec
TokenRequestSpec will be augmented to allow for the specification of
AttestationClaims, with a type of map[string][]string. This is modeled
after the UserInfo.Extra field, which has served well for its purpose and
proven flexible.
The keys to this map represent the names of claims to which the client wishes
kube-apiserver to attest. The values represent the specifics of the claim. The
keys will be well-known. Arbitrary keys may not be used. Keys not recognized
by kube-apiserver will result in a rejection of the TokenRequest. The
addition of keys recognized by kube-apiserver will be subject to API review.
The meaning of the value or values will be well-defined. The same value
may change meaning when it corresponds to a different key, but the meaning
within the purview of each key (claim name) must be well-defined.
// TokenRequestSpec contains client provided parameters of a token request.
type TokenRequestSpec struct {
// NOTE: Newly added: AttestationClaims (described above)
AttestationClaims map[string][]string
// Audiences are the intended audiences of the token. A recipient of a
// token must identify themself with an identifier in the list of
// audiences of the token, and otherwise should reject the token. A
// token issued for multiple audiences may be used to authenticate
// against any of the audiences listed but implies a high degree of
// trust between the target audiences.
Audiences []string
// ExpirationSeconds is the requested duration of validity of the request. The
// token issuer may return a token with a different validity duration so a
// client needs to check the 'expiration' field in a response.
ExpirationSeconds int64
// BoundObjectRef is a reference to an object that the token will be bound to.
// The token will only be valid for as long as the bound object exists.
// NOTE: The API server's TokenReview endpoint will validate the
// BoundObjectRef, but other audiences may not. Keep ExpirationSeconds
// small if you want prompt revocation.
BoundObjectRef *BoundObjectReference
}
Changes to TokenRequest handler
In addition, the handler for TokenRequest will be updated to recognize
ValidatingWebhookConfiguration and MutatingWebhookConfiguration
as valid BoundObjectRefs. When one of these bindings is used, the
"webhook-authentication.k8s.io/allowedAPIGroup" AttestationClaim must
be provided. To fail to do so is considered an error and the request will
be rejected.
The handler will be updated to perform authorization checks to ensure the
service account for which the TokenRequest is made has the requisite
permissions. These checks are described in detail in another section below.
Added claim: "webhook-authentication.k8s.io/allowedAPIGroup"
The first valid key (claim name) for the AttestationClaims field will
be a claim named "webhook-authentication.k8s.io/allowedAPIGroup". It is
intended that only one APIGroup be specified. In other words, the value
must be a string slice of length 1; else, the request will be considered
improperly formed.
The special wildcard value "*" is recognized here to mean the token will be
valid for all APIGroups. Any other value represents a single APIGroup
for which the resulting token will be valid. When the value is the empty
string, "", it is understood to mean “no APIGroup” (as is the case for
the "v1." local APIService). This is understood to be distinct from a nil
map, empty map, or empty slice; all of those are considered invalid when the
BoundObjectRef is a webhook configuration object.
Claims are subject to authorization checks on the service account for
which the TokenRequest is made. These are described in detail in a later
section
.
Token Acquisition (from the client perspective)
All webhook authentication clients:
When a webhook authentication client
needs
to call an admission webhook about a given resource, it issues a TokenRequest
for its token acquisition service account
to the Kubernetes API Server. The request includes:
- A
BoundObjectRefpointing to either aValidatingWebhookConfigurationorMutatingWebhookConfigurationobject. - The name of a token acquisition service
account
with
attestpermission on theAPIGroupnamed in the required correspondingAttestationClaim. - An audience derived from the webhook’s url.
- An
AttestationClaimwith the key"webhook-authentication.k8s.io/allowedAPIGroup", and a value indicating whichAPIGroups the resulting token should authorize its bearer to ask webhooks about.
kube-apiserver:
Dedicated service account for token acquisition
In the case of kube-apiserver, the token acquisition service
account
will be a service account
with a well-known name, kube-system:webhook-auth, which is automatically
created in the bootstrapping process, and automatically recreated on deletion.
Note that the name of this service account, and whether it may be used in
the future to identify kube-apiserver in other ways, remains an open question.
Other details
When kube-apiserver needs to call an admission webhook, it will be
doing so for a resource (or custom resource) it serves directly. The
BoundObjectRef in the TokenRequest must be the one corresponding to the
ValidatingWebhookConfiguration or MutatingWebhookConfiguration of the
webhook it seeks to consult. The audience must be coherent with the bound
object. Because this is kube-apiserver, this request is a request for a token
is valid for a specific webhook but for all APIGroups. As such,
it should make the claim "webhook-authentication.k8s.io/allowedAPIGroup": ["*"].
The requester will only receive the JWT token when the authorization checks
(described below) succeed. When the principal is kube-apiserver, this
will always succeed unless the request happens to occur after the deletion
of its service account but before it can be recreated.
Aggregated API Servers:
When an aggregated API server needs to call an admission webhook, it requests a service account token from the Kubernetes API Server. Each aggregated API server should have a dedicated service account for this purpose, as it will be part of the resource path for the token request. The request flow for aggregated API servers is:
- The aggregated API server authenticates to
kube-apiserverusing whatever credential it is configured with (which may or may not be a service account). That principal must be authorized tocreate serviceaccount/tokenin the relevant namespace, with the appropriate resource name (i.e. that of the token acquisition service account). - It sends a
TokenRequestfor its dedicated service account, with all of the following: a. ABoundObjectRefpointing to theValidatingWebhookConfigurationorMutatingWebhookConfigurationfor the webhook it wishes to contact. a. An audience coherent with the webhook’s configuration. a. The"webhook-authentication.k8s.io/allowedAPIGroup" AttestationClaimwith a value of length 1, containing as its first and only element the name of theAPIGroupcontaining the resource that theAdmissionReviewrequest is about. (In normal circumstances, this should match theAPIGroupfield of anAPIServiceobject corresponding to this aggregated API server). kube-apiserverperforms authorization checks (described in a later section) and issues the service account token.- The aggregated API server presents the token to the webhook in its
Authorizationheader.
The token will be received only when the authorization checks succeed. These are described in the next section.
We expect each aggregated API server to have its own dedicated service
account for obtaining tokens it will use to authenticate to webhooks. Reuse
of these service accounts across multiple aggregated API servers is strongly
discouraged. Reuse of kube-apiserver’s service account by aggregated API
servers is not only discouraged; it is furthermore condemned.
Authorization Checks
When kube-apiserver receives a TokenRequest with one of the webhook
authentication bound object types
as the BoundObjectRef, it performs the following checks:
- Does the principal making the
TokenRequesthavecreateonserviceaccounts/tokenfor the service account named in the request? - Does the bound object (which may be a
ValidatingWebhookConfigurationor aMutatingWebhookConfiguration) actually exist? If not, the request is rejected. - Are the required claims present, and are they well-formed? (with fast
failure):
a. Is there a
"webhook-authentication.k8s.io/allowedAPIGroup" AttestationClaim? a. Does the claim’s value have length 1? a. If the first and only element of the claim’s value is not"*", is there at least one existing, non-deletedAPIServicewith anAPIGroupfield that matches this value? - Does the service account for which the
TokenRequestwas made have"attest"permissions on the synthetic resource"webhook-authentication.k8.io/apigroup"with name exactly equal to the value of the"webhook-authentication.k8s.io/allowedAPIGroup"claim?
To prevent cluster state from leaking, error messages should not expose any information about the existence or nonexistence of objects in the cluster.
The authorization checks are performed via an authorizer.Authorize() call
against the token acquisition service account’s identity. An authorizer will
be added to perform these checks.
RBAC Example
To illustrate the permission model, the following RBAC configurations are given as an example.
# Role permitting an identity to create tokens for its dedicated service
# account.
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: let-me-create-webhook-authentication-tokens
rules:
- apiGroups: [""]
resources: ["serviceaccount/token"]
resourceNames: ["webhook-token-acquisition-service-account"]
verbs: ["create"]
---
# The aggregated API server or `kube-apiserver`'s identity is bound to the
# above role, giving it access to make `TokenRequest`s for its dedicated
# service account.
kind: RoleBinding
metadata:
name: binding-to-let-you-create-serviceaccount-tokens
namespace: in-the-relevant-namespace
subjects:
- name: principal-requesting-a-token
apiGroup: rbac.authorization.k8s.io
kind: # Could be any of ServiceAccount | User | Group
roleRef:
kind: Role
name: let-me-create-webhook-authentication-tokens
apiGroup: rbac.authorization.k8s.io
---
# ClusterRole permitting an identity to obtain tokens valid for a single
# specific APIGroup
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: let-the-webhook-token-acquisition-service-account-request-tokens-bound-to-an-api-service
rules:
- apiGroups: ["webhook-authentication.k8s.io"]
resources: ["apigroups"]
resourceNames: ["jungle.panda"]
verbs: ["attest"]
---
# Binding granting the token acquisition service account the permissions of
# the above ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cluster-binding-to-let-you-get-webhook-authentication-tokens
subjects:
- kind: ServiceAccount
name: webhook-token-acquisition-service-account
namespace: in-the-relevant-namespace
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: let-the-webhook-token-acquisition-service-account-request-tokens-bound-to-an-api-service
apiGroup: rbac.authorization.k8s.io
Synthetic resource for authorization checks
When a requester makes a TokenRequest bound to one of the
WebhookConfiguration types, the service account for which the
token is requested must have "attest" on the synthetic resource
"webhook-authentication.k8s.io/apigroups" with a name that matches exactly
the value of the "webhook-authentication.k8s.io/allowedAPIGroup" claim in
the request.
Recommendations for permissions
The KEP authors recommend that only kube-apiserver’s
service account should be granted "attest" on the "*"
"webhook-authentication.k8s.io/apigroups". Aggregated API servers should
instead be granted "attest" on only those APIGroupss over which they
have control.
New types of BoundObjectRef
The TokenRequest API’s BoundObjectRef is extended to accept
ValidatingWebhookConfiguration, and MutatingWebhookConfiguration as
valid object reference kinds. The token becomes invalid if the referenced
*WebhookConfiguration is deleted. The presence of these bound objects
triggers authorization checks, described above. Also required is a claim
with name "webhook-authentication.k8s.io/allowedAPIGroup".
Audience
The token’s audience is derived from the webhook configuration. Client and
kube-apiserver will perform the same derivation, and both will derive the
same value. The exact format of the value has not yet been determined and
various alternatives are being weighed.
The webhook verifies that the token’s aud claim matches its configured
identity before accepting the request.
New JWT Private Claims
Tokens intended for authenticating to webhooks include the
following new fields in the kubernetes.io private claims of the
JWT. "{mutating,validating}WebhookConfiguration" are typical bound
objects. The newly-added attestation claims described above are nested as a
map[string][]string under "attestationClaims":
{
"kubernetes.io": {
"mutatingWebhookConfiguration": {
"name": "mutagen-capsule",
"uid": "44e818f2-2ad0-4432-9816-3a649ca9945c"
},
"attestationClaims": {
"webhook-authentication.k8s.io/allowedAPIGroup": ["jungle.panda"]
}
}
}
or
{
"kubernetes.io": {
"validatingWebhookConfiguration": {
"name": "splinter-validate",
"uid": "b0f1b456-6f90-4546-b72c-d9000e5dead1"
},
"attestationClaims": {
"webhook-authentication.k8s.io/allowedAPIGroup": ["jungle.panda"]
}
}
}
Where the mutatingWebhookConfiguration or validatingWebhookConfiguration
does not match the type of webhook receiving the token, the request may
be rejected.
Token Verification
The webhook may verify these tokens by taking the following steps:
- Verify the token’s signature via the OIDC discovery endpoint.
- Verify that the token’s audience matches the expected audience. This audience is derived deterministically from the webhook’s configuration.
- Verify that the JWT is bound to either a
ValidatingWebhookConfigurationor aMutatingWebhookConfiguration, but not both. - Verify that the value of the
"webhook-authentication.k8s.io/allowedAPIGroup"attestation claim is either: a. An exact match for theAPIGroupof the resource in theAdmissionReviewrequest, or a. The exact string"*".
Token Review
The webhook library will be designed in such a way that does not perform
TokenReview. Nevertheless, it must be possible for bearers or receivers
of these specialized service account tokens to perform token validation via
TokenReview. Therefore, object-existence checks will need to be added for
the cases when the private claims contain a "mutatingWebhookConfiguration"
or "validatingWebhookConfiguration" field.
TokenReview will also validate the existence of at least one APIService
whose APIGroup field matches exactly the value corresponding to the
"webhook-authentication.k8s.io/allowedAPIGroup" key in the JWT’s
"attestationClaims" private claim.
Token Caching and Rotation
Service account tokens for authentication to webhooks are cached per
combination of webhook and APIGroup. When a cached token has expired, the
next webhook call for that combination triggers a new TokenRequest. Tokens
will expire after 10 minutes. Anything less is considered a validation error,
and anything more will be silently shortened to 10 minutes.
Test Plan
[x] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.
Prerequisite testing updates
None identified at this time.
Unit tests
Unit tests will cover:
TokenRequestwithAttestationClaimsfield issues correct private claims.- The
attestauthorization check is performed and enforced. TokenRequestvalidation is performed withBoundObjectRefs pointing toValidatingWebhookConfigurationorMutatingWebhookConfiguration.- The webhook dispatch path attaches the token as a bearer token when the feature gate is enabled.
- The webhook dispatch path does not attach a token when the feature gate is disabled.
Integration tests
- Token issuance and webhook dispatch end-to-end with a test webhook that verifies token claims.
- Rejection when the SA lacks
attestpermission. - Rejection when the
APIServicefor anAPIGroupdoes not exist. - Cache behavior: verify that a cached token is reused and that a new token is requested on expiry.
- Feature gate toggling: verify behavior with the gate on and off.
- A webhook rejects a request where the bound object does not match the webhook type.
e2e tests
- An aggregated API server authenticates to an admission webhook using a JWT bound to each of the three new types.
- A webhook rejects a request where the
APIGroupclaims do not match the resource in theAdmissionReviewbody.
Graduation Criteria
Alpha
- Feature implemented behind feature gates.
- Webhook verification library.
- Webhook token issuance and webhook presentation functional for
kube-apiserver. - Webhook issuance and webhook presentation functional for aggregated API servers.
- Initial unit and integration tests completed and enabled.
Beta
- All unit, integration, and e2e tests passing and stable.
- Feedback from early adopters incorporated.
- All known issues and gaps resolved.
GA
- At least two releases since beta with no regressions.
- Conformance tests added.
Upgrade / Downgrade Strategy
On upgrade to a version that enables the feature:
kube-apiserverbegins presenting bearer tokens to admission webhooks. Webhooks that do not verify bearer tokens are unaffected, since the token is presented as anAuthorizationheader that the webhook can ignore.- Existing kubeconfig-based authentication setups continue to function.
On downgrade or feature disablement:
kube-apiserverstops presenting bearer tokens. Webhooks that have been configured to require token verification will reject requests. Operators must either re-enable the feature or reconfigure their webhooks.
Version Skew Strategy
This feature does not involve coordination between the control plane and
nodes. It is contained entirely within kube-apiserver and aggregated
API servers.
In a multi-replica HA cluster during rolling upgrade, some kube-apiserver
replicas may present bearer tokens to webhooks while others do not. Webhooks
that require token verification may see intermittent failures during the
rollout window. Webhooks should be configured to require bearer tokens only after
all replicas have been upgraded.
Production Readiness Review Questionnaire
Feature Enablement and Rollback
How can this feature be enabled / disabled in a live cluster?
- Feature gate (also fill in values in
kep.yaml)- Feature gate name:
APIServerWebhookAuthenticationToken - Components depending on the feature gate:
kube-apiserver
- Feature gate name:
Does enabling the feature change any default behavior?
Yes. When the issuance feature gate is enabled, kube-apiserver
will request a service account token (from itself) with a
"webhook-authentication.k8s.io/allowedAPIGroup" with the value
"*". kube-apiserver will present the bearer token to the webhook. Webhooks
that do not inspect the Authorization header will be unaffected. Webhooks
configured to accept bearer tokens of a different format may error upon
receipt of this token.
This KEP is scoped to admission webhooks only. Other webhook types are out of scope.
Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
Yes. Disabling APIServerWebhookAuthenticationToken and restarting
kube-apiserver will revert to the previous behavior. Webhooks that have
been configured to require the bearer token will begin rejecting requests,
since the API server will no longer present a token.
What happens if we reenable the feature if it was previously rolled back?
The feature will resume working as expected. No data migration or cleanup is required.
Are there any tests for feature enablement/disablement?
Unit tests will verify that when the feature gate is enabled, the webhook dispatch path presents a token. When the feature gate is disabled, no token is presented. Integration tests will exercise the full webhook call path with the feature gate toggled on and off.
Rollout, Upgrade and Rollback Planning
How can a rollout or rollback fail? Can it impact already running workloads?
During rollout in a multi-replica HA cluster, some replicas may present tokens while others do not. Webhooks that require tokens may see intermittent failures from replicas that have not yet been upgraded. This does not affect already running workloads directly, but it affects admission of new or modified objects during the rollout window.
On rollback, webhooks that were configured to require bearer tokens will reject all requests. Operators should reconfigure webhooks before or immediately after rollback.
What specific metrics should inform a rollback?
An increase in apiserver_admission_webhook_rejection_count with rejection
codes indicating authentication failure (401, 403) after enabling the feature
would indicate a problem. An increase in
apiserver_admission_webhook_fail_open_count would indicate that webhooks are
failing and the fail-open policy is being invoked more frequently than
expected.
Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
Integration tests will cover feature gate enablement and disablement. Manual testing of the upgrade->downgrade->upgrade path will be performed before beta promotion.
Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
No. The existing kubeconfig-based webhook authentication mechanism is not deprecated.
Monitoring Requirements
How can an operator determine if the feature is in use by workloads?
The feature is not workload-facing. It is a control plane behavior. An operator can determine the feature is active by checking the kube-apiserver feature gate configuration and by observing JWT-related metrics (see below).
How can someone using this feature know that it is working for their instance?
- Other (treat as last resort)
- Details: A webhook operator can verify the feature is working by checking
the
Authorizationheader on incoming requests for a valid JWT with the expected audience and private claims. Thekube-apiservermetrics below confirm that tokens are being issued and presented.
- Details: A webhook operator can verify the feature is working by checking
the
What are the reasonable SLOs (Service Level Objectives) for the enhancement?
Use of this feature should not change existing API SLOs. The additional latency from token issuance is amortized by caching.
What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
- Metrics
- Metric name:
apiserver_admission_webhook_latency_seconds(existing) - Aggregation method: 99th percentile
- Components exposing the metric:
kube-apiserver
- Metric name:
- Metrics
- Metric name:
apiserver_admission_webhook_rejection_count(existing) - Components exposing the metric:
kube-apiserver
- Metric name:
Are there any missing metrics that would be useful to have to improve observability of this feature?
New metrics to add:
apiserver_webhook_authentication_token_request_total: counter ofTokenReviewrequests containing aBoundObjectRefto one of the webhook authentication bound object types.apiserver_webhook_authentication_token_request_duration_seconds: histogram of token request latency, for tokens used to authenticate to webhooks.apiserver_webhook_authentication_token_create_calls_total: counter of cache hits and misses when looking up cached webhook authentication tokens.
Dependencies
Does this feature depend on any specific services running in the cluster?
No new dependencies. The feature uses the existing TokenRequest API and
OIDC discovery endpoint, both of which are part of kube-apiserver.
Scalability
Will enabling / using this feature result in any new API calls?
Yes. kube-apiserver will make a TokenRequest API call (create serviceaccounts/token) prior to calling a webhook, when no valid cached
token exists. Each request with a ValidatingWebhookConfiguration or
MutatingWebhookConfiguration BoundObjectRef triggers an additional
authorization check (the attest verification). The webhook configuration
object is also fetched to verify it exists.
Aggregated API servers will make the same calls to kube-apiserver.
This additional load is mitigated by caching tokens for their lifetime. Once
a token is cached for a given webhook+APIGroup combination, no new API
calls are needed until the token expires.
Will enabling / using this feature result in introducing new API types?
No. However, the attest verb is introduced for use with the synthetic
"webhook-authentication.k8s.io/apigroups" resource.
JWT private claims will also be augmented.
Will enabling / using this feature result in any new calls to the cloud provider?
If service account token signing has been offloaded to an external signer,
there will be an increase in signing requests proportional to the number
of unique webhook+APIGroup combinations.
Will enabling / using this feature result in increasing size or count of the existing API objects?
Yes. Each aggregated API server will have a dedicated service account for
token requests. kube-apiserver will have an additional service account
for the same purpose. Additional RBAC roles and bindings will be needed.
Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
On the first webhook call for a given webhook+APIGroup combination, there
will be additional latency from the TokenRequest and authorization check.
Subsequent calls use the cached token and incur no additional latency. The
cost is amortized over the token’s lifetime.
Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, …) in any components?
Minimal increase in memory for the token cache (one JWT per webhook for
kube-apiserver, one JWT per webhook+APIGroup combination for aggregated API
servers). CPU impact from token signing is negligible and amortized by caching.
Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
No. This feature does not affect nodes.
Troubleshooting
How does this feature react if the API server and/or etcd is unavailable?
If kube-apiserver is unavailable, no webhook calls are made and the feature
is moot. If etcd is unavailable, the dedicated service account and webhook
configuration objects cannot be read, and token issuance will fail. However,
the aggregated API server wouldn’t be able to obtain the list of webhooks in the
event that kube-apiserver is not available, and therefore wouldn’t request
a token unless it already had the list. Webhook calls will proceed without
a token, but will be rejected if the webhook has opted in to requiring them.
Additionally, aggregated API servers will break if they can’t obtain a webhook
authentication token from kube-apiserver. This is mitigated somewhat by
caching the token until its expiration (a total of ten minutes).
What are other known failure modes?
kube-apiservertoken service account is deleted- Detection:
apiserver_webhook_authentication_token_request_totalwith failure label increases. - Mitigations:
kube-apiserverwill need to be restarted, in order for the service account to be recreated as part of the boostrapping process. - Diagnostics:
kube-apiserverlogs will show token request failures. - Testing: Integration tests cover SA deletion and recreation.
- Detection:
Token acquisition service account lacks
attestpermission- Detection:
apiserver_webhook_authentication_token_request_totalwith failure label increases. Webhook calls proceed without authentication or fail, depending on webhook configuration. - Mitigations: Grant the
attestpermission. - Diagnostics:
kube-apiserverlogs will show authorization denial for theattestcheck. - Testing: Integration tests cover missing
attestpermission.
- Detection:
Webhook rejects token due to claims mismatch
- Detection:
apiserver_admission_webhook_rejection_countincreases. - Mitigations: Verify that the webhook is correctly matching the APIService claims against the resource in the AdmissionReview body.
- Diagnostics: Webhook server logs will show the specific claim mismatch.
- Testing: e2e tests cover claims mismatch rejection.
- Detection:
What steps should be taken if SLOs are not being met to determine the problem?
- Check
apiserver_webhook_authentication_token_request_totalfor token request failures. - Check
apiserver_admission_webhook_rejection_countfor webhook rejections. - Check
apiserver_admission_webhook_latency_secondsfor increased latency. - Verify the dedicated SA exists and has the correct permissions (possibly
RBAC):
kubectl auth can-i attest "apigroups.webhook-auth.k8s.io/porpoise.ai" --as=system:serviceaccount:dolphin:educator - If the problem cannot be resolved, disable the
APIServerWebhookAuthenticationTokenfeature gate and restartkube-apiserver.
Implementation History
Drawbacks
- Additional authorization checks on each token request add some overhead, though this is mitigated by caching.
- Webhook authors need to implement token verification to benefit from the feature, though a verification library will be provided.
- The feature introduces a new use of the
attestverb and extends theBoundObjectRefto support two new types, and a newAttestationClaimsfield, adding surface area to theTokenRequestAPI.
Alternatives
APIService as BoundObjectRef
This design was very similar to the KEP’s final design, and was the initially
proposed design. A TokenRequest was bound to an APIService instead of the
webhook configuration. An intermediate design handled both cases. The former was
unsatisfactory due to the token maintenance burden placed on kube-apiserver,
which serves potentially thousands of different APIServices. The latter
was complex, fiddly, and harder to reason about than the current design.
Client Certificates (mTLS)
kube-apiserver could authenticate to webhooks using client certificates
(e.g., the existing front-proxy cert). This was considered but has drawbacks:
L7 proxies terminate TLS and strip client certs, making this unreliable in
common deployment topologies (service meshes, cloud load balancers, ingress
controllers). Bearer tokens survive L7 proxies because they are HTTP headers.
Designated ServiceAccount (“Magic SA”)
A well-known service account name could represent the API server’s identity. This was considered but rejected because it expands the semantic meaning of ServiceAccount from “workload identity” to “control-plane identity” and relies on a magic name convention rather than explicit authorization.
ServiceAccount Token with Identity in Private Claims
Any service account token could carry a special claim indicating API server identity, gated by a synthetic subresource authorization check. This was considered but rejected in favor of binding to the APIService object, which provides a more semantically precise identity (the caller is authorized for a specific API group/version, not just “is an API server”).
AdmissionReview Delegation
Aggregated API servers could delegate admission to kube-apiserver via a
new AdmissionReview REST API. This was considered but rejected due to its
large scope (requiring its own KEP and significant API surface) and because
it would not address the kube-apiserver’s own authentication to webhooks.