KEP-3488: CEL for Admission Control

Release Signoff Checklist
Summary
Motivation
- Goals
- Non-Goals
Background
Considerations
Proposal
Design Details
GA
- Upgrade / Downgrade Strategy
- Version Skew Strategy
Production Readiness Review Questionnaire
Implementation History
Drawbacks
Future Work
Alternatives
Infrastructure Needed (Optional)

Release Signoff Checklist

Items marked with (R) are required prior to targeting to a milestone / release.

(R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
(R) KEP approvers have approved the KEP status as implementable
(R) Design details are appropriately documented
(R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- e2e Tests for all Beta API Operations (endpoints)
- (R) Ensure GA e2e tests meet requirements for Conformance Tests
- (R) Minimum Two Week Window for GA e2e tests to prove flake free
(R) Graduation criteria is in place
- (R) all GA Endpoints must be hit by Conformance Tests
(R) Production readiness review completed
(R) Production readiness review approved
“Implementation History” section is up-to-date for milestone
User-facing documentation has been created in kubernetes/website , for publication to kubernetes.io
Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

Summary

This is a proposal for customizable, in-process validation of requests to the Kubernetes API server as an alternative to validating admission webhooks.

This proposal builds on the capabilities of the CRD Validation Rules feature that graduated to beta in 1.25, but with a focus on the policy enforcement capabilities of validating admission control.

Motivation

This KEP will lower the infrastructure barrier to enforcing customizable policies as well as providing primitives that help the community establish and adhere to best practices of both K8s and its extensions.

Currently the way custom policies are enforced are via admission webhooks. Admission webhooks are extremely flexible, but have a few drawbacks as compared to in-process policy enforcement:

They require building infrastructure to host the admission webhook.
They contribute to latency by requiring another network hop.
Due to the extra infrastructure dependencies, webhooks are inherently less reliable than in-process webhooks. This forces cluster operators to choose between failing closed, which reduces the availability of the cluster as a whole and failing open, which limits the efficacy of webhooks for enforcing policy.
Webhooks are operationally burdensome for cluster administrators to manage. They must take responsibility for the observability, security and the release/rollout/rollback plans for the webhook.

Taking a view of the K8s ecosystem as a whole, it is clear that there is demand for opinionated policy frameworks.

Pod Security Policies provided this for pods, but encountered a number of issues. One of which was that it was hard to keep up with community demand for more control surfaces, and the delay in delivering these control surfaces due to K8s’ rollout period.

Pod Security Admission is a similar solution, but does not attempt to duplicate the control granularity that PSP provided.

There are numerous in-tree embedded controllers.

The existence of security regimes like the CIS Kubernetes Benchmarks highlight the values of standardized controls. Automating their enforcement, where possible, will make it easier for users to lock down their clusters.

With the advent of CRDs, and the drive to make the resources they define first-class entities, the footprint of Kubernetes extensions is set to grow for the foreseeable future. This KEP allows authors of such extensions to provide policy primitives similar to PSP or PSA, putting them on equal footing with in-tree functionality.

With the reduced infrastructure footprint and demonstrated demand for a customizable, built-in mechanism for extensible policy, this KEP fills a community need. It is not intended to replace validating admission webhooks altogether, however, since these can support functionality that may not make sense to provide in-tree.

Goals

Provide an alternative to webhooks for the vast majority of validating admission use cases.
Provide the in-tree extensions needed to build policy frameworks for Kubernetes, again without requiring webhooks for the vast majority of use cases.
Make good use of CEL type checking. This becomes complicated when considering that CRD schemas can be changed at any time and that not all fields of built in types exist in an older Kubernetes version.
Provide a polyfill implementation that is supported by the Kubernetes org to provide this enhancement functionality to Kubernetes versions where this enhancement is not available.
Provide core functionality as a library so that use cases like GitOps, CI/CD pipelines, and auditing can run the same CEL validation checks that the API server does.

Non-Goals

Build a comprehensive in-tree policy framework. We believe the ecosystem is best equipped to explore and develop policy frameworks. We’re focusing on building an extensible enforcement point into admission control that can be used to build policy frameworks.
- Examples of what policy frameworks might do beyond this enhancement might do:
  - Auditing already written resources
  - Building out libraries for code reuse
  - Validating Kubernetes resource YAML files adhere to a policy in a CI/CD pipeline
Mutation support. While we believe this enhancement could be extended in the future to support mutations, we believe it is best handled as a separate enhancement. That said, we need to keep mutation in mind in this KEP to ensure we design it in such a way that we don’t obviously paint ourselves into a corner where mutation would be difficult to introduce later.
Full feature parity with validating admission webhooks. For example, this enhancement is not expected to ever support making requests to external systems.
Replace the admission controllers compiled into the API server.
Static or on-initialization specification of admission config. This is a needed feature but should be solved in a general way and not in this KEP (xref: https://github.com/kubernetes/enhancements/issues/1872) .

Background

This is not a new idea, Tristan Swadell (@TristonianJones) explored policy for Kubernetes using CEL with cel-policy-templates-go , and Jordan Liggitt (@liggitt) prototyped using CEL for in-process admission control in Kubernetes in 2020.

CRD Validation Rules were implemented as a more constrained subset of this problem and addressed how to integrate the Kubernetes type system with CEL.

Considerations

Admission Webhook Parity

Users of the Kubernetes API are already familiar with ValidatingWebhookConfigurations. We should strive for consistency with this API unless there is a good reason to diverge from it. As a concrete example, we should provide access to all the information that webhooks have access to (see AdmissionRequest ) and if we provide access to additional information we should extend AdmissionRequest to include it. (But we need to be careful not to make AdmissionRequest significantly larger as this will impact the performance/latency of existing webhooks not leveraging the additional information. We should be careful about providing access to cross-object information like namespace objects to webhooks since they can be stale.)

Configurability

Consider an admission rule that disallows requests based on a blocklist.

While it is possible to inline a blocklist directly into a CEL expression as a data literal (!(object.metadata.name in ['blocked1', 'blocked2'])), this quickly becomes problematic:

Long blocklists become unwieldy quickly in CEL expressions
A blocklist per scope (e.g. namespace) is inconvenient express and maintain

The need to configure admission rules is common enough (see below use cases) that we propose configuration be a 1st class concept in the API.

Since all the policy frameworks we have surveyed have configurability as a 1st class concept, omitting it would result in either the policy frameworks not adopting this enhancement (and sticking with webhooks) or somehow bypassing the limitation. One possible approach would be to generate a CEL expression with the configuration data embedded as a data literal, but we would strongly prefer not to have policy frameworks generating a CEL expression for each possible configuration, but CEL is designed for evaluations over structure input, such as configuration data, and the alternative of generating a CEL expression for each possible configuration would be sub-optimal from an evaluation and maintenance standpoint.

Migration

With webhooks already in large scale use in the Kubernetes ecosystem, we intend to prioritize capabilities that ease migration. As a concrete example, when migrating, having fine grained control of what validation messages are returned and how they are formatted can make a migration far more seamless.

Compliance

In-process admission control has fundamental advantages over webhooks: it is far safer to use in a “fail closed” mode because it removes the network as a possible failure domain. With webhooks, using “fail closed” can negatively impact cluster availability. But “fail closed” is very valuable when enforcing compliance (and security). We intend to prioritize capabilities that make “fail closed” a safe mode of operation. As a concrete example, only allowing CEL expressions that pass compilation and type checking significantly reduces the opportunities for runtime errors.

Also, making it possible (and convenient) to declare “zero trust” policies is important to compliance. By “zero trust”, we mean policy rules that apply the principle of least privilege to newly created resources (e.g. namespaces) where the policy is initially set to the most restrictive state and can be made less restrictive via configuration.

Proposal

Introduce a new ValidatingAdmissionPolicy kind to the admissionregistration.k8s.io group. (suggestions welcome on exact name to use for kind)

At a high level, the API will support:

Request matching (similar to the match rules of admission webhooks, RBAC, priority & fairness and Audit)
CEL rule evaluation (similar to both CRD Validation Rules but with access to the data in AdmissionRequest )
Version conversion support (similar to admission webhook’s MatchPolicy)
Access the old object (similar to transition rules and oldObject in AdmissionRequest)
Configurability, as motivated above.

There are also lots of additional capabilities (response message formatting, failure policies, type safety, advanced matching rules, …) that will be discussed in detail further on in this proposal.

We have divided this proposal into phases, all of which must be completed before this feature graduates to beta. Our goal is to size the phases so that each can be completed in a single Kubernetes release cycle.

Phase 1

API Shape

Before getting into all the individual fields and capabilities, let’s look at the general “shape” of the API.

This API separates policy definition from policy configuration by splitting responsibilities across resources. The resources involved are:

Policy definitions (ValidatingAdmissionPolicy)
Policy bindings (ValidatingAdmissionPolicyBinding)
Policy param resources (custom resources or config maps)

Relatinships between policy resources

This separation has already been demonstrated successfully by multiple policy frameworks (see the survey further down in this KEP). It has a few key properties:

Reduces total amount of resource data needed to manage policies:
- Params can be shared across multiple policies instead of copied. Multiple policies can be enforcing different aspects of a “no external connections”, for example, but can all share the configuration.
- Policies can be configured in different ways for different use cases without having to copy the policy definition.
- Rollouts and canary-ing can be managed largely via bindings without having to copy policy definitions or params.
Ownership of resources aligns well with typical separation of roles for policy management.
Existing policy frameworks can leverage this design far more easily because it aligns with how separation of concerns is expressed by most policy frameworks.

Each ValidatingAdmissionPolicy resource defines a admission control policy. The resource contains the CEL expressions to validate the admission policy and declares how the admission policy may be configured for use.

For example:

# Policy definition
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
metadata:
  name: "replicalimit-policy.example.com"
spec:
  paramKind:
    group: rules.example.com
    kind: ReplicaLimit
    version: v1
  matchConstraints:
    resourceRules:
    - apiGroups:   ["apps"]
      apiVersions: ["v1"]
      operations:  ["CREATE", "UPDATE"]
      resources:   ["deployments"]
  validations:
    - name: max-replicas
      expression: "object.spec.replicas <= params.maxReplicas"
      messageExpression: "'object.spec.replicas must be no greater than %d'.format([params.maxReplicas])"
      reason: Invalid
      # ...other rule related fields here...

The spec.paramKind field of the ValidatingAdmissionPolicy specifies the kind of resources used to parameterize this policy. For this example, it is configured by ReplicaLimit custom resources. Note in this example how the CEL expression references to the parameters via the CEL params variable, e.g. params.maxReplicas.

spec.matchConstraints specifies what resources this policy is designed to validate. This also guides type-checking, see the “Informational type checking” section for details.

The spec.validations fields contain CEL expressions. If an expression evaluates to false, the validation check is enforced according to the enforcement field.

This is a “Bring Your Own CRD” design. The admission policy definition author is responsible for providing the ReplicaLimit parameter CRD.

To configure an admission policy for use in a cluster, a binding and parameter resource are created. For example:

# Policy binding
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicyBinding
metadata:
  name: "replicalimit-binding-test.example.com"
spec:
  policyName: "replicalimit-policy.example.com"
  paramRef:
   name: "replica-limit-test.example.com"
   namespace: "default"
  matchResources:
    namespaceSelectors:
    - key: environment,
      operator: In,
      values: ["test"]

# Policy parameters
apiVersion: rules.example.com/v1
kind: ReplicaLimit
metadata:
  name: "replica-limit-test.example.com"
maxReplicas: 3

This policy parameter resource limits deployments to a max of 3 repliacas in all namespaces in the test environment.

An admission policy may have multiple bindings. To bind all other environments environment to have a maxReplicas limit of 100, create another PolicyBinding:

apiVersion: aadmissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicyBinding
metadata:
  name: "replicalimit-binding-nontest"
spec:
  policyName: "replicalimit-policy.example.com"
  paramRef:
    name: "replica-limit-test.example.com"
    namespace: "default"
  matchResources:
    namespaceSelectors:
    - key: environment,
      operator: NotIn,
      values: ["test"]

apiVersion: rules.example.com/v1
kind: ReplicaLimit
metadata:
  name: "replica-limit-clusterwide.example.com"
maxReplicas: 100

Bindings can have overlapping match criteria. The policy is evaluated for each matching binding. In the above example, the “nontest” policy binding could instead have been defined as a global policy:

apiVersion: aadmissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicyBinding
metadata:
  name: "replicalimit-binding-global"
spec:
  policyName: "replicalimit-policy.example.com"
  paramRef:
   name: replica-limit-clusterwide.example.com"
  matchResources:
    namespaceSelectors:
    - key: environment,
      operator: Exists

With this binding, the test and global policy bindings overlap. Resources admitted to test environment would then be checked against both policy configurations.

Policy Definitions

Policy definitions are responsible for:

Defining what validations the policy enforces and how violations are reported
Defining how a policy may be configured

Each ValidatingAdmissionPolicy resource contains a spec.matchConstraints to declare what resources it validates. This field is required.

spec.matchConstraints constrains which resources this policy can be applied to. Policy bindings each have match rules with further narrow this constraint, but cannot expand it. This allows the CEL expressions to make safe assumptions. E.g. a CEL expression that is constrained to CREATES and UPDATES of resources is guaranteed the root object variable is never null, but a CEL expression that might need to evaluate a DELETE must handle the root object variable being null. See below “Match criteria” section for how match criteria is described.
spec.matchConstraints guides CEL expression type checking, see the below “Type safety” section for more details.

CEL expressions have access to the contents of the AdmissionReview type, organized into CEL variables as well as some other useful variables:

‘object’
‘oldObject’
‘request’
- ‘requestResource’ (GVR)
- ‘resource’ (GVR)
- ’name’
- ’namespace’
- ‘operation’
- ‘userInfo’
- ‘dryRun’
- ‘options’
‘params’ - referred params object, maybe null if no object is referred

See below “Decisions and Enforcement” for more detail about how the spec.validations field works and how violations are reported.

Policy Configuration

ValidatingAdmissionPolicyBinding resources and parameter CRDs together define how cluster administrators configure policies for clusters.

Each ValidatingAdmissionPolicyBinding contains:

spec.policyName - A reference to the policy being configured
spec.matchResources - Match criteria for which resources the policy should validate
spec.paramKind - Reference to the custom resource containing the params to use when validating resources

Example:

apiVersion: aadmissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicyBinding
metadata:
  name: "xyzlimit-scale.example.com"
spec:
  policyName: xyzlimit-scale.example.com
  paramRef:
    name: xyzlimit-scale-settings.example.com
  matchResources:
    namespaceSelectors:
    - key: environment,
      operator: Exists

Each parameter CRD defines the custom resources that are referenced by the spec.params field of PolicyBinding resources. Example:

apiVersion: rules.example.com/v1
kind: XyzLimit
metadata:
  name: "xyzlimit-settings.example.com"
allowed: ["a", "b", "c"]
banned: ["x", "y", "z"]
xyz:
  fuzzFactor: 0.8
  reticulate: true

We will recommend tag / label / annotation on every CRD that is used for policy params, so that the CRDs to be easily identified/queried.

Note that this API design simplifies well to basic cases:

For policies that require no parameterization, only the PolicyBinding is needed.
For policies that are global, see the below “Default Validations” section and “singleton policies” for how a policy can be created using a single resource.

See “Alternatives considered” section for rejected alternatives. This design was selected because:

The param CRD schema is owned entirely by the policy author.
Matching criteria is fully defined and validated in the builtin PolicyBinding type.
Type checking is straight forward.
Policy parameterization is separated from the policy binding, allowing for well abstracted parameterization types to be used by applied in different ways by multiple policies.
Make some rollouts easier. E.g. adding a new validation rule to a policy.

Details:

Namespace Collisions:
- This design is similar to RoleBindings which use a roleRef. If We are to support both cluster and namespace scoped policy definitions, we need the same structure as roleRef for policyRef.
- To address we will require parameter CR names of the form <identifier>.<resourceName>.<apiGroup>
- Fix: Require the name to include the parameter type’s group and resource.
Invalid Configurations:
- With 4 different resources involved in each validating admission policy check (policy, binding, parameter CRD, parameter CR), there are many combinations of the states of these resources that are invalid. E.g.:
  - parameter CRD does not exist
  - binding refers to policy or parameter resource that does not exist
  - policy CEL expressions references fields that do not exist in parameter CRD
- If a policy binding is in any of these states, it is identified as “invalid” and the failure policy is applied.
Privileges to access policy bindings implies control of policy configurations:
- To address this, bindings resource will have extra auth check to verify that anyone modifying the binding is also permitted to modify the parameters resource.
- We should consider using a verb for secondary authz check that policy binding editor has policy parameter edit roles? (tallclair suggested this, it has nice properties).

API details:

Name this ClusterPolicyBinding so we can add PolicyBinding (namespace scoped) in the future?

Match Criteria

During admission, the Kubernetes API server will validate the resource being admitted against all policy configuration resources that match the resource.

While webhook match rules give a good sense of what types of capabilities might be needed, they serve a slightly different purpose. Webhook match rules make it possible to avoid webhook requests, which incur latency and impact availability, for resources that don’t need to be evaluated by the webhook. CEL expressions have a comparatively small impact of latency and are in-process (and so do not have the same impact to availability).

For CEL expressions, the primary benefits of match criteria are:

Match criteria establishes bounds on what sort of admission requests the CEL expressions must consider. The CEL expressions can be written knowing that the match criteria has filtered out requests that is does not need to consider.
Match criteria is available on policy bindings and allows the binding author to further constrain what resources the particular binding applies to.

We did consider not having any “YAML matching” for this feature and instead pushing all matching into CEL. The main deciders for me were:

Kubernetes already has resource matching as a well established concept
Match criteria can be built indexed/accelerated/built into decision trees
Match criteria can evaluate only to true/false. There is no ’error’ case to consider
Match criteria can be used to guide static typing. If the match is for v1.Deployment, we know ahead of runtime what type the object variable is

Matching is performed in quite a few systems across Kubernetes:

Match type	Usages in existing matchers	Support?
namespace	Audit, P&F	phase 1
namespace label selectors	WH	phase 1
label selectors	WH	phase 1
annotations		No
apiGroup + resource	WH/Audit/P&F/RBAC	phase 1
apiVersion	WH	phase 1
resource name	Audit/RBAC	phase 1
scope (cluster\|namespace)	WH/P&F	phase 1
operation (HTTP verb)	WH/Audit/P&F	phase 1
exclude	Audit (level=None)	phase 1
apiVersion + kind		phase 1
NonResourceURLs	Audit/RBAC/P&F	No
user/userGroup	Audit	phase 1 - request.userInfo.groups
user.Extra	(in WH AdmissionReview)	phase 1 - request.userInfo.extra
permissions (RBAC verb)	RBAC	phase 2 see “Secondary Authz” section

WH = Admission webhooks, P&F = Priority and Fairness

Match criteria must be declared in the spec.matchResources field of PolicyBinding resources (see ReplicaLimit in the above example) and will be declared with API types in a format similar to admission webhooks, P&F, RBAC and Audit, but with improved support for exclude matching.

Excluding:

Exclude support makes it possible to do things like validate all namespaces except kube-system. This is difficult to support without direct exclude support, particularly for 3rd party policy enforcements systems that cannot assume permission to set labels on kube-system.

Exclude matching will be offered adding exclude<fieldname> for match fields where exclude is appropriate. Fields such as namespaceSelector that already offer exclusion (e.g. via the NotIn operator) do not need a corresponding exclude<fieldname>.

E.g.:

  excludeNamespaces: ["kube-system"] # excludeNamespaces and namespaces are mutually exclusive
  excludePermissions: ["all-the-superpowers"]
  namespaceSelector: # Already has exclude support via NotIn and DoesNotExist
  - keys: "xyz"
    operator: NotIn
    values: ["1"]
  excludeResourceRules: # excludeResourceRules takes precedent over resourceRules
  - apiGroups:        ["apps"]
    apiVersions:      ["*"]
    operations:       ["*"]
    resources:        ["deployments"]
  resourceRules:
  - apiGroups:        ["apps"]
    apiVersions:      ["*"]
    operations:       ["*"]
    resources:        ["*"]
  ...

Special case: apiGroup + resource + operation matching

For admission webhooks, at least one spec.rules must be declared to state which apiGroup + resource + operations the webhook operates on. To configure a webhook to match everything (which is a very bad idea), a match rule would need to be written to state that, e.g.:

spec:
  rules:
  - apiGroups:   ["*"]
    apiVersions: ["*"]
    operations:  ["*"]
    resources:   ["*"]

This forces the webhook configuration author to explicitly declare what they intend to match.

The same principle applies here but with one major difference– if a policy definition has a match rules for apiGroup + resource + operation, then all bindings of that policy are already constrainted to that apiGroup + resource + operation match.

Take, for example:

apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
...
spec:
  paramSource: ...
  matchConstraints:
    resourceRules:
    - apiGroups:   ["apps"]
      apiVersions: ["v1"]
      operations:  ["CREATE", "UPDATE"]
      resources:   ["deployments"]

Since this policy is constrainted to create/update of deployments. Policy bindings don’t need to repeat this constraint. This would be sufficient:

apiVersion: admissionregistration.k8s.io/v1alpha1
kind: PolicyBinding
...
spec:
  matchResources:
    namespaceSelectors:
    - key: environment
      operator: Exists
 ...

We can enforce this by requiring:

Policy definitions match rules are validated to match one or more apiGroup + resource + operation.
Policy bindings are not required to match apiGroup + resource + operation since the policy definition is already required to do this. But they may further narrow down the match. This is useful when rolling out policies. E.g. when transitioning a policy from Warn to Deny, being able to narrow down the resource match allows for more fine grained rollout steps.

This encourages policy definition authors to consider the assumptions that the CEL expressions make. If the expressions unconditionally access object without a has(object) check, the expression will only ever work on CREATE and UPDATE and would fail at runtime on a DELETE. It is quite difficult to write CEL expression that handle all admission requests well, so we want to guide policy authors toward matching only the requests that they intend to support. The Kubernetes admission chain also scales better, and is more resiliant, when matching is precise and the validation expressions don’t need to do any post-matching checks that could have been handled by matching.

Note that if a policy binding is to be applied as broadly as possible (i.e. everywhere allowed by the policy definition) it must do so by using a wildcard match rule.

Match Policy:

MatchPolicy will work the same as for admission webhooks. It will default to Equivalent but may be set to Exact. See “Use Case: Multiple policy definitions for different versions of CRD” for an explanation of why we need MatchPolicy.

xref:

Decisions and Enforcement

This section focuses on how policies make decisions, how those decisions are enforced, and how decisions are reported back to the client.

Goals:

Feature parity with AdmissionReview
- Support allow/deny result, warnings, audit annotations
- Support for reasons/codes
Ability to format message strings

Policy definitions:

Each validation may define a message:
- message - plain string message
- messageExpression: "<cel expression>" (mutually exclusive with message)
  - As part of the KEP update to add expression composition , expressions defined under variables will be accessible from messageExpression
  - messageExpression is a CEL expression and thus factors into the runtime cost limit. If the runtime cost limit is exceeded during messageExpression execution, then this is logged. Whether or not the action is admitted after that depends upon failure policy.
- If message and messageExpression are absent, expression and name will be included in the failure message
- If messageExpression results in an error: expression and name will be included in the failure message plus the arg evaluation failure
- reason and/or code - these fields have same semantics as admission review; the reason clarifies the code but does not override it. If reason is well known (.e.g “Unauthorizied” is well known to be code 401), then the code will be inferred from the reason and use of a different code will not be allowed.

Example policy definition:

# Policy definition
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
metadata:
  name: "validate-xyz.example.com"
spec:
  ...
  validations:
    - expression: "self.name.startsWith('xyz-')"
      name: name-prefix
      message: "self.name must start with xyz-"
      reason: Unauthorized
    - expression: "self.name.contains('bad')"
      name: bad-name
      message: "name contains 'bad' which is discouraged due to ..."
      code: 400
      reason: Invalid
    - expression: "self.name.contains('suspicious')"
      name: suspicious-name
      message: "'self.name contains suspicious'"
      code: 400
      reason: Invalid

xref:

https://open-policy-agent.github.io/gatekeeper/website/docs/next/violations/

Failure Policy

Because failure policy is most often selected based on the need to guarantee enforcement, we will default failure policy to “fail” and allow it to be configured on a per-policy basis:

apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
spec:
  ...
  failurePolicy: Ignore # The default is "Fail"
  validations:
    - expression: "object.spec.xyz == params.x"

Safety measures

To prevent clusters from being put into a unustable state that cannot be recoverd from via the API, admission webhooks are not allowed to match ValidatingWebhookConfiguration and MutatingWebhookConfiguration kinds.

We will extend this approach:

ValidatingAdmissionPolicy cannot match ValidatingAdmissionPolicy/PolicyBinding/param resources.
ValidatingWebhookConfiguration cannot match MutatingWebhookConfiguration or ValidatingAdmissionPolicy/PolicyBinding/param resources.

Note that this does allow ValidatingAdmissionPolicy to match ValidatingWebhookConfiguration.

Note: In the future we may further loosen this up and allow admission configuration to intercept/guard writes to admission configuration while preventing deadlock - Add feature to configure a set of webhooks to intercept other webhooks https://github.com/kubernetes/kubernetes/issues/101794 .

Alternative considered: Each ValidatingAdmissionPolicy has a “level”, a ValidatingAdmissionPolicy can match another ValidatingAdmissionPolicy of a higher level. This could be added later.

Singleton Policies

For simple policies that does not refer to a param, a policy can be authored using a single ValidatingAdmissionPolicy resource without a paramKind field.

This is only available for cases where there is no need to have multiple bindings, and where all params can be inlined in CEL.

A “singleton” (aka standalone) policy can be defined as:

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
...
spec:
  # no paramKind
  matchConstraints: ...
  validations:
  - expression: "object.spec.replicas < 100"

Note that:

spec.paramKind must be absent
validation expressions may not refer params

Safety features:

This field may only be set when the policy is created. it may not be set on existing policies.
Any bindings assigned to a singleton policy are considered “misconfigured” and apply the FailurePolicy.

Reporting/debugging/analysis implications:

Violations for non-singleton policies will always be reported using a {policy definition, policy binding} identifier pair. To be consistent with this, singleton policies can be use a {policy definition, policy definition} identifier pair. This is a bit verbose but keeps reporting consistent and makes tracing a violation back to resources that produced it straight-forward.

Limits

We will put limits on:

Max policy bindings per policy definition
Max lengths for all lists in match criteria (resourceRules, namespaceSelectors, labelSelectors, …)
Per expression CEL cost limits
Per policy CEL evaluation cost limits

Phase 2

All these capabilities are required before Beta, but will not be implemented in the first alpha release of this enhancement due to the size and complexity of this enhancement.

Informational type checking

Some advantages of strongly typed objects and expressions over treating everything as unstructured are:

Checks against missing/misspelled fields. The user may write an expression that refers to a missing/misspelled field that does not exist in their test cases but appears later. A type check can detect this kind of error while an evaluation-time check may not.
Checks against type confusions. Similarly, the user may confuse the type of field but their test cases never touch wrongly typed fields.
Guard against short-circuit evaluation. The user may make a mistake of one of the mentioned above but the code path is never covered in their test cases;
Support Kubernetes extensions. For example, IntOrString and map lists.

However, enforcing types for every expression and object is not feasible because of:

Version skew
CRDs
Aggregated API servers

Problem examples:

Problem	Summary
version skew: ephemeralContainers case	New pod field, need to be able to validate in same was containers and initContainers if field exists and is populated
version skew: Migration from annotation to field	Need to be able to validate annotation (if present) or field (if it exists and is populated)
CRD is deleted	Nothing to type check against, but also means there are no coresponding custom resources
CRD is in multiple clusters, but schema differs	If policy author is aware of the schema variations, can they write policies that work for all the variations?
Validation of an aggregated API server type	Main API server does not have type definitions

Until the design is extended to handle these situations, the type checking will remain informational.

Informational type checking will be performed against all expressions where a GVK can be resolved to type check against. The result of type checking will be part of the status of the performed policy.

For example, accessing an unknown field will result a warning like this.

...
status:
  expressionWarnings:
    - expression: "object.replicas > 1" # should be "object.spec.replicas > 1"
      warning: "no such field 'replicas'"

Enforcement Actions

ValidatingAdmissionPolicyBinding resources may control how admission is enforced. This is performed using a single field. E.g.:

apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicyBinding
...
spec:
  validationActions: [Warn, Audit] # required field

The enum options will be:

Deny: Validation failures result in a denied request.
Warn: Validation failures are reported as warnings to the client. (xref: Admisssion Webhook Warnings )
Audit: Validation failures are published as audit events (see below Audit Annotations section for details).

If, in the future, ValidatingAdmissionPolicy also introduces enforcement action fields, this effective enforcement will be the set to the intersection of the the policy enforcement actions and the binding enforcement actions.

Systems that need to aggregate validation failures may implement an audit webhook backend . See below “Audit Events” for details.

For singleton policies, since there is no separate binding resource, the validationActions field will be set on the policy definition in the same way that other binding fields are.

Metrics will include validation action so that cluster administrators can monitor the validation failures of a binding before setting validationActions to Deny.

This enables the following use cases:

A policy framework captures enforcement violations during dry run and aggregates them. (E.g. When in DryRun mode, OPA Gatekeeper aggregates violations and records them to the status of the constraint resource). Including validation failures in audit events makes this possible to do using a audit webhook backend.
Cluster admin would like to rollout policies, sometimes in bulk, without knowing all the details of the policies. During rollout the cluster admin needs a state where the policies being rolled out cannot result in admission rejection. With the enforcement field on bindings, cluster admins can decide which initial actions to enable and then add actions until Deny is enabled. The cluster admin may monitoring metrics, warnings and audit events along the way.
A policy framework needs different enforcement actions at different enforcement points. Since this API defines the behavior of only the admission enforcement point, higher level constructs can map to the actions of this enforcement point as needed.

Future work:

ValidatingAdmissionPolicy resources might, in the future, add a warnings field adjacent to the validations and auditAnnotations fields to declare expressions only ever result in warnings. This would allow ValidatingAdmissionPolicy authors to declare a expression as non-enforcing regardless of validationActions.
ValidatingAdmissionPolicy resources, might, in the future, offer per-expression enforcement actions (instead of a separate warnings field) and combine these enforcement actions with the ValidatingAdmissionPolicyBinding enforcement action to determine the effective enforcement. This would be designed to simplify the workflow required to add or update expression on an existing ValidatingAdmissionPolicy.

Audit Annotations

ValidatingAdmissionPolicy may declare Audit annotations in the policy definition. E.g.:

apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
...
spec:
  ...
  validations:
    - expression: <expression>
  auditAnnotations:
    - key: "my-audit-key"
      valueExpression: <expression that evaluates to a string (and is recorded) or null (and is not recorded)>

auditAnnotations are independent of validations. A ValidatingAdmissionPolicy may contain only validations, only auditAnnotations or both.

Auudit annotations are recorded regardless of whether a ValidatingAdmissionPolicyBinding’s validationActions include Audit.

The published annotation key will be of the form <ValidatingPolicyDefinition name>/<auditAnnotation key> and will be validated as a QualifiedName .

The validation rule will be: len(key) < QualifierName.maxLength - len(policy name) - 1 to accommodate the <ValidatingPolicyDefinition name>/<auditAnnotation key> audit annotation key format.

If valueExpression returns a string, the audit annotation is published. If valueExpression returns null, the audit annotation is omitted. No other return types will be supported.

Audit Events

All audit event keys are prefixed by <ValidatingPolicyDefinition name>/.

At Metadata audit level or higher, when a validating admission binding fails, and the binding’s validationActions includes Audit, any validation expression, details are included in the audit annotations for the audit event under the key validation_failures. E.g.:

# the audit event recorded
{
    "kind": "Event",
    "apiVersion": "audit.k8s.io/v1",
    "annotations": {
        "ValidatingAdmissionPolicy/mypolicy.mygroup.example.com/validation_failure": "{\"expression\": 1, \"message\": \"x must be greater than y\", \"enforcement\": \"Deny\", \"binding\": \"mybinding.mygroup.example.com\"}"
        # other annotations
        ...
    }
    # other fields
    ...
}

Also, at Metadata audit level or higher, any audit annotations declared by the policy definition are included with the key provided. E.g.:

# the audit event recorded
{
    "kind": "Event",
    "apiVersion": "audit.k8s.io/v1",
    "annotations": {
        "ValidatingAdmissionPolicy/mypolicy.mygroup.example.com/myauditkey": "my audit value"
        # other annotations
        ...
    }
    # other fields
    ...
}

Per namespace policy params

Validating admission policies and bindings are cluster scoped.

We want to enable a clusters to be able to parameterize a policy on a per-namespace using a resource contained in the namespace.

(Thanks for the input from @dead2k)

The goal is to enable:

A cluster-admin to write a single policy to say, “this is the policy I want in all my namespaces”.
A namespace admins that can read the param resources, but not write the params resource, to understand the limitations they currently have.
A single lenient cluster policy and binding to enforce a minimum constraint, and a single cluster policy and binding pointing to a namespace level params to further restrict the policy for a particular namespace.

To implement this, a new optional field namespaceParamRef will be added to ValidatingAdmissionPolicyBinding:

apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicyBinding
metadata:
  name: "demo-binding-test.example.com"
spec:
  policyName: "demo-policy.example.com"
  namespaceParamRef:
    name: "param-resource.example.com"
    failAction: “Allow”
  validationActions: [Deny]

The namespaceParamRef may either specify an exact name, or may specify a label selector to locate the param resource in a namespace. For example:

apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicyBinding
metadata:
  name: "demo-binding-test.example.com"
spec:
  policyName: "demo-policy.example.com"
  namespaceParamRef:
    selector:
      matchLabels:
        policy: demo-policy 
    failAction: “Allow”
  validationActions: [Deny]

Note that with a label selector, multiple param resource may match, in which case the policy is evaluated for each param resource; the admission request must be allowed by the policy for all the param resources to be admitted.

Implementation details:

namespaceParamRef and paramRef are members of a union; if one of the fields is set, the other must be unset.
failAction defines the behavior when the param resource cannot be found in a namespace. Set to Allow to admit all requests even when there is no params resources found, and policy is not evaluated. Set to Deny to fail admission if no param resources are found. (Note from jpbetz: this could be implemented without introducing a new field. For “allow” add a params != null matchCondition, for deny, add params != null as the first expression)
if the paramKind of the policy referred to by policyName is cluster scoped, and namespaceParamRef set, the binding is considered mis-configured, and the failureMode applies.

Match Conditions

Note that the syntax of the matchConditions resource is intended to align with the Admission Webhook Match Conditions KEP #3716 , so that KEP should be controlling with regard to deviations in the schema. This section is focused specifically on how the matchConditions concept can be applied to in-process admission.

The match criteria in bindings are not expected to be able to cover all possible ways users may want to scope their policies. For example, there is no way to match off of kind, only resource. To provide extensibility for the match criteria without requiring modifying every validation rule individually, a global predicate system is needed. These predicates contain CEL statements that must be satisfied, otherwise the policy will be ignored. In order to keep bindings language-agnostic and to support singleton policies, the logic should live in the policy definition resource. To enable customization per-binding, the CEL statements should have access to the parameter resource.

Here is an example of a policy definition using match conditions (under the matchConditions field):

apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
metadata:
  name: "replicalimit-policy.example.com"
Spec:
  failurePolicy: Fail
  paramKind:
    apiVersion: rules.example.com/v1
    kind: ReplicaLimit
  matchConstraints:
    resourceRules:
    - apiGroups:   ["apps"]
      apiVersions: ["v1"]
      operations:  ["CREATE", "UPDATE"]
      resources:   ["deployments"]
  matchConditions:
    - name: 'is-deployment'
      expression: 'metadata.kind == "Deployment"'
    - name: 'not-in-excluded-namespaces'
      expression: '!(metadata.namespace in params.excludedNamespaces)'
  validations:
    - expression: "object.spec.replicas <= params.maxReplicas"
      reason: Invalid

For demonstration purposes, we assume match has no support for excludedNamespaces.

Note that matchConditions and validations look similar, but matchConditions entries only have the expression field: their only function is to gate whether the expressions in validations are evaluated.

matchConditions has the following behaviors:

Only the request object and parameters are accessible (no referential lookup)
All match conditions must be satisfied (evaluate to true) before validations are tested
If there is an error executing a match condition, the failure policy for the (definition, binding) tuple is invoked

CEL Expression Composition

Expression Composition is a technique to define a set of variables each from a given expression and allow validation expressions to refer the variables.

Use Cases

Reusing/memorizing an expensive computation and lazy evaluation

For a CEL expression that takes a significant time to evaluate, especially these that cost O(N^2) time or worse, it would be nice to only run it once and only when necessary. If multiple validation expressions used the same expression, that expression could be refactored out into a variable. Because of the evaluation of a composited variable is lazy, if the value of some variable does not affect the result of an expression due to boolean short-circuit evaluation, the variable can be omitted and does not incur runtime cost.

Code re-use for complicated expressions

A CEL expression may not be computationally expensive, but could still be intricate enough that copy-pasting could prove to be a bad decision later on in time. For example, a sub-expression is likely used both in the validation and to format messageExpression. If a sufficiently complex expression ended up copy-pasted everywhere, and then needs to be updated somehow, it will need that update in every place it was copy-pasted. A variable, on the other hand, will only need to be updated in one place.

Variables

Each CEL “program” is a single expression. There is no support for variable assignment. This can result in redundant code to traverse maps/arrays or dereference particular fields.

We can support this in much the same way as cel-policy-template terms. These can be lazily evaluated when they are resolved during the evaluation of the main expression (cel-policy-template does this).

The policy spec now has an additional variables section. This is an array containing one or more name and expression pairs, which can be used/re-used by the policy’s validation expressions. These results are memoized on a per-validation basis, so if multiple expressions use the same spec variables, the expression that calculates the variable’s value will only run once.

The variables can be accessed as members of variables, which is an object that is exposed to CEL expressions (both validation expressions and other variables).

For example:

  variables:
    - name: metadataList
      expression: "spec.list.map(x, x.metadata)"
    - name: itemMetadataNames
      expression: "variables.metadataList.map(m, m.name)"
  validations:
    - expression: "variables.itemMetadataNames.all(name, name.startsWith('xyz-'))"
    - expression: "variables.itemMetadataNames.exists(name, name == 'required')"

Variable names must be valid CEL names and must be unique among all variables. What constitutes a valid CEL name can be found at CELlanguage definition under IDENT. This validity is checked when the policy is being created or updated.

For per-policy runtime cost limit purposes, variables count towards the runtime cost limit once per policy. The cost of each variable is computed when it is first evaluated in an expression, mirroring how the cost limit would be calculated if the variable’s expression was embedded verbatim. If the runtime cost limit is exceeded in the process, then evaluation halts. No individual variable or expression will be listed as the cause in the resulting message. Whether the request actually fails depends on the failure policy, however. For subsequent uses, inclusion of the variable has zero effect on the runtime cost limit. If the variable evaluates to an array or some other iterable, and some expression iterates on it, that of course contributes to the cost limit, but simply including the variable does not add the underlying expression’s cost again.

Variables are also subject to the per-expression runtime cost limit. Exceeding the per-expression runtime cost limit is always attributed to the variable, unlike the per-policy limit.

Variables can only reference other variables that have been previously defined in the variables section. This will have a side effect of making the order of the variable definitions matter but prevents circular reference.

Both the result and potential error of a variable evaluation are memorized. If an error occurs during variable evaluation, then every validation expression that caused it to be evaluated (since variable are always lazily-evaluated) also finishes with an error, and the variable evaluation will not retry.

Secondary Authz

We will support admission control use cases requiring permission checks:

Validate that only a user with a specific permission can set a particular field.
Validate that only a controller responsible for a finalizer can remove it from the finalizers field.

To depend on an authz decision, validation expressions can use the authorizer variable, which performs authz checks for the admission request user (the same use as identified by request.userInfo) by default, and which will be bound at evaluation time to an Authorizer object supporting receiver-style function overloads:

Symbol	Type	Description
serviceAccount	Authorizer.(namespace string, name string) -> Authorizer	Returns an authorizer whose subject is the named serviceaccount (instead of admission request user)
path	Authorizer.(path string) -> PathCheck	Defines a check for an non-resource request path (e.g. /healthz)
check	PathCheck.(httpRequestVerb string) -> Decision	Checks if the user is authorized for the HTTP request verb on the path
group	Authorizer.(group string) -> GroupCheck	Defines a check for API resources within a group
resource	GroupCheck.(resource string) -> ResourceCheck	Specifies the resource to be checked within the group
subresource	ResourceCheck.(subresource string) -> ResourceCheck	Specifies that the check is for a subresource
namespace	ResourceCheck.(namespace string) -> ResourceCheck	Specifies that the check is for a namespace (if not called, the check is for the cluster scope)
name	ResourceCheck.(name string) -> ResourceCheck	Specifies that the check is for a specific resource name
check	ResourceCheck.(apiVerb string) -> Decision	Checks if the subject is authorized for the API verb on the resource
allowed	Decision.() -> bool	Is the subject authorized?
reason	Decision.() -> string	Returns a human-readable explanation of why this decision was made
errored	Decision.() -> bool	Returns true if and only if an error occurred while making this decision
error	Decision.() -> string	Returns the text of the error that occurred. If no error occurred, returns the empty string

xref: https://kubernetes.io/docs/reference/access-authn-authz/authorization/#review-your-request-attributes for a details on authorization attributes.

Example expressions using authorizer:

authorizer.resource('signers', 'certificates.k8s.io', '*').name(oldObject.spec.signerName).check('approve').allowed()

Note that this API:

Produces errors at compilation time when parameters are missing or improperly combined (e.g. subresource with non-resource path, missing path or resource).
Is open to the addition of future knobs (e.g. impersonation).
Will have a limit on the number of authz checks generated during expression evaluation, this will be enforced by setting a CEL evaluation cost to performing each authz check that is high enough serve as an effective limit to the total authz checks that can be performed. The limit will be picked emperically by evaluating the resource costs (CPU cost primarily) of authz checks. This will need to be set high enough to handle policies like “You must be authorized to read every secret your pod mounts”.

Other considerations:

Since authorization decisions depend on an apiserver’s authorizer, and the ValidatingAdmissionPolicy plugin is supported on aggregated API servers , the evaluation result of any validation expression involving secondary authz inherently depends on which apiserver evalutes it.
A validation expression could be written such that it allows malicious clients to probe permissions. Policy authors are responsible for careful use of client-provided inputs when constructing authz checks.
Decisions can be stored either as a variable or as part of the implementation of the object bound to the exposed authorizer object (i.e. caching decisions for identical checks within a single policy evaluation) in order to prevent duplicate checks across multiple validation expressions.
Authorization checks that require information from resources other than the resource being admitted are possible but will be limited by eventual consistency. Information from other resources can be accumulated by a controller and written to a custom resources which can then be referenced by the paramSource of a policy binding and accessed in CEL expressions via the params variable.

CertificateApproval use case:

apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
metadata:
  name: "certificate-approval-policy.example.com"
spec:
  matchConstraints:
    resourceRules:
    - apiGroups:   ["certificates.k8s.io"]
      apiVersions: ["*"]
      operations:  ["UPDATE"]
      resources:   ["certificatesigningrequests/approval"]
  validations:
  - expression: "authorizer.resource('signers', 'certificates.k8s.io', '*').name(oldObject.spec.signerName).check('approve').allowed() || authorizer.resource('signers', 'certificates.k8s.io', '*').name([oldObject.spec.signerName.split('/')[0], '*'].join('/')).check('approve').allowed()"
    reason: Forbidden
    messageExpression: "user not permitted to approve requests with signerName %s".format([oldObject.spec.signerName])"

Other use cases in existing admission plugins:

PodSecurityPolicy (kube)
CertificateSigning (kube)
OwnerReferencesPermissionEnforcement (kube)
network.openshift.io/ExternalIPRanger
route.openshift.io/IngressAdmission
scheduling.openshift.io/PodNodeConstraints
network.openshift.io/RestrictedEndpointsAdmission
security.openshift.io/SecurityContextConstraint
security.openshift.io/SCCExecRestrictions

Restricted Service Account use case (from deads2k):

Note that user.Extra in AdmissionReview has pod claims, which are valuable.

sig-auth has previous talked about trying to find a way to restrict access from a daemonset pod to a customresource/foo that has Foo.spec.NodeName set to the Node.metadata.name of the pod bound to the particular SA token. This is tantalizingly close because user.Extra contains authentication.kubernetes.io/pod-uid to locate a pod, determine a Pod.spec.NodeName.

A built-in that does that may be well received and unlock many use-cases. Exploring the idea may be useful. If most also require controlled read permission, then its probably better to create something specifically for the purpose.

This enhancement makes request.userInfo.extra['authentication.kubernetes.io/pod-uid'] available to admission policies. Looking up the pod (or any other additional resources) is makes this use case challenging. A controller that accumulates a pod-uid -> node-name map in a custom resource by watching all pods could then make the mapping available in a custom resource for the admission policy to consume using paramSource.

This would result in a CEL expression like:

object.spec.NodeName == params.nodeNamebyPodUID[request.userInfo.extra['authentication.kubernetes.io/pod-uid']]

But this would not scale well to clusters with large pod counts.

If we were to offer a way to lookup arbitrary other resources, or even if we provided selective access to just some resources, this might become easier. This can explored as future work.

Access to namespace

We have general agreement to grant CEL expressions access to the admission object’s namespace through a newly added CEL variable namespaceObject. If the resource is cluster scoped, namespaceObject will be null.

namespaceObject will provide access to all existing fields under namespace metadata, namespace spec and namespace status except for metadata.managedFields and metadata.ownerReferences. The fields could be directly accessed through namespaceObject variable. e.g. namespaceObject.metadata.name or namespaceObject.status.phase.

Namespace labels and annotations are the most commonly needed fields not already available in the resource being validated. labels and annotations could be accessed through namespaceObject.metadata.labels for example namespaceObject.metadata.labels.env. Note that we recommend to check if the specific label/annotation exists before validation: 'env' in namespaceObject.metadata.labels.

Transition rules

Will provide access to “object” and “oldObject” in CEL expressions. These will be the same as in AdmissionReview.
On CREATE, “oldObject” will be null.
On DELETE, “object” will be null.

If we add “CEL expression scoping” (see above section), we will also need to consider how scoped fields are handled for create/update/delete. Note that CRD validation rules have transition rules which are only evaluated when both “self” and “oldSelf” are present.

Resource constraints

We will leverage the design and implementation of CRD Validation Rules Resource Constraints , which provides:

CEL estimated cost limits
CEL runtime cost limits
Go context cancellation as a way of halting CEL execution if the request context is canceled for any reason.

Estimated cost is, unfortunately, not something we can offer for admission with any kind of guarantees attached due to the already listed issues that have prevented use from enforcing type safety. We could instead compute estimated for the same cases where we provide informational type checking, in which case we can report any cost limit violations in the same way we report type checking violations. Note that for built-in types, where max{Length,Items,Properties} value validations are not available, estimated cost calculations will not be nearly as helpful or actionable. We do not plan to enforce any estimated cost calculations on ValidatingAdmissionPolicy.

Runtime cost limits should be established and enforced. Exceeding the cost limit will trigger the FailurePolicy, so this will need to be documented, but unlike webhooks, runtime cost is deterministic (it is purely a function of the input data and the CEL expression and is independent of underlying hardware or system load), making it less of a concern for control plane availability than webhook timeouts.

The request’s Go context will be passed in to all CEL evaluations such that cancellation halts CEL evaluation, if, for any reason, the context is canceled.

Safety Features

Additional safety features we should consider:

Configurable admission blocking write requests made internally in kube-apiserver during server startup (like RBAC default policy reconciliation) making it impossible for a server to start up healthy. (This is not specific to CEL?)
Ability to skip specific resource types - Admission Controller Webhook configuration rule cannot exclude specific resources: https://github.com/kubernetes/kubernetes/issues/92157

Aggregated API servers

Main complications (provided by @liggitt):

The API server validating/persisting the ValidatingAdmissionPolicy instances isn’t the same one serving aggregated types, so wouldn’t necessarily have schema info to check type safety.
The aggregated API server is responsible for enforcing admission on its custom types, so the implementation that reads ValidatingAdmissionPolicy instances and enforces them would have to live in k8s.io/apiserver and be active in aggregated API servers to enforce admission on aggregated types effectively (same as admission webhooks today).

Plan:

Do not offer type checking for aggregated types.
Support ValidatingAdmissionPolicy in aggregated API servers.

CEL function library

To consider:

labelSelector evaluation functions or other match evaluator functions (original comment thread )

To implement:

string.format into CEL upstream (tracking PR ) (TODO @DangerOnTheRanger: add tracking cel-go issue once available)

Audit Annotations

To consider: Would audit support in this enhancement become redundant if Audit were also extended to support CEL? If so, which should we invest in?

Admission webhooks are able to include an associative array of audit annotations in a review response. If we intend to provide parity with webhooks we would also want to support audit.

Rough plan:

Each validation has a name. If the enforcement is Audit the name can be used as the audit annotation key.
Can add an audit option next to the deny and warn enforcement options.

Client visibility

In order to make DryRun more visible to clients we will add a client visibility option to policy bindings.

This is largely focused at making deployment/rollout more manageable.

It might be generalized to control visibility of enforced violations.

Metrics

Goals:

Parity with admission webhook metrics
- Should include counter of deny, warn and audit violations
- Label by {policy, policy binding, validation expression} identifiers
Counters for number of policy definitions and policy bindings in cluster
- Label by state (active vs. error), enforcement action (deny, warn)
Counters for Variable Composition
- Should include counter of variable resolutions to measure time saved.
- Label by policy identifier

Granularity:

Latency metrics should be per {policy, validation expression}. The next level of granularity would be {policy, binding, validation expression}, but that the number of biindings can become quite large, so let’s limit it to {policy, validation expression} for now.

xref: Metrics Provided by OPA Gatekeeper
xref: Admission Webhook Metrics

Future Plan

Namespace scoped policy binding

Note The namespace scoped policy binding will require a new API in place. It will be planned separately and will not be affecting the existing ValidatingAdmissionPolicy behavior.

For phase 1, policy bindings were only allowed to be cluster scoped. We can support namespace scoped policy bindings as follows:

Add a NamespacePolicyBinding resource.
If the parameter resource is namespace scoped, it implicitly matches resources only in the namespace it is in, but may further constrain what resources it matches with additional match criteria.

Benefits: Allows policy of a namespace to be controlled from within the namespace. As an example, ResourceQuota works this way.

Details to consider:

Should a policy support both cluster scoped and namespace scoped binding? If so how? It would need two different parameter CRDs (since a CRD must either be cluster scoped or namespace scoped, not both).

User Stories

In addition to “User Stores”, see below “Potential Applications” for a list of known applications and their use case requirements.

Use Case: Singleton Policy

User wishes to define a simple policy that required no parameters. They don’t want to create parameter CRD since what they’re doing can be expressed quite simply in a single CEL expression.

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: "validate-xyz.example.com"
spec:
  singletonPolicy: true
  match:
    resourceRules:
    - apiGroups:   ["apps"]
      apiVersions: ["v1"]
      operations:  ["CREATE", "UPDATE"]
      resources:   ["deployments"]
  defaultValidations:
  - expression: "object.spec.replicas < 100"

Use Case: Shared Parameter Resource

User wishes to define a CRD for a list of banned words that may not be used in any of a wide range of identifiers in the cluster (resource names, container names, …).

Parameter CRD is defined to hold the list of banned words.
Multiple policies are defined for different resources. The policies all reference the same parameter CRD.
A single custom resource is defined with the list of banned words (but has not matching rules of its own).

apiVersion: rules.example.com/v1
kind: BannedWords
metadata:
  name: "banned-words.example.com"
spec:
  bannedWords:
  - glitter
  - rainbow

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: "policy1.example.com"
spec:
  match:
    resourceRules:
    - apiGroups:   ["*"]
      apiVersions: ["*"]
      operations:  ["CREATE", "UPDATE"]
      resources:   ["pods"]
  validations:
  - expression: "!object.name in params.bannedWords"

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: "policy2.example.com"
spec:
  match:
    resourceRules:
    - apiGroups:   [""]
      apiVersions: ["v1"]
      operations:  ["CREATE", "UPDATE"]
      resources:   ["pods"]
  validations:
  - expression: "!object.spec.containers.any(c, c.name in params.bannedWords)"
  - expression: "!object.spec.initContainers.any(c, c.name in params.bannedWords)"

Both policies can use a trivial PolicyBinding to enable the same parameter resource for both policies.

Similar Use Case: A cluster administrator wishes to use a single policy configuration to manage a network policy that must be enforced across multiple Kubernetes kinds that contain relevant networking fields. It is possible to implement by having multiple ValidatingAdmissionPolicy resources that all reference the same spec.params CRD but that each enforce the policy for a different Kubernetes network kind.

Use Case: Principle of least privilege policy

A cluster administrator would like disallow the use of a list of reserved labels by default, but allow use of the labels in specific namespaces so long as the label values are valid.

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
...
spec:
  paramSource:
    group: rules.example.com
    kind: ReservedLabels
    version: v1
  match:
    ...
  validations:
    expression: "['reserved1', ...].exists(r, object.metadata.labels.contains(r) && !params.allowedLabels.contains(r))"
  defaultValidations:
  - expression: "['reserved1', ...].exists(r, object.metadata.labels.contains(r)"

Use Case: Validating native type with new field (version skew case)

Policy author wants to write a policy that validates a property of all containers in a pod, including ephemeralContainers for versions of Kubernetes where ephemeralContainers are available.

  validations:
    - expression: "object.spec.containers.all(c, c.name.startsWith('xyz-'))"
    - expression: "!has(object.spec.initContainers) || object.spec.initContainers.all(c, c.name.startsWith('xyz-'))"
    - expression: "!has(object.spec.ephemeralContainers) || object.spec.ephemeralContainers.all(c, c.name.startsWith('xyz-'))"

This does not work if type checking is strict. The ephemeralContainers field will be reported as unrecognized.

Note: This is the sort of policy where cluster managers would ideally be able to register the policy to validate ephemeralContainers before upgrading to the version of Kubernetes where ephemeralContainers are available for use.

Annotation to field migration example: https://github.com/open-policy-agent/gatekeeper-library/blob/master/library/pod-security-policy/seccomp/template.yaml

Use Case: Multiple policy definitions for different versions of CRD

While version conversion allows for single policy definition. Cases for multiple policy definitions are:

A policy author wishes to write a policy for both the v1 and v2 of a CRD because they wish to avoid incuring a CRD conversion webhook request, which would happen if they only offered a single policy (at either version).

A policy author wishes to write a policy such that it can be evluated “shift-left” in a pre-submit check.

Proposed solution:

Use a matchPolicy: Exact for the v1 policy.
use matchPolicy: Equivalent and an exclude match rules for v1 for the policy that handles v2+. This way if a v3 is added in the future, the policy for v2 applies via version conversion by default.

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: "policy1.example.com"
spec:
  match:
    resourceRules:
    - apiGroups:   ["example.com"]
      apiVersions: ["v1"]
      operations:  ["CREATE", "UPDATE"]
      resources:   ["myCRD"]
    matchPolicy: Exact
  validations:
  - rule: "object.v1fieldname == 'xyz'"

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: "policy1.example.com"
spec:
  match:
    excludeResourceRules:
    - apiGroups:   ["example.com"]
      apiVersions: ["v1"]
      operations:  ["CREATE", "UPDATE"]
      resources:   ["myCRD"]
    - apiGroups:   ["example.com"]
      apiVersions: ["*"]
      operations:  ["CREATE", "UPDATE"]
      resources:   ["myCRD"]
      matchPolicy: Equivalent #is the default
  validations:
  - expression: "object.v2fieldname == 'xyz'"

Use Case: Prevent admission webhooks from matching a reserved namespace

Cluster administrator wishes to prevent admission webhooks from matching requests to specific namespaces. E.g. kube-system or some other namespace that is critical to the cluster.

Let’s assume for this example that the namespace is kube-system.

One approach would be to require kube-system contain a special label and the 1st match rule of a admission webhook use a namespaceSelector:

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: "policy1.example.com"
spec:
  match:
    resourceRules:
    - apiGroups:   ["admissionregistration"]
      operations:  ["CREATE", "UPDATE"]
      resources:   ["ValidatingAdmissionWebhook", "MutatingAdmissionWebhook"]
  validations:
  - expression: >
      has(object.namespaceSelectors) && object.namespaceSelectors.size() > 0 && 
      object.namespaceSelectors[0].key = 'webhook-restricted' &&
      object.namespaceSelectors[0].namespaceSelector.operator = 'In' &&
      object.namespaceSelectors[0].namespaceSelector.values = ['true']
    message: "The 1st namespaceSelector or ValidatingAdmissionWebhook and MutatingAdmissionWebhooks must be: {key: webhook-restricted, operator: In, values: ['true']}"
    reason: Forbidden

This approach would pair well with a admission mutation that adds the rule to exclude kube-system to all admission webbhooks. This would require CEL mutating admission support.

More general types of validations like this would benefit from CEL support for functions like labelSelector.match().

Use Case: Fine grained control of enforcement

Policy author wishes to define a policy where the cluster administrator is able to configure how a policy is enforced by defining a series of progressively stricter levels.

Multiple copies of the same expression can be used, each guarded by a params check:

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: "policy1.example.com"
spec:
  match: ...
  validations:
  - expression: "!(params.enforceLevel > 2) || <cel expression>"
    reason: Invalid
  - expression: "!(params.enforceLevel > 1) || <cel expression>"
    reason: Invalid
  - expression: "<cel expression>"
    reason: Invalid

Use Case: Migrating from validating webhook to validation policy

Steps:

Webhook is configured and in-use.
ValidatingAdmissionPolicy created with FailPolicy: Ignore
ValidatingAdmissionPolicy is monitored to ensure it behaves the same as the webhook (logs or audit annotations can be used)
ValidatingAdmissionPolicy is updated to FailPolicy: Fail
Verify the webhook never denies any requests. If the admission policy is equivalent, then policy will be run first and deny the request before webhooks are called.
Webhook is configured with FailPolicy: Ignore (optional)
Webhook configuration is deleted

Use Case: Pre-existing Deployment triggers rollout long after Pod policy is changed

User creates a Deployment
User uses the Deployment to roll out a ReplicaSet
A validation policy is introduced for Pods, it is set to Warn
Because deployments can be infrequent, a Deployment that would create pods violating the policy is not noticed
validation policy is set to Deny
problem Deployment is used to roll out a new ReplicaSet, pods fail policy validation

Note that even if all existing objects in the system are checked against the new policy, the Deployment will not be noticed to violate it unless the PodTemplate is checked. And if the policy is for Pods specifically, there is not automatic checking of the PodTemplate.

xref: https://kyverno.io/docs/writing-policies/autogen/

Use Case: Rollout of a new validation expression to an existing policy

Policy definition A exists in cluster with policy bindings X1..Xn
“temporary” policy definition B is created with the new validation, it has the same settings as policy definition A otherwise (e.g. it uses the same param CR)
Policy bindings X1..Xn are replicated as Y1..Yn but modified to use policy definition B and set enforcement to warning only
Cluster administrators observe violations (via metrics, audit logs or logged warnings)
Cluster administrator determines new validation is safe
Policy bindings X1..Xn are set enforcement to deny
If anything goes wrong, revert enforcement back to warning only
Policy definition A is updated to include the new validation
Policy definition B and policy bindings Y1..Yn are deleted

Use Case: Canary-ing a policy

New policy definition is created
Any needed param CRs are created
policy bindings are created and set enforcement to warning only
Cluster administrators observe violations (via metrics, audit logs or logged warnings)
Cluster administrator determines new policy is safe
policy bindings are set enforcement to deny

Potential Applications

As mentioned earlier, we aim to provide a customizable, in-process validation of requests to the Kubernetes API server as an alternative to validating admission webhooks. The current policy enforcement is mainly done through:

Build-in admission controllers like PSA
External admission controllers in the ecosystem like K-Rail, Kyverno, Kubewarden and OPA/Gatekeeper
Self developed validating admission webhooks

Use Case: Build-in admission controllers

Extending to security use cases beyond what PodSecurityAdmission (replacement of PSP) provides.

Use cases for extending Pod Security admission:

Further limitations an CSI volumes
Limitations on seccomp and AppArmor localhost profiles
Additional limitations on which UIDs can be used
Application or namespace specific SELinux restrictions
Restricting privileged namespaces

Use Case: KubeWarden

Kubewarden is a policy engine for Kubernetes. It helps with keeping the Kubernetes clusters secure and compliant. Kubewarden policies can be written using regular programming languages or Domain Specific Languages (DSL). Policies are compiled into WebAssembly modules that are then distributed using traditional container registries.

Policy hub for ready to use policies: https://hub.kubewarden.io/
Policy examples: https://github.com/topics/kubewarden-policy

Use Case: OPA/Gatekeeper

Gatekeeper uses the OPA Constraint Framework to describe and enforce policy. A community-owned library of policies for the OPA Gatekeeper project: https://github.com/open-policy-agent/gatekeeper-library

Use Case: K-Rail

k-rail is a workload policy enforcement tool for Kubernetes. policy violations examples: https://github.com/cruise-automation/k-rail#viewing-policy-violations

Use Case: Kyverno

Kyverno is a policy engine designed for Kubernetes. It can validate, mutate, and generate configurations using admission controls and background scans. The policy examples used in Kyberno: https://kyverno.io/policies/

Use Case: Cloud Provider Extensions

PVL Admission controller (which is deprecated) is being replaced by a webhook (issue, KEP) - requires mutation.

Notes/Constraints/Caveats (Optional)

Risks and Mitigations

Design Details

CEL Integration with Kubernetes native types

While implementing CRD Validation Rules , CEL was integrated with CRD structural schemas and the “unstructured” data representation. For admission control, we also need CEL to be integrated with the Kubernetes Go structs used to representative native API types, both for type checking and for runtime data access.

Writing to Status

This enhancement proposes using status of ValidatingAdmissionPolicy to communicate type-checking errors and any other misconfigurations such as CRD not found errors.

As mentioned in https://github.com/kubernetes/enhancements/pull/3492#discussion_r964841045 , status on API server configuration objects has been tricky to design in the past, because of the following:

multiple active kube-apiservers (sometimes at identical versions, sometimes skewed by one version during upgrade)
multiple active non-kube-apiserver servers (aggregated servers)

As a concrete example, The CRD NonStructural status field takes advantage of the metadata generation field (https://github.com/kubernetes/kubernetes/commit/2cfc3c69dc7c17b2711af0168f39ed7f515675c2) . This avoids repeated updates for the same generation and potential fights of API server in HA environments even without leader elected controllers.

We will use a similar approach. The complication here is that for this case we must consider up to 3 resources:

ValidatingAdmissionPolicy resource
Parameter CRD
The CRD for the kind-under-test, if it is a CRD (and not a built-in type)

In order to be able to know how old the three resources were when the status was last written, we must track additional information in the status:

apiVersion: "admissionregisteration/v1alpha1"
kind: "ValidatingAdmissionPolicy"
metadata:
  name: "myPolicy"
  generation: 2
  ...
status:
  paramKind:
    apiVersion: "example.com/v1"
    kind: "fooLimits"
    generation: 5
    resourceVersion: 10100
  matchedCustomResource:
    apiVersion: "example.com/v1"
    kind: "foo"
    generation: 100
    resourceVersion: 10200

Whenever an apiserver is performing a sync (of any of these three resources), it takes the latest state is has of all three resources and checks if they represent forward progress compared to a consistent read of the resource.

Forward progress is (1) no resource state is older than used for the last status update (2) at least one resource state is newer.

For spec.generation, this is trival, just compare the generation from the observed with the current state.

For referenced CRDs, it is more involved:

Forward progress requires comparing both the apiVersion/kind and the generation/resourceVersion.
If the apiVersion/kind does not match the CRD from the spec, it can safely be considered older without checking the generation. Goal is to converge with what is in the spec.
Generation and resourceVersion comparison:
- observed resourceVersion > existing status resourceVersion: older.
- observedresourceVersion > existing status resourceVersion && observed generation == status generation: same.
- observedresourceVersion > existing status resourceVersion && observed generation > status generation: newer.

If the controller has observed forward progress it updates the entire status, including any conditions and error information:

status:
  ...
  conditions:
    type: "Available" # TODO: pick an appropriate type for broken policies
    status: "False"
    reason: Misconfigured
    message: "Validation expressions contain errors. Param custom resource definition not found."
    ...
  validationErrors:
    - expression: "object.baz > params.min"
      errors:
        - "illegal ..."
        - "no such field ..."
  paramSourceErrors:
    - "paramSource custom resource definition not found"

Note that write conflicts do not require a retry since the write that caused the conflict will result in another sync once it is observed.

Alternative Considered: Use Leader election

Pro:

Reconciliation loop becomes noticibly simpler

Con:

Implementation difficulty- I suspect an entire KEP could be dedicated to using leader election for this purpose.

Versioning

Policy Definition Versioning

As a built-in type, ValidatingAdmissionPolicy follows Kubernetes API guidelines.

Parameter CRD Versioning

A parameter CFD may offer a new version using the existing CRD schema versioning and version conversion support. The policy definition can then migrate from reading the old version to the new version.

Test Plan

[X] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.

Prerequisite testing updates

N/A

Unit tests

<package>: <date> - <test coverage>

Integration tests

In the first alpha phase, the integration tests are expected to be added for:

The behavior with feature gate and API turned on/off and mix match
The happy path with everything configured and validation proceeded successfully
Validation with different enforcement policies
Validation with different failure policies
Validation with different Match Criteria
Validation violations for different reasons including type checking failures, misconfiguration, failed validation, etc and formatted messages
Singleton policy
Validation limit check including cost limit, max policy binding limit, max length, etc.

e2e tests

We will test the edge cases mostly in integration test and unit test. We may add e2e test for spot check of the feature presence.

Graduation Criteria

Alpha

Feature implemented behind a feature flag
Ensure proper tests are in place.

Beta

benchmark and resolve optimization issues, including:
- add tests which registers a validation policy for everything and iterates through all groups/versions/resources/subresources and ensures they get intercepted and work properly with a CEL validation policy(comment )
- set paramKind in a ValidatingAdmissionPolicy results in starting a new informer that watches all instances of that object using a new unstructured informer which is inefficient(comment )
- switch to a lock-free implementation to address lock having to wait for all existing admission evaluations to complete and blocking all new admission evaluations until this completes.(comment1 ,comment2 )
- Perform minimal possible number of conversions when evaluating multiple admission policies for a request resource. If multiple admission policies require the same conversion, convert only once. From @liggitt: “webhook code loops up one level, first accumulates all the validation webhooks we’ll run, then converts to the versions needed by those webhooks then evaluates in parallel”
authz check to the specific resource referenced in the policy’s paramKind. (comment )
complete feature of access to namespace metadata
add controlled rollout strategy to support future CEL library/function/variable changes
Quantity support from CEL expression and tested properly
support the list of features mentioned under phrase 2

GA

Complete type check for CRD and aggregated types
Scalability evaluation. Evaluate ValidatingAdmissionPolicy scalability including how many validators it could run, runtime cost evaluation, evaluating the reasonable ResourceQuota for ValidatingAdmissionPolicy, how much faster comparing with Webhook, the scale target, etc.
Fix the known issues along the way(variables should keep the type info , variables should mark omitempty , properly escape , reconsilition with api-server restart )
Get agreement on the excluded resources in this issue .
Define the mechanism for CEL library update policy after GA(version skew policy, the policy to update CEL library, etc). Sync with Declarative Validation on CEL libraries to prevent unexpected libraries be added after GA. Provide documentation on library change.
Adoption: at least two organizations have demonstrated it is useful for major use cases.
Have enough test coverage for common use cases and corner use cases.

Upgrade / Downgrade Strategy

In alpha, no changes are required to maintain previous behavior. And the feature gate and the required API could be turned on to make use of the enhancement.

Version Skew Strategy

N/A

Production Readiness Review Questionnaire

Feature Enablement and Rollback

How can this feature be enabled / disabled in a live cluster?

Feature gate (also fill in values in kep.yaml)
- Feature gate name: ValidatingAdmissionPolicy
- Components depending on the feature gate: kube-apiserver

Does enabling the feature change any default behavior?

No, default behavior is the same.

Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?

Yes, disabling the feature will result in validation expressions being ignored.

What happens if we reenable the feature if it was previously rolled back?

The validatingAdmissionPolicy will be enforced again.

Are there any tests for feature enablement/disablement?

Unit test and integration test will be introduced in alpha implementation.

Rollout, Upgrade and Rollback Planning

How can a rollout or rollback fail? Can it impact already running workloads?

The existing workload could potentially fail the validation and cause unexpected failures if the validation is misconfigured.

While rollout, the cluster administrator could configure the feature with enforcement: [Warn, Audit] or similar, and wait until being comfortable to switch the enforcement level to Deny. In this way it will minimize the effect on the running workloads.

Note that if the request was in while the feature is off, when the feature is turned back on, the validation policy might prevent it from editing if validation rules applied.

What specific metrics should inform a rollback?

On a cluster that has not yet opted into ValidatingAdmissionPolicy, non-zero counts for either of the following metrics mean the feature is not working as expected:

cel_admission_validation_total
cel_admission_validation_errors

Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?

Upgrade and rollback will be tested before the feature goes to Beta.

Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?

No.

Monitoring Requirements

How can an operator determine if the feature is in use by workloads?

The following metrics could be used to see if the feature is in use:

validating_admission_policy/check_total
validating_admission_policy/definition_total

How can someone using this feature know that it is working for their instance?

Metrics like validating_admission_policy/check_total can be used to check how many validation applied in total
Audit mode can be used to check audit event following this documentation
ValidatingAdmissionPolicy.Status can be used to see if typechecking performed as expected
User can also verify if the admission request is rejected or a warning is shown as expected based on how validationAction is set.

What are the reasonable SLOs (Service Level Objectives) for the enhancement?

No impact on latency for admission request when ValidatingAdmissionPolicy are absent.

Performance when ValidatingAdmissionPolicy are in use will need to be measured and optimized.

What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?

The Metrics below could be used:
- validating_admission_policy/check_total
- validating_admission_policy/definition_total
- validating_admission_policy/check_duration_seconds

Are there any missing metrics that would be useful to have to improve observability of this feature?

No. We are open to input.

Dependencies

Does this feature depend on any specific services running in the cluster?

No.

Scalability

Will enabling / using this feature result in any new API calls?

Yes. A new API group is introduced which will be used for this feature.

Will enabling / using this feature result in introducing new API types?

Yes. We introduced two new kinds for this feature: ValidatingAdmissionPolicy and ValidatingAdmissionPolicyBinding as described in this doc

Will enabling / using this feature result in any new calls to the cloud provider?

No.

Will enabling / using this feature result in increasing size or count of the existing API objects?

No.

Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?

The existing admission request latency might be affected when the feature is used. We expect this to be negligible and will measure it before GA.

Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, …) in any components?

We don’t expect it to. Especially comparing to the existing method to achieve the same goal, using this feature will not result in non-negligible increase of resource usage.

Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?

No.

Troubleshooting

How does this feature react if the API server and/or etcd is unavailable?

Same as without this feature.

What are other known failure modes?

N/A

What steps should be taken if SLOs are not being met to determine the problem?

The feature can be disabled by disabling the API or setting the feature-gate to false if the performance impact of it is not tolerable.
Try to run the validations separately to see which rule is slow
Remove the problematic rules or update the rules to meet the requirement

Implementation History

Drawbacks

Future Work

cel-policy-template range or equivalent.
Default validations?
Short circuiting of validation (right now all are always evaluated)?
CEL based matching support?
kubectl support for this feature could show information about a policy and how it is applied? could be really useful pre-GA to help users

Alternatives

Type checking alternatives

Alternatives are summarized here and discussed in more detail below:

Design Alternative	Summary
Typesafe CEL expressions and scopes	Expressions and the schema paths of expression scopes are fully typechecked, any type errors trigger the `failureMode` of the policy
Typesafe CEL expressions, dynamic scopes	Expressions are typesafe, but scopes are dynamically typed, easing version skew cases
Informational type checking	Expressions and scopes are typechecked, but only to report warnings, evaluation is dynamic
No typechecking	Expressions and scopes are evaluated dynamically

Alternative: Typesafe CEL expressions and scopes

To keep failure policy easy to reason about, and to continue to use CEL in a type-safe way we propose:

If a ValidatingAdmissionPolicy has a spec.match that matches a single GVK, the CEL expression is allowed access to the full object in a typesafe way. Otherwise, the CEL expression is allowed access to the metadata only.
If there are any type checking errors (or if the CRD for the matched GVK does not exist):
- When a ValidatingAdmissionPolicy is created/update. Any type check errors against Kubernetes built-in types result in the create/update request failing validation with the type error.
- When any CRD a ValidatingAdmissionPolicy needs for type chekcing is created/updated: The type check errors are detected by an control loop watching the CRDs with an informer in the API server, and reported in the status of the ValidatingAdmissionPolicy. The policy toggles to a “misconfigured” state where all admission requests matching and of the policy configurations of the policy fail according to the FailureMode.

Example: Typesafe access to object

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: "validate-xyz.example.com"
spec:
  match:
    expression:
    - apiGroups:   ["apps"]
      apiVersions: ["v1"]
      operations:  ["CREATE", "UPDATE"]
      resources:   ["deployments"]
  validations:
    # replicas is accessible because this resource matches only v1 deployments
  - expression: "object.spec.replicas < 100"

Example: Typesafe access only to metadata

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: "validate-xyz.example.com"
spec:
  match:
   resourceRules:
    - apiGroups:   ["*"]
      apiVersions: ["*"]
      operations:  ["CREATE", "UPDATE"]
      resources:   ["*"]
  validations:
    # minReadySeconds is not accessible because this resource matches multiple types
  - expression: "object.spec.minReadySeconds > 60" # ERROR! Not such field "minReadySeconds".
    # metadata is always accessible
  - expression: "object.name.startsWith('xyz')"

Pros:

All CEL expressions are type checked,

Cons:

Does not support all above cases, in particular: version skew cases, validation of an aggregated API server type.
Typechecking happens quite late for some operations, lots of failure modes to reason through (policy created before CRD, CRD updated/deleted/recreated, CRD schema differs across clusters, incompatible CRD change, version skew, …)

Alternative Considered: Typesafe CEL expressions, dynamic scopes

Idea is to use “CEL expression scoping” (see section below) in such a way that missing schema fields due to version skew or CRD changes/inconsistencies can be tolerated.

Scope schema paths are typechecked, but if there are any fields are missing:

the scoped expression skips validation
warnings are reported, but no error states are triggered

Example usage:

# Policy definition
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
metadata:
  name: "validate-xyz.example.com"
spec:
  paramKind:
    group: rules.example.com
    kind: ReplicaLimit
    version: v1
  match:
    resourceRules:
    - apiGroups:   ["apps"]
      apiVersions: ["v1"]
      operations:  ["CREATE", "UPDATE"]
      resources:   ["deployments"]
  validations:
    - expression: "self.name.startsWith('xyz-')"
      scopes: ["spec.containers[*]", "initContainers[*]", "spec.ephemeralContainers[*]"]
status:
  expressionWarnings:
    - expression: "self.name.startsWith('xyz-')"
      scope: "spec.ephemeralContainers[*]"
      # For Kubernetes versions that pre-date the ephemeralContainers field:
      warnings: ["spec.ephemeralContainers[*] is not a valid schema path"]

Pros:

Retains typechecking of CEL expressions while still supporting version skew cases and CRD changes/inconsistencies via the dynamic evalution of expression scopes.
Possible to have a policy definition suppress expression scope warnings. E.g. suppressWarning: { type: MissingField, field: spec.ephemeralContainers, reason: 'Field is only available in Kubernetes 1.x+' }

Cons:

Does not handle aggregated API server case.
Strange mix of type safety and dynamic typing. Difficult to explain, document, justify.

Alternative: Informational type checking

All CEL expressions are evaluated dynamically.

Type checking is still performed for all expressions where a GVK can be matched to type check against, resulting in warnings, e.g.:

...
status:
  expressionWarnings:
    - expression: "object.foo"
      warning: "no such field 'foo'"

Pros:

Can handle all use cases listed.
Does not depend on implementing “CEL expression scoping” to support listed use cases.
Policy definition authors can still opt-in to take full advantage of type checking at development time.
Cluster administrators can check if a policy passes type checking before enabled it.
Possible to have a policy definition suppress warnings. E.g. suppressWarning: { type: MissingField, field: spec.ephemeralContainers, reason: 'Field is only available in Kubernetes 1.x+' }

Cons:

Type errors that would have prevented production issues can be ignored.

Alternative Considered: No typechecking

Pros:

Possible to handle all cases dynamically.

Cons:

No opportunity to benefit from type checking.

Policy definition and configuration separation alternatives

Alternative: Duck Typed CRDs

This is the alternative shown in the initial examples of this KEP.

Policy authors write a CRD to define how each policy is configured. E.g.:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: replicalimit.rules.example.com
  annotations:
    admission.kubernetes.io/is-policy-configuration-definition: "true"
spec:
  group: example.com
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                config:
                  type: object
                    maxReplicas:
                      type: int
  scope: Cluster
  names:
    kind: ReplicaLimit

The admission.kubernetes.io/is-policy-configuration-definition annotation means “inject the correct .spec.match during admission and keep it up to date with a controller”. (Suggested by deads2k). This minimizes version skew if new match criteria is added to spec.match and also minimizes development effort by removing the need to manually declare the fields in CRDs.

The main challenge with this alternative is dealing with mismatches between how CRDs declare the spec.match schema and what the apiserver expects, even with a controller keeping it in sync, it can briefly become out of sync. For this our plan is:

When consuming configuration CRDs OR policy configuration resources used for policy configuration:
- If any unrecognized fields, missing required fields, or incorrectly typed fields are found under spec.match:
  - For configuration CRDs:
    - Set the ValidatingAdmissionPolicy state to “misconfigured” in the status (via a Condition, I believe).
    - Trigger the FailurePolicy on all admission validations.
    - Add a detailed error in the status of the ValidatingAdmissionPolicy.
  - For policy configuration resources:
    - Track in the status of ValidatingAdmissionPolicy that some policy configurations are misconfigured. (also via a Condition?).
    - Add a detailed error in the status of the ValidatingAdmissionPolicy.
    - Trigger the FailurePolicy on admission for resources that match the policy configuration.
- If the CRD is deleted:
  - Set state to “misconfigured
  - Trigger FailurePolicy on all admission validations
  - Add a detailed error in the status of the ValidatingAdmissionPolicy.

A partial spec.match schema (subset of the full schema) is okay so long as only optional fields are omitted. But any unrecognized field in the spec.match would not be allowed.

Proposed annotation:

Example: admission.kubernetes.io/is-policy-configuration-definition: "true"
Used on: CustomResourceDefinition
What a CustomResourceDefinition has the annotation set to “true”, the OpenAPIv3 schema of all versions of this resource is modified and then kept-in-sync by a controller to always contain the expected schema fields of admission policy configuration resources.

xref: https://kubernetes.io/docs/reference/labels-annotations-taints/

Pros:

Concise. A single resource configures both match criteria and configuration params.
If later Kubernetes OpenAPIv3 supports $ref, there is a migration path from this approach to the $ref approach (below)

Cons:

A single resource for both configuration params and matching rules is a problem when using the same configuration with multiple polices, each that need different matching rules.
API server must check for a wide range of error conditions and define how exactly it handles each of them.
If the spec.match schema is incorrectly defined, CRD author might not realize it since they need to check the status of the corresponding ValidatingAdmissionPolicy for any errors.
Changing this schema in the future could be extremely difficult. CRD schemas are atomic from a server-side-apply perspective (spec.versions on down).

Alternative: OpenAPIv3 `$ref` in CRDs

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: replicalimit.rules.example.com
spec:
  group: example.com
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                config:
                  type: object
                    maxReplicas:
                      type: int
                match:
                  $ref: "#/components/schemas/matchrules"
  scope: Cluster
  names:
    kind: ReplicaLimit

Pros:

Match rule schema is owned by Kubernetes, so as it evolves, CRDs automatically pick up changes.

Cons:

CRDs do not yet support OpenAPIv3 $refs, so support would need to be added, presumably this would require a separate KEP.

Alternative: `/matchRules` subresource

The idea of this alternative is to require a configuration CRD to declare that it provides match criteria by exposing a /matchRules subresource:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: replicalimit.example.com
spec:
  group: example.com
  versions:
    ... # configuration schema(s) go here
  subresources:
    matchRules:
      matchRulesPath: .spec.match

Pros:

CRD explicitly opt-in to providing match criteria in a structured way.
Follows pattern used by scale subresource to provide polymorphism across CRDs

Cons:

One of the primary purposes of subresources is accessing/modifying a portion of a resource independently, which is not what we’re trying to achieve.
Kubernetes development/maintenance perspective, subresources, particularly for CRDs, are expensive to introduce and maintain.

Alternative: `PolicyConfiguration` kind with config embedded

apiVersion: admissionregistration.k8s.io/v1
kind: PolicyConfiguration
metadata:
  name: "replica-limit-prod.example.com"
spec:
  match:
    namespaceSelectors:
    - key: environment,
      operator: NotIn,
      values: ["test"]
  config:
    apiVersion: rules.example.com/v1
    kind: ReplicaLimit
    spec:
      maxReplicas: 100

Pros:

Matching criteria is fully defined and validated in a builtin type

Cons:

Embedded config needs the same treatment as custom resource, but reimplementing it all on an embedded resource is, at best, highly impractical in the apiserver as it exists today. E.g.. there is not automatic validation of the embedded resource.

Alternative: Generate CRDs

The ValidatingAdmissionPolicy contains a OpenAPIv3 schema defining how the policy is configured. The schema need only contains the policy specific configuration (it does not need to contain match rules or anything else that is standard).

apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
metadata:
  name: "validate-xyz.example.com"
spec:
  config:
    group: rules.example.com
    kind: ReplicaLimit
    version: v1
    openAPIV3Schema:
      type: object
      properties:
        spec:
          type: object
          properties:
            maxReplicas:
              type: int

This allows the apiserver to combine the configuration schema with the match schema and then generate a CRD. The apiserver then can run a control loop to always keep the CRD match stanzas in sync with what is expected.

This could alternatively be kept separate from the policy definition by having something like:

apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionConfigurationDefinition
metadata:
  name: "validate-configuration-xyz.example.com"
spec:
  configSchema:
    group: rules.example.com
    kind: ReplicaLimit
    version: v1
    openAPIV3Schema:
      type: object
      properties:
        spec:
          type: object
          properties:
            maxReplicas:
              type: int

Pros:

Policy author doesn’t need to define the CRD. It can instead be generated for them.

Cons:

Most of the cons of “Duck Typed CRDs” alternative apply, since ultimately a CRD is created and behaves the same.
This implies that the policy owns the configuration. But we have use cases where multiple policies share the same configuration. For this model it seems misleading and potentially problematic to have both policies attempting to define the same configuration.

CEL variables alternatives

Alternative: Scopes

Imagine that a policy validation uses a CEL expression find an invalid value in a list somewhere nested in a resource. E.g. spec.initContainers.<listitem>.name.

How do we:

Including the invalid value in a message?
- Using a second expression to build a message that then must again find the validation error in the list duplicates a lot of code, and is inefficient.
Include the field path of the problem field in the message?
- It can be hard coded in the message string for basic cases, but for more complex cases (where the are map keys or array indices involved) it becomes messy and complicated to reconstruct.
Traverse across fields safely? CEL offers the all() macro, so traversals like spec.initContainers.all(c, c.name) are possible, but can be subtly incorrect because of optional fields (initContainers in this case), which require has() checks, e.g.: !has(spec.initContainers) || spec.initContainers.all(c, c.name).

CRD Validation Rules solved these problems by allowing validation rules to attached to any schema in the OpenAPIv3. The validation rules are scoped to whatever location in the OpenAPIv3 they are attached.

We propose using a simple schema path format. The purpose of the path is to uniquely identify a schema from the root of a CRDs OpenAPIv3 schema.

Example:

spec.initContainers         # Schema of initContainers array
spec.initContainers[*]      # Schema of the items of the initContainers array
spec.initContainers[*].name # Schema of the name of initContainers

For example, to validate all containers:

  validations:
    - scope: "spec.containers[*]"
      expression: "scope.name.startsWith('xyz-')"
      message: "scope.name does not start with 'xyz'"

To make it possible to access the path information in the scope, we can offer a way to bind varables to the map and list indices in the path, e.g.:

spec.x[xKey].y[yIndex].field

  validations:
    - scope: "x[xKey].y[yIndex].field"
      expression: "scope.startsWith('xyz-')"
      messageExpression: "'%s, %d: some problem'.format([scopePath.xKey, scopePath.yIndex])"

Prior art:

cel-policy-template’s offer a range feature that allows a CEL expression to be scoped to each entry of a map or item of an array. Multiple ranges can be combined to traverse a complex object.
Kyverno foreach declarations , use JMESPath to query for the elements that are then validated.

Note: We considered extending to a list of scopes, e.g.:

  validations:
    - scopes: ["spec.containers[*]", "initContainers[*]", "spec.ephemeralContainers[*]"]
      expression: "scope.name.startsWith('xyz-')"
      message: "scope.name does not start with 'xyz'"

But feedback was this is signficantly more difficult to understand.

Message formatting alternatives

Alternative: CEL args

- expression: "..."
  message: "{1} is less than {2}"
  messageArgs: ["spec.value", "spec.max"]

Cons:

How all types are converted to string becomes the responsibility of this API. Hard to please everyone and may end up needing to reimplementing fmt.Sprintf. In which case this is probably best handled from within CEL.

Alternative: Inline CEL expressions

Single message field but it supports templating, e.g.:

"{{object.int1}} is less than {{object.int2}}"

Cons:

Must defining escaping rules in string for including {{ or }} as a literal
CEL expressions must be properly escaped

Alternative: Inline JSON path

"{{.object.int1}} is less than {{.object.int2}}"

Cons:

(Same as above “Inline CEL expressions”)
Author must switch between using CEL and JSON Path in adjacent fields
JSON Path is less expressive than CEL (both a pro and a con)

Alternative: CEL expressions, separate args from format string

- expression: "..."
  message: "{1} is less than {2}"
  messageArgs: ["", "object.int2"]

Note “%s is less than %s” is also viable, but CEL can always preformat and emit a string for cases where developer needs more control.

Cons:

Slightly more verbose format (but avoid all the escaping problems)