KEP-4595: CEL for CRD AdditionalPrinterColumns

Release Signoff Checklist
Summary
Motivation
- Goals
- Non-Goals
Proposal
Design Details
Production Readiness Review Questionnaire
Implementation History
Drawbacks
Alternatives
Infrastructure Needed (Optional)

Release Signoff Checklist

Items marked with (R) are required prior to targeting to a milestone / release.

(R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
(R) KEP approvers have approved the KEP status as implementable
(R) Design details are appropriately documented
(R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- e2e Tests for all Beta API Operations (endpoints)
- (R) Ensure GA e2e tests meet requirements for Conformance Tests
- (R) Minimum Two Week Window for GA e2e tests to prove flake free
(R) Graduation criteria is in place
- (R) all GA Endpoints must be hit by Conformance Tests
(R) Production readiness review completed
(R) Production readiness review approved
“Implementation History” section is up-to-date for milestone
User-facing documentation has been created in kubernetes/website , for publication to kubernetes.io
Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

Summary

This enhancement proposes to let users define human readable printer columns for custom resource definitions using CEL expressions as an alternative to using JSON path.

Motivation

Currently, when creating CustomResourceDefinitions you can define a map of additionalPrinterColumns that would be displayed when querying the custom resources with kubectl. This list of additionalPrinterColumns are defined using JSON paths. If your CustomResourceDefinition is defined in the following manner, running kubectl get mycrd myresource would yield the following response.

additionalPrinterColumns:
- name: Desired
  type: integer
  jsonPath: .spec.replicas
- name: Current
  type: integer
  jsonPath: .status.replicas
- name: Age
  type: date
  jsonPath: .metadata.creationTimestamp

NAME                 DESIRED    CURRENT     AGE
myresource           1          1           7s

This approach has a few limitations such as not being able to support arrays, missing support for processing conditionals, not being able to compute column value from multiple fields and difficulty with formatting dates as duration from another timestamp.

With the advent of CEL, we can provide an alternative input for additionalPrinterColumns to represent the value in CEL for more complicated table readings. This would be added along with the existing JSON path and users can define additionalPrinterColumns for their CRDs in either JSON path or as a CEL expression.

Goals

Enable support for defining additionalPrinterColumns using CEL expressions in Custom Resource Definitions (CRD).
Ensure each column uses only one method—either a CEL expression or JSONPath, not both.
Allow CRDs to define a mix of columns, with some using CEL and others using JSONPath.

Non-Goals

Modify, replace, or phase out JSONPath-based column definitions.
Expanding CEL’s access scope beyond the current design constraints (e.g., no access to arbitrary metadata.* fields beyond name and generateName).
Refer caveats section for context.
Changes to kubectl or other clients are required.

Proposal

This KEP propses a new, mutually exclusive sibling field to additionalPrinterColumns[].jsonPath called additionalPrinterColumns[].expression. This field allows defining printer column values using CEL (Common Expression Language) expressions that evaluate to strings.

To support this, the CustomResourceColumnDefinition struct will be extended to accept CEL expressions for printer columns, and the API server will evaluate these expressions dynamically when responding to Table requests (e.g., kubectl get), producing richer, computed, or combined column outputs.

Example

Given this CRD snippet:

additionalPrinterColumns:
- name: Replicas
  type: string
  expression: "%d/%d".format([self.status.replicas, self.spec.replicas])
- name: Age
  type: date
  jsonPath: .metadata.creationTimestamp
- name: Status
  type: string
  expression: self.status.replicas == self.spec.replicas ? "READY" : "WAITING"

The kubectl get output might look like:

NAME                 REPLICAS     AGE      STATUS
myresource           1/1          7s       READY
myresource2          0/1          2s       WAITING

This enhancement enables flexible, human-friendly column formatting and logic in kubectl get outputs without requiring external tooling or complex JSONPath workarounds.

User Stories (Optional)

Story 1

As a Kubernetes user, I want to define additionalPrinterColumns that correctly aggregate and display all nested arrays within my CRD, so that kubectl get outputs the full list of hosts instead of only showing the first array. Current JSONPath-based columns only print the first matching array, resulting in incomplete data.

Using CEL expressions for additionalPrinterColumns allows combining all nested arrays into a single flattened list, providing complete and accurate output in kubectl get.

additionalPrinterColumns:
- name: hosts
  jsonPath: .spec.servers[*].hosts
  type: string
- name: hosts
  type: string
  description: "All hosts from all servers"
  expression: "self.spec.servers.map(s, s.hosts)"

Output:

NAME   HOSTS                                   HOSTS CEL
foo0   ["foo.example.com","bar.example.com"]   [[foo.example.com, bar.example.com], [baz.example.com]]

In the above example:

spec.servers is mapped to extract each hosts array.
The resulting list of all hosts is displayed in the column output.

Once we support the CEL flatten() macro in the Kubernetes CEL environment, we can get the exact output with (self.spec.servers.map(s, s.hosts)).flatten().

References:

Story 2

As a Kubernetes user, I want to display the status of a specific condition (e.g., the “Ready” condition) from a list of status conditions in a human-readable column when using kubectl get. Currently, jsonPath based additionalPrinterColumns cannot directly extract and display a single condition’s status from an array of conditions, which limits usability and clarity.

With CEL based additionalPrinterColumns, I can define a column using an expression that filters and selects the relevant condition, making the output more meaningful.

Example:

Using the following CRD snippet, I define a READY column that uses a CEL expression to extract the status of the “Ready” condition:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
...
spec:
  ...
  versions:
    ...
    schema:
      openAPIV3Schema:
        type: object
        properties:
          status:
            type: object
            properties:
              conditions:
                type: array
                items:
                  type: object
                  properties:
                    type:
                      type: string
                    status:
                      type: string
  ...
  additionalPrinterColumns:
    - name: READY
      type: string
      description: 'Status of the Ready condition'
      expression: 'self.status.conditions.exists(c, c.type == "Ready") ? self.status.conditions.filter(c, c.type == "Ready")[0].status : "Unknown"'

Output:

NAME                READY
example-resource    True
example-resource2   Unknown

This expression checks if a condition with type == "Ready" exists. If so, it returns its status; otherwise, it returns "Unknown". This approach enables clear, user-friendly status reporting for conditions stored as arrays in the CRD.

References:

https://github.com/kubernetes/kubernetes/issues/67268

Story 3

As a Kubernetes user, I want to define an additional printer column that combines multiple fields from a sub-resource into a single human-readable string. The additionalPrinterColumns defined using jsonPath can’t concatenate fields, so the output is either limited or unclear.

With CEL expressions in additionalPrinterColumns, it is possible to format and combine multiple fields cleanly for better readability.

For example, in a CRD with .spec.sub.foo and .spec.sub.bar, this column defined using CEL expression combines the two fields with a slash:

additionalPrinterColumns:
- name: "Combined"
  type: string
  description: "Combined Foo and Bar values"
  expression: 'format("%s/%s", self.spec.sub.foo, self.spec.sub.bar)'

Output:

NAME          COMBINED     AGE
myresource    foo/bar      7s

This shows output like val1/val2 in kubectl get columns, improving clarity.

References:

https://github.com/operator-framework/operator-sdk/issues/3872

Story 4

As a Kubernetes user, I want to format dates as relative durations (e.g., “5m ago” instead of absolute timestamps) in printer columns, making it easier to understand resource age or timing at a glance.

Example:

additionalPrinterColumns:
  - name: Duration
    type: string
    description: Duration between start and completion
    expression: 'timestamp(self.status.completionTimestamp) - timestamp(self.status.startTimestamp)'

Output:

NAME         DURATION
sample-job   24h7m10s

This would allow kubectl get to display the elapsed time between start and completion timestamps as a formatted duration.

Reference:

https://stackoverflow.com/questions/70557581/kubernetes-crd-show-durations-in-additionalprintercolumns

Notes/Constraints/Caveats (Optional)

As of this writing, when defining additionalPrinterColumns using CEL expressions, access to fields under metadata is limited. Only metadata.name and metadata.generateName are accessible, as per the current design decision .

This makes CEL-based columns less flexible than those defined using JSONPath, because columns definied using JSONPath can access additional metadata fields like creationTimestamp, labels, and ownerReferences, etc.

For example, the following jsonPath-based columns defined in the Cluster API project ) are valid:

additionalPrinterColumns:
- name: Age
  type: date
  description: Time since creation
  jsonPath: .metadata.creationTimestamp
- name: Cluster
  type: string
  description: Associated Cluster
  jsonPath: .metadata.labels['cluster\.x-k8s\.io/cluster-name']
- name: Machine
  type: string
  description: Owning Machine
  jsonPath: .metadata.ownerReferences[?(@.kind=="Machine")].name

But when attempting to define the same columns using CEL expressions, it fails because any field under metadata (except metadata.name and metadata.generateName) is dropped during the conversion of the CRD structural schema to a CEL declaration :

additionalPrinterColumns:
- name: Age
  type: date
  description: Time since creation
  expression: self.metadata.creationTimestamp

Error:

The CustomResourceDefinition "jobs.example.com" is invalid: spec.additionalPrinterColumns[1]: Internal error: CEL compilation failed for self.metadata.creationTimestamp rules: compilation failed: ERROR: <input>:1:14: undefined field 'creationTimestamp'
 | self.metadata.creationTimestamp
 | .............^

There’s a similar ongoing discussion here – https://github.com/kubernetes/kubernetes/issues/122163

Risks and Mitigations

Complex CEL expressions may impact compilation performance

With CEL-based additionalPrinterColumns, users may write highly complex expressions to fulfill specific use cases. These expressions can lead to longer compilation times or excessive compute cost during CRD creation.

Mitigation:

A finite CEL cost model is enforced, as is standard with other CEL-enabled features in Kubernetes. This model limits the computational cost during expression compilation. If a CEL expression exceeds the allowed cost, the compilation will timeout and fail gracefully.

For expressions that are within the cost limits but still slow due to complexity, the responsibility lies with the CRD author to keep or drop them.

Runtime evaluation errors despite successful compilation

CEL expressions are compiled during CRD creation but evaluated later during API usage, such as kubectl get <resource>. As a result, runtime data inconsistencies can cause evaluation errors even if compilation was successful.

For example, if a CEL expression references fields not present in a given Custom Resource instance—due to missing data, schema changes, or optional fields—the evaluation may fail.

Mitigation:

This behavior is aligned with how jsonPath based additionalPrinterColumns currently function. If a jsonPath evaluation fails, an empty value is printed in the column.

The same strategy will be applied for CEL: evaluation failures will result in an empty column, and the underlying error will be logged. This ensures user experience remains consistent and resilient to partial data issues.

Example:

openAPIV3Schema:
  status:
    type: object
    properties:
      startTimestamp:
      type: string
      format: date-time # Incorrect format
    completionTimestamp:
      type: string
      format: date-time # Incorrect format
    duration:
      type: string
additionalPrinterColumns:
- name: Duration
  type: string
  description: Duration between start and completion
  expression: 'timestamp(self.status.completionTimestamp) - timestamp(self.status.startTimestamp)'

In the above example, the format for the fields are incorrect, but the CEL expression is valid. This results in the CEL program returning an error during evaluation at the runtime. This happens because the format we’ve defined, date-time is incorrect. The correct format defined in supportedFormats is datetime. The above would example would give us the following error:

NAME         DURATION
sample-job   no such overload: timestamp(string)

Design Details

Today CRD additionalPrinterColumns only supports JSONPath. This is done today with TableConvertor that converts objects to metav1.Table. Once we create a CRD, a new TableConvertor object will be created along with it. The TableConvertor is what processes the output for additionalPrinterColumns when we query for custom resources. The JSONPath is validated during the CRD validation and is parsed when the TableConvertor is created. We propose extending the CRD API as well as the TableConvertor logic to handle CEL expressions alongside the existing JSONPath logic without changing any of the current behaviour.

API Changes

We extend the CustomResourceColumnDefinition type by adding an Expression field which takes CEL expressions as a string.

type CustomResourceColumnDefinition struct {
  ...
  JSONPath   string

+ Expression string
}

type CustomResourceColumnDefinition struct {
  // ...
  JSONPath string `json:"jsonPath,omitempty" protobuf:"bytes,6,opt,name=jsonPath"`
	
+ Expression string `json:"expression,omitempty" protobuf:"bytes,7,opt,name=expression"`
}

Proposed flow of CEL additionalPrinterColumns

This CEL expression would then be compiled twice:

During the CRD validation and,
Then again during the TableConvertor creation

The compiled CEL program would then be later evaluated at runtime when printing columns during resource listing .

CEL Compilation

To handle the CEL compilation, we add a new CompileColumn() function to the apiextensions-apiserver/pkg/apiserver/schema/cel package which would be called during both CRD Validation and from inside the TableConvertor.New() function.

func CompileColumn(expr string, s *schema.Structural, declType *apiservercel.DeclType, perCallLimit uint64, baseEnvSet *environment.EnvSet, envLoader EnvLoader) ColumnCompilationResult {
  ...
}

Validation

We expect the additionalPrinterColumns of a CustomResourceDefinition to either have a jsonPath or an expression field. Currently additionalPrinterColumns are validated from the ValidateCustomResourceColumnDefinition function. Once we add the new expression field, we compile the CEL expression here using the cel.CompileColumn() function. If the CEL compilation fails at validation, the CRD is not applied.

func ValidateCustomResourceColumnDefinition(col *apiextensions.CustomResourceColumnDefinition, fldPath *field.Path) field.ErrorList {
  // ...
  if len(col.JSONPath) == 0 && len(col.expression) == 0 {
    allErrs = append(allErrs, field.Required(fldPath.Child("JSONPath or expression"), "either JSONPath or CEL expression must be specified"))
  }

  if len(col.JSONPath) != 0 {
    if errs := validateSimpleJSONPath(col.JSONPath, fldPath.Child("jsonPath")); len(errs) > 0 {
      allErrs = append(allErrs, errs...)
    }
  }

+ if len(col.expression) != 0 {
+   // Handle CEL context creation and error handling
+   var celContext *CELSchemaContext
+   celContext = PrinterColumnCELContext(schema)
+   ...

    // CEL compilation during the validation stage
    compilationResult = cel.CompileColumn(col.Expression, structuralSchema, model.SchemaDeclType(s, true), celconfig.PerCallLimit, environment.MustBaseEnvSet(environment.DefaultCompatibilityVersion(), true), cel.StoredExpressionsEnvLoader())
    // Based on the CEL compilation result validate the additionalPrinterColumn
    if compilationResult.Error != nil {
      allErrs = append(allErrs, field.InternalError(fldPath, fmt.Errorf("CEL compilation failed for %s rules: %s", col.Expression, compilationResult.Error)))
    }

    ...
  }

  return allErrs
}

Implementation

Inside tableconvertor.go :

We have the TableConvertor.New() function which creates the TableConvertor object for a CRD. This is done from the crdHandler when the CRD is created or updated.
Each column under additionalPrinterColumns is defined in the TableConvertor object with a columnPrinter interface . This interface has two methods, FindResults() and PrintResults(), which would be used by the TableConvertor object to compute and print the additionalPrinterColumns’ values when we do a GET operation on the CRD.

Today for JSONPath additionalPrinterColumns, we parse the JSONPath expression inside the TableConvertor.New() function here like so:

func New(crdColumns []apiextensionsv1.CustomResourceColumnDefinition) (rest.TableConvertor, error) {
  ...
  path := jsonpath.New(col.Name)
  if err := path.Parse(fmt.Sprintf("{%s}", col.JSONPath)); err != nil {
    return c, fmt.Errorf("unrecognized column definition %q", col.JSONPath)
  }
  path.AllowMissingKeys(true)
  c.additionalColumns = append(c.additionalColumns, path)
}

We then call this function in TableConvertor.New() to allow handling additionalPrinterColumns defined using CEL expressions:

+ func New(crdColumns []apiextensionsv1.CustomResourceColumnDefinition, s *schema.Structural) (rest.TableConvertor, error) {
  ...
+   if len(col.JSONPath) > 0 && len(col.Expression) == 0 {
      // existing jsonPath logic
+   } else if len(col.Expression) > 0 && len(col.JSONPath) == 0 {
+     compResult := CompileColumn(col.Expression, s, model.SchemaDeclType(s, true), celconfig.PerCallLimit, environment.MustBaseEnvSet(environment.DefaultCompatibilityVersion(), true), cel.StoredExpressionsEnvLoader())

+     if compResult.Error != nil {
+       return c, fmt.Errorf("CEL compilation error %q", compResult.Error)
+      }
+     c.additionalColumns = append(c.additionalColumns, compResult)
+   }
}

To make all this work, we also introduce the following:

A new struct ColumnCompilationResult:

type ColumnCompilationResult struct {
  Error          error
  MaxCost        uint64
  MaxCardinality uint64
  FieldPath      *field.Path
  Program        cel.Program
}

This struct implements the columnPrinter interface:

func (c ColumnCompilationResult) FindResults(data interface{}) ([][]reflect.Value, error) {
  ...
}

func (c ColumnCompilationResult) PrintResults(w io.Writer, results []reflect.Value) error {
  ...
}

The output of cel.CompileColumn() returns a ColumnCompilationResult object for each additionalPrinterColumn.

With all of this we can pass the CEL program to the TableConvertor’s ConvertToTable() method, which will call FindResults and PrintResults for all additionalPrinterColumns, regardless of whether they’re defined with JSONPath or CEL expressions.

CEL vs JSONPath Performance Analysis

A big part of the discussions for our proposal was the CEL cost limits since this is the first time CEL is added to the read path. As part of this we’ve done benchmarking of the time it takes to parse and compile equivalent JSONPath and CEL expressions.

Note: The following benchmark analysis statistics are only indicative of the performance. The actual numbers may vary across different runs of the same test.

Refer:

Source code for the POC

Scenario 1: Benchmarking overall performance (compilation + evaluation + cost estimation bits et.al)

Details

Run on Apple M3 Pro with 12 cores, 18 GB RAM, arm64

Find the raw output of the benchmark tests, as well as the source code: https://gist.github.com/sreeram-venkitesh/f4aff1ae7957a5a3b9c6c53e869b7403

The following table provides an average performance analysis across CEL and JSONPath based additionalPrinterColumns:

	CEL (BenchmarkNew_CEL )	JSONPath (BenchmarkNew_JSONPath )
Column Definition	`self.spec.servers.map(s, s.hosts.filter(h, h == "prod.example.com"))`	`.spec.servers[*].hosts[?(@ == "prod.example.com")]`
Overall Performance (Compilation + Evaluation)	• Average iterations: 3,111 • Average time per operation: 382,914 ns/op (~383 µs per op) • Standard deviation: ±42,087 ns (±11%)	• Average iterations: 70,542 iterations • Average time per operation: 17,654 ns/op (~17.7 µs per op) • Standard deviation: ±2,846 ns (±16%)
Compilation Performance	• Cold Start: 2.340 ms • Warmed: 300–400 µs ◦ Most Expensive / Consistent Phases: • Env & Cost Estimator: 160–220 µs avg • CEL Compilation: 60–120 µs avg • Program Generation: 50–80 µs avg • 83% improvement (2.34 ms → ~400 µs)	• Cold Start: ~85 µs • Warmed: 5–8 µs ◦ Most Expensive / Consistent Phases: • JSONPath Parsing: 4–85 µs (occasional spikes) • 90% improvement (85 µs → ~8 µs)
Evaluation Performance	FindResults • Cold: 103.5 µs • Warmed: 13.5 µs • 81% improvement (103.5 → 13.5 µs) PrintResults • Cold: 3.9 µs • Warmed: 1.5 µs • 70% improvement (3.9 → 1.5 µs)	FindResults • Cold: 1.4 µs • Warmed: 0.85 µs • 29% improvement (1.4 → 0.85 µs) PrintResults • Cold: 0.29 µs • Warmed: 0.18 µs • 58% improvement (0.29 → 0.18 µs)

Scenario 2: Benchmarking evaluation (findResults()) performance.

Based on the review comment here - Benchmark an expensive JSON Path additionalPrinterColumns operation (just the part that finds a value using the JSON Path library).

Details

Run on a resource constraint VM - 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz, 4 CPU, 4GB RAM, X86_64

Find the raw output of the benchmark tests, as well as the source code: https://gist.github.com/Priyankasaggu11929/43cc9ece4d6215ee4cfe0d1523a919d6

The following table provides an average performance analysis across CEL and JSONPath based additionalPrinterColumns (only for the `findResults()` execution durations across the benchmark test iterations, along with the min, max, avg indexes):

	CEL (BenchmarkNew_CEL_DeepComplex )	JSONPath (BenchmarkNew_JSONPath_DeepComplex )
Column Definition	`self.spec.environments.map(e, e.clusters.map(c, c.nodes.filter(n, n.metrics.memory > 8000).map(n, n.id)))`	`.spec.environments[].clusters[].nodes[?(@.metrics.memory > 8000)].id`
Evaluation Performance	FindResults • Min: 30.91 µs • Max: 1870.87 µs • Average: 58.38 µs	FindResults • Min: 2.19 µs • Max: 1147.24 µs • Average: 8.40 µs

Conclusion —

Overall performance (compilation + evaluation + cost calculation et.al) of CEL across our two scenarios above, is that CEL is about 20x slower than JSONPath.

But since our focus for the performance analysis was to analyze the evaluation cost (refer scenario 2):

On average, CEL is about 7x slower than JSONPath (58.38 µs vs 8.40 µs)
In the worst cases scenario (most expensive run) CEL is 1.5x slower than JSONPath (1870.87 µs vs 1147.24 µs)

Test Plan

[x] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.

Prerequisite testing updates

Unit tests

Alpha:

staging/src/k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/validation/validation_test.go

Test that validation passes when we create an additionalPrinterColumn with an expression field with valid CEL expression
Test that validation fails when we create an additionalPrinterColumn with an expression field with an invalid CEL expression
Test that existing behaviour of jsonPath is not altered when creating CRDs with only jsonPath additionalPrinterColumns
Test that validation fails when we create an additionalPrinterColumn with both jsonPath and expression fields
Test that validation passes when we create multiple additionalPrinterColumns with both jsonPath and expression fields
Test that validation fails when we try to create an additionalPrinterColumn with expression field when the feature gate is turned off

staging/src/k8s.io/apiextensions-apiserver/pkg/registry/customresource/tableconvertor/tableconvertor_test.go

Verify that CEL compilation errors are caught at the CRD validation phase
Verify that CEL compilation at the TableConvertor creation stage succeeds
Verify that TableConvertor is getting created for the CRD with both jsonPath and expression columns

Integration tests

test/integration/apiserver/crd_additional_printer_columns_test.go

Verify that CRDs are getting created with additionalPrinterColumns with both jsonPath and expression fields
Verify that CEL compilation errors are caught at the CRD validation stage
Verify that existing behaviour is not altered when creating CRDs with only jsonPath additionalPrinterColumns

e2e tests

We will test all cases in integration test and unit test. If needed, we can add e2e tests before beta graduation. We are planning to extend the existing e2e tests for CRDs .

Graduation Criteria

Alpha

Feature implemented behind a feature flag
Initial benchmarks to compare performance of JSONPath with CEL columns and set an appropriate CEL cost (equivalent or at most 2x to the JSONPath cost - as discussed in the June 11, 2025 SIG API Machinery meeting )
Unit tests and integration tests completed and enabled

Beta

Gather feedback from developers and surveys
Add e2e tests
Add appropriate metrics for additionalPrinterColumns usage and CEL cost usage
More benchmarking to compare JSONPath and CEL execution and modify CEL cost if needed

GA

N examples of real-world usage
Upgrade/downgrade e2e tests
Scalability tests
Allowing time for feedback

Upgrade / Downgrade Strategy

No changes are required for a cluster to make an upgrade and maintain existing behavior.

If the cluster is downgraded to a version which doesn’t support CEL for additionalPrinterColumns:

Existing additionalPrinterColumns with CEL expressions would be ignored and those columns will not be printed. Any create or update operation to CRDs would fail if we try to use CEL for additionalPrinterColumns.
Existing additionalPrinterColumns with JSONPath would still work as expected.

Once the cluster is upgraded back to a version supporting CEL for additionalPrinterColumns, users should be able to create CRDs with additionalPrinterColumns using CEL again.

Version Skew Strategy

This feature is implemented in the kube-apiserver component, skew with other kubernetes components do not require coordinated behavior.

Clients should ensure the kube-apiserver is fully rolled out before using the feature.

Production Readiness Review Questionnaire

Feature Enablement and Rollback

How can this feature be enabled / disabled in a live cluster?

Feature gate (also fill in values in kep.yaml)
- Feature gate name: CRDAdditionalPrinterColumnCEL
- Components depending on the feature gate: apiextensions-apiserver, kube-apiserver
Other
- Describe the mechanism:
- Will enabling / disabling the feature require downtime of the control plane?
- Will enabling / disabling the feature require downtime or reprovisioning of a node?

Does enabling the feature change any default behavior?

No default behaviour will be changed since we still support additionalPrinterColumns with JSONPath.

Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?

Yes, if the feature is disabled after being used, the existing additionalPrinterColumns with JSONPath would work as expected. Existing resources with CEL expressions in their additionalPrinterColumn definition would be ignored and those columns will not be printed if the feature is disabled.

What happens if we reenable the feature if it was previously rolled back?

CRDs which had failed validation previously might now succeed if the CEL expression is valid. Existing CRDs additionalPrinterColumns defined with CEL expression would start working again after the feature has been reenabled.

Are there any tests for feature enablement/disablement?

We will have unit and integration tests to make sure that the feature enablement and disablement works as intended.

Rollout, Upgrade and Rollback Planning

How can a rollout or rollback fail? Can it impact already running workloads?

This feature will not impact rollouts or already-running workloads.

What specific metrics should inform a rollback?

If enabling this feature introduces an increase in the latency of the kubectl get <resources> (or similar) request durations, in turn creating load on the apiserver, the same can be indicated by apiserver metrics like apiserver_request_duration_seconds. If there are significant spikes in these metrics during these GET operations you can try disabling the feature/rolling back the cluster version to see if the performance improves.

Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?

We’re planning to test upgrade-> downgrade -> upgrades before graduating to beta.

Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?

No.

Monitoring Requirements

How can an operator determine if the feature is in use by workloads?

The cluster admin can check if the CRDAdditionalPrinterColumnCEL feature gate is turned on. If yes, the admin can further check if any CRD has any columns defined under additionalPrinterColumns section which are using the new expression field.

How can someone using this feature know that it is working for their instance?

Events
- Event Reason:
API .status
- Condition name:
- Other field:
Other (treat as last resort)
- Details: Users will be able to define additionalPrinterColumns for their custom resources with expression instead of jsonPath.

What are the reasonable SLOs (Service Level Objectives) for the enhancement?

What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?

Metrics
- Metric name:
- [Optional] Aggregation method:
- Components exposing the metric:
Other (treat as last resort)
- Details:

Are there any missing metrics that would be useful to have to improve observability of this feature?

No.

Dependencies

Does this feature depend on any specific services running in the cluster?

No.

Scalability

Will enabling / using this feature result in any new API calls?

No.

Will enabling / using this feature result in introducing new API types?

No.

Will enabling / using this feature result in any new calls to the cloud provider?

No.

Will enabling / using this feature result in increasing size or count of the existing API objects?

No.

Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?

Performance of CRD reads might be impacted. Benchmarking needs to be done to know the exact difference between using JSONPath and CEL.

Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, …) in any components?

Since the CEL expressions are compiled and evaluated in the kube-apiserver, depending on the complexity of the CRDs and the expressions defined, we may see a non-negligible increase of CPU usage. We are planning to benchmark this before beta graduation.

Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?

No.

Troubleshooting

How does this feature react if the API server and/or etcd is unavailable?

The same way any write to apiserver would.

What are other known failure modes?

None.

What steps should be taken if SLOs are not being met to determine the problem?

Disable the feature.

Implementation History

Drawbacks

Alternatives

An alternative to the CEL approach proposed by this KEP would be to extend JSONPath to support arrays and other complex queries. There have been a couple of attempts to implement this previously.

These attempts were not successful because of breaking changes to JSONPath. Now that we have CEL as an option, we can move away from trying to extend JSONPath and embrace CEL, since it covers a much larger ground than what we could achieve with extending JSONPath.

KEP-4595: CEL for CRD AdditionalPrinterColumns

KEP-4595: CEL for CRD AdditionalPrinterColumns

Release Signoff Checklist

Summary

Motivation

Goals

Non-Goals

Proposal

Example

User Stories (Optional)

Story 1

Story 2

Story 3

Story 4

Notes/Constraints/Caveats (Optional)

Risks and Mitigations

Complex CEL expressions may impact compilation performance

Runtime evaluation errors despite successful compilation

Design Details

API Changes

Proposed flow of CEL additionalPrinterColumns

CEL Compilation

Validation

Implementation

CEL vs JSONPath Performance Analysis

Test Plan

Prerequisite testing updates

Unit tests

Integration tests

e2e tests

Graduation Criteria

Alpha

Beta

GA

Upgrade / Downgrade Strategy

Version Skew Strategy

Production Readiness Review Questionnaire

Feature Enablement and Rollback

How can this feature be enabled / disabled in a live cluster?

Does enabling the feature change any default behavior?

Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?

What happens if we reenable the feature if it was previously rolled back?

Are there any tests for feature enablement/disablement?

Rollout, Upgrade and Rollback Planning

How can a rollout or rollback fail? Can it impact already running workloads?

What specific metrics should inform a rollback?

Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?

Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?

Monitoring Requirements

How can an operator determine if the feature is in use by workloads?

How can someone using this feature know that it is working for their instance?

What are the reasonable SLOs (Service Level Objectives) for the enhancement?

What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?

Are there any missing metrics that would be useful to have to improve observability of this feature?

Dependencies

Does this feature depend on any specific services running in the cluster?

Scalability

Will enabling / using this feature result in any new API calls?

Will enabling / using this feature result in introducing new API types?

Will enabling / using this feature result in any new calls to the cloud provider?

Will enabling / using this feature result in increasing size or count of the existing API objects?

Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?

Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, …) in any components?

Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?

Troubleshooting

How does this feature react if the API server and/or etcd is unavailable?

What are other known failure modes?

What steps should be taken if SLOs are not being met to determine the problem?

Implementation History

Drawbacks

Alternatives

Infrastructure Needed (Optional)