KEP-3515: Kubectl Explain OpenAPIv3
KEP-3515: OpenAPI v3 for kubectl explain
- Release Signoff Checklist
- Summary
- Motivation
- Proposal
- Design Details
- Production Readiness Review Questionnaire
- Implementation History
- Drawbacks
- Alternatives
- Future Work
Release Signoff Checklist
Items marked with (R) are required prior to targeting to a milestone / release.
- (R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
- (R) KEP approvers have approved the KEP status as
implementable - (R) Design details are appropriately documented
- (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- e2e Tests for all Beta API Operations (endpoints)
- (R) Ensure GA e2e tests meet requirements for Conformance Tests
- (R) Minimum Two Week Window for GA e2e tests to prove flake free
- (R) Graduation criteria is in place
- (R) all GA Endpoints must be hit by Conformance Tests
- (R) Production readiness review completed
- (R) Production readiness review approved
- “Implementation History” section is up-to-date for milestone
- User-facing documentation has been created in kubernetes/website , for publication to kubernetes.io
- Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
Summary
This KEP proposes an enhancement to kubectl explain:
- Switch data source from OpenAPI v2 to OpenAPI v3
- Replace the hand-written
kubectl explainprinter with a go/template implementation.
Motivation
OpenAPI v3 is a richer API description than OpenAPI v2
OpenAPI v3 support in Kubernetes is currently beta since version 1.24. OpenAPI V3 is a richer representation of the kubernetes API to our users, who have been asking for visibility into things like:
- nullable
- default
- validation fields like oneOf, anyOf, etc.
To show each of these additional data points by themselves is a strong reason to switch to using OpenAPI v3.
CRD schemas expressed as OpenAPI v2 are lossy
Today CRDs specify their schemas in OpenAPI v3 format. To serve the /openapi/v2
document used today by kubectl, there is an expensive conversion from the v3 down
to v2 format.
This process is very lossy
, so kubectl explain when used against CRDs
making use of v3 features does not have a good experience with inaccurate information, or fields removed altogther.
This transformation causes bugs, for example, when attempting to explain a field
that is nullable, kubectl instead shows nothing, due to the lossy conversion
wiping nullable fields.
Goals
- Provide the new richer type information specified by OpenAPI v3 within kubectl explain
- Have a more maintainable
text/templatebased approach to printing - Fallback to old
explainimplementation if cluster does not expose OpenAPI v3 data. - Provide multiple new output formats for kubectl explain:
- human-readable plaintext
- maybe others
- (Optional?) Allow users to specify their own templates for use with kubectl explain (there may be interesting use cases for this)
- Improve discoverability of API Resources and endpoints, and provide a platform for richer information to be included in the future.
Non-Goals
- “Fix” openapi v3 to openapi v2 conversion
This is a non-goal for two reasons:
- These formats are not compatible, and there WILL be data loss and inaccuracy
- This negates the benefits of using OpenAPI v3 for the richer type information
- Provide general-purpose OpenAPI visualization.
Proposal
Basic Usage
The following user experience should be possible with kubectl explain
kubectl explain pods.spec
Output should be familiar to users of today’s kubectl explain, except new
information from the OpenAPI v3 spec is now populated.
Note: Feature during development will be gated by an experimental flag. The commands shown here elide the experimental flag for clarity.
Built-in Template Options
Plaintext
kubectl explain pods
or
kubectl explain pods --output plaintext
The plaintext output format is the default and should be crafted to be as close
as the existing explain output in use before this KEP.
OpenAPIV3 (raw json)
kubectl explain pods --output openapiv3
To get raw OpenAPI v3 data for a certain resource today involves:
1.) setting up kubectl proxy
2.) fetching the correct path at /openapi/v3/<group>/<version>
3.) filtering out unwanted results
This command is useful not only for its convenience, but also other visualizations may be built upon the raw output if we opt not to support a first-class custom template solution in the future.
Risks and Mitigations
OpenAPI V3 Not Available
Risk
OpenAPI v3 data is not available in the current cluster.
Mitigation
If the user does not provide an –output argument
In alpha in particular, if --output is not specified, the old explain behavior
using openapi v2 data will be used.
In beta, kubectl will test if server publishes /openapi/v3. If it does, it will
proceed with the new renderer. If there is no endpoint published, kubectl will fall
back to the old v2 implemtation.
After GA, --output plaintext will be assumed and behave as below.
If the user does provide an –output argument
If a user specifies an --output argument and the server 404’s attempting to
fetch the correct openapi version for the template, a new error message should
be thrown to the effect of: server missing openapi data for version: %v.%v.%v.
Internal templates should strive to support the latest OpenAPI version enabled by default by versions of kubernetes within their skew. With that policy, templates will always render with the latest spec-version of the data, if it is available.
Other network errors should be handled using normal kubectl error handling.
Design Details
Current High-level Approach
- User types
kubectl explain pods - kubectl resolves ‘pods’ to GVR core v1 pods using cluster discovery information
- kubectl resolves GVR to its GVK using restmapper
- kubectl fetches
/openapi/v2as protobuf - kubectl parses the protobuf into
gnostic_v2.Document - kubectl converts
gnostic_v2.Documentintoproto.Models - kubectl searches the document’s
Definitionsfor a schema with the extensionx-kubernetes-group-version-kindmatching the interested GVK - If a field path was used, kubectl traverses the definition’s fields to the subschema specified by the user’s path.
- kubectl renders the definition using its hardcoded printer
- If
--recursivewas used, repeat step 9 for the transitive closure of object-typed fields of the top-level object. Concat the results together.
Proposed High-level Approach
- User types
kubectl explain pods - kubectl resolves ‘pods’ to GVR core v1 pods using cluster discovery information
- kubectl attempts to fetch
/openapi/v3, which indexes where to find specs for each GV - If failure and fallback to v2 is allowed, falls back to Step #3 of the “Current High-level Approach”.
- Otherwise, kubectl fetches OpenAPIV3 path for GV:
/openapi/v3/<group>/<version> - kubectl parses the result as map[string]any
- kubectl locates the schema of the return type for the Path
/apis/<group>/<version>/<resource> - If a field path was used, kubectl traverses the definition’s fields to the subschema specified by the user’s path.
- kubectl renders the type using its built-in template for human-readable plaintext
- If
--recursivewas used, repeat step 9 for the transitive closure of object-typed fields of the top-level object. Concat the results together.
Template rendering
Go’s text/template will be used due to its familiarity, stability, and virtue of being in stdlib.
Test Plan
[x] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.
Prerequisite testing updates
Unit tests
k8s.io/kubectl/pkg/explain:09/29/2022-75.6
Integration tests
:
Tests should include
- Expected Output tests
- Show correct OpenAPI v3 endpoints are hit
- Tests that show default/nullability information is being included in plaintext output
- Tests that update the backing openapi in between calls to explain
e2e tests
Existing e2e tests should be adapted for the new system. E2E test that shows every definition in OpenAPI document can be retrieved via explain
:
Graduation Criteria
Defined using feature gate
Alpha 1
- Feature implemented behind a command line flag
--outputand environment variable - Existing explain tests are working or adapted for new implementation
- Plaintext output roughly matches explain output
- OpenAPIV3 (raw json) output implemented
Beta
OpenAPI V3 is enabled by default on at least one version within kubectl’s support window. As of Kubernetes 1.24 OpenAPIV3 entered beta and become enabled by default, therefore meeting this requirement. In Kubectl for release 1.25, all k8s versions within support window will be able to have OpenAPIV3 enabled. However, the fallback is kept around since it may not always be enabled.
--output plaintextis on-by-default and environment variable is removed/on by default--output plaintext-openapiv2added as a name for the oldexplainimplementation, so the feature may be positively disabled.
GA
- OpenAPIV3 is GA and has been since at least the minimum supported apiserver version by kubectl.
- OpenAPIV3 should be stable for all k8s versions within skew.
- Old
kubectl explainimplementation is removed, as is support for OpenAPIV2-backedkubectl explain
Upgrade / Downgrade Strategy
N/A
Version Skew Strategy
This feature only requires the target cluster has enabled The OpenAPIV3 feature.
OpenAPIV3 is Beta as of Kubernetes 1.24. Thus every version of Kubernetes within skew should be reasonably expected to have the feature on, unless it has been explicitly disabled.
This feature should not be on-by-default without an automatic fallback until OpenAPIV3 is GA.
Users of the --output plaintext flag who attempt to use it against a cluster for which
OpenAPI v3 is not enabled will be shown an error informing them of missing openapi
version upon 404.
In Beta, if no output is specified, OpenAPIV3 will be tried first, and fallback to V2 if not available. In GA, the fallback will be removed (since all clusters in skew should publish V3 endpoint by then)
Built-in templates supported by kubectl should aim to support at least one OpenAPI
version which is GA for an apiserver version within the support window.
kubectl will support trying to fetch each of these versions, so one is guaranteed
to be able to render.
Production Readiness Review Questionnaire
Feature Enablement and Rollback
How can this feature be enabled / disabled in a live cluster?
- Feature gate (also fill in values in
kep.yaml)- Feature gate name:
- Components depending on the feature gate:
- Other
- Describe the mechanism: disablement via
--output plaintext-openapiv2CLI argument forexplainsubcommand. Beta may also be disabled withKUBECTL_EXPLAIN_OPENAPIV3=falseenvironment variable. - Will enabling / disabling the feature require downtime of the control plane? No
- Will enabling / disabling the feature require downtime or reprovisioning of a node? No
- Describe the mechanism: disablement via
Does enabling the feature change any default behavior?
Enabling the feature changes the data source of kubectl explain to use openapiv3.
The output optimally should be familiar to users, who may be delighted to see new
information populated.
Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
Yes, providing --output plaintext-openapiv2 will disable the feature.
Alternatively during Alpha and Beta phases,
environment variable KUBECTL_EXPLAIN_OPENAPIV3=false may be used to disable the
feature without the backwards-incompatile argument.
This feature has no persistent effect on data that is viewed. It is just a viewer of cluster data.
What happens if we reenable the feature if it was previously rolled back?
There is no persistence to using the feature. It is only used for viewing data. So it behaves as normal.
Are there any tests for feature enablement/disablement?
Plan to add more tests for enablement/disablement for beta. PR started here with tests that toggle feature on and off and show feature works in both cases.
Rollout, Upgrade and Rollback Planning
How can a rollout or rollback fail? Can it impact already running workloads?
No, this is a user-interactive CLI feature. If users don’t like it they can
use the old functionality by providing arguments --output plaintext-openapiv2
What specific metrics should inform a rollback?
Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
This feature has no state in the cluster. Using explain on a cluster cannot
affect other users.
Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
No.
Monitoring Requirements
How can an operator determine if the feature is in use by workloads?
There is no direct metrics of explain users, but operators can indirectly
gauge usership by watching openapi v3 metrics.
How can someone using this feature know that it is working for their instance?
kubectl explain pods --output plaintext
User should see OpenAPI v3 JSON Schema for pods type printed to console.
What are the reasonable SLOs (Service Level Objectives) for the enhancement?
N/A
What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
- Metrics
- Metric name:
- [Optional] Aggregation method:
- Components exposing the metric:
- Other (treat as last resort)
- Details: N/A for client-side user-interactive CLI feature
Are there any missing metrics that would be useful to have to improve observability of this feature?
N/A
Dependencies
None
Does this feature depend on any specific services running in the cluster?
To reap the benefits of this feature, OpenAPI v3 is required, however OpenAPI v2 data can be used as a fallback. OpenAPI V3 is GA as of Kubernetes 1.27.
Scalability
Will enabling / using this feature result in any new API calls?
Yes, up feature replaces a single GET of /openapi/v2 which returns a large (megabytes)
openapi document for all types with a more targeted call to /openapi/v3/<group>/<version>
The /openapi/v3/<group>/<version> endpoint implements E-Tag caching so that if the document has
not changed the server incurs a cheap, almost negligible cost to serving the request.
The document returned by calls to /openapi/v3/... is expected to be far smaller
than the megabytes-scale openapi v2 document, since it only includes information
for a single group-version. Additionally, this new mechanism is far more cache-friendly
so the expectation is that far less data will need to be transferred.
Will enabling / using this feature result in introducing new API types?
No.
Will enabling / using this feature result in any new calls to the cloud provider?
No.
Will enabling / using this feature result in increasing size or count of the existing API objects?
No.
Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
No.
Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, …) in any components?
No, would expect generally same amount of resource usage for kubectl.
Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
No, it is client-side only and only uses a single standard HTTP connection.
Troubleshooting
How does this feature react if the API server and/or etcd is unavailable?
Using kubectl’s normal error handling. There is no lasting effect to data or the user.
What are other known failure modes?
What steps should be taken if SLOs are not being met to determine the problem?
N/A
Implementation History
Drawbacks
Alternatives
Implement proto.Models for OpenAPI V3 data
The current hard-coded printer is capable of printing any objects in proto.Models form.
[We already have a way to express OpenAPI v3 data as proto.Models, so this can be
seen as a path of least resistance for plugging OpenAPI v3 into kubectl explain.
This approach is undesirable for a few different reasons:
1.) We would like to update the explain printer to include new OpenAPI v3 information, the current design makes that time consuming and not maintainable.
2.) API-Machinery has desire to deprecate proto.Models. We seeproto.Models
conversion as unnecessary and costly buraucracy, that contributes to high
OpenAPI overhead. We are seeking to deprecate the type in favor of the
kube-openapi types for future usage.
Custom User Templates
Users might also like to be able to specify a path to a custom template file for the resource information to be written to:
human-readable plaintext form:
kubectl explain pods --template /path/to/template.tmpl
Since the API surface for this sort of feature remains very unclear and will likely be very unstable, this sort of feature should be delayed until the internal templates have proven the API surface to be used. To do otherwise would risk breaking user’s templates.
Future Work
This was work that was specced out was part of this KEP, but not added. SIG-CLI is open to improvements in these areas.
Other template outputs
This KEP makes it easy to extend the explain output. Requirements for
built-in md and html outputs might be:
mdoutput implemented (or dropped from design due to continued debate)- Table of contents all GVKs grouped by Group then Version.
- Section for each individual GVK
- All types hyperlink to specific section
- basic
htmloutput (or dropped from design due to continued debate) - Table of contents all GVKs grouped by Group then Version.
- Page for each individual GVK.
- All types hyperlink to their specific page
- Searchable by name, description, field name.
This was removed from scope for the KEP to focus only on the feature users rely on which is the plaintext explain. These templates may be added in the future.
HTML
kubectl explain pods --output html
Similarly to godoc , we suggest to provide a searchable, navigable, generated webpage for the kubernetes types of whatever cluster kubectl is talking to.
Only the fields selected in the command line (and their subfields’ types, etc) will be included in the resultant page.
Possible idea: If user types kubectl explain --output html with no specific target,
then all types in the cluster are included.
Markdown
kubectl explain pods --output md
When using the md template, a markdown document is printed to stdout, so it
might be saved and used for a documentation website, for example.
Similarly to html output, only the fields selected in the command line
(and their subfields’ types, etc) will be included in the resultant page.
Possible idea: If user types kubectl explain --output md with no specific target,
then all types in the cluster are included.