KEP-3130: KMS Observability
KEP-3130: KMS Observability
- Release Signoff Checklist
- Summary
- Motivation
- Proposal
- Design Details
- Production Readiness Review Questionnaire
- Implementation History
- Alternatives
Release Signoff Checklist
Items marked with (R) are required prior to targeting to a milestone / release.
- (R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
- (R) KEP approvers have approved the KEP status as
implementable - (R) Design details are appropriately documented
- (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- e2e Tests for all Beta API Operations (endpoints)
- (R) Ensure GA e2e tests for meet requirements for Conformance Tests
- (R) Minimum Two Week Window for GA e2e tests to prove flake free
- (R) Graduation criteria is in place
- (R) all GA Endpoints must be hit by Conformance Tests
- (R) Production readiness review completed
- (R) Production readiness review approved
- “Implementation History” section is up-to-date for milestone
- User-facing documentation has been created in kubernetes/website , for publication to kubernetes.io
- Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
Summary
Currently, it is not possible to correlate (in logs) the sequence of calls that are involved in the enveloping operation: kube-apiserver->kms-plugin->KMS. This KEP proposes extending the signature of the kms-plugin interface to include the transaction ID (to be generated by the kube-apiserver), which kms-plugin could pass to KMS.
Motivation
The only way to correlate a successful/failed envelope operation today is to use the approximate timestamp of the operation to check events in kube-apiserver, kms-plugin and KMS. There is no guarantee that the timestamp of the operation is the same as the timestamp of the corresponding event in KMS. This KEP proposes extending the signature of the kms-plugin interface to include the transaction ID (to be generated by the kube-apiserver), which kms-plugin could pass to KMS. This transaction ID will be logged with additional metadata such a secret name and namespace for the envelope operation. Similarly, the transaction ID will be logged in the kms-plugin and optionally passed to KMS.
Goals
- Add transaction ID to kms-plugin interface
- Update the logging in kube-apiserver to include transaction ID and non-sensitive metadata such as secret name, namespace for envelope operations
Non-Goals
- Using this transaction ID for audit logging
Proposal
- Generate a new UID for each envelope operation in kube-apiserver.
- Add a new UID field to the envelope operation in kms-plugin interface.
Design Details
This design is centered around generating a new UID for each envelope operation similar to UID generation in admission review requests here: https://github.com/kubernetes/kubernetes/blob/e9e669aa6037c380469b45200e59cff9b52d6d68/staging/src/k8s.io/apiserver/pkg/admission/plugin/webhook/request/admissionreview.go#L137 .
A new UID field will be added to the EncryptRequest and DecryptRequest structs in the kms-plugin interface. The field is a pointer to a string. If the feature gate is disabled, the UID field will be nil and this results in byte equivalent data on the wire when compared to a 1.23 API server.
type EncryptRequest struct {
// UID is a unique identifier for the request.
UID *string `protobuf:"bytes,3,opt,name=uid,proto3" json:"uid,omitempty"`
// Version of the KMS plugin API.
Version string `protobuf:"bytes,1,opt,name=version,proto3" json:"version,omitempty"`
// The data to be encrypted.
Plain []byte `protobuf:"bytes,2,opt,name=plain,proto3" json:"plain,omitempty"`
XXX_NoUnkeyedLiteral struct{} `json:"-"`
XXX_unrecognized []byte `json:"-"`
XXX_sizecache int32 `json:"-"`
}
type DecryptRequest struct {
// UID is a unique identifier for the request.
UID *string `protobuf:"bytes,3,opt,name=uid,proto3" json:"uid,omitempty"`
// Version of the KMS plugin API.
Version string `protobuf:"bytes,1,opt,name=version,proto3" json:"version,omitempty"`
// The data to be decrypted.
Cipher []byte `protobuf:"bytes,2,opt,name=cipher,proto3" json:"cipher,omitempty"`
XXX_NoUnkeyedLiteral struct{} `json:"-"`
XXX_unrecognized []byte `json:"-"`
XXX_sizecache int32 `json:"-"`
}
The UID generated in the kube-apiserver will be used:
- For logging in the kube-apiserver. All envelope operations to the kms-plugin will be logged with the corresponding UID.
- The UID will be logged using a wrapper in the kube-apiserver to ensure that the UID is logged in the same format and is always logged.
- In addition to the UID, the kube-apiserver will also log non-sensitive metadata such as name, namespace and GroupVersionResource of the object that triggered the envelope operation.
- Sent to the kms-plugin as part of the
EncryptRequestandDecryptRequeststructs.
Test Plan
Unit tests covering:
- Generation of UID for each envelope operation
Integration test covering:
- Logging of UID in kube-apiserver
- UID in the
EncryptRequestandDecryptRequest - UID set to nil in the
EncryptRequestandDecryptRequestwhen the feature gate is disabled- Confirm this results in byte equivalent data on the wire when compared to a 1.23 API server.
Graduation Criteria
Alpha
- Feature implemented behind a feature flag
- Initial unit and integration tests completed and enabled
Beta
- Gather feedback from providers using the feature
- Any known bugs fixed
GA
- This is part of the KMS reference implementation
Production Readiness Review Questionnaire
Feature Enablement and Rollback
How can this feature be enabled / disabled in a live cluster?
- Feature gate
- Feature gate name:
KMSUID - Components depending on the feature gate:
- kube-apiserver
- Feature gate name:
FeatureSpec{
Default: false,
LockToDefault: false,
PreRelease: featuregate.Alpha,
}
Does enabling the feature change any default behavior?
UID sent as part of the envelope operation is a change in the default behavior. This is backwards compatible.
Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
Yes, via the KMSUID feature gate. Disabling this gate will cause the API server to not send the UID as part of Encrypt or Decrypt envelope operation.
Monitoring Requirements
How can someone using this feature know that it is working for their instance?
- Other (treat as last resort)
- Details: Logs in kube-apiserver, kms-plugin and KMS will be logged with the corresponding UID.
What are the reasonable SLOs (Service Level Objectives) for the enhancement?
There should be no impact on the SLO with this change.
What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
- Other (treat as last resort)
- Details: Logs in kube-apiserver, kms-plugin and KMS will be logged with the corresponding UID.
Dependencies
Does this feature depend on any specific services running in the cluster?
No.
Scalability
Will enabling / using this feature result in any new API calls?
No.
Will enabling / using this feature result in introducing new API types?
No.
Will enabling / using this feature result in any new calls to the cloud provider?
No.
Will enabling / using this feature result in increasing size or count of the existing API objects?
This proposal adds a new field UID to the gRPC API for envelope operations.
Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
No.
Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, …) in any components?
No.
Troubleshooting
How does this feature react if the API server and/or etcd is unavailable?
- ETCD data encryption with external kms-plugin is unavailable
Implementation History
Alternatives
We considered using the AuditID from the kube-apiserver request that generated the envelope operation. This approach has the following drawbacks:
- AuditID can be configured by the user with the
Audit-IDheader in the API server request. Multiple requests can be sent to the kube-apiserver with the same Audit-ID. - Not all API server requests will generate an envelope operation. The API server caches DEKs and for the DEK that’s available in the cache, the kube-apiserver will not generate an envelope operation.
- Since not all calls to the KMS correspond to an audit log, using audit ID is not complete for correlating calls from kube-apiserver->kms-plugin->KMS.