KEP-3288: Split Stdout and Stderr Log Stream of Container
KEP-3288: Split Stdout and Stderr Log Stream of Container
- Release Signoff Checklist
- Summary
- Motivation
- Proposal
- Design Details
- Production Readiness Review Questionnaire
- Implementation History
- Drawbacks
- Alternatives
- Infrastructure Needed (Optional)
Release Signoff Checklist
Items marked with (R) are required prior to targeting to a milestone / release.
- (R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
- (R) KEP approvers have approved the KEP status as
implementable - (R) Design details are appropriately documented
- (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- e2e Tests for all Beta API Operations (endpoints)
- (R) Ensure GA e2e tests for meet requirements for Conformance Tests
- (R) Minimum Two Week Window for GA e2e tests to prove flake free
- (R) Graduation criteria is in place
- (R) all GA Endpoints must be hit by Conformance Tests
- (R) Production readiness review completed
- (R) Production readiness review approved
- “Implementation History” section is up-to-date for milestone
- User-facing documentation has been created in kubernetes/website , for publication to kubernetes.io
- Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
Summary
Currently, kubelet actually has the potential to return certain log stream of a container, but this ability is not exposed to the user. This proposal enables users to view certain log stream of a container, and aims to extend kubelet and api-server with the ability to return certain container log stream.
Motivation
Users could fetch logs of the container, but now kubelet always returns combined stdout and stderr logs, which is not convenient for users who only want to view certain log stream. Meanwhile, many users have been interested in the ability to retrieve certain log stream of a container, so it is great to implement this long-wanted feature.
Goals
- Enable api-server to return specific log stream of a container
- Enable users to fetch specific log stream of a container
Non-Goals
- Supporting the combination of a specific
Stream(stdout or stderr) andTailLinesin the first iteration. However, ifStreamis set toall, bothStreamandTailLinescan be specified together. - Implementing a new log format
Proposal
Add a new field Stream to PodLogOptions to allow users to indicate which log stream
they want to retrieve. To maintain backward compatibility, if this field is not set,
the combined stdout and stderr from the container will be returned to users.
Users are enabled to fetch the stderr log stream of a given container using kubectl as following:
kubectl logs --stream=stderr -c container pod
User Stories
Bob runs a service “foo” that continuously prints informational messages to stdout, and prints warnings/errors to stderr. When the service “foo” is misbehaving, Bob wants to know what is going on as soon as possible, so he checks the logs of the service “foo”, but the error message is not easy to notice because the apiserver always returns the combined stdout and stderr from the container.
This problem could be resolved if Bob is able to specify that he only wants stderr.
Notes/Constraints/Caveats (Optional)
It can be tricky when Stream and TailLines of PodLogOptions are both specified.
For instance, users might want to fetch the last 10 lines of the stderr log stream for a container, but this is prohibitively expensive to implement from kubelet’s perspective:
At present, container’s logs are stored in an encoded format, either “json-file” or “cri” format:
# json-file
{"log":"out1\n","stream":"stdout","time":"2024-08-20T09:31:37.985370552Z"}
{"log":"err1\n","stream":"stderr","time":"2024-08-20T09:31:37.985370552Z"}
# cri
2016-10-06T00:17:09.669794202Z stdout F out1
2016-10-06T00:17:09.669794202Z stderr F err1
Please note that the log stream is encoded in each log line. Let’s see what will happen if kubelet needs to return the last 10 lines of the stderr log stream:
- Kubelet has to decode each line of the log file from the bottom to determine whether it is from the stderr stream or not, which is a CPU-intensive operation in practice.
- What’s worse, once kubelet identifies that a line is from the stderr stream, it must keep track of the matched lines until a specified number of lines are found. This can be time-consuming, especially when only few lines of stderr logs amidst a large number of stdout logs.
In conclusion, it is not practical for kubelet to return the last N lines of a specific log stream.
Alternatively, kubelet could return logs that match the given stream within the last N lines. For example, consider the following logs:
{"log":"out1\n","stream":"stdout","time":"2024-08-20T09:31:37.985370552Z"}
{"log":"err1\n","stream":"stderr","time":"2024-08-20T09:31:37.985370552Z"}
{"log":"out2\n","stream":"stdout","time":"2024-08-20T09:31:37.985370552Z"}
{"log":"err2\n","stream":"stderr","time":"2024-08-20T09:31:37.985370552Z"}
If users run kubectl logs --stream=stderr --tail=2 pod, kubelet would only return the following:
err2
This approach is more efficient, as kubelet only needs to parse a deterministic number of log lines once, rather than potentially all of them. However, this may go against users’ expectations and could lead to confusion.
Taking all these considerations into account, I propose that we do not support the combination of specific Stream
and TailLinesin the first iteration.
Additionally, the apiserver should validate the PodLogOptions to ensure that a specific Stream and TailLines
are mutually exclusive.
Risks and Mitigations
Design Details
Changes of kube-apiserver
Add a new field Stream to k8s.io/kubernetes/pkg/apis/core.PodLogOptions:
// LogStreamType represents the desired log stream type.
type LogStreamType string
const (
// LogStreamTypeStdout is the stream type for stdout.
LogStreamTypeStdout LogStreamType = "stdout"
// LogStreamTypeStderr is the stream type for stderr.
LogStreamTypeStderr LogStreamType = "stderr"
// LogStreamTypeAll represents the combined stdout and stderr.
LogStreamTypeAll LogStreamType = "all"
)
// PodLogOptions is the query options for a Pod's logs REST call
type PodLogOptions struct {
...
// If set to "stdout" or "stderr", return the given log stream of the container.
// If set to "all" or not set, the combined stdout and stderr from the container is returned.
// Available values are: "stdout", "stderr", "all", "" (empty string).
// +optional
Stream LogStreamType
}
When users want to query certain stream from container, they need to add a new query named stream
to the URL, i.e. /api/v1/namespaces/default/pods/foo/log?stream=stderr&container=nginx.
Then the kube-apiserver is able to know the desired stream and passes it to the kubelet.
To tell kubelet which stream to return, we need to update the LogLocation
function to make it be aware of
the new parameter:
// LogLocation returns the log URL for a pod container. If opts.Container is blank
// and only one container is present in the pod, that container is used.
func LogLocation(
ctx context.Context, getter ResourceGetter,
connInfo client.ConnectionInfoGetter,
name string,
opts *api.PodLogOptions,
) (*url.URL, http.RoundTripper, error) {
...
params := url.Values{}
// validate stream
switch opts.Stream {
case LogStreamTypeStdout, LogStreamTypeStderr, LogStreamTypeAll, "":
default:
return nil, nil, errors.NewBadRequest(fmt.Sprintf("invalid container log stream %s", opts.Stream))
}
// keep backwards compatibility
if opts.Stream == "" {
opts.Stream = LogStreamTypeAll
}
params.Add("stream", string(opts.Stream))
...
}
Changes of kubelet
Add a new field Stream to k8s.io/api/core/v1.PodLogOptions:
// LogStreamType represents the desired log stream type.
// +enum
type LogStreamType string
const (
// LogStreamTypeStdout is the stream type for stdout.
LogStreamTypeStdout LogStreamType = "stdout"
// LogStreamTypeStderr is the stream type for stderr.
LogStreamTypeStderr LogStreamType = "stderr"
// LogStreamTypeAll represents the combined stdout and stderr.
LogStreamTypeAll LogStreamType = "all"
)
// PodLogOptions is the query options for a Pod's logs REST call.
type PodLogOptions struct {
...
// If set, return the given log stream of the container.
// Otherwise, the combined stdout and stderr from the container is returned.
// +optional
Stream LogStreamType
}
In the getContainerLogs
method of k8s.io/kubernetes/pkg/kubelet/server.Server, we examine the Stream field
of PodLogOptions to decide which stream to return:
// getContainerLogs handles containerLogs request against the Kubelet
func (s *Server) getContainerLogs(request *restful.Request, response *restful.Response) {
...
fw := flushwriter.Wrap(response.ResponseWriter)
var stdout, stderr io.Writer
switch logOptions.Stream {
case corev1.LogStreamTypeStdout:
stdout, stderr = fw, io.Discard
case corev1.LogStreamTypeStderr:
stdout, stderr = io.Discard, fw
case corev1.LogStreamTypeAll:
stdout, stderr = fw, fw
}
response.Header().Set("Transfer-Encoding", "chunked")
if err := s.host.GetKubeletContainerLogs(ctx, kubecontainer.GetPodFullName(pod), containerName, logOptions, stdout, stderr); err != nil {
response.WriteError(http.StatusBadRequest, err)
return
}
}
We also need to modify the ReadLogs function to make it be able to filter out the unwanted stream.
Changes of kubectl
Add a new flag --stream, whose value defaults to “all”, to kubectl logs, hence users are able to specify the
log stream to return.
Test Plan
[x] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.
Prerequisite testing updates
Unit tests
k8s.io/kubernetes/pkg/registry/core/pod/strategy.go:2022-06-07-59.5%k8s.io/kubernetes/pkg/kubelet/server/server.go:2022-06-07-67.8%k8s.io/kubernetes/pkg/kubelet/kuberuntime/logs/logs.go:2022-06-07-71%
Integration tests
Add unit tests to pkg/kubelet/... and pkg/registry/core/pod/... to make sure kubelet
and kube-apiserver is behaving as expected.
Specifically, when TailLines and Stream are specified in PodLogOptions, we need to ensure that
the counting of TailLines happens properly and only logs from a specified stream were counted.
e2e tests
Add test case and conformance test to test/e2e/common/node/ and test/e2e/kubectl/
Graduation Criteria
Alpha
- Feature implemented behind a feature flag
- Add unit and e2e tests for the feature.
Beta
- Solicit feedback from the Alpha.
- Ensure tests are stable and passing.
Upgrade / Downgrade Strategy
There is no extra work required for users to maintain previous behavior, the changes caused by this enhancement are backwards compatible.
To make use of the enhancement, users will need to update the kube-apiserver and kubelet
to at least v1.32 and turn on feature gate SplitStdoutAndStderr in both components.
Version Skew Strategy
TBD
Production Readiness Review Questionnaire
Feature Enablement and Rollback
How can this feature be enabled / disabled in a live cluster?
- Feature gate (also fill in values in
kep.yaml)- Feature gate name: SplitStdoutAndStderr
- Components depending on the feature gate: kubelet, kube-apiserver
Does enabling the feature change any default behavior?
No. If the query parameter stream in the url of fetching logs from kube-apiserver is empty or not set,
combined stdout and stderr is returned, which is the default behavior.
Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
Yes.
What happens if we reenable the feature if it was previously rolled back?
No harm, It becomes enabled again after the kubelet and kube-apiserver restart.
The log files do not change when the feature is on compared to when it is off.
Are there any tests for feature enablement/disablement?
Yes, unit tests for the feature when enabled and disabled will be implemented in both kubelet and api server.
Rollout, Upgrade and Rollback Planning
How can a rollout or rollback fail? Can it impact already running workloads?
What specific metrics should inform a rollback?
Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
Monitoring Requirements
How can an operator determine if the feature is in use by workloads?
How can someone using this feature know that it is working for their instance?
- Events
- Event Reason:
- API .status
- Condition name:
- Other field:
- Other (treat as last resort)
- Details:
What are the reasonable SLOs (Service Level Objectives) for the enhancement?
N/A
What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
N/A
Are there any missing metrics that would be useful to have to improve observability of this feature?
N/A
Dependencies
Does this feature depend on any specific services running in the cluster?
No.
Scalability
Will enabling / using this feature result in any new API calls?
No.
Will enabling / using this feature result in introducing new API types?
No.
Will enabling / using this feature result in any new calls to the cloud provider?
No.
Will enabling / using this feature result in increasing size or count of the existing API objects?
No.
Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
No.
Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, …) in any components?
No.
Troubleshooting
How does this feature react if the API server and/or etcd is unavailable?
What are other known failure modes?
What steps should be taken if SLOs are not being met to determine the problem?
Implementation History
2022-05-01: KEP opened
2022-06-08: KEP marked implementable
Drawbacks
Alternatives
Instead of filtering log stream on the server side, we could return a stream of CRI-format logs, something like the following:
2016-10-06T00:17:09.669794202Z stdout P log content 1
2016-10-06T00:17:09.669794203Z stderr F log content 2
so that we could demultiplex the log stream on the client side.
The main drawback of this approach is that we change the format of the log stream and break backward compatibility, there is extra overhead to demultiplex the stream on the client side.