KEP-4940: KEP Template
KEP-4940: Add Pod Security Admission (PSA) to block setting .host field from ProbeHandler and LifecycleHandler
- Release Signoff Checklist
- Summary
- Proposal
- Design Details
- Production Readiness Review Questionnaire
- Implementation History
- Drawbacks
- Alternatives
Release Signoff Checklist
Items marked with (R) are required prior to targeting to a milestone / release.
- (R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
- (R) KEP approvers have approved the KEP status as
implementable - (R) Design details are appropriately documented
- (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- e2e Tests for all Beta API Operations (endpoints)
- (R) Ensure GA e2e tests meet requirements for Conformance Tests
- (R) Minimum Two Week Window for GA e2e tests to prove flake free
- (R) Graduation criteria is in place
- (R) all GA Endpoints must be hit by Conformance Tests
- (R) Production readiness review completed
- (R) Production readiness review approved
- “Implementation History” section is up-to-date for milestone
- User-facing documentation has been created in kubernetes/website , for publication to kubernetes.io
- Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
Summary
We have a Host field that can be set from TCPSocketAction and
HTTPGetAction fields which are part of the ProbeHandler and
LifecycleHandler structs in Kubernetes that are used in
InitContainers and Containers structs of PodSpec.
The Host field is used for allowing users to specify
another entity other than the podIP (which is the default value) to
which Kubelet should perform probes to.
However this opens it up for security attacks since the Host
field can be set to pretty much any value in the system including
security sensitive external hosts or localhost on the node.
Kubelet will be probing this set Host value which can
lead to blind SSRF attacks.
Goals
- Add Pod Security Admission (PSA) to enable admins to restrict
users from creating probes with the
Hostfield set. - The Baseline Pod Security Standard (PSS) will be updated to enforce blocking this field so that it helps with easier adoption for workload operators given this is a known issue we want to prevent.
Non-Goals
- Removing
.Hostfield from the API and dropping support (It is unsaid rule that nothing can get removed from core Kubernetes API)
Proposal
There is a long term plan to deprecate the existing TCP and HTTP probe types in the API to replace them with ones with slightly different semantics. See KEP-4559 for more details. Given the unsolvable security problems with the Host field, we do not plan to offer it in the new types.
Meanwhile, the older API is never going to go away. So we also want to add PSA to allow admins to be able to restrict users from creating probes with the Host field set when using the (about to be deprecated) API. This is implemented by kubernetes PR 125271 that does exactly that.
Risks and Mitigations
There might be users who depend on the Host field in
their existing probes which will continue to work and if
newly created probes also need the Host field to point
to an external destination then the admin can avoid enforcing
the PSA to block it.
Design Details
Add a Baseline APILevel Pod Security Admission policy to allow admins of the
cluster to block users from setting .host field in:
- spec.containers[*].LivenessProbe.ProbeHandler.HTTPGet.Host
- spec.containers[*].ReadinessProbe.ProbeHandler.HTTPGet.Host
- spec.containers[*].StartupProbe.ProbeHandler.HTTPGet.Host
- spec.containers[*].LivenessProbe.ProbeHandler.TCPSocket.Host
- spec.containers[*].ReadinessProbe.ProbeHandler.TCPSocket.Host
- spec.containers[*].StartupProbe.ProbeHandler.TCPSocket.Host
- spec.containers[*].Lifecycle.PostStart.TCPSocket.Host // Deprecated. TCPSocket is NOT supported as a LifecycleHandler and kept for backward compatibility.
- spec.containers[*].Lifecycle.PreStop.TCPSocket.Host // Deprecated. TCPSocket is NOT supported as a LifecycleHandler and kept for backward compatibility.
- spec.containers[*].Lifecycle.PostStart.HTTPGet.Host
- spec.containers[*].Lifecycle.PreStop.HTTPGet.Host
- spec.initContainers[*].LivenessProbe.ProbeHandler.HTTPGet.Host
- spec.initContainers[*].ReadinessProbe.ProbeHandler.HTTPGet.Host
- spec.initContainers[*].StartupProbe.ProbeHandler.HTTPGet.Host
- spec.initContainers[*].LivenessProbe.ProbeHandler.TCPSocket.Host
- spec.initContainers[*].ReadinessProbe.ProbeHandler.TCPSocket.Host
- spec.initContainers[*].StartupProbe.ProbeHandler.TCPSocket.Host
- spec.initContainers[*].Lifecycle.PostStart.TCPSocket.Host // Deprecated. TCPSocket is NOT supported as a LifecycleHandler and kept for backward compatibility.
- spec.initContainers[*].Lifecycle.PreStop.TCPSocket.Host // Deprecated. TCPSocket is NOT supported as a LifecycleHandler and kept for backward compatibility.
- spec.initContainers[*].Lifecycle.PostStart.HTTPGet.Host
- spec.initContainers[*].Lifecycle.PreStop.HTTPGet.Host
Test Plan
- Unit and E2E tests will be added to ensure the PSA works as expected
Prerequisite testing updates
None
Unit tests
Necessary unit tests will be added to the PSA package for testing the new code. Current test coverage status for the package is:
k8s.io/pod-security-admission/policy:2025-05-06-89.9%k8s.io/pod-security-admission/test:TBD-TBD
Integration tests
The following integration tests will be added to verify the PSA validation logic:
- Test that pods with
.hostfield set in probes are rejected when PSA is enabled with baseline level - Test that pods without
.hostfield set in probes are allowed when PSA is enabled with baseline level - Test that existing pods with
.hostfield set continue to work when PSA is enabled - Test that pods with
.hostfield set are allowed when PSA is disabled or using an older version
These tests will be added to:
test/integration/auth/podsecurity_test.gohttps://storage.googleapis.com/k8s-triage/index.html?test=TestPodSecurity
The integration tests will verify the PSA policy validation logic by:
- Creating test cases for each probe type (HTTPGet, TCPSocket) in a pod
- Testing each probe location (LivenessProbe, ReadinessProbe, StartupProbe, LifecycleHandler)
- Verifying the PSA policy enforcement at the baseline level
- Testing the behavior with different PSA configurations
e2e tests
There are no Pod Security specific E2E tests (we rely on integration test coverage instead), but the Pod Security admission controller is enabled in E2E clusters, and all E2E test namespaces are labeled with the enforcement label for Pod Security.
Graduation Criteria
The PSA added will be done within a single release and given there will be no feature gates for that, there is no need for multi-release graduation criteria. All related code will land within the same single release
Upgrade / Downgrade Strategy
Any older pods with this field set should not be affected with the above solution. Only newer pods getting created with the field will be alerted.
Users who are using this field can switch to using exec probes moving forward which should unblock them given exec probes can provide the same functionality.
Version Skew Strategy
N/A since its only within a single component: pod-security-admission and doesn’t cross multiple components.
Production Readiness Review Questionnaire
Feature Enablement and Rollback
How can this feature be enabled / disabled in a live cluster?
We decided to not go with feature gates and use PSA versioning.
So if the admin sets pod-security.kubernetes.io/enforce-version: v1.34
along with pod-security.kubernetes.io/enforce:
Does enabling the feature change any default behavior?
There is no effect on clusters where PSA is not enabled OR an older PSA version is used.
There is no effect on clusters where
.Hostprobes are not usedThere is no effect on clusters where an older PSA versioning is being used
If users create new pod with
.Hostprobes field set and the admin has set baseline PSA level toenforcemode then the request will be actively blocked and rejected. Existing pods with.Hostprobes that are upgrading will not be impacted unless PSA level is set toenforcemode.
Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
No
What happens if we reenable the feature if it was previously rolled back?
N/A
Are there any tests for feature enablement/disablement?
N/A since there is no feature gate
Rollout, Upgrade and Rollback Planning
How can a rollout or rollback fail? Can it impact already running workloads?
- Running workloads/deployments that have
.Hostprobes set when upgraded to the latest version where they get rolled-out, if the PSA enforce label is placed on the namespace of the workload, then the workload will fail to get created. - If pod security label is not enabled on the namespace, then there is no impact on running workloads
What specific metrics should inform a rollback?
If your workloads are not rolling out due to the policy rejecting the request,
then cluster admins can use the PSA denial metrics
. Example, the pod_security_evaluations_total
can indicate how many “deny” decisions were done based on number of policy evaluations that
occurred.
Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
N/A
Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
N/A
Monitoring Requirements
N/A
How can an operator determine if the feature is in use by workloads?
If pods have probes with .Host field set and PSA label is set on that pod’s namespace
to a version where the new admission has been added, then it means the feature is enabled.
How can someone using this feature know that it is working for their instance?
Trying to create a pod with .Host field set in the probes will fail
like this:
Error from server (Forbidden): error when creating "psa/fail-case-pod.yaml": pods "liveness-http-pass" is forbidden: violates PodSecurity "restricted:latest": probeHost (container "liveness" uses probeHost 135.45.63.4)
Trying to rollout a deployment with .Host field set in probes will fail with the following status:
- lastTransitionTime: "2025-06-17T06:17:36Z"
lastUpdateTime: "2025-06-17T06:17:36Z"
message: 'pods "hello-world-577c86d6dd-bs7nt" is forbidden: violates PodSecurity
"restricted:latest": probeHost (container "hello-world" uses probeHost 135.45.63.4)'
reason: FailedCreate
status: "True"
type: ReplicaFailure
observedGeneration: 1
unavailableReplicas: 1
What are the reasonable SLOs (Service Level Objectives) for the enhancement?
N/A
What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
cluster admins can use the PSA denial metrics to determine if something is wrong with their workloads and services are not serving properly due to policy enforcement.
Are there any missing metrics that would be useful to have to improve observability of this feature?
N/A
Dependencies
None
Does this feature depend on any specific services running in the cluster?
No
Scalability
N/A
Will enabling / using this feature result in any new API calls?
No
Will enabling / using this feature result in introducing new API types?
No
Will enabling / using this feature result in any new calls to the cloud provider?
No
Will enabling / using this feature result in increasing size or count of the existing API objects?
No
Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
No
Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, …) in any components?
No
Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
No
Troubleshooting
How does this feature react if the API server and/or etcd is unavailable?
N/A
What are other known failure modes?
N/A
What steps should be taken if SLOs are not being met to determine the problem?
So if the admin sets pod-security.kubernetes.io/enforce-version: v1.34
on a namespace this feature will get enabled and workloads rolling out
with .Host probes set will be impacted. One of the remediation procedures to
get workloads into a healthy state would be:
- To pin the the PSA namespace label to a version prior to the version where this field is introduced (example set it to v1.33)
- Restart your workloads.
Implementation History
Drawbacks
N/A
Alternatives
The alternative is to remove this field from the API after its deprecated, but that’s not a supported API action.