KEP-5100: DSR and Overlay support in Windows kube-proxy
KEP-5100: [RETROACTIVE] DSR and Overlay support in Windows kube-proxy
- Release Signoff Checklist
- Summary
- Motivation
- Proposal
- Design Details
- Production Readiness Review Questionnaire
- Implementation History
- Drawbacks
- Alternatives
- Infrastructure Needed (Optional)
Release Signoff Checklist
Items marked with (R) are required prior to targeting to a milestone / release.
- (R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
- (R) KEP approvers have approved the KEP status as
implementable - (R) Design details are appropriately documented
- (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- e2e Tests for all Beta API Operations (endpoints)
- (R) Ensure GA e2e tests meet requirements for Conformance Tests
- (R) Minimum Two Week Window for GA e2e tests to prove flake free
- (R) Graduation criteria is in place
- (R) all GA Endpoints must be hit by Conformance Tests
- (R) Production readiness review completed
- (R) Production readiness review approved
- “Implementation History” section is up-to-date for milestone
- User-facing documentation has been created in kubernetes/website , for publication to kubernetes.io
- Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
Summary
Add support for DSR (Direct Server Return) and Overlay networking mode support for Windows kube-proxy.
Support for both of these features was added in K8s v1.14 without a KEP. This KEP is to retroactively document the changes made to Windows kube-proxy to support these features and provide a path for promoting these features to GA.
Motivation
DSR support was added to Windows Server 2019 as part of the May 2020 update. DSR provides performance optimizations by allowing the return traffic routed through load balancers to bypass the load balancer and respond directly to the client; reducing load on the load balancer and also reducing overall latency. More information on DSR on Windows can be found here .
Overlay networking mode is a common networking mode used in Kubernetes clusters and is required by some for some important scenarios like network policy support with Calico CNI. Adding support for overlay networking mode in Windows kube-proxy will allow users to use more CNI solutions with Windows nodes.
Goals
Enable DSR and overlay networking on Windows nodes running kube-proxy in Kubernetes clusters.
Non-Goals
Proposal
DSR and Overlay networking mode support is already implemented in Windows kube-proxy and has been extensively tested in the Windows CI pipeline. This proposal is to promote the existing implementations to GA.
Additionally, DSR support on Windows is supported on both EKS and AKS. Both DSR and overlay networking support have been used in the Windows CI pipelines running release-informing jobs since K8s v1.20.
User Stories (Optional)
Story 1
As a cluster administrator, I want to enable DSR functionality on Windows nodes in order to reduce load in the Host Network Service and reduce latency for client requests.
Story 2
As a cluster administrator, I want to be able to enable network policy on Windows nodes which requires overlay networking mode support in kube-proxy for some CNI solutions.
Notes/Constraints/Caveats (Optional)
Overlay networking mode is not compatible with dualstack networking on Windows.
If kube-proxy is started with both overlay networking mode and dualstack networking enabled, a warning message will be added and ip address space with be downgraded to ipv4 only. This is existing behavior and has not caused any reported issues.
Risks and Mitigations
Enabling DSR and overlay networking mode support in Windows kube-proxy both have very little risk.
For DSR, the Windows Host Network Service handles all of the logic for managing network traffic; kube-proxy only needs to specify if DSR should be used when creating/sycing load balancer rules. Additionally, DSR must be enabled with a kube-proxy command switch (–enable-dsr=true) disabling DSR is can be performed by redeploying kube-proxy on Windows nodes.
Overlay networking support in Windows has been used in the Windows CI pipelines running release-informing capz-windows-master jobs since K8s v1.20.
Design Details
Since the functionality is already implemented, the design details section will cover the current implementation.
DSR Enablement
DSR is enabled by passing --enable-dsr=true as a command line switch to the Windows kube-proxy.
Prior to GA, kube-proxy will ensure that WinDSR=true is specified in the feature-gates and will fail to start if DSR is enabled without that.
Checks for terminating and service endpoints handle DSR traffic differently than non-DSR traffic to adhere to behavior defined in KEP-1669: Proxy Terminating Endpoints
- Local endpoints will be skipped when determining if all endpoints for a service are terminated if DSR is enabled and service type is load balancer.
- Non-local endpoints will be skipped when considering if all endpoints for a service are non-serving if DSR is enabled and service type is load balancer.
Flags passed to HNS (Host Networking Service) calls used for the following operators will be updated to include a flag indicating if DSR is enabled for all get, create, and update loadbalancer HNS calls.
Overlay support
To enable overlay networking on Windows nodes, HNS network created on the node prior to starting kube-proxy and specified by $KUBE_NETWORK should be of type Overlay.
Prior to GA WinOverlay=true must be specified in the kube-proxy feature gates.
If the specified network is of type Overlay and the the feature gate is not set, kube-proxy will log an error and fail to start.
Addintionally, in overlay networking node, kube-proxy needs to know the source IP address of the traffic it is proxying by setting --source-vip=$sourceVIP on the kube-proxy command line.
Creating the endpoint varries by CNI implementation and here are two examples:
- For Flannel, the endpoint is created prior to starting kube-proxy like in this example
- For Calico, the endpoint is crated by the node agent and queried by name prior to starting kube-proxy like in this example
Once kube-proxy is running in overlay networking mode, the specified source VIP will sometimes be used on in load balancer policy rules based on the backend endpoints using the following logic:
a) Backend endpoints are any IP’s outside the cluster ==> Choose Node’s IP as the source VIP b) Backend endpoints are IP addresses of a remote node => Choose Node’s IP as the source VIP c) Everything else (Local POD’s, Remote POD’s, Node IP of current Node) ==> Choose the specified source VIP
Everything else is handled by the Windows HNS.
Test Plan
[x] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.
Prerequisite testing updates
Unit tests
Kube-proxy for Windows must run on Windows machines so coverage is not reported in ci-kubernetes-coverage-unit. This coverage data was run manually on a Windows Server 2022 machine:
- k8s.io/kubernetes/pkg/proxy/winkernel: 2025-02-11 - 58.8% of statements
Integration tests
Functionality described in this KEP require Windows nodes and are primarily validated with unit and e2e tests. The Kubernetes project does not currently have support for running integration tests for Windows specific code-paths.
e2e tests
All Windows nodes running kube-proxy in https://testgrid.k8s.io/sig-windows-master-release#capz-windows-master have DSR and overlay networking configured.
Graduation Criteria
Alpha
N/A - This feature is already implemented.
Beta
- Test passes on testgrid with WinDSR and Winoverlay enabled on Windows nodes are running regularly.
- Unit tests validating expected behavior for both DSR and overlay networking mode are added.
- For DSR, unit tests validating feature gate is set correctly and that the correct flags are passed to HNS calls will also be added.
GA
2 or more CNI solutions support overlay networking mode for Windows nodes.
- Calico networking on Windows enables WinOverlay feature if the underlying HNS network is of type overlay.
- Flannel networking does the same.
Upgrade / Downgrade Strategy
For DSR --enable-dsr=true must be passed as a kube-proxy command line switch to enable the functionality.
This means that the upgrade/downgrade strategy is the redeploy kube-proxy with the appropriate configuration.
For overlay networking mode the entire cluster must be configured for overlay networking so cluster it is not possible for upgrade / downgrade this functionality on a per-node basis.
Version Skew Strategy
N/A - As long as the all nodes are configured for overlay networking mode, there is no version skew strategy required since networking APIs are not changing.
Production Readiness Review Questionnaire
Feature Enablement and Rollback
How can this feature be enabled / disabled in a live cluster?
For DSR support:
- Feature gate (also fill in values in
kep.yaml)- Feature gate name: WinDSR
- Components depending on the feature gate: kube-proxy
- Other
- Describe the mechanism: DSR is enabled by passing
--enable-dsr=trueas a command line switch to the Windows kube-proxy. - Will enabling / disabling the feature require downtime of the control plane? no
- Will enabling / disabling the feature require downtime or reprovisioning of a node? Yes, there will be a slight period where network traffic might not be routed correctly while kube-proxy is restarted. Kube-proxy will rules will be re-synced with/without DSR support when kube-proxy is starting up. Nodes that handle network traffic show be drained before toggling DSR support.
- Describe the mechanism: DSR is enabled by passing
For overlay networking mode:
- Feature gate (also fill in values in
kep.yaml)- Feature gate name: WinOverlay
- Components depending on the feature gate: kube-proxy
- Other
- Describe the mechanism:
- Will enabling / disabling the feature require downtime of the control plane? Yes and no - The HNS network used by kube-proxy must be re-created with the correct type before starting kube-proxy which can disrupt network traffic but also all nodes in a cluster must use the same network type so it is not possible to switch between overlay and bridge networking on a per-node basis.
- Will enabling / disabling the feature require downtime or reprovisioning of a node? See above.
Does enabling the feature change any default behavior?
No.
For DSR, --enable-dsr=true must be passed as a kube-proxy command line switch to enable the functionality.
For overlay networking supprt, behavior changes only occur if the HNS network used by kube-proxy is of type Overlay which would only be done intentionally as part of joining nodes to a cluster.
Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
For DSR, yes, DSR can be disabled by passing --enable-dsr=false as a kube-proxy command line switch and restarting kube-proxy.
For Overlay, no, overlay networking mode cannot be disabled on a per-node basis. All nodes in a cluster must use the same network type so it is not possible to switch between overlay and bridge networking on a per-node basis.
What happens if we reenable the feature if it was previously rolled back?
For DSR, kube-proxy should resync HNS rules and start using DSR again.
Are there any tests for feature enablement/disablement?
We have periodic test passes running in prow that use both of these configurations
- capz-windows-master-containerd2 all of the Windows CAPZ tests use calico by default.
- ltsc2025-containerd-flannel-sdnoverlay-master for flannel with overlay networking mode.
For overlay, no, because the feature requires the cluster to be configured for overlay networking mode and cannot be enabled on a per-node basis.
For DSR, unit tests will be added to validate that DSR is enabled and disabled correctly and that the correct flags are passed to HNS calls for each case. These will be required for the feature to move to beta.
Rollout, Upgrade and Rollback Planning
How can a rollout or rollback fail? Can it impact already running workloads?
For DSR a rollout or rollback should not fail. Nodes can operate with DSR enabled or disabled per node in a cluster.
For overlay networking mode support, a rollout can fail if the CNI configuration for the node and kube-proxy configuration are not in sync. This would cause nodes to never go into the Ready state.
What specific metrics should inform a rollback?
Node ready state should be monitored to ensure nodes join the cluster and are properly configured to start running pods.
Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
For DSR support yes, manual verification was done to ensure that DSR can be enabled and disabled on a node.
The steps for the manual validation went as followed:
- Create a cluster with 1 Linux control plane node and 2 Windows worker nodes.
- Deploy a kube-proxy deamonSet with
--feature-gates=WinDSR=trueand--enable-dsr=trueto Windows worker nodes. - Deploy IIS (Internet Information Services) on both Windows worker nodes and expose the service with a LoadBalancer service.
- Once the service IP became available, test that the service is from the each Windows node and outside of the cluster.
- Redeploy the kube-proxy deamonSet with
--enable-dsr=falseto Windows worker nodes. - Wait for Kube-proxy to start and test that the service is still reachable from each Windows node and outside of the cluster.
- Redeploy the kube-proxy deamonSet with
--enable-dsr=trueto Windows worker nodes. - Wait for Kube-proxy to start and test that the service is still reachable from each Windows node and outside of the cluster.
Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
No
Monitoring Requirements
How can an operator determine if the feature is in use by workloads?
If configured for use, both DSR and overlay networking will be used by any workloads that communicate with other pods/services in the cluster.
How can someone using this feature know that it is working for their instance?
- Events
- Event Reason:
- API .status
- Condition name:
- Other field:
- Other (treat as last resort)
- Details: Pod-to-Pod and Pod-to-Service traffic will not route correctly if the feature is not working.
What are the reasonable SLOs (Service Level Objectives) for the enhancement?
N/A
What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
- Metrics
- Metric name:
- [Optional] Aggregation method:
- Components exposing the metric:
- Other (treat as last resort)
- Details: Monitoring of workload-specific network traffic to ensure that traffic is being routed correctly.
Are there any missing metrics that would be useful to have to improve observability of this feature?
No
Dependencies
Does this feature depend on any specific services running in the cluster?
DNS and CNI solutions must be deployed in the cluster.
Both DSR and overlay networking modes are supported for all patch versions of Windows Server 2022 and Windows Server 2025. DSR requires Windows Server 2019 with May 2020 updates (or later).
Scalability
Will enabling / using this feature result in any new API calls?
No
Will enabling / using this feature result in introducing new API types?
No
Will enabling / using this feature result in any new calls to the cloud provider?
No
Will enabling / using this feature result in increasing size or count of the existing API objects?
No
Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
No
Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, …) in any components?
Enabling DSR will increase the number of IP addresses in use on each node by 1 for the VIP used to route return traffic.
Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
No
Troubleshooting
A troubleshooting guide for general Windows networking issues can be found at https://learn.microsoft.com/en-us/troubleshoot/windows-server/software-defined-networking/troubleshoot-windows-server-software-defined-networking-stack
https://github.com/microsoft/SDN/ contains some additional troubleshooting scripts to collect detailed information and can help in troubleshooting
- https://github.com/microsoft/SDN/blob/master/Kubernetes/windows/hns.v2.psm1 is a powershell module with cmdlets for inspecting HNS policies and endpoints
- https://github.com/microsoft/SDN/blob/master/Kubernetes/windows/helper.psm1 contains useful helper functions for troubleshooting
- https://github.com/microsoft/SDN/tree/master/Kubernetes/windows/debug contains various powershell scripts for enabling tracing, collectings stats and perf counters, starting packet captures, etc
Troubleshooting issues with Direct Server Return (DSR) on Windows:
- Ensure that the kube-proxy command line switch
--enable-dsr=trueis set and--feature-gates=WinDSR=trueis set. - Inspect kube-proxy logs for any warnings or errors
- If everything looks correct, log onto the node and inspect the HNS rules to ensure DSR is enabled for the load balancer rules.
- Log onto the node and use
hnsdiag.exe list loadbalancers -dto list all the load balancers and details about their rules. You should seeIsDSR:truefor load balancer policies proxied by kube-proxy. - You can use
hnsdiag.exeto get detailed information about local networks and endpoints in addition to loadbalancers.
- Log onto the node and use
- If you are still having issues create an issue at https://github.com/microsoft/windows-containers
Troubleshooting issues with overlay networking mode on Windows:
- Ensure that the CNI solution has either created a HNS network of type
Overlayor that instructions provided by the CNI solution have been followed to create the network. - Ensure that the name of the network created above is passed to kube-proxy with the
$Env:KUBE_NETWORKenvironment variable. - Check kube-proxy logs for any warnings or errors.
- If everything looks correct, log onto the node and inspect the HNS rules to ensure that the source VIP is being used correctly.
- Log onto the node and use
hnsdiag.exe list loadbalancers -dto list all the load balancers and details about their rules. You should see the source VIP being used for load balancer policies proxied by kube-proxy. - You can use
hnsdiag.exeto get detailed information about local networks and endpoints in addition to loadbalancers.
- Log onto the node and use
- If you are still having issues create an issue at https://github.com/microsoft/windows-containers
How does this feature react if the API server and/or etcd is unavailable?
This feature does not change the functionality of kube-proxy or other Kubernetes components if the API server or etcd is unavailable. Kube-proxy would retain the existing behavior if the API server or etcd is unavailable, which would result in new Pod and Service endpoints not routing correctly on the nodes.
What are other known failure modes?
We have not observed any additional failure modes with DSR or overlay networking mode support on Windows nodes.
What steps should be taken if SLOs are not being met to determine the problem?
See Troubleshooting
Implementation History
- 2019-02-20 - DSR and overlay networking mode support added to Windows kube-proxy (k/k PR #70896
- 2025-01-28 - KEP #5100 created to document the changes made to Windows kube-proxy to support DSR and overlay networking mode support and provide a path for promoting these features to GA.
Drawbacks
The functionally described in this KEP is already implemented and used by various cloud providers so there are no drawbacks to not implementing it. The drawbacks for not progressing the features to GA are that this functionality may get removed from kube-proxy in the future which would result in Windows not being able to support some CNI solutions (Calico networking with network policy support) and not being able to take advantage of DSR performance optimizations.
Alternatives
This functionality has already merged into k/k so other alternatives have not been considered.