KEP-4958: CSI Sidecars All in one

Implementation History
ALPHA Implementable
Created 2024-11-10
Latest v1.34
Milestones
Alpha v1.34
Beta TBD
Stable TBD

KEP-4958: CSI Sidecars All In One

Release Signoff Checklist

Items marked with (R) are required prior to targeting to a milestone / release.

  • (R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
  • [] (R) KEP approvers have approved the KEP status as implementable
  • (R) Design details are appropriately documented
  • [] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
    • e2e Tests for all Beta API Operations (endpoints)
    • (R) Ensure GA e2e tests for meet requirements for Conformance Tests
    • (R) Minimum Two Week Window for GA e2e tests to prove flake free
  • (R) Graduation criteria is in place
  • (R) Production readiness review completed
  • (R) Production readiness review approved
  • “Implementation History” section is up-to-date for milestone
  • User-facing documentation has been created in kubernetes/website , for publication to kubernetes.io
  • Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

Summary

We propose to combine the source code of the CSI Sidecars in a monorepo, Instead of just putting the code repositories together, it is expected that the program entries for all sidecars will be consolidated. therefore we can:

  • Improve the CSI Sidecar release process by reducing the number of components released
  • Decrease the maintenance tasks the SIG Storage community maintainers do to maintain the Sidecars
  • Propagate changes in common libraries used by CSI Sidecars immediately instead of through additional PRs
  • Reduce the number of components CSI Driver authors and cluster administrators need to keep up to date in k8s clusters

As a side effects of combining the CSI Sidecars into a single component we also

  • Reduce the memory usage/API Server calls done by the CSI Sidecars through the usage of a shared informer.
  • Reduce the cluster resource requirements need to run the CSI Sidecars

Motivation

Increased maintenance tasks on components

The SIG Storage community maintains CSI sidecars in separate repositories: external-attacher, external-provisioner, external-resizer and etc. The community must periodically update with go version bumps, library updates, CVE fixes, CHANGELOG updates and separate releases, which takes a lot of maintenance effort.

CSI Sidecars releases

The CSI Drivers/CSI Sidecars have an indirect dependency on the k8s version. This could happen because of:

  • A new CSI feature that touches CSI Sidecars and k8s component - For example the ReadWriteOncePod feature needs changes in k8s components (kube-apiserver, the kube-scheduler, the kubelet), CSI Sidecars

Because of this indirect dependency the SIG Storage community creates a minor release of each CSI Sidecar for every k8s minor release. We use csi-hospath (a CSI Driver used for testing purposes) to test the compatibility of the new releases with the latest k8s version.

We follow the instructions on SIDECAR_RELEASE_PROCESS.md on every CSI Sidecar to create a minor release.

Maintenance tasks by CSI Driver authors and cluster administrators

CSI driver authors face a continuous maintenance burden. They must constantly track and update their drivers to align with the ever-evolving CSI sidecars, ensuring compatibility, security, and access to new features.

csi driver basic structure

Resource utilization by the CSI Sidecar components

In Some CSI Driver control plane deployment setups each sidecar is configured with a minimum memory request, some examples of OSS CSI Driver deployments resource allocations:

  • Memory request
    • EBS CSI Driver
      • In a CP node, sets a 40Mi memory request for each CSI Sidecars(5 sidecars), a total of 200Mi per node.
      • In a worker node, sets a 40Mi memory request for each CSI Sidecar(2 sidecars), a total of 80Mi per node
    • Azuredisk
      • In a CP node, sets a 20Mi memory request for each CSI Sidecars(5 sidecars), a total of 100Mi per node
      • In a worker node, sets a 20Mi memory request for each CSI Sidecars(2 sidecars), a total of 40Mi per node
    • AlibabaCloud Disk
      • In a CP node, sets a 16Mi memory request for each CSI Sidecars(average 4 sidecars) a total of 64Mi per node
      • In a worker node, sets a 16Mi memory request for each CSI Sidecars(1 sidecars), a total of 40Mi per node The 5x memory request is additional overhead in the control plane nodes, 2x in the worker nodes

Goals

  • To combine the source code of common Container Storage Interface (CSI) sidecars, controllers, and webhooks into a single monorepo.

  • From the single repository, produce a single artifact(binary and container image) similar to how kube-controller-manager operates.

    • If we just merge the source code, we won’t be able to reuse resources and realize the above advantages
    • To minimize the impact on CSI driver authors and cluster administrators, the migration process is unified into a single step, avoiding separate code and binary migrations. CSI driver authors and cluster administrators can keep building binaries and images with individual sidecars from the single repository. Each CSI driver author can start using the merged sidecar binary
  • The sidecars includes the following:

  • Retain git history logs of sidecars in new monorepo.

Non-Goals

Proposal

Overview

The proposal consists of creating a monorepo which creates a single artifact with common sidecars combined in one binary:

  • Combine the source code of all common CSI sidecars (external-attacher, external-provisioner, external-resizer, external-snapshotter, livenessprobe, node-driver-registrar), Controllers(snapshot controller, volume-health-monitor controller), Webhooks(csi-snapshot-validation-webhook) in a single repository. A total of 7 repositories including 6 sidecars, 2 controllers and 1 webhook.
  • Include the source code of helper utilities in the same repository(csi-release-tools , csi-lib-utils ), sidecars/apps use the local modules through go workspaces. A total of 1 release helper and 1 go module.
  • Create a new cmd/ entrypoint that enables sidecars selectively, similar to kube-controller-manager and the –controllers flag.

csi aio structure state

CSI Driver authors would include a single sidecar in their deployments(in both the control plane and node pools). while the artifact version is the same, the command/arguments will be different.

pictures: desired aio component structure

The CSI Driver deployment manifest would look like this in the control plane:

kind: Deployment
apiVersion: app/v1
metadata:
  name: csi-driver-deployment
spec:
  replicas: 1
  templates:
    spec:
      containers:
        - name: csi-driver
          args:
            - "--v=5"
            - "--endpoint=unix:/csi/csi.sock"
        - name: csi-sidecars
          command:
            - csi-sidecars
            - "--csi-address=unix:/csi/csi.sock"
            # similar style as kube-controller-manager
            - "--controllers=attacher,provisioner,resizer,snapshotter"
            - "--feature-gates=Topology=true"
            # leader election flags for all the components as one
            - "--leader-election"
            - "--leader-election-namespace=kube-system"
            # global timeouts
            - "--timeout=30s"
            # per controller specific flags are prefixed with the component name
            - "--attacher-timeout=30s"
            - "--attacher-worker-thread=100"
            - "--provisioner-timeout=30s"
          volumeMounts:
            - mountPath: /csi
              name: socket-dir

The CSI Driver deployment manifest would look like this in the worker node

kind: DaemonSet
apiVersion: apps/v1
metadata:
  name: csi-driver-deployment
spec:
  template:
    spec:
      containers:
        - name: csi-driver
          args:
            - "--v=5"
            - "--endpoint=unix:/csi/csi.sock"
        - name: csi-sidecars
          command:
            - csi-sidecars
            - "--csi-address=unix:/csi/csi.sock"
            # similar style as kube-controller-manager
            - "--controllers=node-driver-registrar"
            - "--kubelet-registration-path=/var/lib/kubelet/plugins/<csi-driver>/csi.sock"
          volumeMounts:
            - name: registration-dir
              mountPath: /registration
            - name: plugin-dir
              mountPath: /csi
      volumes:
        - name: registration-dir
          hostPath:
            path: /var/lib/kubelet/plugins_registry/
            type: Directory
        - name: plugin-dir
          hostPath:
            path: /var/lib/kubelet/plugins/<csi-driver>/
            type: DirectoryOrCreate

Quantifiable characteristics of the current state and of the proposed state

Characteristics/StateCurrent state of CSI Sidecars(let nr. of csi-sidecars=6)CSI Sidecars in signal component
Human effort of propagating csi-release-tools(nr. of csi-release-tools changes * nr. of csi-sidecars)0, (because csi-release-tools is part of the repo)
Human effort of propagating csi-lib-utils(nr. of csi-lib-utils changes * nr. of csi-sidecars)0, (because csi-lib-utils is part of the repo)
go mod dependency bumps(nr. of dependency changes * nr. of csi-sidecars) * CSI releases supported(unknown)nr. of dependency changes * releases supported(follow k8s release)
runtime update(nr. of csi-release-tools changes related with go runtime updates * nr. of csi-sidecars)nr. of go runtime updates
members of CSI releases per k8s minor releasenr. of csi-sidecars1

Additional properties of a single CSI Sidecar component without a quantifiable benefit:

DimensionProsCons
Releases
  • Easier releases
  • Better definition of which sidecar releases are supported for CVE fixs i.e. if our model of support is similar to k8s (last 3 releases) then the same applies to the CSI sidecar releases
  • Release nodes in csi-release-tools are part of the release. Currently, commits in csi-release-tools with release notes get lost because the git subtree commands replays commits but loses the PR release note if csi-release-tools is part of the repo
  • No longer able to do single releases per component.
  • More frequent major version bumps, Currently, we increase the major version of a sidecar when we remove a command line parameter or require new RBAC rules, We ended up with provisioner v5, attacher v4, and snapshotter v8. With a common repo, we would end up with 5+4+8=v17 in the worst case.
  • Testability
  • Easier testing
  • Test features that spawn multiple components e.g. the RWOP feature can be tested as a whole. @pohly
  • Performance & Reliability
  • Can use a shared informer decreasing the load on the API server. @msau42
  • Container getting OOMKilled kills the entire CSI machinery, not just a single component.
    • In HA, another replica would take over a few seconds.
  • Simplicity
  • Consolidation of common parameters like leader election, structured logging
  • Combination of metrics/health ports @msau42
  • Enables using additional sidecars that aren’t used because of additional build pipelines that might be needed to support that additional component.
  • Logs would be interleaved making it harder to trace what happened for a request
  • CSI utility liraries that are not only used by CSI Sidecars but by other project.
    • make an external repo which is automatically synchronized from the internal csi-release-tools e.g. a similar analogy to k/k/staging/lib -> k/lib
  • Integration with CSI Drivers
  • Less config in the controller/node yaml manifest
  • Less confusion for CSI Driver authors on which CSI Sidecar versions to use @msau42
  • Complex configuration for the single CSI Sidecar component
  • Difficulty expressing per CSI Sidecar configuration e.g. kube-api-qps, kube-qpi-burst
    • global flag, override through a CSI sidecar flag e.g. kube-api-qps -> attacher-kube-api-qps
  • User Stories (Optional)

    Notes/Constraints/Caveats (Optional)

    Design Details

    Glossary

    • Individual repository: An existing repository in the kubernetes-csi/ org in Github e.g. the external-attacher repository.
    • AIO monorepo(monorepo): The monolithic repository where most of the code of the CSI Sidecars will be migrated.
    • Monorepo component: The source code of an individual repository that is currently being migrated or already migrated to the monorepo.
    • AIO Sidecar Image: The All-In-One sidecar image utilizes a monorepo
    • repository root path: The portion of a module path that corresponds to a version control repository’s root directory.

    AIO Monorepo

    Release Management

    Use Semantic Versioning in monorepo.

    Alternatives

    RBAC policy

    We designed the AIO monorepo’s RBAC policy to mirror that of individual repos, where each controller maintains its own policy. Driver maintainers should apply proper RBAC when enabling specific controllers in AIO more discuss info in here

    We plan to combine informer caches of different controllers in the future

    Command Line

    Divided the command lines into two types, a generic command line whose configuration is common to all controllers and is configured only once, and the other type of command lines whose configuration is different for each controller. these command lines each has a new unique name. prefix with the controller name.

            - name: csi-sidecars
              command:
                - csi-sidecars
                - "--csi-address=unix:/csi/csi.sock"
                # similar style as kube-controller-manager
                - "--controllers=attacher,provisioner,resizer,snapshotter"
                - "--feature-gates=Topology=true"
                # leader election flags for all the components as one
                - "--leader-election"
                - "--leader-election-namespace=kube-system"
                # global timeouts
                - "--timeout=30s"
                # per controller specific flags are prefixed with the component name
                - "--attacher-timeout=30s"
                - "--attacher-worker-thread=100"
                - "--provisioner-timeout=30s"
    

    example PR: https://github.com/kubernetes-csi/external-attacher/pull/620

    Code synchronization

    During the transition phase (before individual repositories are fully deprecated), code changes (especially bug fixes and CVE patches) need to be synchronized from individual repositories into the AIO MonoRepo.

    This process will be automated using shell scripts . This sync script will potentially performing necessary adjustments (like import path updates if needed by the dependency strategy).

    Individual repo history

    The Git history from each individual repository must be preserved during the consolidation into the AIO MonoRepo.

    This is critical for traceability. It allows developers investigating bugs or changes in the MonoRepo to easily track the origin of the code back to its specific commit in the individual repository’s history using standard Git tooling (git blame, git log).

    This will likely be achieved using Git strategies designed for repository merging, such as careful merge commits, git graft, or potentially git replace during the initial import phase, ensuring commit hashes remain discoverable. Tooling will be developed to aid this process.

    Reproducible builds & Dependencies Management

    To keep reproducible builds of a Monorepo, when syncing codes from individual repositories, it is critical to enforce consistent dependency versions across all MonoRepo component. Avoiding discrepancies that could break builds or introduce compatibility issues.

    To simplify dependency management, including ensuring reproducible builds, we will adopt a single go.mod and go.work file at the root directory. Nested, imported repositories will not have their own go.mod files.

    Alternatives

    Future Sidecar Integration

    Eligibility: Any sidecar that achieves General Availability (GA) is eligible for integration into the AIO monorepo. Process: We will provide clear documentation outlining the steps for integration. While only GA projects are eligible for final integration, we encourage owners of pre-GA sidecars to follow these instructions as well. This will ensure your project is properly prepared for a smooth integration once it reaches GA status. Ownership: The original developers will retain full ownership and control of their sidecar project.

    Risks And Mitigations

    Development workflow

    MileStone

    overview

    workflow1:

    Milestone-modify-entrypoints-of-existing-sidecars-to-integrate-it-seamlessly-with-the-AIO-sidecar

    Objective: Refactor the CSI Sidecar entrypoint (e.g. cmd/external-attacher/main.go) so that it also exposes a public function that can be reused from both the existing cmd/external-attacher/main.go and from the AIO Sidecar main.

    Tasks:

    1. For {external-attacher, external-provisioner, …} split the main function
    2. For {external-attacher, external-provisioner, …} add per sidecar specific flags
    3. Introduce the concept of global flags in the AIO sidecar

    workflow2:

    Milestone-setting-up-a-Kubernetes-CSI-Storage-Repository-with-nested-directory-synchronization

    Objective: Create a new repository and mirror the nested directories of the existing sidecars to the new repository.

    Tasks:

    1. Create kubernetes-csi/csi-sidecars repository
    2. Mirror the nested directories of the all the seven sidecars repo to the new repository.
    3. Add a README.md to the new repository.

    Milestone-Build-the-project-using-a-modified-copy-of-release-tools

    Objective: Use the release tools to build the project into AIO Sidecar images

    Tasks:

    1. Add new release logic of the release tools to support the AIO monorepo and individual repos at same time
    2. Build the project into AIO Sidecar image with the release tools

    Milestone-set-up-new-test-infra-jobs-to-test-the-project-through-the-hostpath-CSI-Driver

    Objective: Ensure the AIO MonoRepo is testable using existing e2e tests and new CI infrastructure.

    Tasks:

    1. Modified the test infra jobs to support the new repository
    2. Validate prow jobs against new repo
    3. Set up github actions to trigger tests for every new PR, including all the e2e test of individual repo

    Milestone-ready-to-accept-PR-from-community

    Objective: Once individual repositories enter the FeatureFreeze state, the monorepo will be open to accept PRs from contributors of those repositories.

    Tasks:

    1. Setup github actions(unit, golangci, etc) in new monorepo
    2. create CONTRIBUTING.md guidelines specific to the MonoRepo.

    workflow3

    Milestone-define-the-path-for-2-CSI-Drivers-to-be-migrated.

    Objective: Develop detailed migration steps/examples for at least two representative CSI drivers.

    Milestone: Have instructions for CSI Driver authors

    Objective: Inform and guide CSI driver maintainers on how to adopt the new AIO sidecar model.

    Tasks:

    1. Socialize the KEP, document the migration process clearly.

    Milestone-three-cloud-vendors-start-using-the-monorepo-component-for-multi-k8s-minor-releases

    Objective: 3 CSI Drivers using the AIO sidecar for 3 consecutive k8s minor releases.

    Task:

    1. utilizing the provided migration instructions.
    2. Identify and support 3 cloud vendors using the AIO sidecar image in production across 3 consecutive Kubernetes minor releases

    Milestone-accept-PR-from-community

    Objective: Transition development fully to the MonoRepo as individual repositories freeze.

    Tasks:

    1. Mark external-provisioner as featurefreeze state
    2. Accept external-provisioner Monorepo component’s PRs
    3. Mark external-attacher as featurefreeze state
    4. Accept external-attacher Monorepo component’s PRs
    5. ….

    workflow4

    milestone-all-individual-repo-has-been-into-featurefreeze-state

    objective: Systematically stop new feature development in individual repositories.

    task:

    1. Announce FeatureFreeze dates per individual repo
    2. coordinate with maintainers to stop merging feature PRs of individual repo
    3. merge pending PRs to the specific individual repo
    4. Formally mark individual repo as feature-frozen.
    5. Repeat sequentially for all individual repos.

    Milestone-all-individual-repo-has-been-into-deprecated-state

    Objective: To gracefully deprecate individual repository while maintaining clear communication with its users and contributors, ensuring a smooth transition to monorepo.

    Task:

    1. Write a deprecation notice to the specific individual repo
    2. Create a release in the individual repo and mark it as deprecated
    3. Notify key contributors, and users of the planned deprecation through mailing lists
    4. Assist Users in transitioning to monorepo through issues or slack.
    5. Repeat sequentially for all repos.

    Milestone-merge-sidecar-informer-caches

    Objective: To merge the sidecar informer caches, which will allow us to use cache more efficiently.

    This is a nice improvement that shouldn’t be part of the MVP yet. It will happen after all of the CSI sidecars have been deprecated or migrated to the monorepo, and we will start it in another KEP

    Questions and Answers

    Project Goal and Motivation

    Q: Will the snapshot-controller be part of the main AIO sidecar binary?

    A: No. While the code for the snapshot-controller will be in the monorepo, it will be built as a separate binary and image. This is because it’s not a true sidecar and is not deployed with the CSI driver. Including it in the same binary was considered an anti-pattern.

    Q: How will shared resources like leader election and Kubernetes API informers be handled?

    A: The single binary will use a shared leader election mechanism and a shared informer for Kubernetes resources. This is a key benefit of the proposal, as it improves performance and reduces resource consumption.

    Development and Release Process

    Q: What is the strategy for developing new features?

    A: After a “hard cut” date, all new feature development will take place exclusively in the monorepo. This approach avoids the complexity of trying to maintain features in parallel across both the new monorepo and the old individual repositories.

    Test Plan

    Prerequisite testing updates
    Unit tests
    Integration tests
    e2e tests

    Graduation Criteria

    AIO MonoRepo state definition

    • Design: The initial planning and definition phase (current state described in documentation).
    • Alpha : Technical feasibility established. All seven sidecar repositories’ code has been integrated into the MonoRepo structure, and all original end-to-end (e2e) tests from the individual repositories pass successfully within the MonoRepo’s test infrastructure.
    • Beta (production-verified): The MonoRepo is considered stable enough for early adoption by cloud vendors in production environments. Clear migration paths for CSI drivers are defined and documented.
    • GA (released): The MonoRepo actively maintained, and accepts contributions (PRs) from the SIG Storage developer community. Development focus shifts from individual repositories to the MonoRepo components. Requires adoption and use in production by at least three cloud vendors.
    • Standalone : Final state. The AIO MonoRepo is the source of truth. Code synchronization from individual repositories is no longer necessary as they are all deprecated.

    Beta graduation would be at least 2 CSI Drivers using the AIO sidecar for at least 2 consecutive k8s minor releases. GA graduation would be at least 2 CSI Drivers using the AIO sidecar for 3 consecutive k8s minor releases.

    Individual repository state definition

    • Released: current state of individual repos
    • FeatureFreeze:
      • Any new feature PRs are not allowed to be filed to the master branch or release-X branches(Controlled by the individual repo maintainer, categorize it and reject it if it’s a feature)
      • SIG Storage Developer file the feature PRs to AIO MonoRepo
      • Except for the serious bugfixes or CVE fixes PRs (only from individual repo maintainer) which can be merged in master and backported to the other release-X branches
    • Deprecated:
      • Active maintenance stops.
      • Eventually, building new images from this repository ceases (dependent on the full migration of all sidecars).
      • (future) archive it but not at the same time as the deprecation time, this is a terminal state so we can’t undo it

    state change

    Migration Process

    The migration follows a phased approach:

    • Foundation & Setup: Create the new AIO MonoRepo, mirror the code (preserving history), adapt build/release tooling, and establish comprehensive test infrastructure (unit, integration, e2e, CI/GitHub Actions).
    • Integration: Refactor the entry points (main.go) of individual repository to be callable functions, enabling them to run both standalone (for backward compatibility tests) and as part of the unified AIO binary, introducing global and component-specific flags.
    • Adoption & Community Transition: Provide clear documentation and migration guidance for CSI driver authors. Engage with cloud vendors to test and adopt the AIO sidecar image in production (Beta -> GA trigger). Open the MonoRepo for community contributions as individual repositories enter FeatureFreeze.
    • Individual Repository Phase-Out: Sequentially transition each individual repository into FeatureFreeze, followed by Deprecated status, communicating clearly with users and maintainers.
    • Finalization: Once all individual repositories are deprecated, the AIO MonoRepo reaches the Standalone state.

    migration process

    Upgrade / Downgrade Strategy

    The entire switchover is relatively simple, as it does not involve a gradual upgrade of the kubernetes controller plane components and data plane component, only the yaml and image of the csi components need to be upgraded, and the rollback is achieved directly through kubectl rollout.

    Version Skew Strategy

    Nothing in particular.

    Production Readiness Review Questionnaire

    Feature Enablement and Rollback

    How can this feature be enabled / disabled in a live cluster?

    It’s actually not a feature, but we can enable it by deploy new version of csidriver and disable it by delete the new version and redeploy the old version

    Does enabling the feature change any default behavior?

    This won’t make any changes to the default behavior of Kubernetes.

    Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?

    It’s actually not a feature, it’s kind of architectural change. so user can deploy old version csi driver to disable it.

    What happens if we reenable the feature if it was previously rolled back?

    Nothing happened, it will act as usually

    Are there any tests for feature enablement/disablement?

    Yes. We will add unit tests with and without the feature gate enabled.

    Rollout, Upgrade and Rollback Planning

    How can a rollout or rollback fail? Can it impact already running workloads?

    No, it will not impact already running workloads.

    What specific metrics should inform a rollback?

    Should be aware of pvc/pv and pod related persistent external storage failures event

    Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
    Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?

    No, it does not.

    Monitoring Requirements

    How can an operator determine if the feature is in use by workloads?

    Determine whether the csi-provisioner deployment includes a AIO Sidecar image by inspecting its container configuration.

    How can someone using this feature know that it is working for their instance?

    Only if their csi plugin are working correctly.

    What are the reasonable SLOs (Service Level Objectives) for the enhancement?
    What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?

    The SLI for the CSI AIO project is based on individual sidecars’ SLIs, with metrics exposed by csi-lib-utils

    • Metrics
      • Metric name: operationsLatencyMetricName
      • Components exposing the metric: Monorepo component
    Are there any missing metrics that would be useful to have to improve observability of this feature?

    Nothing in particular.

    Dependencies

    Does this feature depend on any specific services running in the cluster?

    No.

    Scalability

    Will enabling / using this feature result in any new API calls?

    No, It doesn’t increase the number of API calls. In fact, it will decreasing it

    Will enabling / using this feature result in introducing new API types?

    Nope

    Will enabling / using this feature result in any new calls to the cloud provider?

    Nope

    Will enabling / using this feature result in increasing size or count of the existing API objects?

    Nope

    Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?

    Nope

    Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, …) in any components?

    It will reduce disk and memory usage due to merging image and cache informer of csi driver

    Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?

    Nope

    Troubleshooting

    How does this feature react if the API server and/or etcd is unavailable?
    What are other known failure modes?
    What steps should be taken if SLOs are not being met to determine the problem?

    Implementation History

    Drawbacks

    Alternatives

    ReleaseManagement

    We are consider to switch semantic version to k8s version, there are some pros and cons

    pros:

    • We don’t need to reinvent the wheel about what our dev process is going to look like, we follow the same docs as k8s https://kubernetes.io/releases/release/ . This is tried and tested for many releases
    • Cluster administrators would know which version to use to match their CSI Driver deployment e.g. for a k8s 1.27 cluster they’d use the 1.27 release of the CSI Sidecar.

    cons:

    • Breaking changes might happen in a minor release, Cluster administrators MUST read sidecar release notes considering breaking changes before working on a big release.
    • Version skew scenario becomes confusing for the cluster administrator e.g. they deploy the CSI Sidecars v1.x, cluster is upgraded to v1.{x+3} (CP upgrade first, NP later), nodepools would have CSI sidecar at v1.{x+3} with kubelet at v1.x
    • k/k at 1.27.5 - CSI 1.27.0 or (different mapping still)

    After investigation, we found that there isn’t clear advantage to switch to k8s versioning, so we are going to stick with the semantic versioning.

    ReproducibleBuilds

    1. Using Go Workspaces (introduced in Go 1.18)

    Using go work init and go work sync to manage multiple go.mod files within the MonoRepo.

    1. Single Root

    Removing monorepo component level go.mod/go.sum files and managing all dependencies via a single go.mod/go.sum at the repository root path.

    Conclusion: To simplify dependency management, including ensuring reproducible builds, we will adopt a single go.mod and go.work file at the root directory. Nested, imported repositories will not have their own go.mod files.

    Infrastructure Needed (Optional)