KEP-5339: ClusterProfile credentials plugin

Implementation History
ALPHA Provisional
Created 2025-05-23
Latest v1.34
Milestones
Alpha v1.34
Ownership
Participating SIGs
Primary Authors

KEP-5339: Plugin for Credentials in ClusterProfile

Release Signoff Checklist

Items marked with (R) are required prior to targeting to a milestone / release.

  • (R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
  • (R) KEP approvers have approved the KEP status as implementable
  • (R) Design details are appropriately documented
  • (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
    • e2e Tests for all Beta API Operations (endpoints)
    • (R) Ensure GA e2e tests meet requirements for Conformance Tests
    • (R) Minimum Two Week Window for GA e2e tests to prove flake free
  • (R) Graduation criteria is in place
  • (R) Production readiness review completed
  • (R) Production readiness review approved
  • “Implementation History” section is up-to-date for milestone
  • User-facing documentation has been created in kubernetes/website , for publication to kubernetes.io
  • Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

Summary

To manage an Inventory of Clusters, a platform admin can rely on having the cluster manager output ClusterProfile CRs that point to the clusters. Those CRs are key for multicluster controllers that want to operate on the clusters. However, there isn’t a single way to obtain credentials to reach those clusters. This KEP provides a standardized way to obtain credentials for Clusters when using ClusterProfile and makes it pluggable to allow the diverse ecosystem to support the multitude of ways to obtain credentials. It also reuses part of the Kubeconfig external provider semantics to make implementation easier.

Motivation

ClusterInventory is unfinished without an ability to use the clusters and controller writers have been very explicit that credentials are needed. Previous attempts at writing credentials have failed and we believe that a plugin model, also reusing known flows, will help solve the “credentials” need for ClusterProfiles.

See introduction slides

Goals

  • Provide a library for controllers to obtain credentials for a cluster represented by a ClusterProfile
  • Allow cluster managers to provide a method to obtain credentials that doesn’t require to be embedded into the controller code and recompiling.
  • Be a secure mechanism for credential obtention and storage.

Non-Goals

  • Define the mechanism for shipping plugins to be used by the controllers and their delivery in the controller image/pod.
  • Design plugin or a library for plugins
  • Mandate Federated workload identity / OIDC frameworks (though they are recommended)

Proposal

The proposed approach to obtaining credentials is to leverage plugins for retrieving the credentials from an issuer recognized by the target cluster. The controller using ClusterProfile would use a library to run a local executable which would retrieve the credentials for the current controller and a given clusterprofile. It is expected that plugins would leverage elements local to the controller to help assert the identity of the controller (environment variables, config files, KSA, the local IP, etc…) to retrieve credentials that are valid on the target cluster. Plugins would be exec’ed by the controller so that they don’t need be built-in the binary, allowing flexibility into writing their own credential plugins and still leveraging multicluster controllers written by the community. In addition, we propose to reuse the exec approach and protocol used for external credentials in Kubeconfig (but not the configuration part of kubeconfig). Finally, in order to retrieve the endpoint for the cluster, we standardize the property names that are used in ClusterProfile.

Risks and Mitigations

Because of its interaction with authentication and credentials, particular attention in this design must be paid to security:

  • credentials leak: ClusterProfile, Controller configuration and Plugin configuration should never contain sensitive information
  • Plugin poisoning: supporting credentials provider plugins in a controller relies on trusting the plugin itself and its path in the filesystem. Particular attention must be provided by the user deploying a controller to make sure the plugin that they install are from trusted source as they will have access to the controller’s identity. In addition, the path of the plugin may be edited or hijacked by an attacker which would then sit in lieu of the normal plugin, allowing process execution by the controller’s process. This risk is mitigated by the assumption that the pod’s filesystem is private to it and that no lower-privileged (or separate) processes are able to access it.

Another risk is around AuthZ. This design doesn’t cover the distribution of RBAC to multiple clusters and identifying what principal a controller can be identified as. This setup is currently left to the responsibility of the platform admin setting up the different clusters and controllers.

Design Details

The proposal’s implementation would be done via a Library in https://github.com/kubernetes-sigs/cluster-inventory-api . The library would be in golang. The library is provided as the community shared implementation for golang and it is possible that other implementations would be created, and would work with the same plugin mechanism defined here, allowing for reuse of the external providers that cluster managers write.

The expected prototype for a controller is expected to be the following:

func (c *ClusterProfileExternalProviders) GetConfig(cp *ClusterProfile) (*rest.Config, error)

The library implementation flow is expected to be as follows:

  1. Build the endpoint details of the cluster by reading properties of the ClusterProfile
  2. Call the CredentialsExternalProviders, following the same flow defined in KEP 541 (giving the ability to reuse the code in client-go’s exec package )
  3. If the Cluster includes an extensions entry named client.authentication.k8s.io/exec, pass its extension object through to ExecCredential.Spec.Cluster.Config as plugin configuration.
  4. Build the rest.Config and return it to the caller

External credentials Provider plugin mechanism

In order to call the plugin, the library execs the plugin defined in the configuration. It passes the Cluster information that was obtained from the ClusterProfile. The library then calls the plugin following the protocol defined in KEP 541 . The library provided in https://github.com/kubernetes-sigs/cluster-inventory-api can leverage the original code that is kept in client-go .

Standardizing the Provider definition

In order to populate the Cluster object that the exec provider requires, we standardize a new field in ClusterProfile called accessProviders that is stored in the Status of the ClusterProfile. All the data from this structure is specific to the clusterProfile and does not contain any Controller-specific information. It must be usable by different controller, applications or consumers without requiring changes. It also cannot contain any data considered a secret; and we consider that reachability information is not sensitive.

The definition is as follows:

type AccessProviders struct {
  // +listType=map
  // +listMapKey=name
  accessProviders []AccessConfig // mapping of access provider types to their config. In some cases the cluster may recognize different identity types and they may have different endpoints or TLS config.
}

// AccessType defines the type of access provider that is used to reach the cluster. For example, GCP access (using tokens that are understood by GCP's IAM) is designated by the string `google`.
type AccessType string

// AccessConfig gives more details on data that is necessary to reach out the cluster for this kind of access provider
type AccessConfig struct {
  Name string // name of the provider type
  Cluster *Cluster // Configuration to reach the cluster (endpoints, proxy, etc) // See the sections below for details.
}

Cluster Data

The Cluster structure for the exec defined in KEP 541, implemented in k/client-go assumes the following:

type Cluster struct {
	// LocationOfOrigin indicates where this object came from.  It is used for round tripping config post-merge, but never serialized.
	// +k8s:conversion-gen=false
	LocationOfOrigin string `json:"-"`
	// Server is the address of the kubernetes cluster (https://hostname:port).
	Server string `json:"server"`
	// TLSServerName is used to check server certificate. If TLSServerName is empty, the hostname used to contact the server is used.
	// +optional
	TLSServerName string `json:"tls-server-name,omitempty"`
	// InsecureSkipTLSVerify skips the validity check for the server's certificate. This will make your HTTPS connections insecure.
	// +optional
	InsecureSkipTLSVerify bool `json:"insecure-skip-tls-verify,omitempty"`
	// CertificateAuthority is the path to a cert file for the certificate authority.
	// +optional
	CertificateAuthority string `json:"certificate-authority,omitempty"`
	// CertificateAuthorityData contains PEM-encoded certificate authority certificates. Overrides CertificateAuthority
	// +optional
	CertificateAuthorityData []byte `json:"certificate-authority-data,omitempty"`
	// ProxyURL is the URL to the proxy to be used for all requests made by this
	// client. URLs with "http", "https", and "socks5" schemes are supported.  If
	// this configuration is not provided or the empty string, the client
	// attempts to construct a proxy configuration from http_proxy and
	// https_proxy environment variables. If these environment variables are not
	// set, the client does not attempt to proxy requests.
	//
	// socks5 proxying does not currently support spdy streaming endpoints (exec,
	// attach, port forward).
	// +optional
	ProxyURL string `json:"proxy-url,omitempty"`
	// DisableCompression allows client to opt-out of response compression for all requests to the server. This is useful
	// to speed up requests (specifically lists) when client-server network bandwidth is ample, by saving time on
	// compression (server-side) and decompression (client-side): https://github.com/kubernetes/kubernetes/issues/112296.
	// +optional
	DisableCompression bool `json:"disable-compression,omitempty"`
	// Extensions holds additional information. This is useful for extenders so that reads and writes don't clobber unknown fields
	// +optional
	Extensions map[string]runtime.Object `json:"extensions,omitempty"`
}

In this structure, not all fields would apply, such as:

  • CertificateAuthority, which points to a file (and a ClusterProfile doesn’t have a filesystem)

And there are fields that require special attention:

  • Extensions, which holds additional, usually cluster-specific information, that might help authenticate with the cluster. (For more information about this field and how it is handled, see the section on Passing plugin configuration via extensions ).

Passing plugin configuration via extensions

Some credential providers require cluster-specific, non-secret parameters (for example, a clusterName) in order to obtain credentials. To standardize how this information is conveyed from a ClusterProfile to a plugin, the library follows the existing convention defined by the client authentication API:

Optional: when a plugin needs per-cluster, non-secret config, set an extension entry with name: client.authentication.k8s.io/exec under Cluster.extensions. The library reads only the extension field of that entry and passes it through verbatim to ExecCredential.Spec.Cluster.Config. The content must be non-secret and cluster-specific. Controller- or environment-specific data must not be placed here. Plugins may read values (e.g. clusterName) from ExecCredential.Spec.Cluster.Config.

Reference: client.authentication.k8s.io/v1 Cluster: config sourced from extensions[client.authentication.k8s.io/exec]

Example (embedded in ClusterProfile.status.accessProviders[].cluster):

extensions:
- name: client.authentication.k8s.io/exec
  extension:
    clusterName: spoke-1

In practice, however, there exist certain scenarios where setting the reserved client.authentication.k8s.io/exec extension to pass cluster-specific data might not be appropriate: libraries such as client/go will eventually save the extension data (along with other information, including the CA bundles for a cluster) to an environment variable, KUBERNETES_EXEC_INFO, which exec plugins can read; however:

  • some exec plugins might be expecting inputs from CLI arguments or plugin-specific environment variables directly; they might not read the KUBERNETES_EXEC_INFO environment variable at all, or might make only limited use of the environment variable.
  • it might not be proper to set the KUBERNETES_EXEC_INFO environment variable in the target environment: for example, KUBERNETES_EXEC_INFO includes CA bundles for a cluster and its size might exceed length limitations in the environment.
  • the client.authentication.k8s.io/exec extension keeps the data in the free form, runtime.RawExtension; before it is saved to the KUBERNETES_EXEC_INFO environment variable, client/go will pass it to the ExecConfig.Config struct first, which accepts only runtime.Object data. This brings about possible marshalling/unmarshalling complications, which could be difficult to handle gracefully for the community-provided library proposed in this KEP.

To address the deficiencies above, we further propose that:

  • this KEP reserves a name in the extensions, clusterprofiles.multicluster.x-k8s.io/exec/additional-args, which holds additional CLI arguments that would be supplied to the exec plugin when the ClusterProfile API and community-provided library are used for authentication.

    If an extension under this name is present, the community-provided library will extract the data, and append the additional arguments to the ExecConfig struct (specifically the ExecConfig.Args field) that will be used to prepare the rest.Config output. The arguments will then be used to invoke the exec plugin.

    The additional arguments shall be saved as a string array in the YAML format.

    For simplicity reasons, the community-provided library will not perform any de-duplication on the CLI arguments after the additional arguments are appended.

  • this KEP reserves another name in the extensions, clusterprofiles.multicluster.x-k8s.io/exec/additional-envs, which holds additional environment variables that would be supplied upon calling the exec plugin when the ClusterProfile API and community-provided library are used for authentication.

    If an extension under this name is present, the community-provided library will extract the data, and add the additional variables to the ExecConfig struct (specifically the ExecConfig.Env field) that will be used to prepare the rest.Config output. The variables will then be set when invoking the exec plugin.

    The additional environment variables shall be represented as a string map in the YAML format.

    The community-provided library will de-duplicate the list of environment variables when adding the additional variables; if two entries are present under the same name, the one from the extension will prevail.

Security concerns

With the addition of newly reserved extensions, understandably there might be situations where users might want to block additional CLI arguments or environment variables from being set due to security reasons. To resolve this, the KEP proposes that the community-provided library implementation must allow users to specify whether additional CLI arguments or environment variables can be set by a ClusterProfile object. By default the reserved extensions should be ignored.

See the Configuring plugins in the controller section for more information.

ClusterProfile Example

Below is an example of a GKE ClusterProfile, which would map to a plugin providing credentials of type google:

apiVersion: multicluster.x-k8s.io/v1alpha1
kind: ClusterProfile
metadata:
 name: my-cluster-1
spec:
  displayName: my-cluster-1
  clusterManager:
    name: GKE-Fleet
status:
  version:
    kubernetes: 1.28.0
  properties:
   - name: clusterset.k8s.io
     value: some-clusterset
   - name: location
     value: us-central1
  accessProviders:
  - name: google
    cluster:
      server: https://connectgateway.googleapis.com/v1/projects/123456789/locations/us-central1/gkeMemberships/my-cluster-1

Below are some examples that feature the use of extensions in ClusterProfiles:

  • This example uses the reserved client.authentication.k8s.io/exec extension to pass cluster names to a plugin of the secret reader type:

    apiVersion: multicluster.x-k8s.io/v1alpha1
    kind: ClusterProfile
    metadata:
      name: my-cluster-1
    spec:
      displayName: my-cluster-1
      clusterManager:
        name: inhouse-manager
    status:
      accessProviders:
      - name: secretreader
        cluster:
          server: https://<spoke-server>
          certificate-authority-data: <BASE64_CA>
          extensions:
          - name: client.authentication.k8s.io/exec
            extension:
              clusterName: spoke-1
    
  • This example uses the clusterprofiles.multicluster.x-k8s.io/exec/additional-args extension to pass additional CLI arguments (-audience https://my-on-prem-k8s.example.dev) to the exec plugin when the spire-agent credential provider is used, as the cluster’s authentication solution is expecting tokens with this specific audience for security reasons.

    apiVersion: multicluster.x-k8s.io/v1alpha1
    kind: ClusterProfile
    metadata:
      name: my-on-prem-cluster
    spec: ...
    status:
      ...
      accessProviders:
      - name: spire-agent
        cluster:
          server: https://my-on-prem-k8s.example.dev
          ...
          extensions:
          - name: "clusterprofiles.multicluster.x-k8s.io/exec/additional-args"
            extension:
            - "-audience"
            - "https://my-on-prem-k8s.example.dev"
    
  • This example uses the clusterprofiles.multicluster.x-k8s.io/exec/additional-envs extension to pass additional environment variables CLIENT_ID and TENANT_ID to the exec plugin when the kubelogin credential provider is used; these entries can help the exec plugin exchange for cluster-specific access tokens.

apiVersion: multicluster.x-k8s.io/v1alpha1
kind: ClusterProfile
metadata:
 name: my-aks-cluster
spec: ...
status:
  ...
  accessProviders:
  - name: kubelogin
    cluster:
      server: https://braveion-abcxyz.hcp.eastus2.azmk8s.io
      ...
      extensions:
      - name: "clusterprofiles.multicluster.x-k8s.io/exec/additional-envs"
        extension:
          "CLIENT_ID": "my-client-id"
          "TENANT_ID": "my-tenant-id"

Configuring plugins in the controller

Plugins are selected by a string which represents the type of access provider that is used to reach the cluster, for example, “google” for GKE Clusters. This allows the controller to attach a different binary name or path for the binary.

It is expected that the library will have a mapping from its supported type of access provider to the expected binary to call. The library would be fed via a repeated flag clusterprofile-access-provider for ease of use. The flag maps an access provider type to the associated binary and potential flags that should be passed. It cannot contain cluster-specific information (which is not known at that time).

./controller ... --clusterprofile-access-provider "google='/usr/bin/gke-gcloud-auth-plugin --flag1 value1 --flag2 value2'"

Despite being a flag, we can express the equivalent structure for each Plugin:

type Provider struct {
  AccessType string
  ExecutablePath string
  args []string
  ClusterProfileSourcedCLIArgsPolicy ProfileSourcedDataPolicy
  ClusterProfileSourcedEnvVarsPolicy ProfileSourcedDataPolicy
}

Given the plugin is executed directly by the controller, it may expect to have access to the same environment as the controller itself, inclusive of envvars, filesystem and network. It is expected that the identity of the plugin is the same as the controller itself.

The ClusterProfileSourcedCLIArgsPolicy and ClusterProfileSourcedEnvVarsPolicy flags control whether the library will process clusterprofiles.multicluster.x-k8s.io/exec/additional-args and clusterprofiles.multicluster.x-k8s.io/exec/additional-envs extensions, as described earlier. If set to Ignore, additional CLI arguments and/or environment variables cannot be set from the ClusterProfile side.

Plugin Examples

As an example, we provide pseudocode for plugins that could easily be implemented with the protocol. They are ultrasimplified version of the code and structures to convey the idea and not be an implementation example.

Secret Reader plugin

This plugin assumes the controller is aware of the list of clusters ahead of time and has created secrets for them in its namespace. It simply reads the token from the secret mapped to the cluster specifically for this controller. Note that namespace comes from the controller config while clusterName is read by the plugin from ExecCredential.Spec.Cluster.Config, which the library populates from the Cluster.extensions entry named client.authentication.k8s.io/exec.

func GetToken(namespace, clusterName string) string {
  // query secrets local to this controller (same cluster, same namespace)
  secret := secrets.Namespace(namespace).Get(clusterName)
  return secret.Data.token
}

GKE with Workload Identity Federation

This plugin uses Workload Identity Federation to call the other clusters that are GKE clusters and therefore understanding google-issued credentials.

func GetToken() string {
  // This library calls looks at the standard envvar called GOOGLE_CREDENTIALS and if not found, calls the Metadata Server IP (169.254.169.254)
  creds := google.GetDefaultCredentials()
  return creds.Token()
}

Test Plan

[ ] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.

Prerequisite testing updates
Unit tests
  • <package>: <date> - <test coverage>
Integration tests
e2e tests

Graduation Criteria

Upgrade / Downgrade Strategy

Version Skew Strategy

Production Readiness Review Questionnaire

Feature Enablement and Rollback

N/A - Out of tree library

Rollout, Upgrade and Rollback Planning

N/A - Out of tree library

Monitoring Requirements

Are there any missing metrics that would be useful to have to improve observability of this feature?

The following metrics would be added into the library using plugins to help observability:

  • Number of Credential Obtention, categorized per plugin type, reply state
  • Latency to obtain credentials, categorized per plugin type

Dependencies

Does this feature depend on any specific services running in the cluster?

No risk.

It depends on ClusterProfile resources being available in the cluster for it to be useful. The dependency is indirect and without ClusterProfile this library is simply not needed.

Scalability

N/A - just a library/protocol

Troubleshooting

How does this feature react if the API server and/or etcd is unavailable?

N/A; no use of etcd

What are other known failure modes?
What steps should be taken if SLOs are not being met to determine the problem?

Implementation History

Drawbacks

Alternatives

There are a couple alternatives to this plugin-based approach.

Infrastructure Needed (Optional)

N/A