KEP-5304: DRA Device Attributes Downward API

Release Signoff Checklist
Summary
Motivation
- Goals
- Non-Goals
Proposal
Design Details
Production Readiness Review Questionnaire
Implementation History
Drawbacks
Alternatives
Infrastructure Needed (Optional)

Release Signoff Checklist

Items marked with (R) are required prior to targeting to a milestone / release.

(R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
(R) KEP approvers have approved the KEP status as implementable
(R) Design details are appropriately documented
(R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- e2e Tests for all Beta API Operations (endpoints)
- (R) Ensure GA e2e tests meet requirements for Conformance Tests
- (R) Minimum Two Week Window for GA e2e tests to prove flake free
(R) Graduation criteria is in place
- (R) all GA Endpoints must be hit by Conformance Tests within one minor version of promotion to GA
(R) Production readiness review completed
(R) Production readiness review approved
“Implementation History” section is up-to-date for milestone
User-facing documentation has been created in kubernetes/website , for publication to kubernetes.io
Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

Summary

This KEP proposes exposing Dynamic Resource Allocation (DRA) device attributes to workloads via CDI (Container Device Interface) mounts. Drivers provide device metadata by populating a Metadata field in PrepareResult when returning from PrepareResourceClaims. The framework writes this metadata to per-request JSON files, which are mounted into containers via CDI. Containers only see metadata for the requests they reference (via resources.claims[].request), providing proper isolation when a claim has multiple requests used by different containers. Workloads like KubeVirt can read device metadata like PCIe bus address, or mediated device UUID, from standardized file paths without requiring custom controllers or Downward API changes.

Motivation

Workloads that need to interact with DRA-allocated devices (like KubeVirt virtual machines) require access to device specific metadata such as PCIe bus addresses or mediated device UUIDs. Currently, to fetch attributes from allocated devices, users must:

Go to ResourceClaimStatus to find the request and device name
Look up the ResourceSlice with the device name to get attribute values

This complexity forces ecosystem projects like KubeVirt to build custom controllers that watch these objects and inject attributes via annotations/labels, leading to fragile, error-prone, and racy designs.

Goals

Provide an easy way for DRA driver authors to make the attributes and other metadata discoverable inside the pods (specifically containers requesting devices).
Minimize complexity and avoid modifications to core components like scheduler and kubelet to maintain system reliability and scalability
Maintain full backward compatibility with existing DRA drivers and workloads
Define a versioned JSON schema to ensure compatibility within versions and clear migration paths across versions
Support metadata updates after NodePrepareResources so that drivers (e.g. network DRA) can update metadata from NRI hooks such as RunPodSandbox.

Non-Goals

Expose the entirety of ResourceClaim/ResourceSlice objects
Provide an additional hook in DRA path to update after metadata NodePrepareResources call for metadata updates

Proposal

This proposal introduces framework-assisted attribute exposure via CDI mounting in the DRA kubelet plugin framework (k8s.io/dynamic-resource-allocation/kubeletplugin). The framework provides a command-line flag (e.g. --enable-device-metadata) that drivers integrate into their CLI. Once integrated, operators enable or disable the metadata feature via the flag when starting the plugin process (no driver image change required). When the flag is set, the framework enables the metadata code path; when unset, the feature is fully disabled — no CDI specs or metadata files are generated. Metadata files are always written under the kubelet device plugin directory for that driver ({kubeletDir}/plugins/{driverName}/dra-device-metadata/...).

When enabled, the framework always generates a CDI spec (bind-mount) for every claim the driver prepares, ensuring the container mount point exists regardless of when the metadata content arrives. The metadata file itself is provided in one of two ways:

Immediate metadata (e.g. GPU drivers): PrepareResourceClaims returns with Device.Metadata populated. The framework writes the metadata file immediately alongside the CDI spec.

Deferred metadata (e.g. network drivers like DRANet): PrepareResourceClaims returns without Device.Metadata (or with an empty result) because device details are not yet available — network configuration (IP addresses, interface names, MAC addresses) only becomes known after CNI runs during RunPodSandbox. The CDI mount is still done at prepare time with empty metadata. The driver writes the file via MetadataUpdater.UpdateRequestMetadata() from its NRI RunPodSandbox hook, before the pod starts. See Metadata Lifecycle for the full sequence.

In both cases the framework:

Generates CDI specs that bind-mount the driver’s metadata file into containers at a well-known container path
Cleans up files and CDI specs during NodeUnprepareResources

On the host, metadata files are written under the driver’s plugin directory: {kubeletDir}/plugins/{driverName}/dra-device-metadata/{claimNs}_{claimName}/{requestName}/metadata.json

Within the container, CDI bind-mounts expose the metadata at a standardized path: /var/run/dra-device-attributes/{claimName}/{requestName}/{driverName}-metadata.json

Metadata is organized per-request, ensuring containers only see metadata for the specific requests they use (via resources.claims[].request).

User Stories (Optional)

Story 1

As a workload developer (e.g., KubeVirt), I want to automatically discover device attributes (like PCIe addresses) by reading a JSON file at a known path, so my application can configure devices without parsing ResourceClaim/ResourceSlice objects, calling the Kubernetes API, or requiring custom controllers.

Story 2

As a DRA driver author, I want to populate a Metadata field in PrepareResult to expose device attributes in a standardized format, with the framework handling file writing, directory structure, CDI mounting, and cleanup, so I only need to provide the metadata content.

Story 3

As a telco CNF developer, I want network device metadata (PCI address, interface name, IPs, MTU) to be available inside my container, so my DPDK application can discover and bind to the correct devices without custom controllers.

Notes/Constraints/Caveats (Optional)

File-based, not env vars: Attributes are exposed as JSON files mounted via CDI, not environment variables. This allows for complex structured data and dynamic attribute sets.
Enable/disable: Drivers wire up the framework’s flag config (e.g. --enable-device-metadata) in their CLI; operators then control enable/disable via the flag without driver image changes. See Feature Gate .
No API changes: Zero modifications to Kubernetes API types; framework/driver-side only.
File lifecycle: Created during NodePrepareResources, deleted during NodeUnprepareResources; no new shared host directory.

Risks and Mitigations

Risk: Exposing device attributes might leak sensitive information.

Mitigation: Drivers control which attributes are published via the Device.Metadata field. Files are created with 0644 permissions (readable but not writable by container). Drivers should only expose non-sensitive metadata.

Risk: File system clutter from orphaned attribute files.

Mitigation: Framework implements cleanup in NodeUnprepareResources. On driver restart, framework can perform best-effort cleanup by globbing and removing stale files.

Risk: JSON schema changes could break workloads.

Mitigation: The schema is versioned based on well known kubernetes CRD conventions. Applications in the pod should be able to read the json based on the discovered version.

Design Details

Framework Implementation

Attributes JSON Generation

Drivers provide device metadata by populating the Metadata field in PrepareResult.Devices[] when returning from PrepareResourceClaims. This ensures drivers explicitly provide accurate, current device information at the time of preparation - no auto-generation from ResourceSlice data.

Per-Request Metadata Design: Metadata is organized per-request, not per-claim. This ensures containers only see metadata for the specific requests they use (via resources.claims[].request), providing proper isolation when a claim has multiple requests used by different containers.

Multi-Driver Safety: Each driver writes metadata under its own plugin directory on the host. When a single request with count > 1 allocates devices from multiple DRA drivers (e.g., a DeviceClass whose selectors match devices from different drivers), each driver independently writes only its own file — no coordination or shared directory is needed. CDI bind-mounts expose each driver’s file in the container as {driverName}-metadata.json. Applications enumerate *-metadata.json files in the container’s request directory to discover all devices.

Metadata file schema (Go types)

The following Go structs define the schema of the per-request metadata.json file written by the framework. The framework builds these from PrepareResult (claim metadata plus Devices[].Metadata and device name/pool/driver) and marshals them to JSON.

Staging location: These types will be introduced in the Kubernetes staging tree under the DRA component. The canonical definitions will live in kubernetes/kubernetes at staging/src/k8s.io/dynamic-resource-allocation/api/metadata (and a versioned subpackage such as api/metadata/v1alpha1 for the serialized schema). The framework in staging/src/k8s.io/dynamic-resource-allocation/kubeletplugin will reference these types for encoding the metadata JSON and for the driver-facing Device.Metadata contract. This keeps the metadata schema versioned and co-located with the DRA framework without adding new Kubernetes API types.

// DeviceMetadata contains metadata about devices allocated to a ResourceClaim.
// It is serialized to versioned JSON files that can be mounted into containers.
type DeviceMetadata struct {
    metav1.TypeMeta   `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty"`

    // Requests contains the device allocation information for each request
    // in the ResourceClaim.
    // +optional
    Requests []DeviceMetadataRequest `json:"requests,omitempty"`
}

The framework populates only metadata.name, metadata.namespace, metadata.uid, and metadata.generation.

// DeviceMetadataRequest contains metadata for a single request within a ResourceClaim.
type DeviceMetadataRequest struct {
    // Name is the name of the request (from the ResourceClaim spec).
    Name string `json:"name"`

    // Devices contains metadata for each device allocated to this request.
    // +optional
    Devices []Device `json:"devices,omitempty"`
}

// Device contains metadata about a single allocated device.
type Device struct {
    // Name is the name of the device within the pool.
    Name string `json:"name"`

    // Driver is the name of the DRA driver that manages this device.
    Driver string `json:"driver"`

    // Pool is the name of the resource pool this device belongs to.
    Pool string `json:"pool"`

    // Attributes contains the device attributes from the ResourceSlice.
    // Keys are qualified attribute names (e.g., "model", "resource.k8s.io/pciBusID").
    // Values use the Kubernetes DeviceAttribute type for consistency with the
    // resource.k8s.io API.
    // +optional
    Attributes map[resourcev1.QualifiedName]resourcev1.DeviceAttribute `json:"attributes,omitempty"`

    // NetworkData contains network-specific device data (e.g., interface name,
    // addresses, hardware address). This is populated for network devices,
    // typically during the CRI RPC RunPodSandbox, before the containers are
    // started and after the network namespace is created.
    // +optional
    NetworkData *resourcev1.NetworkDeviceData `json:"networkData,omitempty"`
}

Attributes and NetworkData use the same types as in the resource.k8s.io API: resourcev1.DeviceAttribute for attribute values (see ResourceSlice device attributes) and resourcev1.NetworkDeviceData for network device data (e.g. interfaceName, addresses, hwAddress).

The driver API uses the same attribute and network types: kubeletplugin.DeviceMetadata and the Device slice in UpdateRequestMetadata use resourcev1.DeviceAttribute and resourcev1.NetworkDeviceData, so the framework can write the metadata file without type conversion.

Example serialized JSON (GPU device at {claimNs}_{claimName}/gpu-request/example.com-metadata.json). Attribute values use resource.k8s.io DeviceAttribute serialization (typed value wrappers: string, int, version).

{
  "apiVersion": "metadata.resource.k8s.io/v1alpha1",
  "kind": "DeviceMetadata",
  "metadata": {
    "name": "my-claim",
    "namespace": "default",
    "uid": "abc-123-def-456",
    "generation": 1
  },
  "requests": [
    {
      "name": "gpu-request",
      "devices": [
        {
          "name": "gpu-0",
          "driver": "example.com",
          "pool": "node-1-gpus",
          "attributes": {
            "driverVersion": {
              "version": "1.0.0"
            },
            "index": {
              "int": 1
            },
            "model": {
              "string": "LATEST-GPU-MODEL"
            },
            "uuid": {
              "string": "gpu-93d37703-997c-c46f-a531-755e3e0dc2ac"
            },
            "resource.k8s.io/pciBusID": {
              "string": "0000:00:01.0"
            }
          }
        }
      ]
    }
  ]
}

Directory structure on host (per-driver plugin directory):

Each driver writes metadata under its own plugin directory. No shared host directory is introduced.

{kubeletDir}/plugins/{driverName}/dra-device-metadata/
└── {claimNamespace}_{claimName}/
    ├── {requestName1}/
    │   └── metadata.json
    └── {requestName2}/
        └── metadata.json

When a single request has devices from multiple drivers (e.g., count: 2 with a cross-driver DeviceClass), each driver writes its own file in its own plugin directory:

{kubeletDir}/plugins/example.com/dra-device-metadata/
└── default_my-claim/
    └── accel/
        └── metadata.json

{kubeletDir}/plugins/bar.com/dra-device-metadata/
└── default_my-claim/
    └── accel/
        └── metadata.json

Why per-driver plugin directory (not a shared host directory):

An earlier design used a shared /var/run/dra-device-attributes/ directory on the host. Using each driver’s existing plugin directory instead has several advantages:

Portability: The path inherits kubelet’s --root-dir configuration, which already handles differences across Linux distributions and potentially Windows. A hardcoded /var/run/... path is not portable.
No duplicate configuration: Drivers already know their plugin directory. A shared directory would require every driver (and the framework) to duplicate a separate configuration option.
Scoped cleanup: When a driver is uninstalled, its entire plugin directory can be removed. With a shared directory, cleanup must identify and remove individual files belonging to a specific driver — the same problem CDI files already have, but simpler is better.
No cross-driver coordination: Each driver writes only in its own directory. No risk of races, permission conflicts, or naming collisions between drivers.

Generate CDI spec (one per driver per request): Each driver creates a CDI spec that bind-mounts its metadata file from the driver’s plugin directory into the container at the well-known container path. This avoids overlapping directory mounts when multiple drivers serve the same request.

{
  "cdiVersion": "0.3.0",
  "kind": "{driverName}/metadata",
  "devices": [
    {
      "name": "{claimUID}_{requestName}",
      "containerEdits": {
        "mounts": [
          {
            "hostPath": "{kubeletDir}/plugins/{driverName}/dra-device-metadata/{claimNamespace}_{claimName}/{requestName}/metadata.json",
            "containerPath": "/var/run/dra-device-attributes/{claimName}/{requestName}/{driverName}-metadata.json",
            "options": ["ro", "bind"]
          }
        ]
      }
    }
  ]
}

Note: Each driver mounts only its own metadata file. The host path is under the driver’s own plugin directory; the container path uses a standardized convention. When multiple drivers serve the same request, each creates a separate CDI spec targeting a different container file path — no mount conflicts. Containers only see metadata files for requests they reference via resources.claims[].request.

CDI device ID: {driverName}/metadata={claimUID}_{requestName}

Container visibility: A container with resources.claims[].request: gpu-request sees /var/run/dra-device-attributes/my-claim/gpu-request/{driverName}-metadata.json for each driver that allocated devices for that request. In the common single-driver case, the container sees one file (e.g., example.com-metadata.json). In the multi-driver case, it sees one file per driver.

Cleanup (NodeUnprepareResources)

The framework removes metadata files for the unprepared claims during cleanup

API Changes

Command-line flag: A boolean flag (e.g. --enable-device-metadata) is the only way to enable the feature. When the flag is enabled, the CDI mounts are done; if the driver does not provide enough data at prepare time, the mounted file will be empty, and the driver can use MetadataUpdater to write it later before the pod starts. When unset, the feature is off. Host path is always the kubelet device plugin directory (see Directory structure on host ); no Start() option.

Device.Metadata field in PrepareResult: Drivers that have metadata available at prepare time populate this field. The framework writes the metadata file immediately. Drivers that do not have metadata yet (e.g. network drivers) leave this field nil and write later via MetadataUpdater.

// Device provides the CDI device IDs for one request in a ResourceClaim.
// Existing struct in k8s.io/dynamic-resource-allocation/kubeletplugin
type Device struct {
    Requests     []string
    PoolName     string
    DeviceName   string
    CDIDeviceIDs []string
    ShareID      *types.UID
    
    // Metadata contains device attributes to expose to workloads.
    // When set, the framework writes this to a JSON file mounted into containers
    // immediately after PrepareResourceClaims returns.
    // When nil, the CDI mount is still done (when feature is enabled) but the metadata field will be empty
    // the driver writes it via MetadataUpdater.UpdateRequestMetadata() before the pod starts.
    Metadata *DeviceMetadata
}

// DeviceMetadata contains device attributes to expose to workloads.
// Uses the same types as resource.k8s.io (ResourceSlice device attributes, etc.).
type DeviceMetadata struct {
    // Attributes contains device attributes. Keys should follow Kubernetes naming conventions
    // (e.g., "resource.kubernetes.io/pciBusID"). Values use resourcev1.DeviceAttribute.
    Attributes map[string]resourcev1.DeviceAttribute `json:"attributes,omitempty"`
    
    // NetworkData contains network-specific device information.
    // Populated by network DRA plugins (e.g., SR-IOV, DPDK).
    NetworkData *resourcev1.NetworkDeviceData `json:"networkData,omitempty"`
}

Driver Integration

When the flag is set, drivers provide metadata in one of two ways:

At prepare time: Driver populates Device.Metadata in PrepareResult. Framework writes the metadata file immediately. Suitable for drivers that have all device information available during PrepareResourceClaims (e.g. GPU drivers).
After prepare time: Driver returns without Device.Metadata. The CDI mount is done but the metadata field is empty. Driver writes it via MetadataUpdater.UpdateRequestMetadata() before the pod starts (e.g. from an NRI hook after CNI). Suitable for network drivers like DRANet where device details only become available after CNI runs.

Key benefits: CDI spec always generated when flag is set (mount point exists); driver can provide metadata immediately or via MetadataUpdater; framework handles file writing, CDI spec generation, and cleanup.

For network DRA drivers that write metadata in two phases (initial attributes during PrepareResourceClaims, then network info via NRI after CNI), see the Metadata Lifecycle section.

Workload Discovery

The presence of {driverName}-metadata.json indicates metadata from that driver is available for this request. Absence may mean deferred metadata not yet written (e.g. network driver writing from NRI hook). Applications enumerate *-metadata.json in the request directory; if required metadata is missing, error or wait as appropriate.

Schema version handling

The metadata JSON includes apiVersion and kind (e.g. metadata.resource.k8s.io/v1alpha1, DeviceMetadata). Schema versions may evolve over time (e.g. new optional fields, or a future v1beta1). Consumers need a consistent strategy for reading files whose version may differ from what they were compiled against.

Options considered:

Option A: Workload conditional reading (raw JSON). The workload reads the metadata file, inspects apiVersion (and optionally kind), and only parses if it supports that version; otherwise it errors or skips.

Pros: No library dependency. Works with any language. Single source of truth (the file itself).
Cons: Every workload must implement version checks and parsing by hand. No automatic conversion between schema versions — if the framework starts writing v1beta1, existing workloads that only understand v1alpha1 break unless they are updated.

Option B: Pod declares desired version. The Pod specifies the metadata schema version it supports, for example via an annotation like resources.kubernetes.io/dra-metadata-version: v1alpha1 or a field in the resource claim template. The framework or driver could refuse to run, or write to a different path, if the requested version is unsupported.

Pros: Explicit contract between Pod and framework. Scheduler/kubelet could theoretically validate.
Cons: Requires new API or annotation design, cross-component coordination, and a migration path for existing Pods. Adds complexity for a file that is already self-describing via apiVersion.

Option C: Go consumers use the metadata package with internal types. The metadata package provides internal (unversioned) types and a runtime.Scheme with registered conversions for each supported version (v1alpha1, future v1beta1, etc.). Go consumers decode the JSON through the scheme and always work with the stable internal types. The scheme handles version detection and conversion automatically.

Pros: No manual version checks. Automatic forward/backward conversion. Follows the standard Kubernetes API machinery pattern (k8s.io/api / k8s.io/apimachinery).
Cons: Requires a Go dependency on the metadata package. Non-Go consumers must handle versioning themselves (though they can still benefit from the self-describing apiVersion field in the JSON).

Decision: Option C (internal types with scheme-based decoding). The metadata package (k8s.io/dynamic-resource-allocation/pkg/metadata) provides:

Internal types (pkg/metadata): Unversioned, canonical in-memory representation (DeviceMetadata, DeviceMetadataRequest, Device). These are the types Go consumers program against.
Versioned types (pkg/metadata/v1alpha1, future pkg/metadata/v1beta1, etc.): External types with JSON tags, used for serialization. Each version package registers itself with the scheme and provides auto-generated conversion functions to/from the internal types.
Scheme and codec: A runtime.Scheme with all versioned and internal types registered, plus conversion and defaulting functions. Consumers call runtime.Decode(codec, data) to get the internal *metadata.DeviceMetadata regardless of which version was serialized to disk.

This matches the standard Kubernetes API machinery pattern (the same approach used by k8s.io/api / k8s.io/apimachinery). When a new version is introduced:

A new versioned subpackage is added (e.g. pkg/metadata/v1beta1).
Conversion functions between v1beta1 and internal types are generated.
Both versions are registered in the scheme.
Existing Go consumers continue to work unchanged — the scheme decodes either version into the same internal types.

Non-Go consumers (e.g. shell scripts, Python) can still read apiVersion from the JSON and branch accordingly, but the primary versioning mechanism is the Go package with scheme-based conversion.

Good practice for Go consumers:

import (
    "os"
    "k8s.io/apimachinery/pkg/runtime"
    "k8s.io/apimachinery/pkg/runtime/serializer/json"
    "k8s.io/dynamic-resource-allocation/pkg/metadata"
    _ "k8s.io/dynamic-resource-allocation/pkg/metadata/v1alpha1" // register version
)

func ReadMetadata(path string) (*metadata.DeviceMetadata, error) {
    data, err := os.ReadFile(path)
    if err != nil {
        return nil, err
    }
    scheme := metadata.NewScheme() // scheme with all registered versions
    codec := json.NewSerializerWithOptions(json.DefaultMetaFactory, scheme, scheme,
        json.SerializerOptions{Yaml: false, Pretty: false, Strict: true})
    obj, _, err := codec.Decode(data, nil, nil)
    if err != nil {
        return nil, err // unknown version or malformed JSON
    }
    dm, ok := obj.(*metadata.DeviceMetadata)
    if !ok {
        return nil, fmt.Errorf("unexpected type %T", obj)
    }
    return dm, nil
}

The consumer always works with *metadata.DeviceMetadata (internal type). If the file on disk is v1alpha1 today or v1beta1 tomorrow, the scheme converts automatically.

Metadata Lifecycle

Immediate metadata (e.g. GPU drivers): The metadata file is written when PrepareResourceClaims returns with Device.Metadata populated. The file remains unchanged for the lifetime of the prepared claim. metadata.generation is set to 1.

Deferred metadata (e.g. network drivers): Network DRA drivers like DRANet return an empty PrepareResult from PrepareResourceClaims — no devices, no metadata — because the actual device configuration (IP addresses, interface names, MAC addresses) is only available after CNI runs during pod sandbox creation. In this case:

During PrepareResourceClaims, the framework generates the CDI spec (to set up the bind-mount target), but does not write a metadata file (there is nothing to write yet).
During the NRI RunPodSandbox hook (after CNI), the driver calls MetadataUpdater.UpdateRequestMetadata() to write the metadata file for the first time (generation=1).

┌─────────┐    ┌──────────────────┐    ┌────────────┐    ┌─────┐
│ Kubelet │    │ Network DRA      │    │ Containerd │    │ CNI │
│         │    │ Driver           │    │            │    │     │
└────┬────┘    └────────┬─────────┘    └──────┬─────┘    └──┬──┘
     │                  │                     │             │
     │ PrepareResources │                     │             │
     │─────────────────>│                     │             │
     │                  │ (return empty       │             │
     │                  │  result; framework  │             │
     │                  │  creates CDI spec   │             │
     │                  │  but no metadata    │             │
     │                  │  file yet)          │             │
     │<─────────────────│                     │             │
     │                  │                     │             │
     │ RunPodSandbox()  │                     │             │
     │─────────────────────────────────────-->│             │
     │                  │                     │ CNI ADD     │
     │                  │                     │────────────>│
     │                  │                     │   PodIPs    │
     │                  │                     │<────────────│
     │                  │ NRI RunPodSandbox() │             │
     │                  │<────────────────────│             │
     │                  │ (write metadata     │             │
     │                  │  with IPs, iface,   │             │
     │                  │  generation=1)      │             │
     │                  │────────────────────>│             │
     │<───────────────────────────────────────│             │
     │                  │                     │             │
     │ CreateContainers │                     │             │
     │─────────────────────────────────────-->│             │

The metadata.generation field is incremented each time the metadata is updated. For the deferred case, the first write during the NRI hook sets generation=1. Subsequent updates (if any) increment it further.

Framework support for updates: The framework provides a MetadataUpdater that drivers can use during NRI hooks to update metadata for already-prepared claims:

// MetadataUpdater allows drivers to update metadata after initial preparation.
// Used by network DRA drivers during NRI hooks when network info becomes available.
type MetadataUpdater interface {
    // UpdateRequestMetadata updates the metadata for a specific request.
    // The framework validates that this request's devices belong to the calling driver
    // (based on what was returned in PrepareResourceClaims).
    // The generation number is automatically incremented.
    UpdateRequestMetadata(
        ctx context.Context,
        claimNamespace, claimName string,
        requestName string,
        devices []Device,
    ) error
}

The kubeletplugin.Helper implements MetadataUpdater since it already owns the metadata writer, CDI cache, and device state. The expected deployment model is same-process: the driver binary hosts both the kubelet plugin and the NRI plugin, and passes the Helper directly to the NRI plugin at construction time:

func NewDriver(ctx context.Context, config *Config) (*driver, error) {
    helper, err := kubeletplugin.Start(ctx, plugin, opts...)
    // helper implements MetadataUpdater

    // NRI plugin receives helper as a direct Go reference (same process)
    nriPlugin, err := nri.StartPlugin(ctx, &nriHandler{
        metadataUpdater: helper,
    })
    return &driver{helper: helper, nriPlugin: nriPlugin}, nil
}

This design ensures:

Drivers can only update metadata for requests they prepared (framework validates ownership)
Per-driver files ({driverName}-metadata.json) mean no cross-driver conflicts
Each driver’s metadata file is independent
No IPC or service discovery is needed — the NRI plugin receives the MetadataUpdater at creation time

Since the NRI RunPodSandbox hook runs after CNI but before containers start, the updated {driverName}-metadata.json (with network info) is guaranteed to be present by the time the application reads it.

The files are removed during UnprepareResourceClaims.

Availability guarantee: Metadata files are available to all containers in the Pod (init containers, regular containers, and sidecar containers) for the entire duration of the container lifecycle. This is because:

NodePrepareResources runs before RunPodSandbox, and RunPodSandbox completes (including NRI hooks) before any init container starts. Therefore, metadata files are fully written and up-to-date before the first init container runs.
NodeUnprepareResources is called only after all containers have terminated, including sidecar containers. Therefore, metadata files remain on disk and readable throughout the entire Pod lifetime — no container will observe missing or partially cleaned-up files.

Any user code at any point in a container’s lifecycle can read the metadata files.

Usage Examples

Example 1: Physical GPU Passthrough (KubeVirt)

Note: The shell/jq approach shown below is for illustration only and is not the preferred way to consume metadata. It bypasses schema version handling and will break if the serialized version changes. The preferred approach is to use the Go metadata package with internal types and scheme-based decoding (see Schema version handling ). A full Go example will be provided in the dra-example-driver repository.

apiVersion: v1
kind: Pod
metadata:
  name: vm-with-gpu
spec:
  resourceClaims:
  - name: pgpu
    resourceClaimName: physical-gpu-claim
  containers:
  - name: compute
    image: kubevirt/virt-launcher:latest
    resources:
      claims:
      - name: pgpu
        request: gpu-request
    command:
      - /bin/sh
      - -c
      - |
        METADATA=$(cat /var/run/dra-device-attributes/physical-gpu-claim/gpu-request/gpu.example.com-metadata.json)
        PCI_BUS_ID=$(echo $METADATA | jq -r '.requests[0].devices[0].attributes["resource.kubernetes.io/pciBusID"].string')
        echo "Binding GPU at PCI $PCI_BUS_ID"

Example 2: Network Device (SR-IOV / DPDK)

Note: Same caveat as above — shell/jq is shown for illustration only. Use the Go metadata package for production workloads.

apiVersion: v1
kind: Pod
metadata:
  name: dpdk-app
spec:
  resourceClaims:
  - name: sriov-nic
    resourceClaimName: sriov-vf-claim
  containers:
  - name: dpdk
    image: dpdk-app:latest
    resources:
      claims:
      - name: sriov-nic
        request: network-request
    command:
      - /bin/sh
      - -c
      - |
        METADATA=$(cat /var/run/dra-device-attributes/sriov-vf-claim/network-request/sriov.example.com-metadata.json)
        PCI_BUS_ID=$(echo $METADATA | jq -r '.requests[0].devices[0].attributes["resource.kubernetes.io/pciBusID"].string')
        IFACE=$(echo $METADATA | jq -r '.requests[0].devices[0].networkData.interfaceName')
        MAC=$(echo $METADATA | jq -r '.requests[0].devices[0].networkData.hwAddress')
        IP=$(echo $METADATA | jq -r '.requests[0].devices[0].networkData.addresses[0]')
        echo "Binding DPDK to PCI $PCI_BUS_ID, interface $IFACE, MAC $MAC, IP $IP"

Feature Gate

No Kubernetes feature gate. The framework provides a boolean flag (e.g. --enable-device-metadata) that drivers integrate into their CLI. Once integrated, operators enable/disable via the flag in deployment configuration (e.g. DaemonSet args) without a new driver image.

Feature Maturity and Rollout

Alpha (v1.36)

Framework implementation in k8s.io/dynamic-resource-allocation/kubeletplugin: Metadata field on Device; command line flags for driver integration
Drivers integrate framework’s flag into their CLI flags; operators control enable/disable via flag
Unit tests (JSON generation, CDI spec, file lifecycle); integration tests with test driver; E2E for file mount and content
Documentation for driver authors on flag integration and metadata usage

Beta

At least one real DRA driver and one consumer (e.g., KubeVirt) validated the feature
MetadataUpdater API validated with network DRA drivers (NRI hook flow)
Production-ready error handling and edge cases
Performance benchmarks show < 10ms added latency to PrepareResourceClaims
Metrics for observability (errors, latency, writes, active files)
Documentation for workload developers
Upgrade/downgrade testing completed

GA

At least one stable consumer (e.g., KubeVirt) using attributes in production
Comprehensive e2e coverage including failure scenarios

Test Plan

[x] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.

Prerequisite testing updates

No additional prerequisite testing updates are required. Existing DRA test infrastructure will be leveraged.

Unit tests

<package>: <date> - <test coverage>

Integration tests

Integration tests will cover:

End-to-end attribute exposure: Create Pod with resourceClaims, verify attributes JSON is generated and mounted
Multiple requests: Pod with claim containing multiple requests, verify separate directories per request
Container isolation: Verify container only sees metadata for requests it references via resources.claims[].request
Empty metadata: Driver writes no attributes, verify file is created with request metadata only
Attribute types: Test string, bool, int, version attributes are correctly written
Generation number: Verify generation increments on metadata updates
Cleanup: Verify files are removed after unprepare
Enable/disable: Verify feature off when flag unset; no files when Device.Metadata is nil

Tests will be added to test/integration/dra/.

e2e tests

E2E tests will validate real-world scenarios:

Metadata file mounted: Pod can read /var/run/dra-device-attributes/{claimName}/{requestName}/{driverName}-metadata.json
Per-request isolation: Container only sees directory for requests it references
Correct content: Verify JSON contains expected apiVersion, kind, metadata, and device list
Multi-device request: Verify attributes from all allocated devices in the request are included
CDI integration: Verify CRI runtime correctly processes CDI device ID and mounts directory
Cleanup on delete: Delete Pod, verify attribute files are removed from host

Tests will be added to test/e2e/dra/dra.go.

Graduation Criteria

Alpha (v1.36)

Framework implementation complete with Device.Metadata field in PrepareResult
Framework writes metadata file when Device.Metadata is populated
Unit tests for core logic (JSON generation, CDI spec creation, file lifecycle)
Integration tests with test driver
E2E test validating file mounting and content
Documentation for driver authors published
Known limitations documented (no schema standardization yet)

Beta

At least one real DRA driver (e.g., dra-example-driver, GPU driver) has adopted the feature
At least one consumer (e.g., KubeVirt) has validated the feature in a test environment
MetadataUpdater API validated with network DRA drivers (NRI hook flow)
Performance benchmarks show < 10ms added latency to PrepareResourceClaims
Metrics exposed for observability:
- dra_metadata_feature_enabled
- dra_metadata_write_errors_total
- dra_metadata_update_errors_total
- dra_metadata_write_duration_seconds
- dra_metadata_writes_total
- dra_metadata_files_active
Declarative validation for metadata types (#137761 )
Roundtrip test for device metadata API (#137839 )
E2E tests for device metadata (#137699 )
Add v1beta1 API for device metadata format (#138994 )
Documentation for workload developers published
All Beta PRR questions answered
Upgrade/downgrade testing completed

GA

TBD

Upgrade / Downgrade Strategy

Upgrade: No API changes; control plane unchanged. Framework is backward compatible (drivers that don’t populate Device.Metadata unchanged). When the flag is set, drivers that provide metadata expose it; workloads without DRA or not reading the files are unaffected.

Downgrade: Implemented in the driver plugin framework, not kubelet. Disabling the flag or downgrading the driver: new pods won’t get metadata files; existing pods keep theirs until termination.

Rolling upgrade: Toggle the flag per node/deployment; no cluster-wide coordination. Workloads should handle missing metadata files gracefully.

Version Skew Strategy

No control-plane/node coordination (driver-side only). Newer driver with flag set: pods get metadata files. Older driver or flag unset: no metadata files. Test in non-production first.

Production Readiness Review Questionnaire

Feature Enablement and Rollback

How can this feature be enabled / disabled in a live cluster?

Set or unset the framework flag (e.g. --enable-device-metadata) in the DRA plugin process args (e.g. DaemonSet). No driver image change required.

Does enabling the feature change any default behavior?

No. When enabled, new CDI mount points expose DRA device attributes as JSON files.

Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?

Yes. Unset the flag (or omit it) in the plugin deployment. New pods won’t get metadata files; existing pods keep theirs until they terminate. Ensure attribute consumers have a fallback if needed.

What happens if we reenable the feature if it was previously rolled back?

Restores full functionality for new pods; no data migration or special handling.

Are there any tests for feature enablement/disablement?

Yes

unit tests (no files when feature off or Device.Metadata nil)
integration tests (flag toggle)
E2E (files present when on, absent when off)

Rollout, Upgrade and Rollback Planning

How can a rollout or rollback fail? Can it impact already running workloads?

This feature is implemented entirely in the DRA driver plugin framework, not in kubelet or the control plane. Rollout/rollback is controlled by the driver’s deployment (e.g., DaemonSet args).

Rollout failure scenarios:

Driver restart mid-rollout: Running pods retain their metadata files (already mounted). New pods scheduled to that node wait for the driver to become ready. No data loss.
Mixed node states: During rollout, some nodes may have the updated driver while others don’t. Cluster administrators should complete the driver rollout across all relevant nodes before communicating to workload authors that metadata support is available. Workloads expecting metadata should only be deployed after the administrator confirms the rollout is complete. Additionally, workload documentation and library functions guide consumers to handle missing metadata files gracefully as a defensive measure.
File write failure: If the framework cannot write metadata files due to I/O errors, PrepareResourceClaims fails and the pod fails to start. This is the correct behavior. Note: The DRA driver DaemonSet is expected to be configured with proper permissions to write to /var/run/dra-device-attributes/, so permission errors should not occur in normal operation.

Impact on running workloads: None. Metadata files are mounted read-only at pod start. Disabling the feature or restarting the driver does not affect already-running pods.

What specific metrics should inform a rollback?

Operators should monitor the following metrics:

dra_metadata_write_errors_total: Increasing error count indicates file write failures
dra_metadata_update_errors_total: Increasing error count indicates NRI hook update failures
dra_metadata_write_duration_seconds: Latency spikes (p99 > 100ms) may indicate I/O issues

If error metrics increase after enabling the feature, rollback by disabling the metadata feature in the driver configuration.

Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?

Yes, automated e2e testing is included in kubernetes/kubernetes#137699 :

Upgrade: Enable the feature, deploy pods, verify metadata files exist
Downgrade: Disable the feature, deploy new pods, verify no metadata files are created
Upgrade again: Re-enable the feature, verify new pods receive metadata files as expected

The feature is additive and stateless (files are created per-pod, cleaned up on unprepare), so the upgrade/downgrade cycle has no persistent side effects.

Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?

No. This feature is purely additive. No existing APIs, flags, or behaviors are deprecated or removed.

Monitoring Requirements

How can an operator determine if the feature is in use by workloads?

Operators can check:

Metric: Query dra_metadata_feature_enabled == 1 to find nodes with the feature enabled
File presence on node: Check for metadata files under /var/run/dra-device-attributes/ on nodes
Pod inspection: Exec into a pod using DRA claims and check if the metadata file exists at /var/run/dra-device-attributes/{ns}-{claim}/{request}/metadata.json

How can someone using this feature know that it is working for their instance?

Metrics
- dra_metadata_feature_enabled == 1: Confirms the feature is enabled on the node
- dra_metadata_writes_total increasing when pods with DRA claims are created: Confirms metadata files are being written
- dra_metadata_write_errors_total == 0: Confirms no write failures
Other
- Details: Users can also verify by checking for the metadata file inside their container:
```
# Inside the container
ls /var/run/dra-device-attributes/
cat /var/run/dra-device-attributes/{ns}-{claim}/{request}/metadata.json
```
  If the file exists and contains valid JSON with the expected device attributes, the feature is working.

What are the reasonable SLOs (Service Level Objectives) for the enhancement?

This feature adds minimal overhead to pod startup. The metrics below cover only the overhead within the DRA driver framework (writing metadata files). Additional overhead exists from the container runtime processing CDI specs with mount directives and the kernel performing bind mounts. This runtime/kernel overhead has not been measured.

SLOs expressed as metrics queries (covering the measurable DRA framework portion):

Latency: histogram_quantile(0.99, dra_metadata_write_duration_seconds) < 0.01
- 99th percentile of metadata file write latency should be < 10ms
Reliability: rate(dra_metadata_write_errors_total[5m]) == 0
- No write errors when pods with DRA claims are being prepared
Availability: dra_metadata_feature_enabled == 1 on all nodes where the feature is expected

What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?

Metrics
The following metrics are defined in the kubeletplugin framework, which may or may not be exposed by a DRA driver using the framework. Admins need to check the documentation of a DRA driver to determine how to collect these metrics.
- Metric name: dra_metadata_feature_enabled
  - Description: Gauge (0/1) indicating if the feature is enabled on this driver instance
  - Aggregation method: current value per node/driver
  - Components exposing the metric: DRA driver plugin framework
- Metric name: dra_metadata_write_errors_total
  - Description: Counter of failed metadata file writes
  - Aggregation method: sum by driver, error_type
  - Components exposing the metric: DRA driver plugin framework
- Metric name: dra_metadata_update_errors_total
  - Description: Counter of failed metadata updates during NRI hooks (network drivers)
  - Aggregation method: sum by driver, error_type
  - Components exposing the metric: DRA driver plugin framework
- Metric name: dra_metadata_write_duration_seconds
  - Description: Histogram of metadata file write latency
  - Aggregation method: histogram buckets (p50, p99)
  - Components exposing the metric: DRA driver plugin framework
- Metric name: dra_metadata_writes_total
  - Description: Counter of successful metadata writes
  - Aggregation method: sum by driver
  - Components exposing the metric: DRA driver plugin framework
- Metric name: dra_metadata_files_active
  - Description: Gauge of current metadata files on node
  - Aggregation method: current value per node
  - Components exposing the metric: DRA driver plugin framework

Are there any missing metrics that would be useful to have to improve observability of this feature?

No. The implemented metrics cover both operational health (errors, latency) and usage visibility (writes, active files).

Dependencies

Does this feature depend on any specific services running in the cluster?

DRA driver with framework support
- Usage description: The DRA driver must use the k8s.io/dynamic-resource-allocation/kubeletplugin framework and populate Device.Metadata in PrepareResult.
- Impact of its outage on the feature: If the driver is unavailable, DRA claims cannot be prepared (this affects all DRA functionality, not just this feature).
- Impact of its degraded performance or high-error rates on the feature: Metadata file writing adds minimal overhead; driver performance issues would manifest as slow pod startup.

Scalability

Will enabling / using this feature result in any new API calls?

Will enabling / using this feature result in introducing new API types?

No.

Will enabling / using this feature result in any new calls to the cloud provider?

No.

Will enabling / using this feature result in increasing size or count of the existing API objects?

Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?

Yes, but the impact should be minimal:

Pod startup latency: Drivers write metadata files during NodePrepareResources, adding a small I/O overhead. The framework’s schema validation and file writes are lightweight operations.
The feature does not affect existing SLIs/SLOs for clusters not using DRA or for drivers not opting-in on this feature

Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, …) in any components?

Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?

No significant risk of resource exhaustion.

Troubleshooting

How does this feature react if the API server and/or etcd is unavailable?

This feature operates entirely on the node side within the DRA driver plugin framework. It does not make API server calls during metadata file creation or mounting.

API server unavailable: No impact on metadata file writing. The feature continues to work for pods that are already scheduled. New pods cannot be scheduled (general Kubernetes behavior).
etcd unavailable: Same as above. The feature doesn’t interact with etcd directly.

What are other known failure modes?

Metadata file write failure (permissions, I/O error)
- Detection: Pod fails to start with error from DRA driver. Check kubectl describe pod for events mentioning NodePrepareResources failure.
- Mitigations: Ensure the DRA driver process has write permissions to /var/run/dra-device-attributes/. This directory is created and written to by the driver plugin framework during PrepareResourceClaims. Running pods are unaffected.
- Diagnostics: DRA driver logs will show file write errors from the kubeletplugin framework.
- Testing: Unit tests cover file write error handling. Integration tests verify cleanup.
CDI runtime doesn’t support the mount
- Detection: Pod fails to start with CRI errors about invalid CDI spec.
- Mitigations: Upgrade CRI runtime to a version with CDI support (containerd 1.7+, CRI-O 1.23+), or disable the feature by unsetting the driver flag.
- Diagnostics: CRI runtime logs will show CDI parsing errors.
- Testing: E2E tests run on clusters with supported CRI runtimes.
Driver populates invalid metadata (malformed JSON)
- Detection: Workload fails to parse the metadata file.
- Mitigations: Fix the driver implementation. The framework validates basic structure but cannot validate all driver-provided content.
- Diagnostics: Workload logs showing JSON parse errors. Inspect the file directly on the node.
- Testing: Unit tests validate JSON generation. E2E tests verify content structure.
MetadataUpdater called for wrong claim (network drivers)
- Detection: Framework returns error; metadata not updated.
- Mitigations: Fix the driver’s NRI hook implementation to pass correct claim/request identifiers.
- Diagnostics: Driver logs showing "UpdateRequestMetadata failed: claim not found" or similar.
- Testing: Integration tests cover ownership validation.

What steps should be taken if SLOs are not being met to determine the problem?

If latency SLO is not met (dra_metadata_write_duration_seconds p99 > 10ms):

Check node I/O performance - /var/run/ is typically tmpfs, so high latency suggests memory pressure
Check if the node is under heavy load during pod creation
Review driver logs for slow metadata serialization

If error rate SLO is not met (dra_metadata_write_errors_total increasing):

Check driver logs for specific error messages during PrepareResourceClaims
Verify directory permissions for /var/run/dra-device-attributes/
Check for filesystem issues on the node

If availability SLO is not met (dra_metadata_feature_enabled == 0 on expected nodes):

Verify driver configuration enables the metadata feature
Check if the driver pod is running on the affected node
Review driver startup logs for configuration errors

Implementation History

2025-10-02: KEP created and initial proposal drafted
2025-10-03: KEP updated with complete PRR questionnaire responses
2026-01-12: Simplified directory structure based on wg-device-management feedback: removed driverNames file and .supported/ markers to avoid race conditions between drivers and simplify workload discovery
2026-01-25: Redesigned driver integration based on review feedback:
- Removed MetadataWriter helper approach (relied on timing assumptions)
- Added Metadata field to PrepareResult.Device struct
- Drivers now explicitly provide metadata when returning from PrepareResourceClaims
- Added Alternative 3 documenting the interface method approach
- Added MetadataUpdater for network DRA drivers to update metadata via NRI hooks (two-phase metadata with generation number for late-arriving network info)
2026-01-25: Changed from per-claim to per-request metadata organization:
- Each request gets its own metadata.json file in {claimNs}-{claimName}/{requestName}/
- CDI mounts are per-request, so containers only see metadata for requests they use
- Enables proper isolation when a claim has multiple requests used by different containers
2026-06-05: Promoted to Beta (v1.37):
- Completed all Beta PRR questionnaire sections
- Added rollout/rollback planning, monitoring requirements, dependencies, troubleshooting
- Added metrics for observability
- Defined SLOs with concrete metrics queries

Drawbacks

Filesystem dependency: Unlike Downward API environment variables (which are managed by kubelet), this approach requires reliable filesystem access to /var/run/. Failures in file writes block Pod startup.
CDI runtime requirement: Not all CRI runtimes support CDI (or support different CDI versions). This limits compatibility to newer runtimes and requires clear documentation.
Opaque file paths: Workloads must discover filenames via globbing or parse JSON to match claim names. The Downward API approach with env vars would have been more ergonomic.
No schema standardization in Alpha: The JSON structure is subject to change. Early adopters may need to update their parsers between versions.
Driver implementation required: Drivers must populate Device.Metadata in PrepareResult to provide metadata. The Downward API approach would have been transparent to drivers.
Limited discoverability: Workloads can’t easily enumerate all claims or requests; they must know the claim name or glob for files. Env vars would provide named variables.

Alternatives

Alternative 1: Downward API with ResourceSliceAttributeSelector (Original Design)

Description: Add resourceSliceAttributeRef selector to core/v1.EnvVarSource allowing environment variables to reference DRA device attributes. Kubelet would run a local controller watching ResourceClaims and ResourceSlices to resolve attributes at container start.

Example:

env:
- name: PGPU_PCI_BUS_ID
  valueFrom:
    resourceSliceAttributeRef:
      claimName: pgpu-claim
      requestName: pgpu-request
      attribute: resource.kubernetes.io/pciBusID

Pros:

Native Kubernetes API integration
Familiar pattern for users (consistent with Downward API)
Transparent to drivers (no driver changes required)
Type-safe API validation
Named environment variables (no globbing required)

Cons:

Requires core API changes (longer review/approval cycle)
Adds complexity to kubelet (new controller, watches, caching)
Performance impact on API server (kubelet watches ResourceClaims/ResourceSlices cluster-wide or per-node)
Limited to environment variables (harder to expose complex structured data)
Single attribute per reference (multiple env vars needed for multiple attributes)

Why not chosen:

Too invasive for Alpha; requires API review and PRR approval
Kubelet performance concerns with additional watches
Ecosystem requested CDI-based approach for flexibility and faster iteration

Alternative 2: DRA Driver Extends CDI with Attributes (Driver-Specific)

Description: Each driver generates CDI specs with custom environment variables containing attributes. No framework involvement.

Example (driver-generated CDI):

{
  "devices": [{
    "name": "gpu-0",
    "containerEdits": {
      "env": [
        "PGPU_PCI_BUS_ID=0000:00:1e.0",
        "PGPU_DEVICE_ID=device-00"
      ]
    }
  }]
}

Pros:

No framework changes
Maximum driver flexibility
Works today with existing DRA

Cons:

Every driver must implement attribute exposure independently (duplication)
No standardization across drivers (KubeVirt must support N different drivers)
Error-prone (drivers may forget to expose attributes or use inconsistent formats)
Hard to discover (workloads must know each driver’s conventions)

Why not chosen:

Poor user experience (no standard path or format)
High maintenance burden for ecosystem (KubeVirt, etc.)
Missed opportunity for framework to provide common functionality

Alternative 3: Add Method to DRAPlugin Interface

Description: Add a new method GetDeviceMetadata to the DRAPlugin interface that drivers must implement. The framework would call this method after PrepareResourceClaims succeeds.

Example:

type DRAPlugin interface {
    PrepareResourceClaims(ctx context.Context, claims []*resourceapi.ResourceClaim) (map[types.UID]PrepareResult, error)
    UnprepareResourceClaims(ctx context.Context, claims []NamespacedObject) (map[types.UID]error, error)
    HandleError(ctx context.Context, err error, msg string)
    
    // GetDeviceMetadata is called by framework after PrepareResourceClaims succeeds
    GetDeviceMetadata(ctx context.Context, claimUID types.UID) (*DeviceMetadata, error)
}

Pros:

Causes compile error for existing drivers - forces awareness of the new feature
Explicit opt-out by returning nil, nil
Clear separation of concerns

Cons:

Requires drivers to maintain state across two method calls (PrepareResourceClaims returns, then GetDeviceMetadata is called separately)
Redundant method - driver has all the information during PrepareResourceClaims already
Less elegant than returning metadata in PrepareResult

Why not chosen:

Adding a field to PrepareResult is more natural since the driver has accurate device information at the time of preparation
No need for drivers to maintain state across methods
While this approach guarantees compile errors, the benefit doesn’t outweigh the complexity

Infrastructure Needed (Optional)

None. This feature will be developed within existing Kubernetes repositories:

Metadata types (DeviceMetadata, DeviceMetadataRequest, Device) in kubernetes/kubernetes (staging/src/k8s.io/dynamic-resource-allocation/api/metadata and api/metadata/v1alpha1)
Framework implementation in kubernetes/kubernetes (staging/src/k8s.io/dynamic-resource-allocation/kubeletplugin)
Tests in kubernetes/kubernetes (test/integration/dra, test/e2e/dra, test/e2e_node)
Documentation in kubernetes/website (concepts/scheduling-eviction/dynamic-resource-allocation)

Ecosystem integration (future):

KubeVirt will consume attributes from JSON files (separate KEP in kubevirt/kubevirt)
DRA driver examples will be updated to demonstrate Device.Metadata usage