KEP-3762: PersistentVolume last phase transition time

Implementation History
STABLE Implemented
Created 2023-01-20
Updated 2024-09-20
Latest v1.31
Milestones
Alpha v1.28
Beta v1.29
Stable v1.31
Ownership
Owning SIG
SIG Storage
Primary Authors

KEP-3762: PersistentVolume last phase transition time

Release Signoff Checklist

Items marked with (R) are required prior to targeting to a milestone / release.

  • (R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
  • (R) KEP approvers have approved the KEP status as implementable
  • (R) Design details are appropriately documented
  • (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
    • e2e Tests for all Beta API Operations (endpoints)
    • (R) Ensure GA e2e tests meet requirements for Conformance Tests
    • (R) Minimum Two Week Window for GA e2e tests to prove flake free
  • (R) Graduation criteria is in place
  • (R) Production readiness review completed
  • (R) Production readiness review approved
  • “Implementation History” section is up-to-date for milestone
  • User-facing documentation has been created in kubernetes/website , for publication to kubernetes.io
  • Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

Summary

We want to add a new PersistentVolumeStatus field, which would hold a timestamp of when a PersistentVolume last transitioned to a different phase.

Motivation

Some users have experienced data loss when using Delete retain policy and reverted to a safer Retain policy. With Retain policy all volumes that are retained and left unclaimed have their phase is set to Released. As the released volumes pile up over time admins want to perform manual cleanup based on the time when the volume was last used, which is when the volume transitioned to Released phase.

We can approach the solution in a more generic way and record a timestamp of when the volume transitioned to any phase, not just to Released phase. This allows anyone, including our perf tests, to measure time e.g. between a PV Pending and Bound. This can be also useful for providing metrics and SLOs.

Goals

  1. Introduce a new status field in PersistentVolumes.
  2. Update the new field with a timestamp every time a volume transitions to a different phase (pv.Status.Phase).

Non-Goals

  1. Implement any form of volume health monitoring.
  2. Kubernetes will take any new actions based on the added timestamps in PersistentVolume.

Proposal

We need to update API server to support the newly proposed field and set a value of the new timestamp field when a volume transitions to a different phase. The timestamp field must be set to current time also for newly created volumes.

The value of the field is not intended for use by any other Kubernetes components at this point and should be used only as a convenience feature for cluster admins. Cluster admins should be able to list and sort PersistentVolumes based on a timestamp which indicates when the volume transitioned to a different state.

User Stories (Optional)

Story 1

As a cluster admin I want to use Retain policy for released volumes, which is safer than Delete, and implement a reliable policy to delete volumes that are Released for more than X days.

Story 2

As a cluster admin I want to be able to reason about volume deletion, or produce alerts, based on a volume being in Pending phase for more than X hours.

Notes/Constraints/Caveats (Optional)

The caveat of this proposal is that admins might not see the effect immediately after enabling/disabling the feature gate. This is due to how and when the new LastPhaseTransitionTime field needs to be added/removed.

Adding the field to a PV is reasonable only when the PV actually transitions its phase - only at that point we can capture meaningful timestamp. Trying to do this at any other step than phase transition would capture a timestamp that would semantically incorrect and misleading.

Risks and Mitigations

The new field is purely informative and should not introduce any risk.

Design Details

Changes required for this KEP:

  • kube-apiserver
      type PersistentVolumeStatus struct {
      ...
      // lastPhaseTransitionTime represents a point in time as a timestamp of when a volume last transitioned its phase.
      // +optional
      LastPhaseTransitionTime string `json:"lastPhaseTransitionTime,omitempty" protobuf:"bytes,4,opt,name=lastPhaseTransitionTime"`
      ...
      }
    
    • update the timestamp whenever PV transitions to a different phase (pv.Status.Phase)
    • allow LastPhaseTransitionTime to be updated by users if needed
    • reset the timestamp in LastPhaseTransitionTime to nil only when feature gate is disabled and LastPhaseTransitionTime is not initialized (time is zero)

Test Plan

[x] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.

Prerequisite testing updates

Current e2e test coverage is sufficient: test/e2e/storage/persistent_volumes.go

New e2e tests will be added for the new timestamp feature.

Unit tests

Changes will be implemented in packages with sufficient unit test coverage.

For any new or changed code we will add new unit tests.

  • pkg/apis/core/validation/: 2023-01-25 - 82%
Integration tests

This feature could be covered with integration tests only, however e2e testing provides more value and might help catch more bugs. Because these two kinds of tests would be almost identical, integration testing is not needed.

e2e tests

We plan to add new e2e tests which should not interfere with any other tests, and so they could run in parallel.

Graduation Criteria

Alpha

  • Feature implemented behind a feature flag
  • Unit tests completed and enabled
  • Add unit tests covering feature enablement/disablement.
  • Initial e2e tests completed and enabled

Beta

  • Allowing time for feedback (at least 2 releases between beta and GA).
  • Manually test upgrade->downgrade->upgrade path.

GA

  • No users complaining about the new behavior.

Upgrade / Downgrade Strategy

No change in cluster upgrade / downgrade process.

When upgrading, the new LastPhaseTransitionTime field and its value will be added to PVs when transitioning phase - this means that enabling and disabling feature gate might not have an immediate effect.

See “Notes/Constraints/Caveats” section for more details.

Version Skew Strategy

Version skew is not applicable, KCM was not changed in scope of this enhancement.

API serverBehavior
offExisting Kubernetes behavior.
onNew behavior.

Production Readiness Review Questionnaire

Feature Enablement and Rollback

How can this feature be enabled / disabled in a live cluster?
  • Feature gate (also fill in values in kep.yaml)
    • Feature gate name: PersistentVolumeLastPhaseTransitionTime
    • Components depending on the feature gate: kube-apiserver
  • Other
    • Describe the mechanism:
    • Will enabling / disabling the feature require downtime of the control plane?
    • Will enabling / disabling the feature require downtime or reprovisioning of a node?
Does enabling the feature change any default behavior?

Yes. All PVs will start to contain the new LastPhaseTransitionTime field.

Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?

Yes, for PVs not updated while feature was enabled. However once the LastPhaseTransitionTime value is set, disabling feature gate will not remove the value.

More details in “Upgrade / Downgrade Strategy” section.

What happens if we reenable the feature if it was previously rolled back?

No issues expected. There are two cases that can occur for a PV:

  1. PV did not transition its phase when feature gate was enabled - the LastPhaseTransitionTime field was not added to the PV object so this is the same case as enabling the feature gate for the first time.

  2. PV did transition its phase when feature gate was enabled - the LastPhaseTransitionTime field is already set, and it’s timestamp value will be updated on next phase change.

See “Upgrade / Downgrade Strategy” and “Notes/Constraints/Caveats” sections for more details.

Are there any tests for feature enablement/disablement?

Unit tests for enabling and disabling feature gate are required for alpha - see “Graduation criteria” section.

The tests should focus on verifying correct handling of the new PV field in relation to feature gate state. Correct handling means the values of the newly added field are added or updated when PV transitions its phase while feature gate is enabled, and persisted if already set and feature gate is disabled.

Feature enablement tests: https://github.com/kubernetes/kubernetes/blob/4eb6b3907a68514e1b2679b31d95d61f4559c181/pkg/registry/core/persistentvolume/strategy_test.go#L45

Rollout, Upgrade and Rollback Planning

How can a rollout or rollback fail? Can it impact already running workloads?

Rollout is unlikely to fail, unless API server fails and there should be no need for rollback as this enhancement only adds a new field.

Rollback in terms of removal of this new field is not possible, once a PV is updated with the new field it will not be removed by disabling this feature.

However, users can set any arbitrary timestamp value by patching PV status subresource:

$ kc patch --subresource=status pv/task-pv-volume -p '{"status":{"lastPhaseTransitionTime":"2023-01-01T00:00:00Z"}}'

Or remove it by setting zero timestamp:

$ kc patch --subresource=status pv/pv-1 -p '{"status":{"lastPhaseTransitionTime":"0001-01-01T00:00:00Z"}}'
$ kc get pv/pv-1 -o json | jq '.status.lastPhaseTransitionTime'
null
What specific metrics should inform a rollback?

No metrics are required. Not having the new field set after enabling this feature is a sufficient signal to indicate that there is a problem.

Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?

Manual upgrade->downgrade->upgrade test was performed to verify correct behavior of the new field. If a downgrade is performed there are two scenarios that can occur for each PV:

  1. Phase transitioned while feature was enabled - in this case the feature gets disabled and any updates to LastPhaseTransitionTime field must not be allowed.
  2. Phase did not transition while feature was enabled - in this case the timestamp value must be persisted in LastPhaseTransitionTime field.

After upgrading again the behavior has to match behavior as if the feature was turned on for the first time.

The difference between feature enablement/disablement and downgrade/upgrade is that after downgrading to a version that does not support LastPhaseTransitionTime field the data can not be accessed. Whereas only disabling the feature will still show last the value that was set, if present.

Upgrade->downgrade->upgrade test results:

  1. Perform pre-upgrade tests (1.27.5)

Create a PVC to provision a volume:

$ cat /tmp/pvc.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-1
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  resources:
    requests:
      storage: 1Gi
  storageClassName: csi-hostpath-sc
kc create -f /tmp/pvc.yaml
  1. Verify the PV does not have lastPhaseTransitionTime set:
$ kc get pv/$(kc get pvc/pvc-1 -o json | jq '.spec.volumeName' | tr -d "\"")  -o json | jq  '.status.lastPhaseTransitionTime'
null

Upgrade cluster (1.27.5 -> 1.28.1)

  1. Check available versions:
$ dnf search kubeadm --showduplicates --quiet | grep 1.28
kubeadm-1.28.0-0.x86_64 : Command-line utility for administering a Kubernetes cluster.
kubeadm-1.28.1-0.x86_64 : Command-line utility for administering a Kubernetes cluster.
  1. Upgrade kubeadm:
$ sudo dnf install -y kubeadm-1.28.1-0
  1. Prepare config file that enables FeatureGate:
$ cat /tmp/config.yaml
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
apiServer:
  extraArgs:
    feature-gates: PersistentVolumeLastPhaseTransitionTime=true
controllerManager:
  extraArgs:
    cluster-cidr: 10.244.0.0/16
    feature-gates: PersistentVolumeLastPhaseTransitionTime=true
  1. Perform kubeadm upgrade:
$ sudo kubeadm upgrade plan --config /tmp/config.yaml
$ sudo kubeadm upgrade apply --config /tmp/config.yaml v1.28.1
  1. Perform kubelet upgrade:
$ sudo dnf install -y kubelet-1.28.1-0
$ sudo systemctl daemon-reload 
$ sudo systemctl restart kubelet

Perform post-upgrade tests

  1. Create a second PVC to provision a volume:
$ cat /tmp/pvc2.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-2
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  resources:
    requests:
      storage: 1Gi
  storageClassName: csi-hostpath-sc
kc create -f /tmp/pvc2.yaml
  1. Verify it has lastPhaseTransitionTime set:
$ kc get pv/$(kc get pvc/pvc-2 -o json | jq '.spec.volumeName' | tr -d "\"")  -o json | jq  '.status.lastPhaseTransitionTime'
"2023-09-12T08:53:09Z"
  1. Change retain policy on the first PV to Retain:
$ kc get pv/pvc-0c9ea251-b156-4786-ac82-8713b76bb312
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM           STORAGECLASS      REASON   AGE
pvc-0c9ea251-b156-4786-ac82-8713b76bb312   1Gi        RWO            Retain           Bound    default/pvc-1   csi-hostpath-sc            52m
  1. Delete PVC for the first volume to release the PV:
kc delete pvc/pvc-1
  1. Verify the first (pre-upgrade) PVC transitioned phase and transition timestamp is now set:
$ kc get pv/pvc-f2eee26c-bca3-448b-9198-d4948f54dce3 -o json | jq '.status.phase'
"Released"

$ kc get pv/pvc-f2eee26c-bca3-448b-9198-d4948f54dce3 -o json | jq '.status.lastPhaseTransitionTime'
"2023-09-12T08:58:01Z"

Downgrade cluster (1.28.1 -> 1.27.5)

$ kc version -o json | jq '.serverVersion.gitVersion'
"v1.27.5"

Perform post-rollback tests

  1. Create another PVC and volume:
$ cat /tmp/pvc3.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-3
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  resources:
    requests:
      storage: 1Gi
  storageClassName: csi-hostpath-sc
kc create -f /tmp/pvc3.yaml
  1. Verify new PV does not have lastPhaseTransitionTime set:
$ kc get pv/$(kc get pvc/pvc-3 -o json | jq '.spec.volumeName' | tr -d "\"")  -o json | jq  '.status.lastPhaseTransitionTime'
null
  1. Verify lastPhaseTransitionTime of previous PVs can not be accessed anymore:
$ kc get pv/$(kc get pvc/pvc-2 -o json | jq '.spec.volumeName' | tr -d "\"")  -o json | jq  '.status.lastPhaseTransitionTime'
null
  1. Verify lastPhaseTransitionTime can not be set manually:
$ kc patch pvc/pvc-3 -p '{"status":{"lastPhaseTransitionTime":"2023-09-11T13:07:09Z"}}'
Warning: unknown field "status.lastPhaseTransitionTime"
persistentvolumeclaim/pvc-3 patched (no change)

Upgrade cluster again (1.27.5 -> 1.28.1)

  1. Install/update kubeadm:
$ sudo dnf install -y kubeadm-1.28.1-0
  1. Perform kubeadm upgrade:
$ sudo kubeadm upgrade plan --config /tmp/config.yaml
$ sudo kubeadm upgrade apply --config /tmp/config.yaml v1.28.1

Perform post-upgrade tests again

  1. Verify timestamp is available again and unchanged on old PVs:
$ kc get pv/$(kc get pvc/pvc-2 -o json | jq '.spec.volumeName' | tr -d "\"")  -o json | jq  '.status.lastPhaseTransitionTime'
"2023-09-12T08:53:09Z"
$ kc get pv/pvc-f2eee26c-bca3-448b-9198-d4948f54dce3 -o json | jq '.status.lastPhaseTransitionTime'
"2023-09-12T08:58:01Z"
  1. Change reclaim policy on exiting PV, release it and check lastPhaseTransitionTime is set correctly:
$ kc get pv/pvc-2e55f2fd-b0dc-4c95-b8d5-085d16ee6d27 -o json | jq '.spec.persistentVolumeReclaimPolicy'
"Delete"

$ kc patch pv/pvc-2e55f2fd-b0dc-4c95-b8d5-085d16ee6d27 -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
persistentvolume/pvc-2e55f2fd-b0dc-4c95-b8d5-085d16ee6d27 patched

$ kc get pv/pvc-2e55f2fd-b0dc-4c95-b8d5-085d16ee6d27 -o json | jq '.spec.persistentVolumeReclaimPolicy'
"Retain"

$ kc get pv/pvc-2e55f2fd-b0dc-4c95-b8d5-085d16ee6d27 -o json | jq '.status.phase'
"Bound"

$ kc delete pvc/pvc-2
persistentvolumeclaim "pvc-2" deleted

$ kc get pv/pvc-2e55f2fd-b0dc-4c95-b8d5-085d16ee6d27 -o json | jq '.status.phase'
"Released"

$ kc get pv/pvc-2e55f2fd-b0dc-4c95-b8d5-085d16ee6d27 -o json | jq '.status.lastPhaseTransitionTime'
"2023-09-12T12:05:07Z"

$ date
Tue Sep 12 12:05:24 PM UTC 2023
Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?

No.

Monitoring Requirements

How can an operator determine if the feature is in use by workloads?

PV objects can be inspected for LastPhaseTransitionTime field.

How can someone using this feature know that it is working for their instance?
  • API .status
    • Other field: pv.Status.LastPhaseTransitionTime
What are the reasonable SLOs (Service Level Objectives) for the enhancement?

N/A - no SLI defined

What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
  • Other (treat as last resort)
    • Details: To check correct functionality, inspect LastPhaseTransitionTime of a PV after binding it to a PVC. Or simply create a PVC and check dynamically provisioned PV if it has a LastPhaseTransitionTime set to current time.
Are there any missing metrics that would be useful to have to improve observability of this feature?

Due to the simple nature if this feature there’s no need to add any metric.

Dependencies

Does this feature depend on any specific services running in the cluster?

No.

Scalability

Will enabling / using this feature result in any new API calls?

No, the feature is implemented directly in API strategy for updating PVs.

Will enabling / using this feature result in introducing new API types?

No.

Will enabling / using this feature result in any new calls to the cloud provider?

No.

Will enabling / using this feature result in increasing size or count of the existing API objects?

Yes, all PV objects will have an entirely new status field to hold a timestamp called LastPhaseTransitionTime.

Estimated increase in size: < 50B

Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?

No.

Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, …) in any components?

No.

Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?

No.

Troubleshooting

How does this feature react if the API server and/or etcd is unavailable?

If API server or etcd is unavailable objects can not be updated. Since this feature relies on PVs being updated to set LastPhaseTransitionTime field this feature is basically disabled in this case.

What are other known failure modes?

None, the feature is dependent only on API server and should not be affected by other failures.

What steps should be taken if SLOs are not being met to determine the problem?

Users should inspect API server logs for errors in case PV objects are not updated properly.

Implementation History

  • 1.27: alpha
  • 1.28: beta
  • 1.31: GA

Drawbacks

No drawbacks discovered, enhancement only adds a new informative field.

Alternatives

Alternative solution is to update phase transition timestamp in PV controller/KCM. This would increase chances of having a time skew between API audit logs and the timestamp. Updating phase transition timestamp in API strategy code is therefore a better solution.

Infrastructure Needed (Optional)

None.