Eliminating Kubernetes Image Signature Replication

By Sascha Grunert (Red Hat) | Friday, June 05, 2026

The image promoter rewrite laid the groundwork for simplifying how Kubernetes delivers container image signatures. One of the rewrite phases (Phase 6) separated image signing from signature replication into distinct pipeline stages. This follow-up covers the next step: eliminating signature replication entirely.

The problem

After promoting container images to registry.k8s.io, the promoter signs them using cosign with keyless (OIDC) signatures. These signatures are stored as OCI artifacts alongside the images, tagged with the convention sha256-<digest>.sig and sha256-<digest>.att.

The registry.k8s.io domain is backed by archeio , a thin redirector that routes container image requests to the nearest regional Google Artifact Registry backend. When a user in Europe pulls an image, archeio redirects them to europe-west2-docker.pkg.dev; a user in Asia gets redirected to asia-east1-docker.pkg.dev, and so on across 22 regional backends.

This geo-routing is great for image layers, where download locality matters for performance. But it created a problem for signatures: if the promoter only wrote a signature to one region, cosign verify would fail for users redirected to any other region. The solution was a dedicated replication pipeline that copied every .sig and .att tag to all 22 regional backends. This pipeline ran as a periodic Prow job every 2 hours on weekdays, performing thousands of API calls per run: listing tags across all repositories, diffing what existed where, and copying the missing signatures.

The insight

Signatures and attestations are small metadata artifacts, typically a few kilobytes each. Unlike image layers where geo-locality provides meaningful download performance improvements, fetching a signature from a non-local region adds negligible latency. The entire replication pipeline existed to optimize for a latency difference that users would never notice.

The solution

Instead of replicating signatures everywhere, archeio was taught to route signature requests to a single canonical upstream. The change is straightforward: when archeio receives a manifest request for a tag matching sha256-*.sig or sha256-*.att, it redirects to us-central1-docker.pkg.dev (the canonical region) instead of the caller’s nearest regional backend. All other requests continue to use geo-routing as before.

Normal image pull:
  registry.k8s.io ⟶ (geo-routing) ⟶ europe-west2-docker.pkg.dev

Signature verification:
  registry.k8s.io ⟶ (canonical)   ⟶ us-central1-docker.pkg.dev

This is configured through a new SIGNATURE_UPSTREAM_ENDPOINT environment variable on each Cloud Run instance that runs archeio.

On the promoter side, the signing target was updated to explicitly use us-central1-docker.pkg.dev as the canonical registry, instead of relying on alphabetical sorting of registry names (which would have picked asia-east1-docker.pkg.dev). The replicate-signatures subcommand was then removed along with all supporting code.

What changed

The rollout was sequenced to ensure signature verification kept working at every step:

kubernetes/registry.k8s.io#321 : Added SIGNATURE_UPSTREAM_ENDPOINT support to archeio
kubernetes/k8s.io#9413 : Deployed the new environment variable to all Cloud Run instances and updated the archeio image digest
Verified that cosign verify works against registry.k8s.io and that .sig/.att requests redirect to us-central1-docker.pkg.dev
kubernetes-sigs/promo-tools#1829 : Removed the replication pipeline and updated the signing target, released as kpromo v4.5.0
kubernetes/test-infra#36909 : Removed the periodic Prow replication job

Impact

Removing signature replication:

Eliminates thousands of API calls that were spent listing tags and copying signatures across 22 regions every 2 hours
Removes a source of transient failures, since the replication job was susceptible to Artifact Registry rate limits
Simplifies the promoter codebase by deleting the two-phase tag listing, multi-registry grouping logic, and concurrent copy orchestration (over 1,200 lines removed)
Removes a periodic Prow job that ran on weekdays

End users see no change. cosign verify against registry.k8s.io continues to work exactly as before:

cosign verify registry.k8s.io/kube-apiserver:v1.36.0 \
  --certificate-identity krel-trust@k8s-releng-prod.iam.gserviceaccount.com \
  --certificate-oidc-issuer https://accounts.google.com

Trade-offs

Routing all signature requests to a single region means that if us-central1 is unavailable, cosign verify for images served through registry.k8s.io would fail until the region recovers. This is the main trade-off of the approach.

A few mitigating factors make this acceptable in practice:

Artifact Registry is a managed Google Cloud service with high regional availability. An outage of us-central1 would likely affect far more than just signature serving.
Signatures are small metadata (a few KB). Even during normal operation, cosign already depends on registry availability for verification, whether the manifest comes from a regional or central backend.
Image pulls themselves are unaffected. Geo-routing for image layers continues to work independently of signature availability.

What’s next

The broader Kubernetes ecosystem is moving toward OCI 1.1 referrers for signature discovery, replacing the tag-based convention that cosign has used historically. cosign v3 defaults to storing signatures as OCI referrers. As this migration progresses, the tag-matching logic in archeio can eventually be replaced with referrer-aware routing.

Getting involved

This work is tracked in kubernetes-sigs/promo-tools#1762 . If you are interested in contributing to SIG Release , join our weekly meeting or reach out on the #sig-release Slack channel.