2397 words
12 min read

Docker supply chain hardening — from Scout D to OpenSSF 7.8 on a 700K-pull image

Docker supply chain hardening — from Scout D to OpenSSF 7.8 on a 700K-pull image

Earlier this month my public Docker image heyvaldemar/aws-kubectl held a Docker Scout grade of D. It had crossed 700,000 pulls on Docker Hub. The same image now holds a Scout grade of B and an OpenSSF Scorecard of 7.8 out of 10, well above the typical score for public utility images. I reviewed the configuration line by line, shipped three major phases, and hit one production incident that cost me a full afternoon.

The pattern repeats. Banking. Telecom. Cloud-native. A utility image gets adopted because it works, pulls climb into six figures, and nobody touches the Dockerfile for years. Defaults that shipped in 2023 are still the defaults in production in 2026. If you maintain a public image with meaningful adoption and you have not audited it against current supply chain expectations, what follows is the checklist I wish I had kept from the start. For the performance side of the same site rebuild that preceded this work, see the Cloudflare Web Analytics migration that unlocked Lighthouse 100.

Docker Hub Scout health score showing grade B with seven supply chain checks passing

What grade D actually looked like#

The starting Dockerfile was the version most maintainers recognize. One stage. A FROM ubuntu:24.04 without a digest. apt-get install for curl and unzip. A bash loop to download the AWS CLI zip and kubectl binary. No USER directive, which means implicit root. No OCI labels. No hadolint. No lockfile for the base image.

# Before: single stage, implicit root, no labels, no digest pin
FROM ubuntu:24.04
RUN apt-get update && apt-get install -y curl unzip \
&& curl -o awscliv2.zip https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip \
&& unzip awscliv2.zip && ./aws/install \
&& curl -LO https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl \
&& install kubectl /usr/local/bin/kubectl
ENTRYPOINT ["/bin/bash"]

The secure version after Phase 3 looks like a different image. Multi-stage so the build tools never ship in the runtime. Base digest pinned. Explicit USER 10001 so the container runs as a non-root account with GID 0 for OpenShift SCC compatibility. Full set of OCI labels so Scout and GitHub actually know what they are scanning.

# After: multi-stage, digest-pinned, non-root, OCI-labeled
FROM ubuntu:24.04@sha256:c4a8d5503dfb2a3eb8ab5f807da5bc69a85730fb49b5cfca2330194ebcc41c7b AS builder
ARG KUBECTL_VERSION=1.35.4
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates curl unzip \
&& curl -fsSL -o /tmp/awscliv2.zip \
https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip \
&& unzip -q /tmp/awscliv2.zip -d /tmp \
&& /tmp/aws/install -i /opt/aws-cli -b /usr/local/bin \
&& curl -fsSL -o /usr/local/bin/kubectl \
https://dl.k8s.io/release/v${KUBECTL_VERSION}/bin/linux/amd64/kubectl \
&& echo "$(curl -fsSL https://dl.k8s.io/release/v${KUBECTL_VERSION}/bin/linux/amd64/kubectl.sha256) /usr/local/bin/kubectl" | sha256sum -c - \
&& chmod +x /usr/local/bin/kubectl
FROM ubuntu:24.04@sha256:c4a8d5503dfb2a3eb8ab5f807da5bc69a85730fb49b5cfca2330194ebcc41c7b
LABEL org.opencontainers.image.source="https://github.com/heyvaldemar/aws-kubectl-docker"
LABEL org.opencontainers.image.licenses="MIT"
LABEL org.opencontainers.image.description="AWS CLI v2 + kubectl on Ubuntu 24.04"
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates \
&& rm -rf /var/lib/apt/lists/* \
&& groupadd -g 0 -o root 2>/dev/null || true \
&& useradd -u 10001 -g 0 -m -d /home/app -s /bin/bash app \
&& chmod -R g=u /home/app
COPY --from=builder /opt/aws-cli /opt/aws-cli
COPY --from=builder /usr/local/bin/kubectl /usr/local/bin/kubectl
RUN ln -s /opt/aws-cli/v2/current/bin/aws /usr/local/bin/aws
USER 10001
WORKDIR /home/app
ENV HOME=/home/app
ENTRYPOINT ["/bin/bash"]

I have watched the missing USER directive fail the same way for twenty years. In banking it was a compliance audit finding three weeks before go-live. In telecom it was a lateral movement incident that reached the image registry. In cloud-native it is the reason OpenShift restricted-v2 SCC refuses your pod. One line. The cost of skipping it compounds every year the image gets pulled.

Full source including CI workflows, SECURITY.md disclosure policy, and the v1-maintenance migration guide lives at github.com/heyvaldemar/aws-kubectl-docker.

Why this keeps happening#

Sonatype’s 2024 State of the Software Supply Chain report logged 512,847 malicious open-source packages over the preceding year. That is a 156% rise year-over-year.

Containers are a smaller slice of that total than npm or PyPI. The propagation model is identical. A base image ships, downstream teams pull it, and the maintainer’s attack surface becomes the attack surface of every production cluster that pulled it. One compromised personal access token reaches a cluster in Frankfurt and a cluster in Singapore by the next morning, because the image digest has already rolled out through the nightly base rebuilds in both places.

The second driver is silence between the build and the consumer. Most public images have no signature, no SBOM, and no build provenance. A pull resolves a tag to a digest. That is the entire audit trail. Sigstore fixed the signing cost by making keyless cosign verification free through GitHub OIDC. BuildKit fixed the SBOM and provenance cost by making both a single build flag. The reason most images still ship without them is that nobody ran the migration, not that the tooling is hard.

The third driver is recency. OpenSSF Scorecard is new enough that most maintainers have not seen their own score. Running it once is a ten-minute workflow. Most public images would return a number between 3 and 5 if they ran it today.

Risk and blast radius#

Direct exposure scales with pulls. At over 700,000 pulls the image is past the threshold where a single supply chain compromise reaches hundreds of downstream CI pipelines inside days. The pulls counter is an adoption metric. It is also a blast radius metric.

Systemic exposure is the harder calculation. A single maintainer account. A single personal access token with write:packages. A single compromised laptop. The OWASP Top 10 for CI/CD Security captures these as CICD-SEC-2 (Inadequate Identity and Access Management) and CICD-SEC-6 (Insufficient Credential Hygiene), and the attack pattern that dominated 2024 was exactly this chain. Hardening the build pipeline matters more than hardening the runtime of the image itself, because the runtime is downstream of whoever signed the image.

Regulatory exposure depends on who pulled the image. If any consumer runs it in a workload subject to EU NIS2, US Executive Order 14028, or financial regulatory frameworks that require SBOM attestations, the maintainer is inside the trust chain whether the maintainer asked to be there or not.

Options compared#

The trade-off space for a solo maintainer is narrower than for a funded team. The table below is the honest comparison for public utility images under 1 GB.

ApproachSetup costOngoing costScorecard ceilingFit
Do nothing0 hours0 hours/month~3/10High risk above 100K pulls
Multi-stage + lint only4 hours30 min/month~5/10Minimum viable for public image
Add cosign + SBOM + SLSA8 hours30 min/month~7/10Recommended above 250K pulls
Full hardening + non-root16 hours1 hour/month~7.8/10Required for enterprise downstream use
Full + CII Best Practices badge40 hours2 hours/month~8.5/10Worth it only for funded projects

The jump from 5 to 7 costs four hours once. The jump from 7 to 7.8 costs another eight hours plus a breaking release. The jump from 7.8 to 8.5 costs forty hours of process overhead and caps at a ceiling you do not control. I stopped at 7.8 because the marginal return on the next tier goes negative for a solo maintainer.

OpenSSF Scorecard report showing 7.8 score with all 18 checks broken down by severity

Framework: supply chain hardening for solo maintainers#

Three layers. Each one is a deploy cycle. The names match the migration topic family because this is a migration from insecure defaults to attested defaults, not a new build from scratch.

Layer 1: inventory#

The first goal is the audit, not the fix. You cannot harden what you have not measured.

Terminal window
# Layer 1: run once before any changes
# 1. Current Scorecard baseline
docker run -e GITHUB_AUTH_TOKEN=$GITHUB_TOKEN gcr.io/openssf/scorecard:stable \
--repo=github.com/OWNER/REPO --show-details
# 2. Current Scout grade
docker scout quickview OWNER/IMAGE:latest
# 3. Trivy CVE baseline
trivy image --severity HIGH,CRITICAL OWNER/IMAGE:latest
# 4. Pin every base image to a digest and save the old Dockerfile
git mv Dockerfile Dockerfile.baseline

Run all four. Save the output. The Scorecard baseline is what you will compare against after hardening. In my run the starting numbers were Scout grade D, OpenSSF score 2.9, and multiple HIGH CVEs from apt package lag.

Owner: the single maintainer.

Layer 2: parallel run#

Phase 1. Multi-stage build. OCI labels. Hadolint in CI.

Phase 2. Cosign keyless signing. SBOM generation. SLSA build provenance. Trivy SARIF upload.

Phase 3. Non-root USER 10001 breaking release plus a v1-maintenance floating tag for users who cannot migrate inside the window. Semver and digest-pinned tags are marked immutable on Docker Hub so a given tag can never silently point at a different image.

The signing job is the piece that earns the Cosign Verified badge and the SLSA provenance:

# Layer 2: .github/workflows/publish.yml (signing + attestation)
permissions:
contents: read
id-token: write
packages: write
attestations: write
jobs:
build:
runs-on: ubuntu-latest
outputs:
digest: ${{ steps.build.outputs.digest }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: docker/setup-buildx-action@4d04d5d9486b7bd6fa91e7baf45bbb4f8b9deedd # v4
- uses: docker/login-action@4907a6ddec9925e35a0a9e82d7399ccc52663121 # v4
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- id: build
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7
with:
platforms: linux/amd64,linux/arm64
push: true
provenance: mode=max
sbom: true
tags: ${{ env.IMAGE }}:${{ env.VERSION }}
- uses: actions/attest-build-provenance@96278af6caaf10aea03fd8d33a09a777ca52d62f # v3.2.0
with:
subject-name: ${{ env.IMAGE }}
subject-digest: ${{ steps.build.outputs.digest }}
push-to-registry: false
- uses: sigstore/cosign-installer@cad07c2e89fa2edd6e2d7bab4c1aa38e53f76003 # v4.1.1
with:
cosign-release: "v2.6.1"
- run: |
cosign sign --yes \
"${IMAGE}@${DIGEST}"
env:
IMAGE: ${{ env.IMAGE }}
DIGEST: ${{ steps.build.outputs.digest }}

Incident postmortem. Earlier during Phase 2 I shipped the workflow above with push-to-registry: true. The signing step succeeded. The attestation push failed. Docker Hub’s OCI referrers API silently rejected the credential handoff from the workflow. I lost the signatures, re-ran the entire publish job, and the version jumped by one patch. The fix is to keep attestations in GitHub Attestations (where they are retrievable by anyone with the digest) and skip the registry push until Docker Hub stabilizes its referrers behavior. Root cause: the referrers API expects a specific OCI 1.1 header format that the actions/attest-build-provenance v3 action does not negotiate cleanly against Docker Hub’s current implementation. GHCR works. Docker Hub does not. I lost four hours to this.

GitHub Attestations page showing thirteen SLSA provenance records from publish.yml

After Phase 2 the image carries keyless cosign signatures, an SPDX SBOM, and SLSA build provenance at mode=max, all traceable to the exact GitHub Actions run that produced them. After Phase 3 the default user is UID 10001 with GID 0, which means the image drops straight into OpenShift restricted-v2 without a securityContext override.

Owner: the single maintainer.

Layer 3: cutover and continuous verification#

Weekly automation is what keeps the score at 7.8 instead of drifting back to 5 over time.

.github/workflows/scorecard.yml
on:
schedule:
- cron: '0 6 * * 2' # Tue 06:00 UTC
push:
branches: [main]
permissions: read-all
jobs:
analysis:
runs-on: ubuntu-latest
permissions:
security-events: write
id-token: write
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
# Pinned to commit SHA, not tag SHA (annotated tags dereference differently)
- uses: ossf/scorecard-action@4eaacf0543bb3f2c246792bd56e8cdeffafb205a # v2.4.3
with:
results_file: results.sarif
results_format: sarif
publish_results: true
- uses: github/codeql-action/upload-sarif@95e58e9a2cdfd71adc6e0353d5c52f41a045d225 # v4.35.2
with:
sarif_file: results.sarif

The comment on the Scorecard action pin is not paranoia. I hit an imposter commit error on the first publication attempt. Git has had two tag types since 2005: lightweight tags point directly at a commit SHA, annotated tags point at a tag object that wraps the commit SHA. GitHub’s git/refs/tags/:tag API returns whichever SHA the tag ref resolves to. Annotated tag, annotated tag object SHA. Scorecard rejects annotated tag object SHAs during its own verification step because they are not commit SHAs.

The fix is a second API call:

Terminal window
# Annotated tag: first call returns tag object SHA
gh api repos/ossf/scorecard-action/git/refs/tags/v2.4.3 --jq '.object'
# {"sha":"99c09fe...","type":"tag"}
# Dereference: second call returns the commit SHA
gh api repos/ossf/scorecard-action/git/tags/99c09fe... --jq '.object'
# {"sha":"4eaacf0543bb3f2c246792bd56e8cdeffafb205a","type":"commit"}
# Pin the commit SHA, not the tag object SHA
# uses: ossf/scorecard-action@4eaacf0543bb3f2c246792bd56e8cdeffafb205a

It is a twenty-year-old distinction in Git. The kind that only breaks your publication the day you need it to work.

Weekly base rebuild, weekly Scorecard re-run, and weekly Docker Hub tag cleanup keep the maintenance budget sustainable at under two hours per month. A tag retention policy that deletes sha-* tags older than 90 days keeps the tag list readable by humans and scanners.

GitHub code scanning view showing seven Scorecard findings on the main branch with All tools are working as expected banner

Owner: the single maintainer, with Dependabot as the junior engineer that ships 80% of the ongoing patches.

Tradeoffs#

The full program costs sixteen hours of initial work and about one hour per month of ongoing review. At Canadian senior DevOps contractor rates of CAD 120 to 180 per hour, the initial hardening costs between CAD 1,920 and CAD 2,880 in labor. Ongoing maintenance runs CAD 120 to 180 per month. Annualized, the whole program sits between CAD 3,360 and CAD 5,040 in the first year.

A single supply chain incident triage with forensic retention starts at USD 25,000. An SBOM attestation audit finding in an enterprise downstream consumer, delivered to a sales engineer three days before contract signing, kills deals in the high six figures. The asymmetry is the argument. The Codecov bash uploader compromise in 2021 exposed credentials from customer CI environments across a two-month window before detection, through a single injected line of shell. Shell script, not container. Propagation graph identical to a signed-nothing Docker image today.

The breaking release in Phase 3 costs a migration window, a v1-maintenance floating tag, and a migration guide in the README. Downstream teams can pin to v1-maintenance while they budget for a container base image bump.

The Scorecard findings that stay open at 7.8 are mostly honest. Code-Review scores 0/10 because I am the only reviewer. Branch-Protection scores 4/10 because I kept the admin bypass for emergency fixes. Fuzzing scores 0/10 because the image is a utility, not an application. These are trade-offs, not oversights. The score reflects them accurately. That is the point of the program.

The closing argument#

Public Docker images inherit the supply chain obligations of the projects that pull them, whether the maintainer accepts the responsibility or not. The tooling to sign, attest, and score an image is free and the workflows are under 300 lines of YAML. The cost of skipping the migration compounds every month the image stays insecure by default. The cost of running it is one sustained weekend of work and about one hour a month afterward.

Sixteen hours once. One hour a month after. Seven hundred thousand pulls protected from whatever compromise ships against the maintainer next.

Pull the image and verify it yourself:

Terminal window
cosign verify heyvaldemar/aws-kubectl:2.0.0 \
--certificate-identity-regexp "https://github.com/heyvaldemar/aws-kubectl-docker/.*" \
--certificate-oidc-issuer "https://token.actions.githubusercontent.com"

Every signature, SBOM, and build provenance record is public in GitHub Attestations.

Discussion#

If you have hardened a public image past 7.0 on Scorecard, hit a Docker Hub referrers issue in production, or kept the alternative and want to argue the cost-benefit, drop a comment below. Counterarguments welcome and the comment thread is where I respond first. For longer back-and-forth with senior practitioners, join the discussion on Discord.


Vladimir Mikhalev

Docker Captain  ·  IBM Champion  ·  AWS Community Builder

The Verdict — production-tested analysis on YouTube.

Docker supply chain hardening — from Scout D to OpenSSF 7.8 on a 700K-pull image
https://heyvaldemar.com/docker-supply-chain-hardening-solo-maintainer/
Author
Vladimir Mikhalev
Published
2026-04-22
License
CC BY-NC-SA 4.0

Discussion