Skip to content

fix(collector): fold Docker Hub registry-host aliases to docker.io#219

Open
thomasboni wants to merge 1 commit into
mainfrom
fix/docker-hub-registry-alias-normalization
Open

fix(collector): fold Docker Hub registry-host aliases to docker.io#219
thomasboni wants to merge 1 commit into
mainfrom
fix/docker-hub-registry-alias-normalization

Conversation

@thomasboni
Copy link
Copy Markdown
Contributor

Problem

The image_authorized_sources policy does a literal go-wildcard match and performs no host canonicalisation. The collectors kept explicit registry hosts verbatim, so an image referenced via a Docker Hub alias hostname was not matched by a docker.io/* trustedUrls pattern and got flagged ISSUE-101 — even though it is the same image.

Example: registry.hub.docker.com/library/node:alpine was flagged despite docker.io/library/* being trusted.

Affected aliases (all resolve to Docker Hub):

  • registry.hub.docker.com
  • index.docker.io
  • registry-1.docker.io

Fix

Add a shared canonicalisation helper and apply it at parse time for both providers:

  • GitLab (parseImageLink): now wraps the existing parser (renamed parseImageReference) and folds i.Registry once — covering all ~19 registry-assignment sites uniformly instead of editing each.
  • GitHub (splitImageRef): folds the leading host segment of the image name (GitHub refs keep the registry inside Name).

Both providers share the same image_authorized_sources.rego, so this makes alias handling consistent across them.

Impact

Users can drop registry.hub.docker.com/*-style alias workarounds from their trusted-registry config. Digest pins and bare/library/ forms are unaffected — they were already covered by the trustDockerHubOfficial fast-path and /* patterns respectively.

Tests

New collector/registry_alias_normalization_test.go:

  • Unit tests for canonicalizeDockerHubRegistry and foldDockerHubAliasInName
  • parseImageLink (GitLab) and splitImageRef (GitHub) fold aliases; non-Hub hosts untouched
  • E2E through the OPA engine: with only docker.io/library/* trusted (no alias pattern), a Hub-alias ref is authorized for both providers; a genuinely untrusted registry stays flagged

go test ./collector/ ./policies/ pass; go vet and gofmt clean.

🤖 Generated with Claude Code

The image_authorized_sources policy does a literal go-wildcard match and
performs no host canonicalisation. The collectors kept explicit registry
hosts verbatim, so an image referenced via a Docker Hub alias hostname
(registry.hub.docker.com, index.docker.io, registry-1.docker.io) was NOT
matched by a docker.io/* trustedUrls pattern and got flagged ISSUE-101,
even though it is the same image.

Add a shared canonicalisation helper and apply it at parse time for both
providers:
- GitLab: parseImageLink now wraps the existing parser (renamed to
  parseImageReference) and folds i.Registry once, covering all ~19
  registry-assignment sites uniformly.
- GitHub: splitImageRef folds the leading host segment of the image name
  (GitHub refs keep the registry inside Name).

This lets users drop the registry.hub.docker.com/* alias workaround from
their trusted-registry config. Digest pins and bare/library forms are
unaffected (already covered by the official fast-path and /* patterns).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cursor
Copy link
Copy Markdown

cursor Bot commented May 29, 2026

PR Summary

Low Risk
Localized collector normalization before policy evaluation; non-Hub registries unchanged and tests include a negative control for untrusted images.

Overview
Docker Hub hostname aliases (registry.hub.docker.com, index.docker.io, registry-1.docker.io) are now folded to docker.io at image parse time, so image_authorized_sources trustedUrls globs written for docker.io/* match the same image regardless of which Hub hostname pipelines use—without duplicate alias patterns in config.

GitLab: parseImageLink delegates to the existing logic (renamed parseImageReference) and then runs canonicalizeDockerHubRegistry on Registry once for all paths. GitHub: splitImageRef rewrites the leading host in the image name via foldDockerHubAliasInName (registry stays in Name for Actions).

New tests cover helpers, both parsers, and an OPA E2E check that alias refs authorize with only docker.io/library/* trusted while non-Hub images still get ISSUE-101.

Reviewed by Cursor Bugbot for commit 48370e5. Bugbot is set up for automated code reviews on this repo. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant