Skip to content

feat: wire image field through API for custom image pinning per database or node#401

Open
tsivaprasad wants to merge 2 commits into
PLAT-596-add-image-version-fields-to-database-specfrom
PLAT-599-allow-custom-image-override-per-database-or-node
Open

feat: wire image field through API for custom image pinning per database or node#401
tsivaprasad wants to merge 2 commits into
PLAT-596-add-image-version-fields-to-database-specfrom
PLAT-599-allow-custom-image-override-per-database-or-node

Conversation

@tsivaprasad

@tsivaprasad tsivaprasad commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR wires the image field through the API, allowing users to pin any container image at either the database or node level via orchestrator_opts.swarm.image. The Control Plane deploys the specified image without requiring manifest validation, while returning a warning in the API response when the image is not present in the manifest.

Changes

  • Added the image field to the SwarmOpts Goa type and regenerated the API layer.
  • Added Warnings []string to ValidationResult and ValidateSpecOutput, allowing non-fatal validation warnings to be propagated through the validation workflow and returned to API callers.
  • Updated ValidateInstanceSpecs to emit a warning when SwarmOpts.Image is explicitly set and differs from the manifest-resolved image for the selected version. The specification remains valid and is still accepted.
  • Node-level image overrides continue to take precedence over database-level settings through the existing overridableValue() merge logic in NodeInstances(), requiring no additional changes.
  • When Image is cleared, the Control Plane automatically falls back to ResolvedImage during the next reconcile cycle using the existing resolveInstanceImages() precedence logic.

Testing

Verification:

  1. Created cluster
  2. Create DB with no image specified, CP auto-resolves from manifest
cp1-req create-database < ../demo/Images/create_db_with_no_image.json
HTTP/1.1 200 OK
Content-Length: 590
Content-Type: application/json
Date: Fri, 05 Jun 2026 13:04:44 GMT

{
  database: {
    created_at: "2026-06-05T13:04:44Z"
    id: "storefront-no-image"
    spec: {
      database_name: "storefront"
      database_users: [
        {
          attributes: ["SUPERUSER", "LOGIN"]
          db_owner: true
          username: "admin"
        }
      ]
      nodes: [
        {
          host_ids: ["host-1", "host-2", "host-3"]
          name: "n1"
        }
      ]
      postgres_version: "17.9"
      spock_version: "5"
    }
    state: "creating"
    updated_at: "2026-06-05T13:04:44Z"
  }
  task: {
    created_at: "2026-06-05T13:04:44Z"
    database_id: "storefront-no-image"
    entity_id: "storefront-no-image"
    scope: "database"
    status: "pending"
    task_id: "019e97e2-cf1a-7b49-9bf2-568a6c6a85e3"
    type: "create"
  }
}
 docker service ls | grep storefront-no-image
yizq2dw5kqqq   postgres-storefront-no-image-n1-9ptayhma         replicated   1/1        ghcr.io/pgedge/pgedge-postgres:17.9-spock5.0.6-standard-2   
qgt248y975zw   postgres-storefront-no-image-n1-689qacsi         replicated   1/1        ghcr.io/pgedge/pgedge-postgres:17.9-spock5.0.6-standard-2   
h3uni81wmir0   postgres-storefront-no-image-n1-ant97dj4         replicated   1/1        ghcr.io/pgedge/pgedge-postgres:17.9-spock5.0.6-standard-2 
  1. Created DB passing non-existing image
    Custom image at database level → warning returned, image deployed, Docker surfaces pull failure for non-existent image
cp1-req create-database < ../demo/Images/create_db.json 
HTTP/1.1 200 OK
Content-Length: 1412
Content-Type: application/json
Date: Fri, 05 Jun 2026 04:28:15 GMT

{
  database: {
    created_at: "2026-06-05T04:28:15Z"
    id: "storefront"
    spec: {
      database_name: "storefront"
      database_users: [
        {
          attributes: ["SUPERUSER", "LOGIN"]
          db_owner: true
          username: "admin"
        }
      ]
      nodes: [
        {
          host_ids: ["host-1", "host-2", "host-3"]
          name: "n1"
        }
      ]
      orchestrator_opts: {
        swarm: {
          image: "ghcr.io/pgedge/pgedge-postgres:my-custom-image"
        }
      }
      postgres_version: "17.9"
      spock_version: "5"
    }
    state: "creating"
    updated_at: "2026-06-05T04:28:15Z"
  }
  task: {
    created_at: "2026-06-05T04:28:15Z"
    database_id: "storefront"
    entity_id: "storefront"
    scope: "database"
    status: "pending"
    task_id: "019e9609-f4bc-7d81-9770-5bed8ac8e503"
    type: "create"
  }
  warnings: [
    "warning for node n1, host host-3: image \"ghcr.io/pgedge/pgedge-postgres:my-custom-image\" is not the manifest image for version 17.9_5 (manifest image: ghcr.io/pgedge/pgedge-postgres:17.9-spock5.0.6-standard-2); ensure it is a valid pgEdge image"
    "warning for node n1, host host-1: image \"ghcr.io/pgedge/pgedge-postgres:my-custom-image\" is not the manifest image for version 17.9_5 (manifest image: ghcr.io/pgedge/pgedge-postgres:17.9-spock5.0.6-standard-2); ensure it is a valid pgEdge image"
    "warning for node n1, host host-2: image \"ghcr.io/pgedge/pgedge-postgres:my-custom-image\" is not the manifest image for version 17.9_5 (manifest image: ghcr.io/pgedge/pgedge-postgres:17.9-spock5.0.6-standard-2); ensure it is a valid pgEdge image"
  ]
}
docker service ls | grep storefront
z4qu0xkqcnt0   postgres-storefront-n1-689qacsi   replicated   0/1        ghcr.io/pgedge/pgedge-postgres:my-custom-image 
  1. Create DB node-level image overrides database-level → node-level-image used, not db-level-image
cp1-req create-database < ../demo/Images/create_db_node_override.json 
HTTP/1.1 200 OK
Content-Length: 1546
Content-Type: application/json
Date: Fri, 05 Jun 2026 04:35:47 GMT

{
  database: {
    created_at: "2026-06-05T04:35:47Z"
    id: "storefront-node-override"
    spec: {
      database_name: "storefront"
      database_users: [
        {
          attributes: ["SUPERUSER", "LOGIN"]
          db_owner: true
          username: "admin"
        }
      ]
      nodes: [
        {
          host_ids: ["host-1", "host-2", "host-3"]
          name: "n1"
          orchestrator_opts: {
            swarm: {
              image: "ghcr.io/pgedge/pgedge-postgres:node-level-image"
            }
          }
        }
      ]
      orchestrator_opts: {
        swarm: {
          image: "ghcr.io/pgedge/pgedge-postgres:db-level-image"
        }
      }
      postgres_version: "17.9"
      spock_version: "5"
    }
    state: "creating"
    updated_at: "2026-06-05T04:35:47Z"
  }
  task: {
    created_at: "2026-06-05T04:35:47Z"
    database_id: "storefront-node-override"
    entity_id: "storefront-node-override"
    scope: "database"
    status: "pending"
    task_id: "019e9610-dc8e-7592-a475-e9fe3f61a5ed"
    type: "create"
  }
  warnings: [
    "warning for node n1, host host-3: image \"ghcr.io/pgedge/pgedge-postgres:node-level-image\" is not the manifest image for version 17.9_5 (manifest image: ghcr.io/pgedge/pgedge-postgres:17.9-spock5.0.6-standard-2); ensure it is a valid pgEdge image"
    "warning for node n1, host host-1: image \"ghcr.io/pgedge/pgedge-postgres:node-level-image\" is not the manifest image for version 17.9_5 (manifest image: ghcr.io/pgedge/pgedge-postgres:17.9-spock5.0.6-standard-2); ensure it is a valid pgEdge image"
    "warning for node n1, host host-2: image \"ghcr.io/pgedge/pgedge-postgres:node-level-image\" is not the manifest image for version 17.9_5 (manifest image: ghcr.io/pgedge/pgedge-postgres:17.9-spock5.0.6-standard-2); ensure it is a valid pgEdge image"
  ]
}
docker service ls | grep storefront-node-override
jo0rccekfw3f   postgres-storefront-node-override-n1-689qacsi   replicated   0/1        ghcr.io/pgedge/pgedge-postgres:node-level-image 
  1. Create DB using manifest image set explicitly → no warning returned
cp1-req create-database < ../demo/Images/create_db_manifest_image.json 
HTTP/1.1 200 OK
Content-Length: 708
Content-Type: application/json
Date: Fri, 05 Jun 2026 04:38:27 GMT

{
  database: {
    created_at: "2026-06-05T04:38:27Z"
    id: "storefront-manifest-image"
    spec: {
      database_name: "storefront"
      database_users: [
        {
          attributes: ["SUPERUSER", "LOGIN"]
          db_owner: true
          username: "admin"
        }
      ]
      nodes: [
        {
          host_ids: ["host-1", "host-2", "host-3"]
          name: "n1"
        }
      ]
      orchestrator_opts: {
        swarm: {
          image: "ghcr.io/pgedge/pgedge-postgres:17.9-spock5.0.6-standard-2"
        }
      }
      postgres_version: "17.9"
      spock_version: "5"
    }
    state: "creating"
    updated_at: "2026-06-05T04:38:27Z"
  }
  task: {
    created_at: "2026-06-05T04:38:27Z"
    database_id: "storefront-manifest-image"
    entity_id: "storefront-manifest-image"
    scope: "database"
    status: "pending"
    task_id: "019e9613-4c02-78f4-867f-0552ec69a0cd"
    type: "create"
  }
}

docker service ls | grep storefront
2lyuimxnkbc2 postgres-storefront-manifest-image-n1-9ptayhma replicated 1/1 ghcr.io/pgedge/pgedge-postgres:17.9-spock5.0.6-standard-2
9792vnbo0jwn postgres-storefront-manifest-image-n1-689qacsi replicated 1/1 ghcr.io/pgedge/pgedge-postgres:17.9-spock5.0.6-standard-2
8a3zmtz2hbzq postgres-storefront-manifest-image-n1-ant97dj4 replicated 1/1 ghcr.io/pgedge/pgedge-postgres:17.9-spock5.0.6-standard-2

5. Update databse to clearing image → falls back to manifest-resolved image on next update 
cp1-req update-database storefront < ../demo/Images/update_db.json
HTTP/1.1 200 OK
Content-Length: 735
Content-Type: application/json
Date: Fri, 05 Jun 2026 05:32:29 GMT

{
  database: {
    created_at: "2026-06-05T04:28:15Z"
    id: "storefront"
    instances: [
      {
        created_at: "2026-06-05T04:28:18Z"
        host_id: "host-1"
        id: "storefront-n1-689qacsi"
        node_name: "n1"
        state: "failed"
        updated_at: "2026-06-05T04:28:42Z"
      }
    ]
    spec: {
      database_name: "storefront"
      database_users: [
        {
          attributes: ["SUPERUSER", "LOGIN"]
          db_owner: true
          username: "admin"
        }
      ]
      nodes: [
        {
          host_ids: ["host-1", "host-2", "host-3"]
          name: "n1"
        }
      ]
      postgres_version: "17.9"
      spock_version: "5"
    }
    state: "modifying"
    updated_at: "2026-06-05T05:32:28Z"
  }
  task: {
    created_at: "2026-06-05T05:32:28Z"
    database_id: "storefront"
    entity_id: "storefront"
    scope: "database"
    status: "pending"
    task_id: "019e9644-c1c7-71e6-9ee0-a8cfb9db241b"
    type: "update"
  }
}
docker service ls
ID             NAME                                             MODE         REPLICAS   IMAGE                                                       PORTS
y24h2aafs0n9   postgres-storefront-n1-9ptayhma                  replicated   1/1        ghcr.io/pgedge/pgedge-postgres:17.9-spock5.0.6-standard-2   
z4qu0xkqcnt0   postgres-storefront-n1-689qacsi                  replicated   1/1        ghcr.io/pgedge/pgedge-postgres:17.9-spock5.0.6-standard-2   
e7cv5z40jqmx   postgres-storefront-n1-ant97dj4                  replicated   1/1        ghcr.io/pgedge/pgedge-postgres:17.9-spock5.0.6-standard-2   

Checklist

  • Tests added

PLAT-599

@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown

Review Change Stack

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fecad26c-d04b-4ef0-b3ef-dfdbffa13cbf

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR introduces image override capability for Docker Swarm orchestrator options, allowing users to specify custom container images. Image overrides are validated against version manifests; warnings are collected throughout the validation pipeline and returned in API responses alongside successful database creation/update results.

Changes

Image Override with Validation Warnings

Layer / File(s) Summary
API Contracts: Response Warnings and Image Override Fields
api/apiv1/design/database.go
CreateDatabaseResponse and UpdateDatabaseResponse gain optional warnings array; SwarmOpts gains optional image field for override specification.
Validation Result Domain Model
server/internal/database/orchestrator.go
ValidationResult struct extends with Warnings field to hold non-fatal validation feedback.
API and Database Type Conversion
server/internal/api/apiv1/convert.go, convert_test.go
Bidirectional conversion functions map SwarmOpts.Image field using pointer/value normalization; conversion tests verify nil and empty-string edge cases.
Image Override Manifest Validation
server/internal/orchestrator/swarm/orchestrator.go, validate_instance_specs_test.go
ValidateInstanceSpecs checks custom image overrides against version manifest; produces non-blocking warnings when image is missing or differs from manifest, with test coverage for known/unknown versions.
Warning Collection and Pipeline Aggregation
server/internal/workflows/validate_spec.go, validate_spec_test.go
ValidateSpecOutput.merge method now formats and collects warnings from validation results; tests verify warnings propagate independently of error/validity status.
API Handler Response Wiring
server/internal/api/apiv1/post_init_handlers.go
CreateDatabase and UpdateDatabase populate Warnings field on response from validation output.

Poem

🐰 A humble hop through manifests so grand,
Where images override, yet warnings at hand,
Through pipelines they flow, both valid and clear,
From Docker Swarm's heart to the API's frontier,
No blockages found, just whispers so dear! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding an image field to the API to enable custom image pinning at database or node level.
Description check ✅ Passed The description covers all required sections: Summary, Changes, Testing with verification steps, and a completed Checklist with issue link. Documentation status is not applicable here as this is a feature for internal control plane functionality.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch PLAT-599-allow-custom-image-override-per-database-or-node

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production

codacy-production Bot commented Jun 5, 2026

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 9 complexity · 2 duplication

Metric Results
Complexity 9
Duplication 2

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@tsivaprasad

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@jason-lynch jason-lynch left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the image field to the API looks good! However, I'm not sure that the validation this PR adds is very helpful, especially since it doesn't prevent you from using an invalid image. Instead of a non-fatal validation, maybe we could check that the image exists by trying to pull it with the Docker client?

Something else to think about for a separate PR: another way that users can get into trouble is if the Postgres and Spock versions in the image don't match the ones in the spec. We should spend a little time thinking about how we can make that less dangerous, such as by enforcing a tag format that contains the Postgres and Spock versions like we do in our official images. I think @mmols had mentioned at one point that it would be reasonable for us to enforce a tag format for custom images.

@tsivaprasad

Copy link
Copy Markdown
Contributor Author

Adding the image field to the API looks good! However, I'm not sure that the validation this PR adds is very helpful, especially since it doesn't prevent you from using an invalid image. Instead of a non-fatal validation, maybe we could check that the image exists by trying to pull it with the Docker client?

Something else to think about for a separate PR: another way that users can get into trouble is if the Postgres and Spock versions in the image don't match the ones in the spec. We should spend a little time thinking about how we can make that less dangerous, such as by enforcing a tag format that contains the Postgres and Spock versions like we do in our official images. I think @mmols had mentioned at one point that it would be reasonable for us to enforce a tag format for custom images.

Thanks @jason-lynch
Good point — a non-fatal warning by itself isn't very useful if it still allows a broken deployment. We've updated the implementation accordingly.

Instead of performing a full docker pull (which downloads image layers), we now use DistributionInspect, a lightweight registry manifest lookup that verifies the image exists without transferring image data.

The validation behavior is now:

  • Hard fail if the image cannot be found or the registry cannot be reached.
  • Warning (non-fatal) if the image exists but differs from the manifest image associated with the selected version.
  • No validation warning when the specified image matches the manifest image exactly.

Created place holder ticket to Enforce version-aware tag format for custom image overrides https://pgedge.atlassian.net/browse/PLAT-641

@jason-lynch

jason-lynch commented Jun 11, 2026

Copy link
Copy Markdown
Member

@tsivaprasad That sounds good! Although I still have a question about this type of validation:

Warning (non-fatal) if the image exists but differs from the manifest image associated with the selected version.

What's the use case for this warning? To me, this looks like any valid usage of the override field will result in a warning. The only time it wouldn't return a warning is when you specify the same image that's in the manifest, in which case the override isn't doing anything. Am I understanding that right or is there a case I'm missing?

I'm wondering if we need warnings at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants