Skip to content

Switch all API validation to Zod schemas, change tests and validation#384

Open
rcantin-w wants to merge 9 commits into
mainfrom
zod-openapi
Open

Switch all API validation to Zod schemas, change tests and validation#384
rcantin-w wants to merge 9 commits into
mainfrom
zod-openapi

Conversation

@rcantin-w

@rcantin-w rcantin-w commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Overview

This migration automates OpenAPI spec generation so reference/content.yaml in wellcomecollection/developers.wellcomecollection.org stays in sync with the code. Three main changes:

  1. Zod-based validation — Replace custom validators with runtime-safe Zod schemas
  2. Schema-driven OpenAPI generator — Spec is generated from controller schemas, not hand-written
  3. GitHub Action — Automatically sync spec when api/** changes on main

What Changed

1. Validation Layer (api/src/controllers/validation.ts)

Before: Custom QueryValidatorConfig + imperative validators

const queryValidator = ({ name, allowed, ... }) => { /* manual validation */ }
const prismicIdValidator = (filterValues, filterName) => { /* manual validation */ }

After: Zod schemas replacing the custom validators

export const commaSeparatedEnum = (name, values, opts) =>
  z.string().optional().transform((val, ctx) => { /* validation logic */ })

export const commaSeparatedPrismicIds = (filterName) =>
  z.string().optional().superRefine((val, ctx) => { /* validation logic */ })

export const workIdsSchema = z.union([z.string(), z.array(z.string())])
  .optional().superRefine((val, ctx) => { /* validation logic */ })

export const dateStringSchema = z.string().optional()
  .superRefine((val, ctx) => { /* validation logic */ })

export const PaginationQuerySchema = z.object({
  page: z.coerce.number().int().min(1).optional(),
  pageSize: z.coerce.number().int().min(1).max(100).optional(),
})

2. Events Controller (api/src/controllers/events.ts)

Before: Two-pass validation

const sortParsed = EventsSortSchema.parse(rawParams)  // sort/pagination
const validParams = paramsValidator(rawParams)         // format/location/timespan/...

After: Single-pass + focused helper

const params = EventsQuerySchema.parse(rawParams)  // all params at once
const { format, excludeFormat } = transformFormat(params.format)

Key point: transformFormat() handles the complex logic (alias mapping + negation):

  • Maps format=workshop → Prismic ID
  • Handles format=!exhibitions → excludeFormat clause
  • Validates the Prismic ID format

This consolidation removes paramsValidator, EventsSortSchema, three dedicated validators, and the unused QueryParams type.


3. OpenAPI Generator (api/scripts/generate-openapi.ts)

Before: Hand-written parameter schemas duplicating the controller logic

registry.registerPath({
  method: 'get',
  path: '/articles',
  request: {
    query: z.object({
      aggregations: z.string().optional().openapi({...}),
      format: z.string().optional().openapi({...}),
      // ... 20 more fields repeated from ArticlesQuerySchema ...
    }),
  },
  // ...
})

After: Schema-driven — controller schemas imported directly

import { ArticlesQuerySchema } from '@weco/content-api/src/controllers/articles'

registry.registerPath({
  method: 'get',
  path: '/articles',
  request: {
    query: ArticlesQuerySchema,  // descriptions come from .meta() on the schema fields
  },
  // ...
})

Impact: Adding a new filter auto-appears in the spec without touching the generator. Descriptions live in .meta() on the schema field, co-located with validation.


4. GitHub Action (.github/workflows/sync-openapi-spec.yml)

Before: Nothing. Manual process to sync the spec.

After: Automated on every push to main with api/** changes

- Checks out both repos (content-api + developers.wellcomecollection.org)
- Runs `npx tsx api/scripts/generate-openapi.ts > /tmp/content.yaml`
- Compares diff
- If changed: creates PR with branch `auto-sync/openapi-spec-{timestamp}`
- Requests review from `@wellcomecollection/digital-experience`

Uses a GitHub App token (configured via vars.SYNC_APP_ID + secrets.SYNC_APP_PRIVATE_KEY) for cross-repo access.


5. Repository Guidelines (AGENTS.md)

Created documentation for AI assistants (and human contributors):

  • Working Style — ask before assuming; complete work properly
  • Repo Structure — monorepo layout, each package's purpose
  • Build & Test — commands to compile, test, run locally
  • Import Rules@weco/ aliases required for cross-directory imports, including common/
  • Validation & Zod Schemas — where validation happens, how to add new filters
  • OpenAPI Generation — spec is auto-generated, how to update it
  • Error Handling — all 4xx errors use HttpError, ZodError returns 400
  • Pull Requests — what to include in PR descriptions

New Developer Workflow: Adding a Filter

Example: Add ?series= to /articles

1. Zod Schema (api/src/controllers/articles.ts)

export const ArticlesQuerySchema = z.object({
  // ... existing fields ...
  series: commaSeparatedPrismicIds('series'),
})

2. ES Query (api/src/queries/articles.ts)
Wire it into the Elasticsearch filter.

3. Test (api/test/articles.test.ts)

4. Add a description — add .meta({ description: '...' }) to the field in the schema. The generator picks it up automatically:

series: commaSeparatedPrismicIds('series').meta({
  description: 'Filter articles by series.',
}),

5. GitHub Action handles syncing — Push to main, the Action auto-generates the spec and opens a PR to the docs repo.

I did a test by adding this branch in the action and it created: wellcomecollection/developers.wellcomecollection.org#75

The change looks big now but it'll be smaller and specific moving forward. We're just moving from a manual, different way of documenting. If you run it locally, it looks almost the same as prod does right now.


Improvements

For Users

  • Better validation errors — Pagination bounds now enforced (pageSize must be 1–100, not 999)
  • Consistent 400 responses — All query validation returns 400 Bad Request with clear error messages
  • Reliable spec — OpenAPI spec no longer drifts from code

For Developers

  • Simpler to add filters — One schema change propagates everywhere
  • Fewer places to edit — No more duplicating parameter definitions
  • Type-safe — Zod schemas provide full TypeScript inference
  • Testing is easier — Controller schemas are directly testable

For Teams

  • Spec always up-to-date — No manual sync process
  • Clear change history — Spec updates are PR-reviewed in docs repo
  • Single source of truth — Controller code is the spec source, not vice versa

Architecture: OpenAPI Automation

Solid arrows = automated. Dashed arrows = manual.

Before

flowchart TD
    subgraph code["This repo (content-api)"]
        A["Custom validators\nqueryValidator\nprismicIdValidator\nworkIdValidator"]
    end

    subgraph runtime["API runtime"]
        D["Express request"]
        E["Two-pass validation\nEventsSortSchema.parse()\n+ paramsValidator()"]
        F["Elasticsearch query"]
        G["JSON response"]
    end

    subgraph docs["developers.wellcomecollection.org"]
        H["reference/content.yaml\n(hand-maintained)"]
        I["Rendered API docs"]
    end

    A -->|"used by"| E
    A ~~~ H
    H -.->|"manually kept in sync"| I

    D --> E
    E --> F
    F --> G
Loading

After

flowchart TD
    subgraph code["This repo (content-api)"]
        A["Zod schemas\nArticlesQuerySchema\nEventsQuerySchema\nAddressablesQuerySchema"]
        B["generate-openapi.ts\n(run by GitHub Action on push to main;\nimports schemas directly,\nschemas carry their own\ndescriptions via .meta())"]
        C["GitHub Action\nsync-openapi-spec.yml\n(triggers on push to main/api/**)"]
    end

    subgraph runtime["API runtime"]
        D["Express request"]
        E["Schema.parse(req.query)\nvalidates + transforms"]
        F["Elasticsearch query"]
        G["JSON response"]
    end

    subgraph docs["developers.wellcomecollection.org"]
        H["reference/content.yaml"]
        I["Rendered API docs"]
    end

    A -->|"used by"| B
    A -->|"used by"| E
    B -->|"generates YAML"| C
    C -->|"opens PR with updated"| H
    H -->|"rendered as"| I

    D --> E
    E --> F
    F --> G
Loading

Copilot AI review requested due to automatic review settings June 23, 2026 17:15
@rcantin-w rcantin-w requested a review from a team as a code owner June 23, 2026 17:15
@rcantin-w rcantin-w marked this pull request as draft June 23, 2026 17:15

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates Content API query validation to Zod schemas and introduces an OpenAPI generator that imports those schemas directly, aligning runtime validation with automated spec generation.

Changes:

  • Replaces bespoke query-param validators with shared Zod schemas/helpers and per-controller *QuerySchema exports.
  • Updates list controllers (articles/events/all) and validation unit tests to use Zod parsing and errors.
  • Adds an OpenAPI 3.1 generator script using @asteasolutions/zod-to-openapi, plus dependency updates.

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
yarn.lock Locks new dependencies (Zod, zod-to-openapi, yaml) and transitive updates.
common/services/init-apm.ts Updates Zod import style for v4 ({ z }).
api/test/validation.test.ts Reworks validation tests around Zod parsing and ZodError assertions.
api/src/controllers/validation.ts Introduces shared Zod validation helpers/schemas (enums, IDs, dates, pagination, work IDs).
api/src/controllers/events.ts Adds EventsQuerySchema and migrates query validation to schema-based parsing.
api/src/controllers/error.ts Adds centralized ZodError → 400 ErrorResponse handling.
api/src/controllers/articles.ts Adds ArticlesQuerySchema and migrates validation to schema-based parsing.
api/src/controllers/addressables.ts Adds AddressablesQuerySchema and migrates validation to schema-based parsing.
api/scripts/generate-openapi.ts New script to generate OpenAPI 3.1 YAML from Zod schemas.
api/package.json Adds zod dependency and dev deps for OpenAPI generation (zod-to-openapi, yaml).
AGENTS.md Documents the new Zod-based validation + OpenAPI generation workflow and repo conventions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread api/src/controllers/validation.ts
Comment thread api/src/controllers/validation.ts
Comment thread api/src/controllers/validation.ts
Comment thread api/src/controllers/validation.ts
Comment thread api/src/controllers/events.ts Outdated
Comment thread api/src/controllers/articles.ts Outdated
Comment thread api/src/controllers/addressables.ts Outdated
Comment thread .github/workflows/sync-openapi-spec.yml Outdated
@rcantin-w rcantin-w marked this pull request as ready for review June 24, 2026 09:51
@rcantin-w rcantin-w moved this to Ready for review in Digital experience Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Ready for review

Development

Successfully merging this pull request may close these issues.

3 participants