diff --git a/.changeset/provenance-enforcement-storyboard-known-failing.md b/.changeset/provenance-enforcement-storyboard-known-failing.md deleted file mode 100644 index 04aab8624a..0000000000 --- a/.changeset/provenance-enforcement-storyboard-known-failing.md +++ /dev/null @@ -1,10 +0,0 @@ ---- ---- - -chore(testing): mark provenance_enforcement storyboard as known-failing pending training-agent implementation - -Adds `creative_sales_agent/provenance_enforcement` to `KNOWN_FAILING_STORYBOARDS` in `server/tests/manual/run-storyboards.ts`. The storyboard exercises the `PROVENANCE_*` rejection paths and the round-trip of `creative_policy.{provenance_required, provenance_requirements, accepted_verifiers}` through `get_products` / `sync_creatives`, but the training agent has no provenance enforcement yet — the spec is landing in this PR ahead of the reference implementation. - -Tracked in #3777; the entry is removed once the training agent enforces provenance per the spec (`get_products` surfacing the seeded `creative_policy` fields and `sync_creatives` emitting the six `PROVENANCE_*` codes for structural rejections). - -Non-protocol change; no schema or task definition is affected. diff --git a/.changeset/training-agent-provenance-enforcement.md b/.changeset/training-agent-provenance-enforcement.md new file mode 100644 index 0000000000..167dace8f7 --- /dev/null +++ b/.changeset/training-agent-provenance-enforcement.md @@ -0,0 +1,32 @@ +--- +--- + +feat(training-agent): implement provenance enforcement (closes #3777) + +Brings the reference training agent up to the spec landed in #3468, with the cleanup work surfaced in expert review. + +**`handleGetProducts`** now overlays `comply_test_controller`-seeded products onto the response so storyboard-seeded `creative_policy.{provenance_required, provenance_requirements, accepted_verifiers}` fields round-trip through `get_products`. Previously only `handleCreateMediaBuy` saw seeded fixtures. Both code paths now go through `overlaySeededProducts`, so the backfill is applied symmetrically; restricted to seeded-product IDs only so the cached catalog singleton stays untouched. + +**`backfillTrainingProductDefaults`** fills in spec-required Product fields (`name`, `description`, `publisher_properties`, `format_ids`, `pricing_options`, `reporting_capabilities` and its required sub-fields) for fixture-seeded products that historically only carried fields `create_media_buy` validation needed. + +**`handleSyncCreatives`** enforces `creative_policy` from session-seeded products with the structural-rejection family on `error-code.json`: + +- `PROVENANCE_REQUIRED` +- `PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING` +- `PROVENANCE_DISCLOSURE_MISSING` +- `PROVENANCE_EMBEDDED_MISSING` +- `PROVENANCE_VERIFIER_NOT_ACCEPTED` — buyer-supplied `verify_agent.agent_url` cross-checked (canonicalized) against the seller's `accepted_verifiers` allowlist before any outbound call. Off-list URLs reject without the seller contacting them, closing the buyer-controlled-URL trust gap from #3468. + +Per-creative failures emit `action: 'failed'` + per-creative `errors[]` with `field` and `recovery: 'correctable'`. The `SyncCreativeResult` interface gains the `failed` action variant and an optional `errors[]` field. Buyer-controlled strings (`verify_agent.agent_url`, `creative_id`) are sanitized (C0/C1 strip + length cap) before interpolation into `TaskError.message` and `error.field` to defend against log/transcript poisoning. + +The cascade is stable and documented on `enforceProvenancePolicy`: `PROVENANCE_REQUIRED` → `PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING` → `PROVENANCE_DISCLOSURE_MISSING` → `PROVENANCE_EMBEDDED_MISSING` → `PROVENANCE_VERIFIER_NOT_ACCEPTED`. `aggregateCreativePolicy` documents the deliberate asymmetry: requirement booleans are intersected (most-restrictive wins, gates compose), `accepted_verifiers` are unioned (allowlist semantics). + +Truth-of-claim (`PROVENANCE_CLAIM_CONTRADICTED`, requires calling `get_creative_features` against an on-list verifier) is out of scope for this initial implementation — the structural codes are sufficient to exercise the wire contract end to end. Tracked at #3802 with a skeleton storyboard at `media_buy_seller/provenance_truth_of_claim` registered in `KNOWN_FAILING_STORYBOARDS`. + +**Conformance:** new compliance scenario at `static/compliance/source/protocols/media-buy/scenarios/provenance_enforcement.yaml` walks the structural-rejection contract end to end across six phases: discover requirement → reject no-provenance → reject missing digital_source_type → reject off-list verifier → reject missing disclosure → accept corrected resubmission. Storyboard ID `media_buy_seller/provenance_enforcement` (the `creative_sales_agent` storyboard category was retired as part of this PR — see commit history; `creative-reception.yaml` moved to `media_buy_seller/creative_reception` scenario, also non-breaking since `creative_sales_agent` was never in the protocol's specialism enum). + +Removes `media_buy_seller/provenance_enforcement` from `KNOWN_FAILING_STORYBOARDS` (the entry was added when the spec landed in #3468 ahead of the reference implementation; this PR closes that gap). Bumps `min_clean_storyboards` 53→65 and `min_passing_steps` 388→446 (legacy) / 401→464 (framework) in `.github/workflows/training-agent-storyboards.yml`. The pre-existing 53/388/401 floors had drifted below the actual `origin/main` baseline of 64/439/457; the real lift from this PR is +1 storyboard / +7 steps from the new six-phase scenario. + +**Follow-ups:** #3802 tracks `PROVENANCE_CLAIM_CONTRADICTED` truth-of-claim; #3803 tracks storyboard-conformance test-infra (required-clean allowlist, `errors[*]` predicate, pre-push hook); #3823 tracks the broader specialism-taxonomy consolidation (deprecate `sales-proposal-mode` into `sales-guaranteed`, drop phantom storyboard-schema enum entries, per-spec-version source trees before 3.1 GA). + +Refs: #3468, #3777, #3792. diff --git a/.github/workflows/training-agent-storyboards.yml b/.github/workflows/training-agent-storyboards.yml index 1b01ee481a..9790f7bec5 100644 --- a/.github/workflows/training-agent-storyboards.yml +++ b/.github/workflows/training-agent-storyboards.yml @@ -45,12 +45,20 @@ jobs: # (#3081) now passes 4 steps instead of grading not_applicable. # 384→388 after bumping @adcp/client to 5.18.0 (#3191) and # implementing force_task_completion (#3194). + # The 53/388 floors had drifted below the actual baseline + # (true main was 64 storyboards / 439 steps); #3777 brings + # the count to 65 / 446 by adding the + # media_buy_seller/provenance_enforcement scenario (six + # phases exercising PROVENANCE_REQUIRED, + # PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING, + # PROVENANCE_VERIFIER_NOT_ACCEPTED, + # PROVENANCE_DISCLOSURE_MISSING, and an accept path). # Passing-step asymmetry with framework mode is intentional: # the two modes declare different capabilities, so the # storyboard runner skips a different subset of steps in each. # Both modes pass every step they run. - min_clean_storyboards: 53 - min_passing_steps: 388 + min_clean_storyboards: 65 + min_passing_steps: 446 - mode: framework flag_value: '1' # 390→393 after bumping @adcp/client to 5.15 + seller catalog @@ -67,8 +75,13 @@ jobs: # injection — the upstream fix for #940. Framework parity with # legacy fully restored; passing-step asymmetry remains because # framework declares more capabilities (more steps run, all pass). - min_clean_storyboards: 53 - min_passing_steps: 401 + # The 53/401 floors had drifted below the actual baseline + # (true main was 64 storyboards / 457 steps); #3777 brings the + # count to 65 / 464 — same six-phase provenance_enforcement + # scenario as legacy; framework picks up the extra steps from + # its broader capability declaration. + min_clean_storyboards: 65 + min_passing_steps: 464 steps: - uses: actions/checkout@v6 diff --git a/server/src/shared/formats.ts b/server/src/shared/formats.ts index 227d2fde92..b5ce550f54 100644 --- a/server/src/shared/formats.ts +++ b/server/src/shared/formats.ts @@ -540,9 +540,10 @@ export function buildFormats(agentUrl: string): TrainingFormat[] { requirements: { mime_types: ['image/png', 'image/svg+xml'], max_file_size_bytes: 5_000_000 } }, ], }, - // Storyboard-hardcoded format ids (creative_lifecycle, creative_sales_agent). - // Aliased to close-enough shapes; storyboards pass a format_id string - // and don't care about subtle shape differences. + // Storyboard-hardcoded format ids (creative_lifecycle, + // media_buy_seller/creative_reception). Aliased to close-enough shapes; + // storyboards pass a format_id string and don't care about subtle + // shape differences. { format_id: { agent_url: agentUrl, id: 'video_30s' }, name: '30-second video', diff --git a/server/src/training-agent/task-handlers.ts b/server/src/training-agent/task-handlers.ts index 6c5f7450b3..ba11c9287c 100644 --- a/server/src/training-agent/task-handlers.ts +++ b/server/src/training-agent/task-handlers.ts @@ -421,7 +421,8 @@ interface CreativeDeliveryEntry { /** Sync creative result entry. */ interface SyncCreativeResult { creative_id: string; - action: 'created' | 'updated'; + action: 'created' | 'updated' | 'failed'; + errors?: TaskError[]; } /** Creative assignment result. */ @@ -550,8 +551,12 @@ export function invalidateCache(): void { /** * Canonicalize an agent URL for equality comparison: lowercase scheme + host, - * strip a single trailing slash, preserve path case. Used to decide whether - * a caller-supplied `format_id.agent_url` points at this agent. + * strip default port, strip a single trailing slash, preserve path case. + * Used both for `format_id.agent_url` (does this point at this agent?) and + * for cross-checking buyer-supplied `verify_agent.agent_url` against the + * seller's `creative_policy.accepted_verifiers` allowlist (per the rule + * inlined into provenance.json: lowercase scheme and host, strip default + * port, normalize path dot-segments). */ function canonicalizeAgentUrl(url: string): string { try { @@ -565,6 +570,318 @@ function canonicalizeAgentUrl(url: string): string { } } +/** + * Backfill required Product fields for *fixture-seeded* products only. + * Catalog products are guaranteed complete by `buildCatalog`, so we never + * mutate them — that would alias the cached singleton across every + * subsequent request (`getCatalog().map(cp => ({...cp.product}))` is a + * shallow copy; `format_ids[]` and `reporting_capabilities` are shared + * references). Restrict to seeded IDs to keep the cache pristine. + * + * Defaults are sentinel values that pass spec validation; storyboards + * that need specific values still seed them explicitly via the fixture. + * `publisher_domain: 'training.example.com'` uses the IETF reserved + * `example.*` namespace (RFC 6761) — it cannot collide with a real + * publisher claim, but is still a sentinel: if any consumer of + * `publisher_properties` ever resolves the domain (DNS, brand.json + * fetch), the resolution will fail loudly rather than silently match an + * arbitrary domain. + */ +function backfillTrainingProductDefaults(product: Product, ownAgentUrl: string): void { + const p = product as unknown as { + product_id?: string; + name?: string; + description?: string; + publisher_properties?: Array<{ publisher_domain: string; selection_type: string }>; + format_ids?: Array<{ agent_url?: string; id?: string }>; + pricing_options?: unknown[]; + reporting_capabilities?: Record; + }; + if (!Array.isArray(p.format_ids) || p.format_ids.length === 0) { + p.format_ids = [{ agent_url: ownAgentUrl, id: 'display_300x250' }]; + } else { + for (const fid of p.format_ids) { + if (typeof fid === 'object' && fid !== null && !fid.agent_url) { + fid.agent_url = ownAgentUrl; + } + } + } + if (typeof p.name !== 'string' || p.name.length === 0) { + p.name = p.product_id ?? 'Test Product'; + } + if (typeof p.description !== 'string' || p.description.length === 0) { + p.description = `Fixture-seeded product ${p.product_id ?? ''}`.trim(); + } + if (!Array.isArray(p.publisher_properties) || p.publisher_properties.length === 0) { + p.publisher_properties = [{ publisher_domain: 'training.example.com', selection_type: 'all' }]; + } + if (!Array.isArray(p.pricing_options) || p.pricing_options.length === 0) { + p.pricing_options = [{ pricing_option_id: 'fixture_default_cpm', pricing_model: 'cpm', currency: 'USD', rate: 5 }]; + } + // reporting_capabilities is required and has six required sub-fields. Fill + // each missing sub-field individually so fixtures that seed *some* (e.g., + // available_metrics, vendor_metrics) don't fail validation on the rest. + const rc = (p.reporting_capabilities ?? {}) as Record; + if (!Array.isArray(rc.available_reporting_frequencies) || (rc.available_reporting_frequencies as unknown[]).length === 0) { + rc.available_reporting_frequencies = ['daily']; + } + if (typeof rc.expected_delay_minutes !== 'number') rc.expected_delay_minutes = 60; + if (typeof rc.timezone !== 'string') rc.timezone = 'UTC'; + if (typeof rc.supports_webhooks !== 'boolean') rc.supports_webhooks = false; + if (!Array.isArray(rc.available_metrics)) rc.available_metrics = ['impressions', 'spend']; + if (typeof rc.date_range_support !== 'string') rc.date_range_support = 'date_range'; + p.reporting_capabilities = rc; +} + +// ── Provenance enforcement (creative_policy) ────────────────────── + +interface AcceptedVerifierEntry { + agent_url: string; + feature_id?: string; + providers?: string[]; +} + +interface ProvenanceRequirements { + require_digital_source_type?: boolean; + require_disclosure_metadata?: boolean; + require_embedded_provenance?: boolean; +} + +interface CreativePolicyView { + provenance_required?: boolean; + provenance_requirements?: ProvenanceRequirements; + accepted_verifiers?: AcceptedVerifierEntry[]; +} + +/** + * Aggregate `creative_policy` across the session's seeded products. The + * training agent applies the most-restrictive aggregation: if any product + * in the session demands a field, every `sync_creatives` submission is + * checked against that field. Mirrors how a real seller would treat a + * buyer's creative library — if the buyer might assign the creative to + * any product whose policy requires `disclosure`, the disclosure must be + * present on submission. + * + * Field-aggregation directions are deliberately asymmetric: + * - Requirement booleans (`provenance_required`, `require_*`) are + * ORed across products — most-restrictive wins because they're + * gates the buyer must clear. + * - `accepted_verifiers[]` is UNIONed across products — least- + * restrictive wins because it's an allowlist. A buyer pointing at a + * verifier accepted by *any* of the seller's products in this + * session passes the cross-check. + * That's allowlists union, gates intersect — the standard pattern. + * + * Returns `null` when no seeded product carries provenance policy. Pre- + * existing storyboards that don't seed provenance fields keep their + * "accept everything" behavior; only storyboards seeding policy fields + * trigger enforcement. + */ +function aggregateCreativePolicy(session: import('./types.js').SessionState): CreativePolicyView | null { + const { seededProducts } = session.complyExtensions; + if (seededProducts.size === 0) return null; + const acc: CreativePolicyView = {}; + let anyPolicy = false; + for (const fixture of seededProducts.values()) { + const policy = (fixture as { creative_policy?: CreativePolicyView } | undefined)?.creative_policy; + if (!policy) continue; + anyPolicy = true; + if (policy.provenance_required) acc.provenance_required = true; + if (policy.provenance_requirements) { + acc.provenance_requirements = acc.provenance_requirements ?? {}; + const req = policy.provenance_requirements; + if (req.require_digital_source_type) acc.provenance_requirements.require_digital_source_type = true; + if (req.require_disclosure_metadata) acc.provenance_requirements.require_disclosure_metadata = true; + if (req.require_embedded_provenance) acc.provenance_requirements.require_embedded_provenance = true; + } + if (policy.accepted_verifiers?.length) { + acc.accepted_verifiers = acc.accepted_verifiers ?? []; + acc.accepted_verifiers.push(...policy.accepted_verifiers); + } + } + return anyPolicy ? acc : null; +} + +interface CreativeManifestView { + provenance?: Record; + assets?: Record }>; +} + +interface CreativeForEnforcement { + creative_id: string; + provenance?: Record; + manifest?: CreativeManifestView; + // sync_creatives also carries provenance directly on the creative-asset + // and on a separate `creative_manifest` field per the spec. + creative_manifest?: CreativeManifestView; +} + +/** + * Resolve the manifest-level provenance for enforcement. Walks the spec's + * inheritance chain: most-specific wins, replace-not-merge. Asset-level + * overrides exist on the spec but the storyboard exercises manifest-level + * provenance; this implementation checks the manifest level first and + * falls back to creative-asset-level. Asset-level overrides aren't yet + * exercised by conformance, so they're not aggregated here. + */ +function resolveManifestProvenance(creative: CreativeForEnforcement): Record | undefined { + const manifest = creative.creative_manifest ?? creative.manifest; + return manifest?.provenance ?? creative.provenance; +} + +/** + * Clamp a buyer-supplied string before interpolating it into an error + * message or field path. Strips C0 controls (newlines, tabs, NUL, + * escape sequences) and caps length so log/transcript consumers that + * render the message in a terminal or HTML pane don't get poisoned by + * an attacker-shaped value. Length cap is generous enough to fit any + * legitimate URL or creative_id but small enough to bound an abusive + * payload's blast radius. + */ +function sanitizeForError(value: string, maxLen = 256): string { + // eslint-disable-next-line no-control-regex + return value.replace(/[-]/g, '').slice(0, maxLen); +} + +/** + * Build a `TaskError` for a structural-rejection PROVENANCE_* code. + */ +function provenanceError( + code: + | 'PROVENANCE_REQUIRED' + | 'PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING' + | 'PROVENANCE_DISCLOSURE_MISSING' + | 'PROVENANCE_EMBEDDED_MISSING' + | 'PROVENANCE_VERIFIER_NOT_ACCEPTED', + message: string, + field: string, +): TaskError { + return { code, message, field, recovery: 'correctable' } as TaskError; +} + +/** + * Apply seller-side `creative_policy` enforcement to a single creative + * submission. Returns the first PROVENANCE_* error if any structural + * check fails, or `null` when the creative passes. + * + * Cascade order (stable; storyboard assertions on `errors[0]` rely on it): + * 1. PROVENANCE_REQUIRED — provenance object absent + * 2. PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING — required field absent + * 3. PROVENANCE_DISCLOSURE_MISSING — required field absent + * 4. PROVENANCE_EMBEDDED_MISSING — required field absent + * 5. PROVENANCE_VERIFIER_NOT_ACCEPTED — verify_agent off-list + * + * Reordering this cascade would change the first-error a buyer sees on a + * creative that fails multiple checks — keep it stable. If a future + * implementation accumulates errors instead of returning the first, the + * order above is the canonical priority for sorting. + * + * The truth-of-claim surface (PROVENANCE_CLAIM_CONTRADICTED, requires + * calling `get_creative_features` against an on-list verifier) is out + * of scope for this initial implementation — the structural codes are + * sufficient to exercise the wire contract end to end. + */ +function enforceProvenancePolicy( + creative: CreativeForEnforcement, + policy: CreativePolicyView | null, +): TaskError | null { + if (!policy) return null; + const provenance = resolveManifestProvenance(creative); + // creative_id is buyer-controlled — sanitize before interpolating into + // the field path so a payload with newlines or oversized strings can't + // poison the path a downstream consumer renders. + const safeId = sanitizeForError(creative.creative_id, 128); + const fieldRoot = `creatives[${safeId}].creative_manifest.provenance`; + + // 1. provenance_required — any provenance object must exist + if (policy.provenance_required && !provenance) { + return provenanceError( + 'PROVENANCE_REQUIRED', + `Seller's creative_policy.provenance_required is true; the submitted creative has no provenance object on the manifest.`, + `creatives[${safeId}].creative_manifest`, + ); + } + + // 2. require_digital_source_type + if (policy.provenance_requirements?.require_digital_source_type) { + const dst = provenance?.digital_source_type; + if (dst === undefined || dst === null) { + return provenanceError( + 'PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING', + `Seller requires digital_source_type but the resolved provenance has none.`, + `${fieldRoot}.digital_source_type`, + ); + } + } + + // 3. require_disclosure_metadata: disclosure.required must be a boolean, + // and when true at least one disclosure.jurisdictions entry expected. + if (policy.provenance_requirements?.require_disclosure_metadata) { + const disclosure = provenance?.disclosure as { required?: unknown; jurisdictions?: unknown[] } | undefined; + if (!disclosure || typeof disclosure.required !== 'boolean') { + return provenanceError( + 'PROVENANCE_DISCLOSURE_MISSING', + `Seller requires disclosure metadata but the resolved provenance has no disclosure.required boolean.`, + `${fieldRoot}.disclosure`, + ); + } + if (disclosure.required === true && (!Array.isArray(disclosure.jurisdictions) || disclosure.jurisdictions.length === 0)) { + return provenanceError( + 'PROVENANCE_DISCLOSURE_MISSING', + `Seller requires disclosure metadata; disclosure.required is true but disclosure.jurisdictions is empty.`, + `${fieldRoot}.disclosure.jurisdictions`, + ); + } + } + + // 4. require_embedded_provenance — at least one entry + if (policy.provenance_requirements?.require_embedded_provenance) { + const embedded = provenance?.embedded_provenance; + if (!Array.isArray(embedded) || embedded.length === 0) { + return provenanceError( + 'PROVENANCE_EMBEDDED_MISSING', + `Seller requires embedded_provenance but the resolved provenance has none.`, + `${fieldRoot}.embedded_provenance`, + ); + } + } + + // 5. accepted_verifiers cross-check on every embedded_provenance[].verify_agent + // and watermarks[].verify_agent reference. Buyer-supplied agent_urls MUST + // canonicalize-match an entry in the seller's allowlist before the seller + // would call them. Off-list URLs are rejected without any outbound call. + if (policy.accepted_verifiers?.length) { + const allowed = new Set(policy.accepted_verifiers.map(v => canonicalizeAgentUrl(v.agent_url))); + type LayerWithVerifyAgent = { verify_agent?: { agent_url?: unknown } }; + const layers: Array<{ kind: 'embedded_provenance' | 'watermarks'; index: number; entry: LayerWithVerifyAgent }> = []; + const embedded = provenance?.embedded_provenance; + if (Array.isArray(embedded)) { + embedded.forEach((entry, index) => layers.push({ kind: 'embedded_provenance', index, entry: entry as LayerWithVerifyAgent })); + } + const watermarks = provenance?.watermarks; + if (Array.isArray(watermarks)) { + watermarks.forEach((entry, index) => layers.push({ kind: 'watermarks', index, entry: entry as LayerWithVerifyAgent })); + } + for (const layer of layers) { + const url = layer.entry.verify_agent?.agent_url; + if (typeof url !== 'string' || url.length === 0) continue; + if (!allowed.has(canonicalizeAgentUrl(url))) { + // `url` is buyer-controlled — sanitize before interpolation so a + // payload with newlines / ANSI escapes / oversized strings can't + // poison the message a downstream consumer renders. The field path + // is server-constructed from constants (no buyer data), so it's safe. + return provenanceError( + 'PROVENANCE_VERIFIER_NOT_ACCEPTED', + `Buyer's verify_agent.agent_url "${sanitizeForError(url)}" is not in the seller's accepted_verifiers list.`, + `${fieldRoot}.${layer.kind}[${layer.index}].verify_agent.agent_url`, + ); + } + } + } + + return null; +} + /** * Merge products and pricing options seeded via comply_test_controller * (`seed_product`, `seed_pricing_option`) into the in-memory product map @@ -573,6 +890,15 @@ function canonicalizeAgentUrl(url: string): string { * the handlers consult (pricing_options with pricing_model/floor_price/ * fixed_price/etc) so fixture-driven storyboards can reference products * that don't live in the static catalog. + * + * The overlay also runs `backfillTrainingProductDefaults` on each seeded + * entry so missing required Product fields don't fail response-schema + * validation when get_products serializes them. Catalog products in the + * map are left untouched — `getCatalog().map(cp => ({...cp.product}))` + * is a shallow copy whose nested arrays/objects (`format_ids[]`, + * `reporting_capabilities`, ...) alias the cached catalog singleton, so + * mutating them would leak across requests. Restricting backfill to + * seeded IDs keeps the cache pristine. */ function overlaySeededProducts( session: import('./types.js').SessionState, @@ -594,6 +920,7 @@ function overlaySeededProducts( ...seededProducts.keys(), ...pricingByProduct.keys(), ]); + const ownAgentUrl = getAgentUrl(); for (const productId of productIds) { const existing = productMap.get(productId) ?? {} as Partial; const fixture = seededProducts.get(productId) as Partial | undefined; @@ -605,6 +932,7 @@ function overlaySeededProducts( pricing_options: seededPricing as unknown as Product['pricing_options'], }); } + backfillTrainingProductDefaults(merged as Product, ownAgentUrl); productMap.set(productId, merged as Product); } } @@ -1025,6 +1353,16 @@ export async function handleGetProducts(args: ToolArgs, ctx: TrainingContext) { let products: Product[] = getCatalog().map(cp => ({ ...cp.product })); + // Overlay seeded products from comply_test_controller fixtures so + // storyboard-seeded fields (e.g. creative_policy.provenance_requirements, + // accepted_verifiers) round-trip through get_products. The overlay + // also backfills required Product fields on seeded products so they + // serialize as schema-valid responses without forcing every fixture + // to repeat boilerplate. Catalog products are not touched. + const productMap = new Map(products.map(p => [p.product_id, p])); + overlaySeededProducts(session, productMap); + products = Array.from(productMap.values()); + // Apply filters if (req.filters) { const channelFilter = req.filters.channels; @@ -2094,6 +2432,10 @@ export async function handleSyncCreatives(args: ToolArgs, ctx: TrainingContext) // Build a set of valid format IDs for validation const validFormatIds = new Set(getFormats().map(f => f.format_id.id)); const ownAgentUrlCanonical = canonicalizeAgentUrl(getAgentUrl()); + // Compute the session's effective creative_policy from seeded products. + // Returns null when no fixture seeds policy fields — pre-existing + // storyboards that don't exercise provenance enforcement keep working. + const effectivePolicy = aggregateCreativePolicy(session); const results: SyncCreativeResult[] = []; for (const creative of req.creatives) { @@ -2108,6 +2450,20 @@ export async function handleSyncCreatives(args: ToolArgs, ctx: TrainingContext) const creativeId = creative.creative_id; const formatId = creative.format_id as FormatID; + // Enforce creative_policy.provenance_required / provenance_requirements / + // accepted_verifiers BEFORE persisting the creative. Per-creative failure + // is surfaced as action: 'failed' + errors[]; the surrounding session and + // any other creatives in the batch are unaffected (best-effort processing). + const policyError = enforceProvenancePolicy(creative as unknown as CreativeForEnforcement, effectivePolicy); + if (policyError) { + results.push({ + creative_id: creativeId, + action: 'failed', + errors: [policyError], + }); + continue; + } + // Reject clearly-malformed agent_urls before we persist them. Prevents // javascript:/data: or overlong URLs landing in JSONB via the pointer. if (formatId?.agent_url !== undefined) { diff --git a/server/tests/manual/run-storyboards.ts b/server/tests/manual/run-storyboards.ts index 20043652d4..0e38194a41 100644 --- a/server/tests/manual/run-storyboards.ts +++ b/server/tests/manual/run-storyboards.ts @@ -113,15 +113,14 @@ const KNOWN_FAILING_STORYBOARDS: ReadonlyMap = new Map([ // Tracked upstream as adcp#3429; remove once the storyboard is migrated to // `envelope_field_present` AND the framework wraps capabilities responses. ['v3_envelope_integrity', 'adcp-client#1045 / adcp#3429 — storyboard asserts envelope status, framework capabilities tool returns unenveloped payload'], - // The storyboard exercises creative_policy.{provenance_required, - // provenance_requirements, accepted_verifiers} round-tripping through - // get_products and the PROVENANCE_*_MISSING / PROVENANCE_VERIFIER_NOT_ACCEPTED - // rejection paths on sync_creatives. The training agent has no provenance - // enforcement yet — get_products doesn't surface the seeded creative_policy - // fields and sync_creatives accepts every submission. Three step validations - // fail against the reference implementation. Tracked as adcp#3777; remove - // this entry once the training agent enforces provenance per the spec. - ['creative_sales_agent/provenance_enforcement', 'adcp#3777 — training agent does not yet enforce creative_policy.provenance_requirements / accepted_verifiers (no PROVENANCE_* rejection paths in sync_creatives)'], + // Skeleton scenario for the truth-of-claim half of provenance enforcement + // (PROVENANCE_CLAIM_CONTRADICTED). The training agent does not yet + // implement get_creative_features against accepted_verifiers, so the + // verifier-driven contradiction path can't run end to end. Tracked in + // adcp#3802; remove once the training agent ships truth-of-claim + // verification and the storyboard's placeholder phase is fleshed out + // with the full negative + positive paths. + ['media_buy_seller/provenance_truth_of_claim', 'adcp#3802 — training agent does not yet invoke get_creative_features against accepted_verifiers (truth-of-claim path); storyboard is a registered skeleton'], ]); /** diff --git a/static/compliance/source/protocols/media-buy/creative-reception.yaml b/static/compliance/source/protocols/media-buy/scenarios/creative_reception.yaml similarity index 93% rename from static/compliance/source/protocols/media-buy/creative-reception.yaml rename to static/compliance/source/protocols/media-buy/scenarios/creative_reception.yaml index 56b4b5fb59..565958bfa2 100644 --- a/static/compliance/source/protocols/media-buy/creative-reception.yaml +++ b/static/compliance/source/protocols/media-buy/scenarios/creative_reception.yaml @@ -1,7 +1,7 @@ -id: creative_sales_agent +id: media_buy_seller/creative_reception version: "1.0.0" title: "Sales agent with creative capabilities" -category: creative_sales_agent +category: media_buy_seller summary: "Stateful sales agent that accepts pushed creative assets and renders them in its environment." track: creative required_tools: @@ -60,7 +60,7 @@ phases: Return capabilities declaring creative in supported_protocols, confirming the agent handles creative operations. sample_request: context: - correlation_id: "creative_sales_agent--get_capabilities" + correlation_id: "media_buy_seller_creative_reception--get_capabilities" validations: - check: response_schema description: "Response matches get-adcp-capabilities-response.json schema" @@ -73,7 +73,7 @@ phases: description: "Response echoes back the context object" - check: field_value path: "context.correlation_id" - value: "creative_sales_agent--get_capabilities" + value: "media_buy_seller_creative_reception--get_capabilities" description: "Context correlation_id returned unchanged" - id: discover_accepted_formats title: "Discover accepted formats" @@ -157,9 +157,9 @@ phases: asset_type: "url" url: "https://acmeoutdoor.example/summer-sale" - idempotency_key: "$generate:uuid_v4#creative_sales_agent_push_creatives_sync_creatives" + idempotency_key: "$generate:uuid_v4#media_buy_seller_creative_reception_push_creatives_sync_creatives" context: - correlation_id: "creative_sales_agent--sync_creatives" + correlation_id: "media_buy_seller_creative_reception--sync_creatives" validations: - check: response_schema description: "Response matches sync-creatives-response.json schema" @@ -172,7 +172,7 @@ phases: description: "Response echoes back the context object" - check: field_value path: "context.correlation_id" - value: "creative_sales_agent--sync_creatives" + value: "media_buy_seller_creative_reception--sync_creatives" description: "Context correlation_id returned unchanged" - id: preview title: "Preview pushed creatives" @@ -230,7 +230,7 @@ phases: quality: "draft" context: - correlation_id: "creative_sales_agent--preview_synced" + correlation_id: "media_buy_seller_creative_reception--preview_synced" validations: - check: response_schema description: "Response matches preview-creative-response.json schema" @@ -243,5 +243,5 @@ phases: description: "Response echoes back the context object" - check: field_value path: "context.correlation_id" - value: "creative_sales_agent--preview_synced" + value: "media_buy_seller_creative_reception--preview_synced" description: "Context correlation_id returned unchanged" diff --git a/static/compliance/source/protocols/media-buy/scenarios/provenance_enforcement.yaml b/static/compliance/source/protocols/media-buy/scenarios/provenance_enforcement.yaml index c3903873b7..9f77cbd57c 100644 --- a/static/compliance/source/protocols/media-buy/scenarios/provenance_enforcement.yaml +++ b/static/compliance/source/protocols/media-buy/scenarios/provenance_enforcement.yaml @@ -1,8 +1,8 @@ -id: creative_sales_agent/provenance_enforcement +id: media_buy_seller/provenance_enforcement version: "1.0.0" title: "Seller enforces provenance_requirements on sync_creatives" -category: creative_sales_agent -summary: "Seller publishes provenance_requirements + accepted_verifiers on a product. Two structural rejections (off-list verify_agent → PROVENANCE_VERIFIER_NOT_ACCEPTED, missing disclosure → PROVENANCE_DISCLOSURE_MISSING), then accepts a corrected resubmission with on-list verifier." +category: media_buy_seller +summary: "Seller publishes provenance_requirements + accepted_verifiers on a product. Four structural rejections (no provenance, missing digital_source_type, off-list verifier, missing disclosure), then a corrected resubmission with on-list verifier is accepted." track: creative required_tools: - get_products @@ -17,15 +17,16 @@ narrative: | provenance + an on-list verify_agent, and the seller confirms by structurally cross-checking the submission and (when applicable) calling an on-list agent. - This scenario walks the structural-rejection contract end to end. Phase 1: the - buyer discovers a product whose creative_policy declares require_disclosure_metadata - and accepted_verifiers. Phase 2: the buyer submits with provenance pointing at an - off-list verify_agent.agent_url — the seller MUST cross-check before any outbound - call and reject with PROVENANCE_VERIFIER_NOT_ACCEPTED, closing the buyer-controlled-URL - trust gap. Phase 3: the buyer's submission has an on-list verifier but omits the - required disclosure block — the seller MUST reject with PROVENANCE_DISCLOSURE_MISSING. - Phase 4: the buyer resubmits with disclosure populated and an on-list verify_agent; - the seller accepts. + This scenario walks the structural-rejection contract end to end across the + PROVENANCE_* family. Phase 1: discover a product whose creative_policy declares + the requirements + accepted_verifiers. Phase 2: submit a creative with no + provenance object at all → PROVENANCE_REQUIRED. Phase 3: submit with provenance + but no digital_source_type → PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING. Phase 4: + point verify_agent at a URL outside accepted_verifiers → PROVENANCE_VERIFIER_NOT_ACCEPTED + (the seller MUST cross-check before any outbound call, closing the buyer- + controlled-URL trust gap). Phase 5: provenance present with on-list verifier + but no disclosure block → PROVENANCE_DISCLOSURE_MISSING. Phase 6: corrected + submission with disclosure and on-list verify_agent — accepted. The truth-of-claim contract (PROVENANCE_CLAIM_CONTRADICTED, surfaced via get_creative_features when the on-list verifier returns a result that contradicts @@ -56,13 +57,17 @@ prerequisites: fixtures: products: - product_id: "test-product-disclosure-required" + name: "Provenance Enforcement Test Product" + description: "Sandbox display inventory exercising AI disclosure provenance enforcement and accepted verifier allowlist" delivery_type: "non_guaranteed" + channels: ["display"] creative_policy: co_branding: "optional" landing_page: "any" templates_available: false provenance_required: true provenance_requirements: + require_digital_source_type: true require_disclosure_metadata: true accepted_verifiers: - agent_url: "https://governance.encypher.seller.example" @@ -98,7 +103,15 @@ phases: the seller's verifier allowlist before submission. sample_request: buying_mode: "brief" - brief: "Display inventory accepting creatives with AI disclosure metadata." + # Brief intentionally surfaces "Provenance Enforcement" so brief-mode + # keyword scoring places the seeded fixture product (which carries + # creative_policy.{provenance_required, provenance_requirements, + # accepted_verifiers}) at products[0]. This is a known coupling + # between the storyboard and brief-mode scoring; if the runner ever + # supports JSONPath predicates (`products[?(@.product_id=='...')]`), + # switch to wholesale mode + a predicate path so the assertion + # doesn't depend on scoring weights. + brief: "Provenance Enforcement display inventory — AI disclosure metadata required, accepted verifier allowlist published" brand: domain: "acmeoutdoor.example" account: @@ -126,6 +139,141 @@ phases: value: "provenance_enforcement--get_products" description: "Context correlation_id returned unchanged" + - id: reject_no_provenance + title: "Sync with no provenance object — rejected" + narrative: | + The buyer submits a creative with no provenance object anywhere on + the manifest. Because the seller's creative_policy.provenance_required + is true, the seller MUST reject the per-creative entry with + PROVENANCE_REQUIRED. This is the cheapest, most likely buyer mistake: + the orchestrator forgot to attach provenance at all. + + steps: + - id: sync_creatives_no_provenance + title: "Submit creative with no provenance object" + task: sync_creatives + schema_ref: "creative/sync-creatives-request.json" + response_schema_ref: "creative/sync-creatives-response.json" + doc_ref: "/creative/task-reference/sync_creatives" + stateful: true + expected: | + The seller accepts the request envelope but rejects the per-creative + entry with action: failed and an error code PROVENANCE_REQUIRED. + sample_request: + account: + brand: + domain: "acmeoutdoor.example" + operator: "pinnacle-agency.example" + creatives: + - creative_id: "acme_no_provenance_probe_001" + name: "Acme no-provenance probe" + format_id: + agent_url: "https://your-platform.example.com" + id: "display_300x250" + assets: + headline: + asset_type: "text" + content: "Outdoor gear, photographed live" + image: + asset_type: "image" + url: "https://test-assets.adcontextprotocol.org/acme-outdoor/hero.jpg" + width: 300 + height: 250 + click_url: + asset_type: "url" + url: "https://acmeoutdoor.example/spring" + idempotency_key: "$generate:uuid_v4#provenance_enforcement_reject_no_provenance_sync" + context: + correlation_id: "provenance_enforcement--reject_no_provenance" + validations: + - check: response_schema + description: "Response matches sync-creatives-response.json schema" + - check: field_value + path: "creatives[0].action" + value: "failed" + description: "Per-creative action is failed for the no-provenance submission" + # Per-creative error assertions read errors[0].code positionally. + # The handler emits errors in the cascade order documented on + # enforceProvenancePolicy (PROVENANCE_REQUIRED first), so [0] is + # stable. If a future implementation accumulates errors, the same + # cascade priority should drive the array order. + - check: field_value + path: "creatives[0].errors[0].code" + value: "PROVENANCE_REQUIRED" + description: "Per-creative error code is PROVENANCE_REQUIRED — provenance object absent on a creative under a policy that requires it" + - check: field_value + path: "context.correlation_id" + value: "provenance_enforcement--reject_no_provenance" + description: "Context correlation_id returned unchanged on rejection" + + - id: reject_missing_digital_source_type + title: "Sync with provenance but no digital_source_type — rejected" + narrative: | + The buyer's provenance object is present but omits digital_source_type. + The seller's creative_policy.provenance_requirements.require_digital_source_type + is true, so the seller MUST reject the per-creative entry with + PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING. Distinct from PROVENANCE_REQUIRED + (no provenance at all) — provenance is present, just missing this field. + + steps: + - id: sync_creatives_no_digital_source_type + title: "Submit creative with provenance but no digital_source_type" + task: sync_creatives + schema_ref: "creative/sync-creatives-request.json" + response_schema_ref: "creative/sync-creatives-response.json" + doc_ref: "/creative/task-reference/sync_creatives" + stateful: true + expected: | + The seller accepts the request envelope but rejects the per-creative + entry with action: failed and an error code + PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING. + sample_request: + account: + brand: + domain: "acmeoutdoor.example" + operator: "pinnacle-agency.example" + creatives: + - creative_id: "acme_no_dst_probe_001" + name: "Acme no-digital_source_type probe" + format_id: + agent_url: "https://your-platform.example.com" + id: "display_300x250" + assets: + headline: + asset_type: "text" + content: "Outdoor gear, photographed live" + image: + asset_type: "image" + url: "https://test-assets.adcontextprotocol.org/acme-outdoor/hero.jpg" + width: 300 + height: 250 + click_url: + asset_type: "url" + url: "https://acmeoutdoor.example/spring" + provenance: + # Provenance present, but digital_source_type intentionally absent + # so the require_digital_source_type policy gate fires. + declared_by: + role: "agency" + idempotency_key: "$generate:uuid_v4#provenance_enforcement_reject_no_dst_sync" + context: + correlation_id: "provenance_enforcement--reject_no_digital_source_type" + validations: + - check: response_schema + description: "Response matches sync-creatives-response.json schema" + - check: field_value + path: "creatives[0].action" + value: "failed" + description: "Per-creative action is failed for the no-digital_source_type submission" + - check: field_value + path: "creatives[0].errors[0].code" + value: "PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING" + description: "Per-creative error code identifies the missing digital_source_type field" + - check: field_value + path: "context.correlation_id" + value: "provenance_enforcement--reject_no_digital_source_type" + description: "Context correlation_id returned unchanged on rejection" + - id: reject_off_list_verifier title: "Sync with off-list verify_agent — rejected" narrative: | @@ -144,8 +292,6 @@ phases: schema_ref: "creative/sync-creatives-request.json" response_schema_ref: "creative/sync-creatives-response.json" doc_ref: "/creative/task-reference/sync_creatives" - expect_error: true - negative_path: payload_well_formed stateful: true expected: | The seller accepts the request envelope but rejects the per-creative entry @@ -207,7 +353,8 @@ phases: path: "creatives[0].action" value: "failed" description: "Per-creative action is failed for the off-list verifier submission" - - check: error_code + - check: field_value + path: "creatives[0].errors[0].code" value: "PROVENANCE_VERIFIER_NOT_ACCEPTED" description: "Per-creative error code is PROVENANCE_VERIFIER_NOT_ACCEPTED — buyer-asserted URL was outside the seller's accepted_verifiers allowlist" - check: field_value @@ -232,8 +379,6 @@ phases: schema_ref: "creative/sync-creatives-request.json" response_schema_ref: "creative/sync-creatives-response.json" doc_ref: "/creative/task-reference/sync_creatives" - expect_error: true - negative_path: payload_well_formed stateful: true expected: | The seller accepts the request envelope but rejects the per-creative entry @@ -281,7 +426,8 @@ phases: path: "creatives[0].action" value: "failed" description: "Per-creative action is failed for the missing-disclosure submission" - - check: error_code + - check: field_value + path: "creatives[0].errors[0].code" value: "PROVENANCE_DISCLOSURE_MISSING" description: "Per-creative error code identifies the missing disclosure requirement; buyers can self-correct without negotiating" - check: field_value @@ -361,9 +507,10 @@ phases: validations: - check: response_schema description: "Response matches sync-creatives-response.json schema" - - check: field_present + - check: field_value path: "creatives[0].action" - description: "Per-creative action is present (created or updated, not failed)" + allowed_values: ["created", "updated"] + description: "Per-creative action is created or updated — not failed. Tighter than field_present, which would silently pass on action: failed" - check: field_value path: "context.correlation_id" value: "provenance_enforcement--accept_with_disclosure" diff --git a/static/compliance/source/protocols/media-buy/scenarios/provenance_truth_of_claim.yaml b/static/compliance/source/protocols/media-buy/scenarios/provenance_truth_of_claim.yaml new file mode 100644 index 0000000000..e68b719e3e --- /dev/null +++ b/static/compliance/source/protocols/media-buy/scenarios/provenance_truth_of_claim.yaml @@ -0,0 +1,114 @@ +id: media_buy_seller/provenance_truth_of_claim +version: "0.1.0" +title: "Seller refutes a buyer's provenance claim via on-list verifier" +category: media_buy_seller +summary: "Buyer attaches a digital_source_type claim. Seller calls an on-list governance agent via get_creative_features; the verifier returns a result that contradicts the claim, and the seller rejects with PROVENANCE_CLAIM_CONTRADICTED. Skeleton scenario — not yet runnable end-to-end (the training agent does not implement get_creative_features against verifiers)." +track: creative +required_tools: + - get_products + - sync_creatives + - get_creative_features + +narrative: | + This is the truth-of-claim half of the provenance enforcement contract. + The structural-rejection half (media_buy_seller/provenance_enforcement) + exercises the PROVENANCE_*_MISSING / PROVENANCE_VERIFIER_NOT_ACCEPTED codes + — failures the seller can detect by inspecting the submission against + creative_policy. This scenario exercises PROVENANCE_CLAIM_CONTRADICTED: + the seller calls a governance agent from creative_policy.accepted_verifiers + via get_creative_features, the verifier returns a result that contradicts + the buyer's provenance claim (e.g., buyer claims digital_source_type: + digital_capture but the AI-detection feature returns ai_generated: true + above the seller's confidence threshold), and the seller rejects. + + This is a SKELETON. The training agent does not yet: + 1. Stand up a governance-agent endpoint on its own URL + 2. Invoke get_creative_features against accepted_verifiers entries + 3. Reconcile feature results against the buyer's provenance claim + 4. Emit PROVENANCE_CLAIM_CONTRADICTED with the audit-safe error.details + allowlist (agent_url, feature_id, claimed_value, observed_value, + confidence, substituted_for) + + Tracked via the issue referenced from KNOWN_FAILING_STORYBOARDS in + server/tests/manual/run-storyboards.ts. When the training agent ships + the truth-of-claim path, remove the KNOWN_FAILING entry and flesh out + this storyboard's phases (currently a single placeholder step). + +agent: + interaction_model: stateful_push + capabilities: + - has_creative_library + examples: + - "Publishers and SSPs that run independent AI detection against buyer-claimed provenance" + +caller: + role: buyer_agent + example: "Pinnacle Agency (buyer)" + +prerequisites: + description: | + Seller publishes a product with creative_policy.accepted_verifiers + pointing at a governance agent that implements get_creative_features. + The training agent's get_creative_features path is not yet implemented; + this storyboard is a skeleton tracked as known-failing. + test_kit: "test-kits/acme-outdoor.yaml" + controller_seeding: true + +fixtures: + products: + - product_id: "test-product-truth-of-claim" + name: "Provenance Truth-of-Claim Test Product" + description: "Sandbox display inventory exercising verifier-driven contradiction of buyer provenance claims" + delivery_type: "non_guaranteed" + channels: ["display"] + creative_policy: + co_branding: "optional" + landing_page: "any" + templates_available: false + provenance_required: true + accepted_verifiers: + - agent_url: "https://governance.encypher.seller.example" + feature_id: "ai_generated" + providers: ["Encypher"] + pricing_options: + - pricing_option_id: "test-pricing-cpm" + pricing_model: "cpm" + rate: 5.00 + currency: "USD" + +phases: + - id: placeholder + title: "Skeleton phase — not yet runnable" + narrative: | + This phase is a placeholder so the storyboard parses and registers in + the conformance inventory. The full flow will be: get_products to read + accepted_verifiers, sync_creatives with a contradicted claim, expect + action: failed with creatives[0].errors[0].code == + PROVENANCE_CLAIM_CONTRADICTED and error.details limited to the safe + allowlist. Track in run-storyboards.ts KNOWN_FAILING_STORYBOARDS. + + steps: + - id: get_products_truth_of_claim + title: "Discover the truth-of-claim product" + task: get_products + schema_ref: "media-buy/get-products-request.json" + response_schema_ref: "media-buy/get-products-response.json" + doc_ref: "/media-buy/task-reference/get_products" + stateful: false + expected: | + Skeleton — the storyboard is on KNOWN_FAILING until the training + agent implements truth-of-claim verification. + sample_request: + buying_mode: "brief" + brief: "Provenance Truth-of-Claim display inventory — verifier-driven AI detection" + brand: + domain: "acmeoutdoor.example" + account: + brand: + domain: "acmeoutdoor.example" + operator: "pinnacle-agency.example" + context: + correlation_id: "provenance_truth_of_claim--get_products" + validations: + - check: response_schema + description: "Response matches get-products-response.json schema"