Add JS Asset Auditor plugin with Playwright CLI by ChristianPavilonis · Pull Request #633 · IABTechLab/trusted-server

ChristianPavilonis · 2026-04-13T18:38:57Z

Summary

Add JS Asset Auditor as a Claude Code plugin at packages/js-asset-auditor/ with a standalone Playwright CLI that sweeps publisher pages for third-party JS assets
Auto-detect integrations (GPT, GTM, Didomi, DataDome, Lockr, Permutive, Prebid, APS) from swept URLs and generate trusted-server.toml config with --config flag
Expanded heuristic filters for Google ad rendering, ad verification, ad fraud detection, and reCAPTCHA to reduce noise
Auto-include target URL host as first-party, make trusted-server.toml optional with --domain flag for portability
Browser launches headed by default to avoid bot detection (captchas, DataDome, etc.)

Try it out

1. Check out the branch

git fetch origin
git checkout feature/js-asset-auditor

2. Install dependencies

cd packages/js-asset-auditor && npm install && npx playwright install chromium
cd ../..

3. Run the CLI directly

# Basic audit — opens a browser, sweeps the page, writes js-assets.toml
node packages/js-asset-auditor/lib/audit.mjs https://www.publisher.com

# Also generate trusted-server.toml with detected integrations
node packages/js-asset-auditor/lib/audit.mjs https://www.publisher.com --config

# Custom output paths
node packages/js-asset-auditor/lib/audit.mjs https://www.publisher.com \
  --output /tmp/assets.toml --config /tmp/config.toml

# Headless mode (for CI, may be blocked by bot protection)
node packages/js-asset-auditor/lib/audit.mjs https://www.publisher.com --headless

4. Use as a Claude Code plugin

claude --plugin-dir packages/js-asset-auditor

Then in Claude Code:

/js-asset-auditor:audit-js-assets https://www.publisher.com
/js-asset-auditor:audit-js-assets https://www.publisher.com --config

CLI flags

Flag	Description
`--diff`	Compare sweep against existing `js-assets.toml`
`--settle <ms>`	Settle window after page load (default: 6000)
`--first-party <hosts>`	Additional first-party hosts (comma-separated)
`--domain <host>`	Publisher domain (falls back to `trusted-server.toml` or URL)
`--no-filter`	Bypass heuristic filtering
`--headless`	Run browser without UI (default is headed)
`--output <path>`	Output file for js-assets.toml (default: `js-assets.toml`)
`--config [path]`	Generate `trusted-server.toml` with detected integrations
`--force`	Overwrite existing config file

Test plan

Run against a publisher page and verify js-assets.toml output
Run with --config and verify detected integrations in generated trusted-server.toml
Run with --diff against an existing js-assets.toml and verify confirmed/new/missing
Run without trusted-server.toml (e.g., from /tmp) and verify domain inference
Load as Claude Code plugin and test the skill
Verify --config without --force errors when file already exists
Verify bot-protected sites work in headed mode (e.g., autoblog.com)

Closes #631

Engineering spec for the /audit-js-assets . Covers sweep protocol, Chrome DevTools MCP tooling, heuristic filtering, slug generation, init and diff modes. Closes #606

Fix incorrect MCP tool name prefix, replace misused wait_for with evaluate_script setTimeout, correct list_network_requests filtering to use resourceTypes, resolve path derivation contradiction with consistent /js-assets/{prefix}/{stem}.js formula, pin slug separator and base62 charset, add URL Processing section with normalization rules and first-party boundary definition, tighten wildcard regex to require mixed character classes, and move skill location to .claude/commands/.

Implement the /audit-js-assets command that sweeps a publisher page via Chrome DevTools MCP, detects third-party JS assets, and generates js-assets.toml entries. Includes a shared slug generation script (SHA-256 + base62) and adds MCP permission grants for navigate_page, list_network_requests, and close_page.

Move URL normalization, filtering, wildcard detection, slug generation, and TOML formatting into scripts/audit-js-assets.mjs. The skill now collects raw browser data and delegates processing to the script, replacing fragile LLM-side URL manipulation. Expand heuristic filter with Google ad rendering, ad fraud detection, ad verification, and reCAPTCHA categories. Auto-include target URL host as first-party. Add --no-filter flag. Fix semver regex to match alpha suffixes like 1.19.8-hcskhn.

Replace MCP-driven browser automation with a standalone Playwright CLI at tools/js-asset-auditor/audit.mjs. One command sweeps a publisher page, collects script URLs, processes them through the shared pipeline, and writes js-assets.toml. Refactor scripts/audit-js-assets.mjs to export processAssets() so both the stdin-based pipeline and the Playwright CLI share the same processing logic. Simplify the Claude skill from 115 to 59 lines — it now calls the CLI and formats the JSON summary.

Rewrite sweep protocol, implementation, and verification sections to describe the three-component architecture: Playwright CLI, processing library, and Claude Code skill wrapper. Add direct CLI invocation examples, --headed flag, first-party auto-detection verification, and ad-rendering filter verification steps.

Restructure into packages/js-asset-auditor/ as a self-contained Claude Code plugin with .claude-plugin/plugin.json manifest, skills/ directory, bin/ executable, and lib/ processing modules. The plugin provides the audit-js-assets skill and CLI automatically when enabled. Remove tools/js-asset-auditor/, scripts/audit-js-assets.mjs, and .claude/commands/audit-js-assets.md — all replaced by the plugin.

Enables installing the JS Asset Auditor plugin from this repo via /plugin marketplace add <org>/trusted-server followed by /plugin install js-asset-auditor.

Add --domain flag and fall back to inferring from the target URL when trusted-server.toml is not present. Enables using the plugin in any project without project-specific config.

Reflect the plugin layout at packages/js-asset-auditor/, update all file paths, document the domain resolution fallback chain (--domain flag > trusted-server.toml > infer from URL), and update skill invocation to use the namespaced /js-asset-auditor:audit-js-assets format.

New --config [path] flag auto-detects integrations (GPT, GTM, Didomi, DataDome, Lockr, Permutive, Prebid, APS) from swept script URLs and generates a trusted-server.toml with appropriate [integrations.*] sections. Auto-extracts fields like GTM container_id from query params and Permutive org/workspace IDs from URL paths. Fields needing manual input are marked with TODO comments.

Switch from headless-by-default to headed-by-default. Sites with bot protection (DataDome, Cloudflare, etc.) block headless browsers. The --headed flag becomes --headless for CI/automation use cases.

aram356

Summary

Additive Claude Code plugin (packages/js-asset-auditor/) with a Playwright-based CLI for sweeping publisher pages, a processing library, and integration auto-detection. No Rust changes. Review uncovered three blocking issues: the format-docs CI gate is red, the generated trusted-server.toml emits an invalid bidders = "" for Prebid, and the CLI arg parser silently consumes the next flag when a value is omitted. A handful of non-blocking refactor/hardening suggestions follow.

Blocking

🔧 wrench

format-docs CI failing: prettier wants table column realignment in docs/superpowers/specs/2026-04-01-js-asset-auditor-design.md (heuristic filter table and detection patterns table). Fix: cd docs && npx prettier --write superpowers/specs/2026-04-01-js-asset-auditor-design.md. (inline at line 100)
Generated bidders = "" is invalid for Rust Prebid config: PrebidConfig::bidders is Vec<String> (crates/trusted-server-core/src/integrations/prebid.rs:69). Users who flip enabled = true on a generated [integrations.prebid] block get a deserialization failure. (inline at detect.mjs:266)
--first-party / --settle / --output / --domain consume the next flag: passing a value-taking flag without a value silently swallows the following arg. Reproduced. (inline at audit.mjs:84)

Non-blocking

♻️ refactor

Slug algorithm duplicated in scripts/js-asset-slug.mjs — make it import from packages/js-asset-auditor/lib/process.mjs. Silent drift here breaks KV lookups against the Rust proxy. (inline at scripts/js-asset-slug.mjs:78)
Stale help block in audit.mjs — references --headed (not implemented; actual flag is --headless) and omits --config, --force, --domain. (inline at audit.mjs:18)

🤔 thinking

readPublisherDomain conflates parse errors with missing file — a malformed but present trusted-server.toml silently falls through to URL-inferred domain, producing wrong slugs (publisher prefix depends on domain). Distinguish ENOENT from parse failure. (inline at audit.mjs:142)
--config existence check uses readFileSync — use fs.existsSync. (inline at audit.mjs:238)
No unit tests for process.mjs / detect.mjs — both have nontrivial logic (wildcard regex, first-party boundary, heuristic filter with two match shapes, integration field extraction). Vitest already runs in crates/js/lib; adding fixtures there (or alongside the plugin) would catch drift, especially for the slug hash that must match the Rust proxy.

⛏ nitpick

GTM match uses pathname.includes("/gtm.js") — .endsWith("/gtm.js") removes theoretical ambiguity. (inline at detect.mjs:37)
formatTomlValue doesn't escape " / \ — fine today since inputs are static, but JSON.stringify on the string branch is free escape handling for future patterns. (inline at detect.mjs:226)

🌱 seedling

Cross-language slug fixture test — the spec claims the JS slug must produce identical output to the Rust proxy's KV key derivation, but the Rust proxy lives on a separate branch. Once it lands, a shared-fixture test (same {domain, url} → same slug in both JS and Rust) is the only reliable guard against silent drift. Worth a follow-up issue.

CI Status

cargo fmt: PASS
cargo clippy: PASS
rust tests: PASS
vitest (crates/js/lib): PASS
format-typescript: PASS
format-docs: FAIL (see 🔧 above)

prk-Jr

Summary

This plugin is close, but there are three blocking behavior issues in the CLI/config pipeline that can mislead operators or produce unusable output. CI is green, but these cases need to be fixed before the tool is safe to rely on for onboarding and diff workflows.

Blocking

🔧 wrench

Generated configs enable integrations before required fields are present (packages/js-asset-auditor/lib/detect.mjs:260)
Diff mode keeps re-appending the same NEW assets on repeated runs (packages/js-asset-auditor/lib/process.mjs:214)
--config conflict only logs an error and still exits successfully (packages/js-asset-auditor/lib/audit.mjs:266)

aram356

Summary

Re-review against main after 66951b9. That commit cleanly addresses every finding from my prior review (format-docs, bidders = "", arg-parser flag-swallowing, slug duplication, stale help block, existsSync, GTM endsWith, TOML escaping, ENOENT-vs-parse distinction), with focused regression tests for each.

However, the 2026-04-22 review from @prk-Jr is not addressed by the latest push — I confirmed all three of their blockers in the worktree — and I found two additional blocking-class issues while re-reading the 18 changed files.

Blocking

🔧 wrench

enabled = true for partial / detect_only integrations — generated config boot-fails on missing required fields (server_url, pub_id, app_id). packages/js-asset-auditor/lib/detect.mjs:260 (inline).
Diff mode re-appends the same NEW assets on repeated runs — reproduced: same asset appears 6× after 3 runs because parseExistingToml ignores commented suggestions. packages/js-asset-auditor/lib/process.mjs:214 (inline).
--config collision exits 0 with success summary — violates spec, breaks automation. packages/js-asset-auditor/lib/audit.mjs:269 (inline).
--config default path collides with live trusted-server.toml — with --force, silently overwrites the real publisher config with a TODO skeleton. packages/js-asset-auditor/lib/audit.mjs:136 (inline).
Plugin unit tests are not wired into CI — .github/workflows/test.yml only runs vitest in crates/js/lib. The 8 regression tests added in 66951b9 (the sole guard against drift on slug algorithm, integration detection, arg parsing) run locally only, so regressions will land on main unnoticed. Add a step that runs cd packages/js-asset-auditor && npm ci && npm test on PRs touching this directory, or unconditionally.

Non-blocking

🤔 thinking

No diff-idempotence test — processAssets(..., { diff: true }) run twice against its own output should yield summary.new.length === 0. A fixture test in packages/js-asset-auditor/test/process.test.mjs would have caught the re-append regression and will keep catching it.
SKILL.md still says "headless browser" — packages/js-asset-auditor/skills/audit-js-assets/SKILL.md:24. The CLI has defaulted to headed since e0c7e0c. Misleading for sandbox users.
data: / blob: script URLs produce "null" origin in slugs — new URL("data:…").origin === "null" so applyWildcards yields null/<pathname>. Rare for <script src>, but one protocol guard is cheap.

♻️ refactor

readPublisherDomain hand-rolls a TOML scan (lib/audit.mjs:37-61) — rejects single-quoted strings and any non-^domain = "…" shape. A small TOML lib would be more robust. Non-urgent.

⛏ nitpick

APS pattern uses pathname.includes("/apstag") (lib/detect.mjs:152) — same class of ambiguity as the GTM case tightened in 66951b9. Consider anchoring.

CI Status

cargo fmt: PASS
cargo clippy / Analyze (rust): PASS
cargo test: PASS
vitest (crates/js/lib): PASS
format-typescript: PASS
format-docs: PASS
browser integration tests: PASS
CodeQL: PASS
plugin tests (packages/js-asset-auditor/test): NOT WIRED INTO CI — see blocking #5

aram356

🔧 Prints test.publisher.com in place of actual domain because it reads from toml file

🔧 Please add guide documentation

ChristianPavilonis · 2026-04-24T16:15:07Z

Addressed the remaining review feedback in 00ce787 and replied/resolved the inline threads:

incomplete integrations are no longer auto-enabled
diff mode is now idempotent across repeated runs
--config collisions now fail non-zero before the success summary
no-arg --config now writes to trusted-server.generated.toml
plugin tests are wired into CI
added docs/guide/js-asset-auditor.md
clarified domain precedence and source selection in the CLI/skill/docs

Validation run after the changes:

cd packages/js-asset-auditor && npm test
cd docs && npm run format
cargo test --workspace
cargo fmt --all -- --check
cargo clippy --workspace --all-targets --all-features -- -D warnings

Re-requesting review from @aram356 and @prk-Jr.

aram356

Summary

Re-review against main after 5b3481a. Every prior blocker from the 2026-04-22 (prk-Jr) and 2026-04-23 (aram356) reviews is cleanly addressed in 00ce787 + 5b3481a, with focused regression tests for each fix and the plugin test job now wired into PR CI. CI is green (14 checks). I verified all 16 plugin tests pass locally and prettier is clean on the new docs files.

Two issues warrant changes before merge:

data: / blob: script URLs produce garbage TOML entries that look like real [[js_assets]] blocks but reference unfetchable origins. Reproduced both cases through processAssets(). Inline below.
Spec example for Lockr now contradicts the implementation — the doc shows enabled = true for an integration that the (corrected) generator now emits as enabled = false. Inline below.

Plus two non-blocking suggestions: replace the hand-rolled TOML scan in readPublisherDomain with a real parser (carried over — the new test now locks in the strict-double-quote-only limitation), and tighten the loose pathname.includes(\"/apstag\") APS pattern the same way GTM was tightened in 66951b9.

Blocking

🔧 wrench

data: / blob: URLs produce garbage TOML (packages/js-asset-auditor/lib/process.mjs:321, inline)
Spec example contradicts implementation (docs/superpowers/specs/2026-04-01-js-asset-auditor-design.md:284, inline)

Non-blocking

♻️ refactor

readPublisherDomain hand-rolled TOML scan (packages/js-asset-auditor/lib/audit.mjs:39-63, inline)

🤔 thinking

Integration detection runs against unfiltered URLs — lib/audit.mjs:305 calls detectIntegrations(scriptUrls, …) on the raw network capture, not the post-first-party set. If a publisher self-hosts their prebid bundle at cdn.publisher.com/prebid.js, a [integrations.prebid] block lands in the generated config. Not strictly wrong (publisher does run prebid), but it bypasses the first-party boundary the asset pipeline carefully enforces. Consider passing the third-party-filtered URL set instead.

⛏ nitpick

APS pattern is loose (packages/js-asset-auditor/lib/detect.mjs:170, inline)
--first-party accepts unvalidated values (packages/js-asset-auditor/lib/audit.mjs:163) — splits on , with no normalization, so --first-party https://www.example.com silently no-ops because stripWww doesn't strip the protocol. Either reject non-hostname values or normalize via new URL(value).hostname.

CI Status

cargo fmt: PASS
cargo clippy / Analyze (rust): PASS
cargo test: PASS
vitest (crates/js/lib): PASS
format-typescript: PASS
format-docs: PASS
browser integration tests: PASS
CodeQL: PASS
js-asset-auditor tests: PASS (newly wired into CI in this PR)

aram356 · 2026-04-29T23:16:17Z

+  const assets = [];
+  const seenOrigins = new Set();
+
+  for (const url of survivingUrls) {


🔧 wrench — data: and blob: script URLs produce garbage TOML entries.

Reproduced locally by feeding the URLs through processAssets():

data:text/javascript,console.log(1) → origin_url = "nulltext/javascript,console.log(1)" blob:https://example.com/abc-123 → origin_url = "https://example.comhttps://example.com/abc-123"

new URL("data:…").origin === "null" and applyWildcards concatenates that with pathname, producing entries the proxy can never fetch. The summary still reports them as surfaced assets, so an operator could commit them without noticing.

Fix: skip non-http(s) URLs early in the loop:

for (const url of survivingUrls) { let parsed; try { parsed = new URL(url); } catch { continue; } if (parsed.protocol !== "http:" && parsed.protocol !== "https:") { continue; } // …existing applyWildcards / slug logic }

A fixture test in test/process.test.mjs covering data: and blob: inputs would lock the contract.

aram356 · 2026-04-29T23:16:17Z

+container_id = "GTM-TRCJMD6"  # auto-detected
+
+[integrations.lockr]
+enabled = true


🔧 wrench — Spec example contradicts the implementation after the partial/detect_only fix.

The spec shows:

[integrations.lockr] enabled = true sdk_url = "https://aim.loc.kr/identity-lockr-trust-server.js" # auto-detected # app_id = "" # TODO: set your Lockr Identity app_id

But Lockr's todos always include app_id (lib/detect.mjs:104), so isIntegrationConfigComplete returns false and generateConfig correctly emits enabled = false. I confirmed by running the generator against https://aim.loc.kr/identity-lockr-trust-server.js:

[integrations.lockr] enabled = false sdk_url = "https://aim.loc.kr/identity-lockr-trust-server.js" # auto-detected # app_id = "" # TODO: set your Lockr Identity app_id

Flip the spec example to enabled = false so the documented contract matches what the tool emits — otherwise future readers will treat the spec as the source of truth and re-introduce the prior bug.

aram356 · 2026-04-29T23:16:17Z

+    }
+    if (inPublisher) {
+      const match = line.match(/^domain\s*=\s*"([^"]+)"/);
+      if (match) return match[1];


♻️ refactor — readPublisherDomain hand-rolls a strict TOML scan that rejects valid TOML.

The regex ^domain\s*=\s*"([^"]+)" only matches double-quoted single-line strings. Real trusted-server.toml files using domain = 'site.com', multi-line literals, or values containing escaped " will fail with the misleading "Could not find [publisher].domain in trusted-server.toml" — and the new test at test/audit.test.mjs:84 actively locks this strict behavior in.

Fix: use a TOML parser. smol-toml is ~5 KB and dependency-free; @iarna/toml is the broader-compat option. Either lets you replace the whole 25-line scan with:

import { parse } from "smol-toml"; export function readPublisherDomain(repoRoot, configPath = "trusted-server.toml") { const resolvedPath = resolve(repoRoot, configPath); const parsed = parse(readFileSync(resolvedPath, "utf-8")); const domain = parsed.publisher?.domain; if (typeof domain !== "string" || domain.length === 0) { throw new Error(`Could not find [publisher].domain in ${configPath}`); } return domain; }

Then update the malformed-content test to feed truly malformed TOML rather than valid single-quoted TOML. (Carried over from the 2026-04-20 review; still present.)

aram356 · 2026-04-29T23:16:17Z

+    label: "Amazon Publisher Services",
+    match: (url) =>
+      url.hostname === "c.amazon-adsystem.com" &&
+      url.pathname.includes("/apstag"),


⛏ nitpick — APS pattern uses pathname.includes("/apstag"), the same loose match shape that was tightened for GTM in 66951b9.

The spec at line 253 says c.amazon-adsystem.com/aax2/apstag*. Today the matcher would also fire on /foo/apstag/bar, /apstag.js.bak, etc. Anchor it:

match: (url) => url.hostname === "c.amazon-adsystem.com" && url.pathname.startsWith("/aax2/apstag"),

or pathname.endsWith("/apstag.js") if you want to allow other path prefixes.

jevansnyc and others added 12 commits April 1, 2026 13:17

Add JS Asset Auditor engineering spec

89fab0b

Engineering spec for the /audit-js-assets . Covers sweep protocol, Chrome DevTools MCP tooling, heuristic filtering, slug generation, init and diff modes. Closes #606

Add plugin marketplace index for js-asset-auditor

36535a5

Enables installing the JS Asset Auditor plugin from this repo via /plugin marketplace add <org>/trusted-server followed by /plugin install js-asset-auditor.

Make publisher domain optional in JS Asset Auditor CLI

bbf3b1b

Add --domain flag and fall back to inferring from the target URL when trusted-server.toml is not present. Enables using the plugin in any project without project-specific config.

Default to headed browser to avoid bot detection

e0c7e0c

Switch from headless-by-default to headed-by-default. Sites with bot protection (DataDome, Cloudflare, etc.) block headless browsers. The --headed flag becomes --headless for CI/automation use cases.

aram356 assigned ChristianPavilonis Apr 14, 2026

ChristianPavilonis requested review from aram356 and prk-Jr April 14, 2026 21:06

aram356 requested changes Apr 20, 2026

View reviewed changes

Comment thread packages/js-asset-auditor/lib/audit.mjs Outdated

Fix JS asset auditor review feedback

66951b9

ChristianPavilonis requested a review from aram356 April 21, 2026 13:40

prk-Jr requested changes Apr 22, 2026

View reviewed changes

Comment thread packages/js-asset-auditor/lib/detect.mjs Outdated

Comment thread packages/js-asset-auditor/lib/process.mjs

Comment thread packages/js-asset-auditor/lib/audit.mjs Outdated

aram356 requested changes Apr 23, 2026

View reviewed changes

Comment thread packages/js-asset-auditor/lib/detect.mjs Outdated

Comment thread packages/js-asset-auditor/lib/process.mjs

Comment thread packages/js-asset-auditor/lib/audit.mjs Outdated

Comment thread packages/js-asset-auditor/lib/audit.mjs Outdated

aram356 requested changes Apr 23, 2026

View reviewed changes

Fix JS Asset Auditor review blockers

00ce787

ChristianPavilonis requested review from aram356 and prk-Jr April 24, 2026 16:15

audit detects more prebid script patterns, and detects bidders

5b3481a

aram356 requested changes Apr 29, 2026

View reviewed changes

aram356 mentioned this pull request Apr 30, 2026

Add Trusted Server CLI technical spec #655

Open

14 tasks

Conversation

ChristianPavilonis commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Try it out

1. Check out the branch

2. Install dependencies

3. Run the CLI directly

4. Use as a Claude Code plugin

CLI flags

Test plan

Uh oh!

aram356 left a comment

Choose a reason for hiding this comment

Summary

Blocking

🔧 wrench

Non-blocking

♻️ refactor

🤔 thinking

⛏ nitpick

🌱 seedling

CI Status

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

prk-Jr left a comment

Choose a reason for hiding this comment

Summary

Blocking

🔧 wrench

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aram356 left a comment

Choose a reason for hiding this comment

Summary

Blocking

🔧 wrench

Non-blocking

🤔 thinking

♻️ refactor

⛏ nitpick

CI Status

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aram356 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ChristianPavilonis commented Apr 24, 2026

Uh oh!

aram356 left a comment

Choose a reason for hiding this comment

Summary

Blocking

🔧 wrench

Non-blocking

♻️ refactor

🤔 thinking

⛏ nitpick

CI Status

Uh oh!

aram356 Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

aram356 Apr 29, 2026

Choose a reason for hiding this comment

ChristianPavilonis commented Apr 13, 2026 •

edited

Loading

aram356 left a comment •

edited

Loading