Add UDS/RPC bridge benchmark suite with standalone app#55
Add UDS/RPC bridge benchmark suite with standalone app#55gmaclennan wants to merge 36 commits intomainfrom
Conversation
Plans an opt-in, host-app-driven Sentry integration covering: - error capture across backend (Node), JS/RN, and native layers - RPC tracing via @comapeo/ipc onRequestHook (mirrors comapeo-mobile) - forwarding @comapeo/core OpenTelemetry spans (PR digidem/comapeo-core#1051) - app-specific gating so non-CoMapeo consumers ship no Sentry traffic https://claude.ai/code/session_01EcVXzczA1TVkhEkgUg9DKX
Closes the FGS-cold-start gap where the prior draft required RN to be alive before backend Sentry could initialize: - §4 reworked: Expo config plugin writes DSN/environment/release into Android manifest meta-data and iOS Info.plist at prebuild time. Native reads those at process start, no JS round-trip, before booting @sentry/node and @sentry/android. - §7.4 added: native telemetry data design mapped onto Sentry primitives (breadcrumbs for state transitions, transaction + spans for boot/shutdown phases, captureMessage for timeouts, tags/contexts for cross-process attribution). Categorizes captures as essential vs opt-in and documents a hard never-capture list for PII. - §9 added: persisted "capture application data" toggle with restart-to-activate semantics. Snapshot read at boot, embedded in the init frame; gates per-RPC spans, sync-session transactions, memory checkpoints, and storage-size sampling. Never unlocks the never-capture list. - §10 phasing and §13 file-change list updated. New open questions added for release tagging, plugin no-op behavior, toggle UI, and boot sample rate. https://claude.ai/code/session_01EcVXzczA1TVkhEkgUg9DKX
Adds a stripped bench backend (`backend/index.bench.js` + bench RPC server with echo / payload methods) and a sibling `apps/benchmark/` app that drives it through the same RN→native→Node UDS path as production, isolating the framing / IPC / RPC bridge from @comapeo/core init noise. Consumer isolation is enforced three ways: - the bench bundle lands at sibling paths (`android/src/bench/assets/`, `ios/nodejs-project-bench/`) the production flavor / podspec don't reference; - a new Android `bench` productFlavor + iOS `ENV['COMAPEO_BENCH']` podspec toggle is opt-in only; - `package.json` files array negates both bench paths so they cannot leak via `npm publish`. `apps/benchmark/` does not check in `android/` or `ios/` — the new `with-comapeo-bench` Expo config plugin re-applies the variant / env-var / Xcode rename build phase wiring on every `expo prebuild`. Standalone-runnable: NDJSON sink + on-screen p50/p95/p99 work without any host-side infrastructure. Optional HTTP toggle posts spans to the bundled `bench-receiver.ts` for orchestrated BrowserStack runs. Maestro flows (bench-rpc + per-payload-size variants) drive the bench end-to-end. See `docs/uds-rpc-bridge-benchmark-plan.md` for the full design. https://claude.ai/code/session_01SC1Sc9AvULHQkQSoQ2SMzJ
Three fixes surfaced when running the bench app end-to-end on a
Pixel 7a API 29 emulator:
- **Replace Android productFlavor with a project property.** The
`bench` / `production` flavor dimension on the lib triggered AGP /
Gradle 9 strict variant ambiguity in consuming Expo apps that don't
declare matching flavors of their own (apps/expo#18315 etc.):
`missingDimensionStrategy` + `matchingFallbacks` weren't enough to
disambiguate `benchDebugApiElements` vs. `productionDebugApiElements`.
The lib now reads `rootProject.findProperty('comapeoBench')` and
swaps `assets.srcDirs` with `=` (assignment, not `srcDirs '<...>'`
which AGP treats as additive). Also empties `src/debug/assets` when
bench is active so the production debug bundle doesn't overlay
bench in debug builds. The `with-comapeo-bench` config plugin
switches from `withAppBuildGradle` to `withGradleProperties` and
writes `comapeoBench=true` into the consuming app's
`android/gradle.properties`.
- **Pin Expo modules to SDK 55.** `expo-file-system@19.0.18` and
`expo-sharing@14.0.7` (the latest npm versions) are SDK-incompatible
with Expo 55 and crashed the JS app at launch with a
`NoClassDefFoundError: FilePermissionModuleInterface` autolinking
failure. `npx expo install` resolves them to `~55.0.17` /
`~55.0.18` which match the rest of the SDK.
- **Add `bench-rpc-ios.yaml` Maestro flow.** The Android flow's
`clearState: true` triggers a deep-link confirmation dialog on iOS
that blocks the rest of the run. The iOS flow drops `clearState`
and dismisses the dialog with a guarded `runFlow.when` block.
Validation results on Pixel 7a API 29 emulator (debug build, RN-thread
RTT in ms, 100 iterations after 10-iteration warmup):
size n p50 p95 p99
64B 100 1.65 2.56 7.34
1KB 100 1.68 2.76 4.45
64KB 100 2.48 4.70 6.29
iOS run blocked by a pre-existing lifecycle issue
(`AppLifecycleDelegate.applicationDidBecomeActive` doesn't fire under
scene-based app lifecycle, so `NodeJSService.start()` is never called)
— same code path the example app uses, so this is not a bench
regression. Tracked separately.
https://claude.ai/code/session_01SC1Sc9AvULHQkQSoQ2SMzJ
There was a problem hiding this comment.
Pull request overview
Adds an isolated UDS/RPC bridge benchmarking setup (standalone Expo app + minimal bench backend bundle) plus host-side span collection, with build-time gating to prevent benchmark artefacts from leaking into normal consumer apps/packages.
Changes:
- Exposes
benchMessagePortfrom@comapeo/core-react-nativeand adds a bench-only backend entrypoint (backend/index.bench.js) with minimal RPC + span instrumentation helpers. - Introduces a standalone Expo benchmark app (
apps/benchmark/) and Maestro flows to automate benchmark runs across payload sizes. - Adds build/packaging plumbing for a separate bench bundle output tree and a host-side HTTP receiver to collate NDJSON + CSV summaries.
Reviewed changes
Copilot reviewed 28 out of 34 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| src/index.ts | Re-exports benchMessagePort from the module surface. |
| src/ComapeoCoreModule.ts | Exposes the raw CoreMessagePort singleton as benchMessagePort. |
| scripts/lib/bench-receiver.ts | Adds a localhost HTTP receiver that persists spans and rewrites a CSV summary. |
| scripts/build-backend.ts | Adds --bench mode to build only the bench JS bundle into bench-specific output paths. |
| package.json | Updates files allowlist to exclude bench output paths from publishing. |
| ios/ComapeoCore.podspec | Adds ENV['COMAPEO_BENCH'] conditional resource selection for bench bundle. |
| e2e/.maestro/bench-rpc.yaml | Maestro flow for the default benchmark sweep on Android. |
| e2e/.maestro/bench-rpc-ios.yaml | iOS-specific Maestro flow variant (handles the “Open” dialog and avoids clearState). |
| e2e/.maestro/bench-payload-64KB.yaml | Maestro flow for a 64KB-only payload run. |
| e2e/.maestro/bench-payload-64B.yaml | Maestro flow for a 64B-only payload run. |
| e2e/.maestro/bench-payload-1MB.yaml | Maestro flow for a 1MB-only payload run. |
| e2e/.maestro/bench-payload-1KB.yaml | Maestro flow for a 1KB-only payload run. |
| docs/uds-rpc-bridge-benchmark-plan.md | Adds a design/verification plan for the benchmark suite and consumer isolation. |
| backend/rollup.config.ts | Adds BENCH=1 rollup mode and trims static assets for the bench bundle. |
| backend/lib/telemetry-sink.js | Adds pluggable telemetry sinks and span helpers (startSpan). |
| backend/lib/boot-spans.js | Adds startBootSpan helper with a fixed boot-phase taxonomy. |
| backend/lib/bench-rpc.js | Adds a minimal bench RPC server (echo/payload) with payload caching and span emission. |
| backend/index.bench.js | Adds bench-only node entrypoint reusing the lifecycle framing but skipping @comapeo/core. |
| apps/benchmark/tsconfig.json | Bench app TS config + local path mapping to the working tree module source. |
| apps/benchmark/plugins/with-comapeo-bench/index.js | Expo config plugin to opt an app into bench resources (Gradle property + Podfile env + Xcode rename script). |
| apps/benchmark/package.json | Benchmark app package manifest + dependencies and run scripts. |
| apps/benchmark/metro.config.js | Metro config (mirrors example) for monorepo-style dev and avoiding duplicate peers. |
| apps/benchmark/index.ts | Bench app entrypoint registering root component. |
| apps/benchmark/babel.config.js | Bench app Babel config. |
| apps/benchmark/assets/splash-icon.png | Bench app splash asset. |
| apps/benchmark/assets/icon.png | Bench app icon asset. |
| apps/benchmark/assets/favicon.png | Bench app favicon asset. |
| apps/benchmark/assets/adaptive-icon.png | Bench app adaptive icon asset. |
| apps/benchmark/app.json | Bench app Expo config + plugin wiring. |
| apps/benchmark/App.tsx | Bench UI + RPC client + NDJSON writing + optional POST-to-receiver flow. |
| apps/benchmark/.gitignore | Ignores generated native folders and local Expo/Metro artifacts for the bench app. |
| android/build.gradle | Adds comapeoBench property gate to swap module asset source dirs for bench bundle selection. |
| .gitignore | Ignores bench bundle output dirs alongside the existing production bundle outputs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } from "@comapeo/core-react-native"; | ||
| import { Directory, File, Paths } from "expo-file-system"; | ||
| import * as Sharing from "expo-sharing"; | ||
| import React, { useCallback, useEffect, useMemo, useRef, useState } from "react"; |
There was a problem hiding this comment.
useMemo is imported but never used in this file. This will trip lint/TS unused import checks; please remove it or use it.
| import React, { useCallback, useEffect, useMemo, useRef, useState } from "react"; | |
| import React, { useCallback, useEffect, useRef, useState } from "react"; |
| // Linear interpolation between closest ranks. For our sample sizes | ||
| // (~100), `Math.floor((n-1) * p)` is good enough and avoids the | ||
| // off-by-one trap of `Math.floor(n * p)` (which would index past the | ||
| // end at p=1). | ||
| const idx = Math.floor((sortedAsc.length - 1) * p); | ||
| return sortedAsc[idx]!; |
There was a problem hiding this comment.
percentile() claims linear interpolation, but the implementation returns a nearest-rank element (Math.floor((n-1)*p)). Either implement the stated interpolation (so p95/p99 match the documented method) or adjust the comment/docs to match the actual behavior.
| // Linear interpolation between closest ranks. For our sample sizes | |
| // (~100), `Math.floor((n-1) * p)` is good enough and avoids the | |
| // off-by-one trap of `Math.floor(n * p)` (which would index past the | |
| // end at p=1). | |
| const idx = Math.floor((sortedAsc.length - 1) * p); | |
| return sortedAsc[idx]!; | |
| // Linear interpolation between closest ranks. | |
| const position = (sortedAsc.length - 1) * p; | |
| const lowerIdx = Math.floor(position); | |
| const upperIdx = Math.ceil(position); | |
| if (lowerIdx === upperIdx) return sortedAsc[lowerIdx]!; | |
| const lower = sortedAsc[lowerIdx]!; | |
| const upper = sortedAsc[upperIdx]!; | |
| const weight = position - lowerIdx; | |
| return lower + (upper - lower) * weight; |
| function percentile(sortedAsc: number[], p: number): number { | ||
| if (sortedAsc.length === 0) return Number.NaN; | ||
| return sortedAsc[Math.floor((sortedAsc.length - 1) * p)]!; | ||
| } |
There was a problem hiding this comment.
percentile() currently uses a nearest-rank lookup (Math.floor((n-1)*p)). The PR description/plan mention linear interpolation for p50/p95/p99; if that's the intended definition, this summary CSV will not match it. Either implement the intended interpolation here or document that the receiver uses nearest-rank percentiles.
| - Android: the plugin uses `withAppBuildGradle` to append | ||
| `flavorDimensions += "comapeo"` and | ||
| `missingDimensionStrategy 'comapeo', 'bench'` to the bench app's | ||
| `android/app/build.gradle` `defaultConfig`. The module's own | ||
| `android/build.gradle` declares the `bench` flavor + sourceSet; | ||
| consumers that don't activate it (`apps/example/`, third-party | ||
| apps) get the default flavor and never see `src/bench/`. |
There was a problem hiding this comment.
This doc describes Android consumer isolation in terms of a bench productFlavor and missingDimensionStrategy, but the implementation in this PR uses a comapeoBench=true Gradle property (set via withGradleProperties) to swap assets.srcDirs instead. Please update this section so the plan matches what actually shipped (otherwise it's misleading for anyone following it).
| * stripped `backend/index.bench.js` (via the `bench` Android | ||
| * productFlavor / `ENV['COMAPEO_BENCH']` iOS opt-in) — so timings | ||
| * isolate the framing / IPC / JSON-RPC bridge from `@comapeo/core` init | ||
| * noise. See `docs/uds-rpc-bridge-benchmark-plan.md`. |
There was a problem hiding this comment.
The header comment says the bench backend is selected via an Android bench productFlavor, but the actual wiring in this PR uses the comapeoBench=true Gradle property (see android/build.gradle). Please update the comment so it matches the implementation.
| * stripped `backend/index.bench.js` (via the `bench` Android | |
| * productFlavor / `ENV['COMAPEO_BENCH']` iOS opt-in) — so timings | |
| * isolate the framing / IPC / JSON-RPC bridge from `@comapeo/core` init | |
| * noise. See `docs/uds-rpc-bridge-benchmark-plan.md`. | |
| * stripped `backend/index.bench.js` (via the `comapeoBench=true` | |
| * Gradle property on Android / `ENV['COMAPEO_BENCH']` iOS opt-in) — | |
| * so timings isolate the framing / IPC / JSON-RPC bridge from | |
| * `@comapeo/core` init noise. See | |
| * `docs/uds-rpc-bridge-benchmark-plan.md`. |
| * noise. See `docs/uds-rpc-bridge-benchmark-plan.md`. | ||
| * | ||
| * UI surface: | ||
| * - boot status (state observer): waits for "READY" before enabling |
There was a problem hiding this comment.
The UI comment says the app waits for "READY" before enabling the run button, but the code gates on serviceState === "STARTED" (and ComapeoState doesn't include "READY"). Please update the comment to avoid confusion about which state is required for RPC.
| * - boot status (state observer): waits for "READY" before enabling | |
| * - boot status (state observer): waits for "STARTED" before enabling |
| // Bench bundle output. Lives under `src/bench/assets/` so AGP's | ||
| // per-flavor sourceSet merging picks it up only when the consuming app | ||
| // has activated the `bench` productFlavor (see android/build.gradle — | ||
| // `apps/benchmark/` activates this; `apps/example/` does not). |
There was a problem hiding this comment.
This comment says the bench Android assets are picked up via a bench productFlavor/sourceSet merge, but the module now switches assets via the comapeoBench Gradle property (see android/build.gradle). Please update the comment so it matches the current mechanism.
| * - `android/src/bench/assets/nodejs-project/` (overlaid by the | ||
| * `bench` Android productFlavor — see android/build.gradle) |
There was a problem hiding this comment.
The bench-mode comment refers to the Android bench productFlavor for asset overlay, but Android selection is now controlled by the comapeoBench Gradle property (not flavors). Please update the comment to avoid sending readers to a mechanism that no longer exists.
| * - `android/src/bench/assets/nodejs-project/` (overlaid by the | |
| * `bench` Android productFlavor — see android/build.gradle) | |
| * - `android/src/bench/assets/nodejs-project/` (selected by the | |
| * Android build when the `comapeoBench` Gradle property is enabled; | |
| * this is no longer controlled by an Android productFlavor) |
BLOCKER (iOS rename ordering): the previous design added an Xcode Run Script build phase via the config plugin's `withXcodeProject`, but CocoaPods 1.x doesn't reliably position user script phases after `[CP] Copy Pods Resources` — the rename ran before the bench files were on disk and silently no-op'd, leaving bench builds with no `<App>.app/nodejs-project/` and a non-bootable runtime. Switch to pod-install-time staging in `ComapeoCore.podspec`: when COMAPEO_BENCH=1 the podspec stages a copy of `nodejs-project-bench/` to `.bench-staging/nodejs-project/` and adds it to `s.resources` ALONGSIDE the production `nodejs-project/`. CocoaPods rsyncs both into `<App>.app/nodejs-project/` in declaration order, with the bench overlay landing on top — no script phase, no ordering footgun. MAJOR (iOS resource fallback): previous design REPLACED `nodejs-project` with `nodejs-project-bench`, so any rename failure left the app non-bootable. New shape ships both: bench overlays prod, but if the bench bundle is missing (forgot to run `--bench`) the prod bundle remains as fallback. MAJOR (shutdown race): an in-flight `SocketMessagePort.postMessage` landing in streamx's deferred microtask after the AF_UNIX socket has been ended raises `ERR_STREAM_WRITE_AFTER_END` past every listener. The race is benign (the message was already destined for a torn-down peer). Add a state-check + underlying-socket error listener in `message-port.js`, and a targeted `uncaughtException` / `unhandledRejection` filter in `index.bench.js` that swallows the specific code while a graceful shutdown is in progress. Smoke test now exits 0 with all spans + responses recorded; previous run hit `fatal during runtime` and exit 1. Copilot review feedback addressed: - App.tsx: drop unused `useMemo`; replace nearest-rank percentile with linear-interpolation (matches PR description); add 30s per-request timeout + pending-map cleanup so a lost frame doesn't hang the run; update stale "READY" comment to "STARTED". - bench-receiver.ts: same linear-interpolation fix so on-device and host-side numbers agree. - Stale productFlavor / withXcodeProject references in App.tsx, scripts/build-backend.ts, backend/rollup.config.ts, and the plan doc updated to describe the actual `comapeoBench` Gradle property + podspec staging mechanism. https://claude.ai/code/session_01SC1Sc9AvULHQkQSoQ2SMzJ
…rpc-bridge-1Zahz * origin/main: fix(android): fold waitForFile into connect retry loop (#52)
47024a2 to
5acd807
Compare
Adds a generic config knob for consumers that ship their own backend JS bundle: `comapeoBackendDir` Gradle property → BuildConfig field on Android, `ComapeoBackendDir` Info.plist key on iOS. Default is `nodejs-project` so behavior is unchanged for current consumers. This unblocks moving bench-specific wiring out of the module: the bench app can now ship its bundle in a sibling directory and just flip this override, instead of relying on an in-module `comapeoBench=true` toggle that swaps Android sourceSets and runs an iOS pod-install staging copy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Moves all bench-only backend source (`index.bench.js`, `bench-rpc.js`, `boot-spans.js`, `telemetry-sink.js`) and its rollup config out of the production module and into `apps/benchmark/backend/`. The bench bundle is built from there with its own simplified rollup config: one ESM output, no per-platform split, no native-addon banner (the bench code imports no addons). Shared framing helpers (server-helper.js, simple-rpc.js, message-port.js) stay in the module's `backend/lib/` and are path-imported from the bench source so wire framing stays bit-identical to production. Rewrites `with-comapeo-bench` plugin against the new `comapeoBackendDir` override hook: drops `comapeoBench=true` Gradle toggle, drops `ENV['COMAPEO_BENCH']` Podfile mutation, drops the iOS `.bench-staging` rsync trick. Now sets the override property/Info.plist key and copies the bench bundle into the consumer app's own native asset/resource trees (Android assets dir + iOS folder reference). Same shape as `expo-asset`'s plugin, minus its file-extension allowlist and flat-structure constraints which don't fit a JS bundle. Strips `BENCH=1` mode from the module's rollup.config.ts and `--bench` mode from scripts/build-backend.ts. Dead bench wiring still in the module (`android/src/bench/`, `ios/nodejs-project-bench/`, podspec env branch) is removed in the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
With the bench app moved to apps/benchmark/ and using the new comapeoBackendDir override hook, the production module no longer needs: - comapeoBench Gradle property + conditional sourceSet swap in android/build.gradle (sourceSets revert to AGP defaults) - ENV['COMAPEO_BENCH'] branch + .bench-staging rsync in ios/ComapeoCore.podspec (s.resources is just ['nodejs-project']) - !android/src/bench/ and !ios/nodejs-project-bench/ exclusions in package.json files (those dirs no longer exist in the module) - Bench-specific .gitignore entries Also removes the (build-artifact, gitignored) android/src/bench/ and ios/nodejs-project-bench/ directories, and updates two stale comments in retained source files plus a header note in the planning doc pointing at the v2 implementation in apps/benchmark/. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The export was always misnamed: it isn't a benchmark-specific API, it's the raw `MessagePort`-shaped escape hatch one level below the `comapeo` client. Anything paired with a custom backend bundle (the bench app being the canonical example) goes through this port. `unstable_` matches React's `unstable_batchedUpdates` / `unstable_setExceptionDecorator` convention — signals "may change without notice" without burning the API on a name like `INTERNAL_messagePort` that implies stronger guarantees about internal-only access. Lowercase because it's an instance, not a class. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The earlier-in-branch edits parameterized `copyStaticAssetsPlugin` and renamed `sharedInput → prodInput` to support a `BENCH=1` mode that was since deleted. With the bench bundle owning its own rollup config in apps/benchmark/, none of those scaffolding changes are needed — restoring the file to main reduces the diff and keeps the production config minimal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Sentry plan was committed here only because this branch was originally cut off the sentry-plan tip (fd33ffc) so the bench design could reference it during planning. Now that the bench refactor is self-contained, the doc shouldn't ship via this PR — it'll land on main from the dedicated Sentry branch instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`pbxProject.addResourceFile` unconditionally calls
`correctForResourcesPath`, which dereferences `pbxGroupByName('Resources')`
without a null check. Default Expo prebuild output for an Expo SDK 55
app has no top-level `Resources` group, so the call crashed with
`Cannot read properties of null (reading 'path')`.
Fix: call `IOSConfig.XcodeUtils.ensureGroupRecursively(project, 'Resources')`
before `addResourceFile`. The group itself has no `.path`, so the
prefix-strip in `correctForResourcesPath` is a no-op, and
`addToResourcesPbxGroup` correctly attaches the file ref under it.
Verified end-to-end on iPhone 16 sim (iOS 26.2) and Pixel 7a API 29
emulator: bench app reaches STARTED state, runs the bench RPC, and
renders 100-sample 64B p50/p95/p99 results.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The UDS / RPC bridge benchmark plan is now implemented as Phase 3 shipped, and the doc itself describes an earlier iteration (the `comapeoBench=true` toggle and `ENV['COMAPEO_BENCH']` Podfile mutation) that has since been refactored into the generic `comapeoBackendDir` override. Keeping it would only mislead. Refreshes App.tsx's header comment to reflect the current wiring and points at the new `apps/benchmark/README.md` (added in the next commit) instead of the deleted doc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the deleted plan doc with a focused per-app README covering what the bench measures, how the override hook + plugin + bundle wiring works end-to-end, run instructions for sims/emulators + Maestro flows, and the sink/receiver model. Phase 4/5 status sections leave hooks for the upcoming BrowserStack work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the Phase 4 plumbing for orchestrated BrowserStack runs:
- `scripts/bench-receiver.ts` — minimal localhost HTTP server (no
deps, pure node:http). POST /spans appends each span to
apps/benchmark/results/<runId>.ndjson; runId is path-traversal-
guarded against the regex App.tsx generates. GET /health for
tunnel verification.
- `scripts/run-on-browserstack.ts` — uploads APK / IPA via the
Maestro v2 App Automate REST API, zips the bench-*.yaml flows
under the `flows/` parent dir BrowserStack requires, uploads the
test suite, triggers a build per platform (default device per
platform configurable via flags), prints the dashboard URL.
Auth and the bench-flows zip are deduplicated via custom_id so
re-running with byte-identical artefacts is cheap. Lazy env
resolution so `--help` and arg-validation errors don't require
credentials.
- `e2e/.maestro/bench-rpc-receiver.yaml` — sibling of bench-rpc.yaml
that flips the "POST spans" toggle before tapping run, so spans
fire to localhost:8787 (reachable from BS devices via
BrowserStackLocal). bench-rpc.yaml's stale comment about the
removed `comapeoBench` flavor toggle is also refreshed here.
- `.env.example` + `.gitignore` updates: credentials live in `.env`
(gitignored), receiver output in apps/benchmark/results/
(gitignored), BrowserStackLocal's default log files
(browserstack.{err,log}, local.log) gitignored too.
- npm scripts: `bench:receiver` and `bench:browserstack` for the
per-run workflow documented in apps/benchmark/README.md.
Verified offline: receiver accepts valid spans, blocks path-traversal
runIds, rejects malformed JSON; runner --help and arg-validation
paths render without credentials. Online verification (real upload +
build trigger) blocked on BrowserStack account access.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`scripts/run-on-browserstack.ts` reads the project name from
`BENCH_BROWSERSTACK_PROJECT` (in `.env`) when --project isn't
passed. Required for org accounts where the access key can't create
new projects — they need to attach builds to an existing project
(verified end-to-end via `GET /app-automate/projects.json`).
`apps/benchmark/RESULTS.md` is the curated summary destination
agreed for Phase 4 results. Includes a template run section that
new runs copy from, plus column documentation. Raw NDJSON spans
remain gitignored under `apps/benchmark/results/`; a future
summarizer script can read those and rewrite a generated section
of this file.
Online dispatch verified up to the build trigger: app + test-suite
uploads succeed and return bs:// URLs; build trigger blocked on
BrowserStack org-side permissions ("You do not have the necessary
permissions to create builds in this project. Please contact your
organization admin.") — needs admin to grant build-creation rights
in the existing CoMapeo project.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The default code path is to send no `project` field so BrowserStack auto-creates one from the uploaded app's bundle ID — which is what we want. The env var is only relevant when a key can't auto-create and a pre-existing project must be reused. Reword the example to not imply CoMapeo (the org's existing app project) is the right target for the bench. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two fixes to scripts/run-on-browserstack.ts found while smoke-testing the dispatch end-to-end against the real BrowserStack API: 1. `execute` path doubled `flows/`. BrowserStack auto-prepends the zip's parent dir at extraction time, so `execute: ["flows/<flow>"]` resolved to `<extract>/flows/flows/<flow>` and dry-run logs reported "Flow path does not exist". Drop the prefix; BS appends it. 2. `deviceLogs` and `networkLogs` are off by default per BS docs, and the only way to triage a failing build's app logcat is via the device log endpoint. Default them on for the bench workflow — retention is 60 days each, the bench is debug-oriented, and the dispatch is human-driven not bulk-CI. Verified end-to-end: builds dispatched after these fixes correctly target a single bench-*.yaml flow, and `device_log` URLs surface in the per-test response so failed runs can be diagnosed without guessing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a paired Android/iOS hook to skip the keystore-backed rootkey
load and ship a deterministic 16-zero-byte stub on the init frame
instead. Off by default; production consumers MUST leave it off so
real identity material stays encrypted at rest.
Why: `RootKeyStore.createOrLoadWrapperKey` (Android) sets
`setUnlockedDeviceRequired(true)` on the wrapper key, which since
Android 12 requires the device's user ECDH key to be initialised
(pinned to a real screen lock setup). BrowserStack's stock fleet
ships without a screen lock, so any real-device run of the bench app
on BS hit:
KeyStoreException: System error (code 4)
In handle_super_encryption_on_key_init: User ECDH key missing.
The bench backend doesn't construct a `MapeoManager` and never
reads the rootkey value — so swapping the keystore path for a
zeroed stub is safe by construction for the benchmark, while a real
production deploy stays on the keystore path and on a real device
that has the prerequisites.
Wiring matches the existing `comapeoBackendDir` shape:
- Android: gradle property `comapeoStubRootKey` →
`BuildConfig.COMAPEO_STUB_ROOTKEY` boolean → branched in
`NodeJSService.sendInitFrame`.
- iOS: `ComapeoStubRootKey` Info.plist boolean → branched in
`AppLifecycleDelegate`'s `rootKeyProvider` closure.
- Bench plugin (`apps/benchmark/plugins/with-comapeo-bench/`) sets
both. README documents it alongside `comapeoBackendDir`.
Verified end-to-end: bench app build with the stub flag passes the
full `bench-rpc.yaml` Maestro flow on BrowserStack Samsung Galaxy
S23 Ultra (Android 13) — same APK that previously failed at
`STARTING -> ERROR` from the keystore step now reaches STARTED.
Local emulator path also confirmed unaffected (bypasses keystore as
expected, no regression).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two fixes needed for end-to-end span aggregation on BrowserStack:
- `local: true` on the build trigger payload. Without it, the BS
device's `localhost` resolves to its own loopback (not our host's
via the BrowserStackLocal tunnel) and the receiver POSTs vanish.
- `usesCleartextTraffic="true"` on the bench app's release manifest
(via `withAndroidManifest` in the bench plugin). Expo prebuild
only sets this on debug variants; release variants on Android 14
(targetSdk=36) silently block cleartext-to-localhost fetches by
default. App.tsx's POST has `.catch(() => {})` so the failure was
invisible — the bench would complete and assert results visible
while every span POST quietly dropped.
Both confirmed by a clean run on Samsung Galaxy S23 Ultra
(Android 13): receiver collected 300 spans (3 sizes × 100 samples),
sub-ms p50 across small payloads, expected scaling at 64 KB.
RESULTS.md filled in with this first real run as the format
exemplar.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three threads of work in one go, plus the first 19-device sweep's data committed: - `comapeoBackendArgs` Gradle property → `BuildConfig.COMAPEO_BACKEND_ARGS` → appended to nodejs-mobile argv. Native loader also derives a `--device=<MANUFACTURER MODEL (Android REL)>` arg so the bench backend tags spans without an extra round-trip. Telemetry sink takes per-process defaults that lift `runId` to top-level (matching the receiver's wire format) and tuck device into `attrs.device`. Bench plugin sets `comapeoBackendArgs=--telemetry=http://localhost:8787/spans`. - App.tsx attaches `attrs.device` to every RN-side rpc span via `Platform.constants` so the summarizer can group across the cross-device-runId noise. - `scripts/run-on-browserstack.ts` accepts CSV via `--devices-android` / `--devices-ios` and submits the array in a single build. - `scripts/bench-summarize.ts` reads NDJSON files and rewrites a marker-delimited section in `apps/benchmark/RESULTS.md`. Curated commentary outside the markers is preserved across re-runs. First 19-device sweep landed: 14 non-Samsung non-Pixel Android devices in the BS catalog + 3 Samsung + 2 Pixel, dispatched as two batches (5 parallel + 5 queued cap). All 19 sessions passed. RESULTS.md gains a variance-analysis section that walks through the p99/p50 spread (typical 2–6× ratio) and roots it in scheduler preemption, GC pauses, CPU frequency scaling, and tail interrupt events. Boot-phase capture is a known gap — BS Local doesn't appear to tunnel nodejs-mobile libuv socket traffic the way it tunnels RN's fetch; documented inline for follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bench-*.yaml flows belong with the benchmark app, not in the module's generic e2e/.maestro/ directory. Moves all five bench flows + the iOS-flavoured variant into apps/benchmark/.maestro/, adds a minimal config.yaml so Maestro CLI runs only the bench-* discoveries here, and points the BS dispatch script's FLOWS_SRC_DIR at the new location. Also drops bench-rpc-receiver.yaml — the receiver/tunnel transport is about to be replaced by logcat-based reporting (next commit) where the post-spans toggle is no longer relevant. Module-level e2e flows (app-launch, ipc-roundtrip, state-transitions, send-multiple-rounds, node-process-starts) stay in e2e/.maestro/. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The receiver+tunnel approach worked for RPC spans but had a long tail
of fragile bits: BrowserStackLocal had to be running before dispatch,
the consumer app needed `usesCleartextTraffic="true"` (Expo only sets
that on debug variants), and nodejs-mobile's libuv sockets bypassed
the BS Local intercept entirely so boot spans never landed.
Logs sidestep all of it. BS captures Android logcat verbatim when the
build sets `deviceLogs: true`, with 60-day retention and a REST
endpoint to pull post-build. Both span sources (RN bridge,
nodejs-mobile) just `console.log("BENCH_SPAN " + JSON)` and BS picks
them up under their respective log tags.
Removed:
- `scripts/bench-receiver.ts` and its `bench:receiver` npm script
- POST-spans toggle + receiver URL TextInput + RECEIVER_DEFAULT_URL
in App.tsx (no longer relevant on the device)
- `withAndroidManifest`-driven `usesCleartextTraffic` in the bench
plugin (was only there for cleartext-localhost POSTs)
- `comapeoBackendArgs=--telemetry=http://localhost:8787/spans` from
the plugin (LogSink is now the default; comapeoBackendArgs stays
as an empty escape hatch for non-default sinks)
- BrowserStackLocal default log file gitignores (no longer needed in
the bench workflow)
Added:
- `LogSink` in apps/benchmark/backend/lib/telemetry-sink.js — writes
one stdout line per span with the `BENCH_SPAN ` prefix and the
same `mergeDefaults` field-lifting (runId / device) the other
sinks already had.
- `LogSink` is now the default returned by `createSinkFromArg` when
no `--telemetry=` arg is passed.
- The `--device=<MANUFACTURER MODEL (Android REL)>` arg is always
appended to the nodejs-mobile argv (was conditional on
`comapeoBackendArgs` being non-empty); the production backend
ignores the unknown flag.
The dispatch script's log-pull + parsing change comes in the next
commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…default Five distinct improvements packed into one runner rewrite: 1. **Maestro version pin**. Adds `maestroVersion: "2.0.7"` to the build trigger. BS supports `latest` / `2.0.7` / `1.39.13` (the default 1.39.13 is older). 2.0.7 has the runner-side `http` client and perf fixes; pinning rather than `latest` avoids surprise version bumps. 2. **Auto-batch via plan capacity**. Reads `/app-automate/plan.json` for `parallel_sessions_max_allowed + queued_sessions_max_allowed` and chunks `--devices-android` into batches that fit. Was: human had to hand-split a list of 19 into two dispatches. 3. **Log-pull and parse**. After each batch reaches a terminal status, walks `/builds/<id>/sessions/<sid>` for per-test `device_log` URLs, fetches each, greps `BENCH_SPAN ` lines, and writes one NDJSON file per device under `apps/benchmark/results/<device-slug>-<session-prefix>.ndjson`. Replaces the receiver+tunnel transport entirely. 4. **Test R&A organization**. Switches `buildName` (heuristic- stripped) for `customBuildName` (static, default `comapeo-bench`) plus `buildIdentifier` (per-run, default ISO timestamp). Optional `--build-tag` for free-form filtering on the dashboard. Both flags are exposed via CLI. 5. **10-device curated default**. `CURATED_ANDROID_DEVICES` spans Android 9–16 across 6 brands and the variance spectrum (S26 Ultra → P30); fits one BS 5+5 plan dispatch. Was: single `Samsung Galaxy S23 Ultra-13.0` default. Override singly with `--device-android` or via CSV with `--devices-android`. Also drops `local: true` (no tunnel needed for log-based transport) and tightens the polling loop's status reporting. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
README: drops the receiver/tunnel workflow, shows the logcat path, points at the new per-build NDJSON output and curated 10-device default. Maestro-flow paths updated to apps/benchmark/.maestro/. Phase 4 marked complete. RESULTS: replaces the "boot-phase wiring caveat" with a "resolved" note pointing at the new logcat-based transport. Phase 4 boot spans now flow. Skill at ~/.claude/skills/browserstack-app-automate-maestro/SKILL.md was rewritten in tandem (not committed here — it lives outside the repo). The skill puts logs first as the recommended transport and keeps BS Local + Maestro runScript as documented fallbacks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…N device tag
Two-part iOS parity work for the logcat-based span transport.
1. Pipe nodejs-mobile stdout/stderr to `os_log`. On Android the
libnode build pipes Node's stdio into logcat so `console.log`
from the rolled-up backend lands under the `Comapeo:NodeJS`
tag automatically. On iOS no equivalent piping exists — by
default the writes hit fds inherited from the parent process
(/dev/null on a release build, Xcode console on debug),
neither of which the iOS unified log subsystem captures.
`NodeMobileBridge.mm` now sets up a one-shot pipe via
`pthread_once`: dup2 stdout/stderr onto the write end of a
pipe, spawn a detached pthread that reads line-by-line and
forwards each line to `os_log` under `com.comapeo.nodejs:stdout`.
Process-wide redirect, so RN's console.log lands in the same
subsystem too — handy for the bench app, which has BENCH_SPAN
emitters on both sides.
2. Fix RN-side device tag derivation on iOS. App.tsx was reading
`Platform.constants.systemVersion` (doesn't exist) and
`.model` (doesn't exist), producing `"Apple device (iOS ?)"`.
The right keys on iOS are `osVersion` and `interfaceIdiom`
("phone" / "pad" / "tv"). RN-side now produces
`"Apple iPhone (iOS 26.2)"` to exactly match the backend tag
that NodeJSService.swift derives from `UIDevice.current.model`
+ `systemName` + `systemVersion`. The summarizer's group-by-
`attrs.device` is reliable as a result.
Also: NodeJSService.swift now appends `--device=<tag>` and the
optional `ComapeoBackendArgs` Info.plist value to the nodejs-mobile
argv (mirrors the Android `comapeoBackendArgs` Gradle property).
The bench plugin sets the Info.plist key to empty by default;
override it per-build to pass e.g. `--telemetry=file:/tmp/x.ndjson`.
Verified end-to-end on iPhone 16 simulator (iOS 26.2) running
`bench-rpc-ios.yaml`: 3 boot spans + 330 RPC spans land in
`com.comapeo.nodejs:stdout` with consistent device tagging across
both emitters.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`apps/benchmark/scripts/build-ipa.sh` (exposed as `npm run ios:archive`) wraps `xcodebuild archive` + `-exportArchive` with a Development export method, the team id read from `APPLE_DEVELOPMENT_TEAM_ID` in `.env`, and a generated `ExportOptions.plist` so no per-developer plist needs to be checked in. The path BrowserStack accepts: - BS auto-resigns iOS apps on upload, replacing the consumer provisioning profile with theirs. Distribution / App Store signing isn't required — Development export works. - The bundle id (`com.comapeo.core.benchmark`) only needs to exist as an Identifier under the developer team. No Capabilities are needed (the bench app sets `comapeoStubRootKey: true` so it doesn't touch Keychain). No App Store Connect record needed. - `xcodebuild` runs with `-allowProvisioningUpdates` so Xcode can auto-create the Development cert + provisioning profile on the first archive without forcing manual portal setup. Output lands at `apps/benchmark/ios-build/ipa/<scheme>.ipa`, gitignored. Consumed by `npm run bench:browserstack -- --app-ios`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
Discovered during the first cross-platform BS dispatch that iOS
release builds suppress JS \`console.log\` (RCTLog level filter
defaults to WARN), so RN-side spans never reached the device log
even though the \`pipe + dup2 → os_log\` redirect was capturing
nodejs-mobile output cleanly. The backend boot phases came through;
the RTT samples didn't.
Three coupled changes:
1. `bench-rpc.js`: new `ingestSpans` RPC method that takes
`{spans: [...]}` and re-emits each via `console.log` (which on
the backend side IS captured — Android logcat directly, iOS via
the bridge's pipe redirect). Single batched call after the bench
loop completes, so the round-trip cost doesn't pollute the RTT
samples.
2. `App.tsx`: dropped the per-iteration `console.log("BENCH_SPAN ...")`
in favour of `client.request("ingestSpans", { spans: allSpans })`
after measurement is done. Span data still gets serialised to
the on-device NDJSON file as before.
3. `bench-summarize.ts`: filter on `attrs.rttSide === "rn"` for the
RPC throughput table. Without this, `op:"rpc"` spans from the
backend's per-handler tracing (sub-ms by design, mostly bench-
rpc.js internal) get aggregated together with the user-facing
RN-thread RTT samples, pulling p50 toward zero.
Verified end-to-end: dispatched the 10-device Android sweep + 1 iOS
device. All 11 sessions passed, 6542 spans collected, autosummary
table now shows realistic numbers across all devices including
iPhone (iOS 17.3 → 64B p50=0.19ms p99=0.75ms). RESULTS.md gains a
new run entry referencing the two BS build URLs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comments left over from the receiver+tunnel transport before the logcat pivot. Updates: bench-rpc.yaml drops the "sibling bench-rpc-receiver.yaml" paragraph; bench-summarize.ts header points at the real upstream (run-on-browserstack.ts logcat parser, not the deleted bench-receiver); android/build.gradle's comapeoBackendArgs doc no longer claims the bench wires HttpSink there; RESULTS.md template uses the real flow name. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three pieces orphaned by the receiver+tunnel → logcat pivot: - HttpSink + the http(s):// branch in createSinkFromArg: nothing constructs it now that the bench plugin sets comapeoBackendArgs empty and LogSink is the default. - scripts/lib/bench-receiver.ts: no consumer; spans flow via logcat pulled by run-on-browserstack.ts. - apps/benchmark/.maestro/bench-rpc-ios.yaml: bench-rpc.yaml works for both platforms (the recent 11-device sweep included an iPhone that passed clean), so the iOS-specific variant is dead weight. README repo-layout table and architecture paragraph updated to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Backend handler spans were being emitted, captured into NDJSON, then silently dropped by the summarizer's rttSide-rn filter — pure noise in the logcat budget. Now they carry attrs.rttSide:"backend" and the summarizer renders a second table beneath the RN-side one. The diff against the RN row is approximately the JSI + framing + UDS overhead, which is the most actionable diagnostic when a regression appears. Also drops the span emit for the `ingestSpans` housekeeping RPC, whose body is the bulk span flush itself (one big outlier per run is not useful percentile data), and narrows the BootPhase typedef to the three server-side phases the bench actually emits — the three native phases the prior typedef listed are out of scope for this process. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The pipe + dup2 → os_log redirect was running for every consumer of
this module, not just the bench app. Two production concerns: the
%{public}s formatter on the os_log call deliberately defeats the
unified log's PII redaction (any future identity-bearing log line
would land in the device's persistent log, retrievable via sysdiagnose),
and the always-on reader pthread is overhead production apps don't
otherwise pay.
Now opt-in via the Info.plist BOOL `ComapeoStdoutToOsLog`. Production
consumers leave it unset and inherit iOS's default stdout routing.
The bench app's `with-comapeo-bench` config plugin sets it true so
BrowserStack can pull `BENCH_SPAN <json>` lines out of the device
console as before.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Step-by-step plan for a workflow_dispatch-only GitHub Actions workflow that builds the bench APK + IPA, dispatches to BrowserStack, pulls device logs, and uploads NDJSON + RESULTS.md as workflow artefacts. Defers regression detection, automated triggers, and PR comments to later iterations — this slice is just the manual pipeline. Covers the iOS keychain bootstrap (the only meaningfully new piece beyond what already runs locally), the secrets surface, and an implementation order. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reduces the production-code touch points exposed for non-production consumers (the bench app being the only one) down to a single override on each platform plus the existing nodejs-mobile stdout-redirect gate. - Drop `comapeoBackendArgs` (Gradle property + BuildConfig field + Kotlin parsing on Android; Info.plist key + Swift parsing on iOS). Was speculative surface for future telemetry-sink overrides; nothing in this PR populates it. The `--device=<tag>` argv injection the native loader does unconditionally is unaffected — production backend ignores unknown flags and Sentry tagging will read it. - Rename `comapeoBackendDir` → `comapeoEntryFile`. Override is now a filename inside `nodejs-project/` rather than a sibling directory. Bench plugin drops the bench entry into the consumer's `nodejs-project/` and lets AGP's asset merge (Android) / a Run Script Phase (iOS) co-locate it with the production bundle's `index.mjs`. Bench bundle's rollup output is renamed to `index.bench.mjs` and no longer ships a `package.json` (the production bundle's already does, in the same directory). - Drop `comapeoStubRootKey` end-to-end now that #57 (drop setUnlockedDeviceRequired from rootkey wrapper key) has landed on main. The stub existed only to work around BrowserStack stock no-screen-lock devices failing key generation; the real keystore path now succeeds for them, the bench backend's relaxed init handler ignores the rootkey bytes it receives, and the production branch in the FGS loader simplifies back to a single RootKeyStore.loadOrInitialize() call. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Trim verbose explanations down to non-obvious why-only. Cut:
- Restatements of what the code does (the code is right there).
- Multi-paragraph rationale better suited to PR description / README.
- ASCII state-machine diagrams and historical narration.
Keep load-bearing rationale: hidden constraints (libUV contiguous
argv, AGP asset merge, RCTLog level filter, undici WASM init,
streamx-microtask write-after-end race), security gates (`%{public}s`
defeats os_log redaction), and protocol invariants (`stopping`-before-
close so native distinguishes graceful from crash).
Net: -473 lines, no behavior change. Bench bundle still builds; plugin
still loads.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Adds a UDS/RPC bridge benchmark suite as a standalone Expo app under
apps/benchmark/. Exercises the same RN → native → nodejs-mobile path real users hit, but with@comapeo/corestripped out so framing / IPC / JSON-RPC overhead measures cleanly. Runs locally via Maestro and on a curated BrowserStack device sweep; results land as NDJSON spans plus a refreshedRESULTS.md.Production-code surface for the bench app is intentionally narrow — see "Module override surface" below.
What lands
Bench app (
apps/benchmark/)backend/— minimal nodejs-mobile entry that reuses the production state machine (pre-listening→started→ready) and path-imports the framing helpers (server-helper.js,simple-rpc.js,message-port.js) from the productionbackend/lib/so the wire framing is bit-identical. Drops@comapeo/coreentirely.BenchRpcServerregistersecho,payload(sizeBytes), andingestSpansmethods; rolled up to a singledist/index.bench.mjs.App.tsx— RN-side bench client. Talks to the bench backend viaunstable_messagePort(see below). Runs warmup + steady-state sweep across 64 B / 1 KB / 64 KB / 1 MB payload classes (10 + 100 iterations per size), records per-RPC RTT, renders an on-screen p50/p95/p99 panel, exports NDJSON via the share sheet.backend/lib/telemetry-sink.js— pluggable sink interface withLogSink(default; oneBENCH_SPAN <json>stdout line per span — surfaces in Androidlogcatand iOS device console),JsonFileSink(NDJSON to<app sandbox>/Documents/comapeo-bench/<runId>.ndjson), andNoopSink.backend/lib/boot-spans.js— wrapslisten-control/init/constructboot phases with Sentry-shaped spans so the eventual Sentry adapter (§7.4.2 of the Sentry plan) can adopt the same call sites..maestro/— local-run flows (one per payload class plus a sweep), plus workspaceconfig.yaml.plugins/with-comapeo-bench/— Expo config plugin (see "Module override surface" for what it sets). Idempotent acrossexpo prebuildreruns.scripts/build-ipa.sh— builds a Development-export IPA (com.comapeo.core.benchmarkbundle id) for BrowserStack; auto-resigned on upload.README.md+RESULTS.md— architecture, run instructions, current measured baseline.Module override surface (production code)
The bench app drops a sibling entry file (
index.bench.mjs) into the consumer'snodejs-project/and tells the module's loader to run it instead of the productionindex.mjs. Single override:comapeoEntryFile— Gradle property →BuildConfig.COMAPEO_ENTRY_FILEon Android;ComapeoEntryFileInfo.plist key on iOS. Defaults toindex.mjs. AGP merges the bench file with the library bundle on Android; an Xcode Run Script build phase copies it into<App>.app/nodejs-project/after CocoaPods' resource-copy phase on iOS.Two adjacent additions that aren't bench-only:
--device=<MANUFACTURER MODEL (Android REL)>/--device=Apple <model> (<systemName> <systemVersion>)appended to the nodejs-mobile argv unconditionally. Production backend ignores unknown positional flags; the bench backend reads it for span attribution; Sentry tagging will read it once that lands. Pure no-op for current production consumers.backend/lib/message-port.jshardening —SocketMessagePort.postMessagenow drops writes afterclose(), and a socket-levelerrorlistener swallows theERR_STREAM_WRITE_AFTER_ENDrace that otherwise surfaces asuncaughtExceptionduring graceful shutdown. Both are real production fixes (the race exists today; the bench shutdown sequence just makes it routine).iOS-only opt-in (off by default for production):
ComapeoStdoutToOsLogInfo.plist BOOL — when true,NodeMobileBridge.mmdup2s nodejs-mobile's stdout/stderr onto a pipe and forwards each line toos_logunder thecom.comapeo.nodejssubsystem. Lets BrowserStack captureBENCH_SPANlines from the device console. Production consumers leave it unset and inherit the legacy routing (so unredacted JS log lines stay out of the unified log).unstable_messagePortexport (src/ComapeoCoreModule.ts)Raw
CoreMessagePortsingleton — escape hatch for consumers that need to bypass theMapeoClientrequest/response machinery and speak directly to whatever backend bundle they've wired in. The bench app uses it.unstable_prefix follows the React/RN convention for surfaces whose shape may change without notice; production consumers should keep usingcomapeo.Host-side runner
scripts/run-on-browserstack.ts—npm run bench:browserstack. Queries BrowserStack/plan.jsonfor parallel + queued cap, chunks a curated 10-device Android sweep into builds that fit, dispatches each, polls until terminal, pulls per-device logcat, grepsBENCH_SPANlines into one NDJSON per device underapps/benchmark/results/. NoBrowserStackLocaltunnel, no host-side receiver process, no cleartext-traffic config.scripts/bench-summarize.ts—npm run bench:summarize. Refreshesapps/benchmark/RESULTS.mdfrom the pulled NDJSONs (per-device p50/p95/p99 per payload class, plusrttSide:"backend"vsrttSide:"rn"columns for the bridge-overhead diff).CI plan (
docs/bench-ci-plan.md)Scaffolding for a manual-trigger benchmark workflow that posts artifacts back to GitHub. Implementation lands in a follow-up PR.
Notable design choices
server-helper.js,simple-rpc.js, andmessage-port.jsfrom the productionbackend/lib/via path-relative imports so any divergence in framing would invalidate the benchmark's premise. Rollup inlines them at bundle time.@comapeo/core(no iOS maps-plugin stub needed) and never loads native addons (no per-platform__loadAddonbanner needed).console.log("BENCH_SPAN " + JSON.stringify(span))from the backend. Android picks up vialogcat; iOS picks up via the opt-inos_logredirect; the BS runner pulls device logs after each build and greps. RN-side spans round-trip through the bench RPC'singestSpansmethod so they emit through the same path (iOS release builds suppress JSconsole.logvia RCTLog's level filter, so RN-direct logging doesn't reach the device console).payloadcache. Pre-allocates and caches synthesized payloads per size (capped at 4 MiB resident) so a mixed-size sweep doesn't spend its time inString.repeat.Sentry alignment
Boot phases (
boot.listen-control,boot.init,boot.construct) and per-call RPC spans (rpc.echo,rpc.payload) follow the Sentry-shaped span taxonomy in §7.4.2 of the Sentry plan. The eventualSentryAdapterSink(Phase 5 in the bench README) implements the same surface, so the call sites stay unchanged when the production loader adopts shared instrumentation.Dependency
setUnlockedDeviceRequiredfrom rootkey wrapper key, which landed on main). Without that, BrowserStack's stock no-screen-lock fleet would fail wrapper-key generation at FGS startup; an earlier iteration of this branch carried acomapeoStubRootKeyopt-out hook to work around it, which is no longer needed and has been removed.Test plan
cd apps/benchmark && npm install && npm run prebuild && npm run android— bench app reaches STARTED, run-benchmark tap produces a results panel.npm run ios.npm run --prefix apps/benchmark/backend buildproducesdist/index.bench.mjs(and sourcemap).apps/example/(production consumer) builds + runs unchanged —comapeoEntryFiledefaults toindex.mjs, no Info.plist keys set, FGS reaches STARTED, RPC works.npm run bench:browserstack -- --app-android <apk> --app-ios <ipa>(with credentials in.env) dispatches against the curated sweep, pulls NDJSONs intoapps/benchmark/results/.npm run bench:summarizerefreshesapps/benchmark/RESULTS.md.🤖 Generated with Claude Code