feat: CORS iframe parity + closed shadow DOM#32
Conversation
Mirrors the @percy/sdk-utils-to-inlined-helpers refactor from the canonical JS SDKs. Adds DEFAULT_MAX_FRAME_DEPTH, UNSUPPORTED_IFRAME_SRCS, the PercyContextLost error class, and the shared iframe helper surface (is_unsupported_iframe_src?, clamp_frame_depth, normalize_ignore_selectors, resolve_max_frame_depth, resolve_ignore_selectors, enumerate_iframes_script) that the upcoming nested CORS / data-percy-ignore / ignoreIframeSelectors / context-loss-recovery features will build on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the single-level CORS iframe loop with process_frame_tree, a bounded recursion mirroring percy-nightwatch's snapshot.js. After serializing an iframe's document, the walker re-enumerates iframes from inside that frame and recurses into any nested cross-origin children that differ from the immediate parent's origin. Depth is clamped via DEFAULT_MAX_FRAME_DEPTH (or options.maxIframeDepth / config) so pathological pages can't blow the stack, and an ancestorUrls Set short-circuits cyclic frame trees that link back to themselves. Specs updated to drive the new tree API via execute_script-returned iframe metadata (enumerate_iframes_script) instead of stubbing frame_element.attribute, matching how the SDK now reads the DOM. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Surfaces dataPercyIgnore from the enumerate_iframes_script payload and treats it as an explicit skip signal in should_skip_iframe?. Page authors can drop a single iframe from CORS capture (e.g. a noisy widget, a third-party embed) without configuring a project-wide selector. Mirrors the same attribute check shipped in percy-nightwatch and percy-webdriverio. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Threads options[:ignoreIframeSelectors] (and the snake_case alias, plus config.snapshot.ignoreIframeSelectors) through resolve_ignore_selectors into enumerate_iframes_script, which evaluates frame.matches(selector) in the page context for each iframe. should_skip_iframe? honors the resulting matchesIgnoreSelector flag, so a single config entry can drop classes of iframes (ads, chat widgets) from CORS capture without touching every page or call site. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After switching into an iframe, read document.URL and re-run the unsupported-src filter. Cross-origin navigations can fail in subtle ways — DNS errors, X-Frame-Options blocks, or programmatic about:blank redirects — that leave the static src attribute pointing at the original https:// URL while the actual document is an error page or about:blank. Without this check we'd serialize and ship that error page as "captured" iframe content. Mirrors the same check from percy-webdriverio's processFrameTree. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a nested frame restoration fails — switch_to.parent_frame raises and we can no longer trust the driver's frame context — the deep process_frame_tree call now packages the iframes it captured before the loss into a PercyContextLost error and propagates upward. Each intermediate level merges its own collected frames into err.partial_capture before re-raising. capture_cors_iframes catches that error at the top, appends the partial capture into corsIframes, and aborts further sibling iteration. The result: a parent-frame restore failure in a 4-deep tree still ships everything we managed to serialize, instead of dropping the whole snapshot's CORS data. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds expose_closed_shadow_roots(driver), which uses three CDP calls — DOM.getDocument (deep + pierce), DOM.resolveNode, and Runtime.callFunctionOn — to surface closed shadow roots that are otherwise inaccessible to JavaScript (element.shadowRoot returns null). Each closed root is keyed into a host-keyed WeakMap on the page (window.__percyClosedShadowRoots) that clone-dom.js consults during PercyDOM.serialize. Wired into the snapshot pipeline before the top- level serialize and re-primed after page reloads inside capture_responsive_dom (refresh clears the WeakMap). No-op when CDP is unavailable or the driver isn't Chromium. Mirrors percy-playwright's exposeClosedShadowRoots; contentDocument subtrees are skipped since cross-frame closed roots aren't supported yet. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace rescue modifiers with explicit begin/rescue blocks, swap em-dash comments for ASCII, rename rescue variable, fix alignment/indentation, split overlong spec line, and add scoped rubocop disables for the intentional predicate method name and one block-nesting hot spot. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three CE:review fixes on the CORS iframe path: - Index alignment race: enumerate_iframes (JS querySelectorAll) and driver.find_elements(:tag_name, 'iframe') (Ruby) were two separate DOM reads. Any mutation between them (PercyDOM.serialize, dynamic insertion, etc.) silently shuffled the meta-to-element mapping, so one iframe's content could ship under another's percyElementId. Replace positional alignment with `find_element(css: 'iframe[data-percy-element-id="…"]')` via a new find_iframe_by_percy_id helper. percyElementId is already stamped by PercyDOM and we already require it via should_skip_iframe. - "blank" sentinel in UNSUPPORTED_IFRAME_SRCS matched any src starting with the literal text "blank" (e.g. a route at /blank/foo). The about:blank case is already covered by the about:blank entry; drop the bare token. - PercyContextLost.set_backtrace(captured_error.backtrace) overwrote the new exception's own trace with the inner one, hiding where the parent-frame restoration actually failed. Drop the override and let the inner error remain accessible via Exception#cause if needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Update existing get_serialized_dom CORS specs to stub driver.find_element(css: 'iframe[data-percy-element-id="..."]') instead of find_elements(:tag_name, 'iframe'), matching the new percyElementId lookup path. - Add unit tests for find_iframe_by_percy_id covering nil/empty input, the attribute selector wire format, and the NoSuchElementError swallow. - Regression test that "blank" no longer matches arbitrary src prefixes (e.g. blanket.com, blank-canvas://); about:blank stays covered. - Test that capture_cors_iframes does not call find_elements on the iframe collection and routes through find_element by percyElementId. - Confirm PercyContextLost keeps its own backtrace pointing at the restoration failure rather than the inner error. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The earlier removal of the Metrics/BlockNesting disable markers left the parent_frame call and the ensuring `end` at column 20 / 18 rather than inside the proper begin/rescue block. Functionally correct (rubocop's inherited percy-style config didn't flag it), but reads as broken. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CSS-escape backslashes and double-quotes in the percyElementId before interpolating into the iframe[data-percy-element-id="..."] selector. Anything weirder than that still falls through to the existing rescue and returns nil. Adds two specs covering the escape and the graceful-nil path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds two regression specs:
* process_frame_tree resolves nested child iframes via
find_element(css: 'iframe[data-percy-element-id="..."]'), not via
a positional index into find_elements(:tag_name, 'iframe'). This
pins the index-alignment fix at the child level.
* Percy::PercyContextLost raised then rescued exposes a backtrace
pointing at the raise site (non-nil, non-empty, top frame matches
the raise line) so deeper diagnostics aren't lost.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Chained gsubs ('\\' first, then '"') were semantically correct but
matched CodeQL's incomplete-string-escaping pattern (the second
gsub on its own doesn't escape backslashes). Collapse to one
gsub(/[\\"]/) { |c| "\\#{c}" } so the escape is provably complete
in a single pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-shadow-dom # Conflicts: # lib/percy.rb # package.json
pranavz28
left a comment
There was a problem hiding this comment.
Cross-SDK review across the 6 ports of this feature (Nightwatch #869 canonical; WDIO #1441; Protractor #708; .NET #436; Ruby #32; Python #136 shadow-DOM-only). This Ruby port has the largest set of behaviour divergences from the canonical Nightwatch implementation — flagged inline.
Cross-SDK consistency matrix (selected)
| Invariant | Nightwatch | Protractor | WDIO | .NET | Ruby | Python |
|---|---|---|---|---|---|---|
| Case-insensitive src match | ✅ .toLowerCase() |
✅ | ✅ | ✅ OrdinalIgnoreCase |
❌ raw start_with? |
N/A |
Prefix list includes file: |
✅ | ✅ | ❌ | ❌ | ❌ | N/A |
Prefix list includes view-source: / devtools: / edge: / opera: |
✅ | ✅ | ✅ | ❌ | ❌ | N/A |
about: used as prefix (not exact match) |
✅ | ✅ | ✅ | ✅ | ❌ about:blank/about:srcdoc only |
N/A |
| DEFAULT_MAX_FRAME_DEPTH | 10 | 10 | 10 | 5 | 5 | N/A |
| HARD cap | 25 | 25 | 25 | 10 | 50 | N/A |
percyContextLost raised at depth=1 |
✅ | ✅ | ❌ | ❌ | ❌ | N/A |
Verified-pass items
find_iframe_by_percy_idescapes\and"inpercyElementId— this is the reference implementation other SDKs should match.- Cycle detection seeds both attribute
srcAND post-switchframe_url— stronger than WDIO #1441's port. - Helper inlining (no
_iframe_shim.rb) is correct for Ruby. - Test coverage is solid: cycle detection, partial-capture handoff, CDP failure modes, CSS escape regression, no
# coverage:ignorepatterns. expose_closed_shadow_rootsre-primed in the responsive snapshot loop (the Python port #136 missed this).
Additional finding (no precise line anchor)
Closed-shadow walker skips ALL iframes — _walk_nodes returns at contentDocument. Closed shadow roots inside same-origin iframes are NOT captured. The selenium-dotnet #436 port fixed exactly this case on CE re-review (recurse into same-origin frames via GetOrigin comparison). Same fix should land here.
Verdict
Request changes. Inline blockers (prefix list + case sensitivity, depth defaults, percyContextLost depth gate). These are the kind of "snapshots almost-right" divergences the coordinated rollout is trying to avoid.
| # Iframe src prefixes / sentinels we never attempt to switch into -- these | ||
| # represent either browser-internal documents, non-HTTP URI schemes, or | ||
| # placeholder values that have no meaningful CORS content to capture. | ||
| UNSUPPORTED_IFRAME_SRCS = %w[ |
There was a problem hiding this comment.
HIGH — UNSUPPORTED_IFRAME_SRCS is missing prefixes that every sibling SDK includes, and uses about:blank/about:srcdoc as exact strings instead of the about: prefix used by the others.
Canonical list from Nightwatch #869 + Protractor #708:
about:, chrome:, chrome-extension:, devtools:, edge:, opera:,
view-source:, data:, javascript:, blob:, vbscript:, file:
Ruby is missing file:, view-source:, devtools:, edge:, opera:, and uses about:blank/about:srcdoc (exact) instead of about: (prefix). User-visible result: Ruby will attempt to switch into file://, view-source:, about:newtab, Edge/Opera internal URLs that every other SDK rejects.
Fix (with the case-sensitivity fix called out separately on line 545):
UNSUPPORTED_IFRAME_SRCS = %w[
about: chrome: chrome-extension: devtools: edge: opera: view-source:
data: javascript: blob: vbscript: file:
].freezeAdd spec rows for file:///x, view-source:https://x, about:newtab so the divergence doesn't return.
| def self.is_unsupported_iframe_src?(src) | ||
| return true if src.nil? || src.to_s.empty? | ||
|
|
||
| UNSUPPORTED_IFRAME_SRCS.any? { |prefix| src == prefix || src.start_with?(prefix) } |
There was a problem hiding this comment.
HIGH — case-sensitive prefix check diverges from every sibling SDK.
UNSUPPORTED_IFRAME_SRCS.any? { |prefix| src == prefix || src.start_with?(prefix) }All sibling SDKs lowercase first:
- JS:
String(src).toLowerCase() - C#:
StringComparison.OrdinalIgnoreCase - Ruby (here): nothing
JavaScript:alert(1), ABOUT:BLANK, FILE:///etc/passwd (mixed-case) slip past Ruby's filter. Fix:
def self.is_unsupported_iframe_src?(src)
return true if src.nil? || src.empty?
s = src.to_s.downcase
UNSUPPORTED_IFRAME_SRCS.any? { |p| s.start_with?(p) }
endDrop the src == prefix exact-match branch once prefixes end in : (every entry already does).
| module Percy | ||
| # Maximum nesting depth for cross-origin iframe recursion. Bounds the cost | ||
| # of pathological pages and prevents runaway recursion on cyclic frame trees. | ||
| DEFAULT_MAX_FRAME_DEPTH = 5 |
There was a problem hiding this comment.
MEDIUM — DEFAULT_MAX_FRAME_DEPTH = 5 diverges from the canonical default of 10.
Nightwatch / WDIO / Protractor JS shims default to 10. .NET also uses 5 (separate finding on PR #436). Selenium-Ruby and .NET appear to be on different defaults from the JS ecosystem.
Customer-visible: same fixture run in Ruby will capture half as many nested-CORS-iframe levels as the same fixture run in Node. Pick one default in upstream sdk-utils (recommended: 10) and align.
Also the hard cap is 50 (see clamp_frame_depth later in the file) while siblings cap at 25 — second divergence on the same axis. Aligning both to 10 / 25 brings Ruby in line with 3 of 4 JS SDKs.
| rescue StandardError | ||
| nil | ||
| end | ||
| if depth > 1 |
There was a problem hiding this comment.
HIGH — PercyContextLost only raised when depth > 1 — diverges from canonical Nightwatch #869.
Nightwatch + Protractor raise regardless of depth. Their argument: when parent_frame fails and the SDK falls back to default_content, the OUTER capture_cors_iframes loop is still iterating sibling iframe references that were resolved against the previous parent context. Silently continuing at depth=1 means the second top-level CORS iframe's serialized content can be attached to the FIRST iframe's percyElementId — a silent cross-attachment bug that's very hard to root-cause from a user report.
WDIO #1441 and .NET #436 carry the same depth>1 gate; they should be aligned together. Suggested fix:
# Always raise on parent restore failure; outer loop catches and merges partial_capture.
# At depth=1, sibling enumeration was resolved against the now-stale parent — same
# correctness gap as deeper failures.
raise PercyContextLost.new(partial_capture: collected)Add a regression test: 3 top-level CORS iframes + force parent_frame to fail on the first — assert only the first iframe's content appears, no cross-attachment.
- expand UNSUPPORTED_IFRAME_SRCS to the canonical 12-entry prefix list (add file:, view-source:, devtools:, edge:, opera:; about: as prefix) - match iframe src case-insensitively - DEFAULT_MAX_FRAME_DEPTH 5->10, hard cap 50->25 (HARD_MAX_FRAME_DEPTH) - raise PercyContextLost regardless of depth (fixes depth=1 cross-attachment) - add specs for new prefixes, mixed-case, and depth=1 PercyContextLost Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The PER-7348 readiness gate (commit 9a71cdb) made get_serialized_dom run PercyDOM.waitForReady via driver.execute_async_script before serialize, but the pre-existing CORS-iframe specs in the .get_serialized_dom block were never updated to stub it -- so every one raised "received unexpected message :execute_async_script" on the shared driver double. Two readiness specs also called get_serialized_dom with `readiness:` as a bare keyword, which Ruby rejects against the (driver, options, percy_dom_script:) signature with "wrong number of arguments (given 1, expected 2)". - add a harmless default execute_async_script no-op stub in the block's before(:each); readiness-specific tests still override it with their own expectations - pass readiness config as a positional options hash, not a keyword arg No test assertions are weakened or skipped. This takes the unit suite in this block from 14 failures to 0; the only remaining failures are the two live-browser / live-server integration tests (real Firefox + Percy CLI on :5338), which are environmental and fail identically on origin/main. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
🤖 stack:pr-review — Automated review (BrowserStack AI harness)Summary: Adds cross-origin (CORS) nested-iframe capture (depth-capped, cycle-guarded),
Findings (fixed during review): the already-merged PER-7348 readiness gate ( Verification: Commits pushed: Overall: ✅ Pass — Approve. No human-decision blockers for the code; only the env-dependent live-CLI/browser tests remain red (pre-existing on
|
…-shadow-dom Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
Brings percy-selenium-ruby to parity with the canonical Percy CORS iframe + closed shadow DOM feature set.
Implemented
data-percy-ignoreattribute opt-outignoreIframeSelectorsoptionis_unsupported_iframe_src?PercyContextLostrecovery mergespartial_captureexpose_closed_shadow_roots)Skipped
Reference
Mirrored from percy/percy-capybara branch PER-7292-add-cross-origin-iframe-support and percy/percy-nightwatch#869.
Test plan
🤖 Generated with Claude Code via /percy-sdk-sync