fix(stripComments): stop leaving empty lines when entire line is a comment#1500
Conversation
| const newValue = node.value | ||
| .replace(WHOLE_LINE_HTML_COMMENT_REGEX, '') | ||
| .replace(HTML_COMMENT_REGEX, '') |
There was a problem hiding this comment.
im thinking of ignoring this tbh, this seems very unlikely and very much unnecessary.
CodeQL's concern is that replace(HTML_COMMENT_REGEX, '') is a single pass. A crafted input can leave behind <!-- after one pass:
input: <<!---->!-->
↑ regex matches <!----> (positions 1-7)
output: <!--> ← starts with <!--
The regex matched (the inner complete comment), removed it, and the surrounding bits collapsed into <!-->. The leftover string still contains <!-- as a substring. A downstream HTML parser could read that as the start of a new comment.
I feel like this is fine since stripComments is gated by mdxSanitizeComments most of the time anyway in the monorepo, which exists to hide internal notes from end users on pages the author themselves wrote. The author already controls the source. If they wanted to leak content to end users they would probably just write that and dont need these extra steps
As far as Im seeing this, there's no untrusted-input boundary being crossed so I think this is fine and it saves us from these complexities.... wdyt @eaglethrost @kevinports
| * Collapses any run of blank lines (empty or whitespace-only) to a single | ||
| * newline so the CommonMark type-6 block isn't terminated mid-table. | ||
| */ | ||
| export const collapseBlankLines = (value: string): string => |
There was a problem hiding this comment.
Do any blank lines between html elements in a <table> cause component splitting? Or only in some cases?
There was a problem hiding this comment.
from my understanding yes, iirc the micromark html block states that an empty line causes it to terminate, which is the cause of the table split
🎯 What does this PR do?
There is actually 2 issues that arose from this ticket
"", leaving an empty line that may break the component into 2 parts: fixed in 026c279<table>, this tokenizer doesnt know how to handle a stray empty line in the middle of a<table>component: fixed in 421762cfor the original customer site in the ticket, the issue was a mix of both. The stripComments created an empty line in a fallback
<table>and it didnt know how to handle it, causing the original errorNote
i also reworked the demo app for the strip comments mode, now we can show multiple markdown outputs just like in the rendered mode. each output for each selected engine
🧪 QA tips
make sure that piece of code survives and is rendered correctly in the demo app and the monorepo (both admin and end user view)
📸 Screenshot or Loom
Issue 1
The new improvement to the stripComments still persist intentional empty lines:

Issue 2