Skip to content

fix(stripComments): stop leaving empty lines when entire line is a comment#1500

Open
maximilianfalco wants to merge 6 commits into
nextfrom
falco/cx-3116-enabling-sanitize-comments-breaks-html-table-rendering-when
Open

fix(stripComments): stop leaving empty lines when entire line is a comment#1500
maximilianfalco wants to merge 6 commits into
nextfrom
falco/cx-3116-enabling-sanitize-comments-breaks-html-table-rendering-when

Conversation

@maximilianfalco
Copy link
Copy Markdown
Contributor

@maximilianfalco maximilianfalco commented Jun 1, 2026

🎫 Resolve CX-3116

🎯 What does this PR do?

There is actually 2 issues that arose from this ticket

  1. whenever stripComments sees an entire line of html comments, it strips it by replacing it with a "", leaving an empty line that may break the component into 2 parts: fixed in 026c279
  2. the table tokenizer has a fallback for <table>, this tokenizer doesnt know how to handle a stray empty line in the middle of a <table> component: fixed in 421762c

for the original customer site in the ticket, the issue was a mix of both. The stripComments created an empty line in a fallback <table> and it didnt know how to handle it, causing the original error

Note

i also reworked the demo app for the strip comments mode, now we can show multiple markdown outputs just like in the rendered mode. each output for each selected engine

🧪 QA tips

<table>
    <thead>
        <tr>
            <th>Hello</th>
            <th>Import something</th>
        </tr>
    </thead>
    <tbody>
        <tr colspan="2"><td>Something</td></tr>
        <tr>
            <td>Why hello there</td>
            <td>Something something</td>
        </tr>
        <!-- <tr colspan="2"><td>Manager</td></tr> -->

        <tr>
            <td><a href="foo">Cat and Dog</a></td>
            <td>
                <code>{property activation resource name}</code>
            </td>
        </tr>
    </tbody>
</table>

make sure that piece of code survives and is rendered correctly in the demo app and the monorepo (both admin and end user view)

📸 Screenshot or Loom

Issue 1

Before After
Screenshot 2026-06-01 at 12 42 47 Screenshot 2026-06-01 at 14 29 20

The new improvement to the stripComments still persist intentional empty lines:
Screenshot 2026-06-01 at 14 30 38

Issue 2

Before After
Screenshot 2026-06-01 at 13 19 23 Screenshot 2026-06-01 at 13 25 38

Comment on lines +20 to +22
const newValue = node.value
.replace(WHOLE_LINE_HTML_COMMENT_REGEX, '')
.replace(HTML_COMMENT_REGEX, '')
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im thinking of ignoring this tbh, this seems very unlikely and very much unnecessary.

CodeQL's concern is that replace(HTML_COMMENT_REGEX, '') is a single pass. A crafted input can leave behind <!-- after one pass:

  input:   <<!---->!-->
                   ↑ regex matches <!---->  (positions 1-7)
  output:  <!-->   ← starts with <!--

The regex matched (the inner complete comment), removed it, and the surrounding bits collapsed into <!-->. The leftover string still contains <!-- as a substring. A downstream HTML parser could read that as the start of a new comment.

I feel like this is fine since stripComments is gated by mdxSanitizeComments most of the time anyway in the monorepo, which exists to hide internal notes from end users on pages the author themselves wrote. The author already controls the source. If they wanted to leak content to end users they would probably just write that and dont need these extra steps

As far as Im seeing this, there's no untrusted-input boundary being crossed so I think this is fine and it saves us from these complexities.... wdyt @eaglethrost @kevinports

Comment thread __tests__/lib/stripComments.test.ts Outdated
Comment thread __tests__/parsers/tables.test.ts
Comment thread example/Doc.tsx
* Collapses any run of blank lines (empty or whitespace-only) to a single
* newline so the CommonMark type-6 block isn't terminated mid-table.
*/
export const collapseBlankLines = (value: string): string =>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do any blank lines between html elements in a <table> cause component splitting? Or only in some cases?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from my understanding yes, iirc the micromark html block states that an empty line causes it to terminate, which is the cause of the table split

Comment thread processor/transform/stripComments.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants