Skip to content

Change caching rules#3

Open
qrdl wants to merge 1 commit into
mainfrom
fix-caching
Open

Change caching rules#3
qrdl wants to merge 1 commit into
mainfrom
fix-caching

Conversation

@qrdl
Copy link
Copy Markdown
Collaborator

@qrdl qrdl commented Apr 29, 2026

Fix caching for HTML->Markdown template

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the html2md proxy-wasm filter’s caching and conversion signaling so HTML responses can be safely cached in both original and converted (HTML→Markdown) variants.

Changes:

  • Normalize conversion signaling via a FLAG_MARKDOWN constant and set Convert request header based on Accept.
  • Apply Vary: Convert only for HTML responses (and merge with any existing Vary value).
  • Change conversion error handling to fall back to the origin response instead of sending a 500.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread html2md/src/lib.rs
Comment on lines +61 to +62
} else {
self.add_http_request_header(CONVERT_FLAG, "");
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on_http_request_headers now adds a Convert request header with an empty value for non-Markdown requests. Since the response already emits Vary: Convert for HTML, caches will still distinguish the Markdown variant even if the header is absent, so forwarding an always-present custom header to the origin may be unnecessary. Consider omitting the header entirely when not converting (keep it removed) unless an upstream component explicitly depends on Convert: "".

Suggested change
} else {
self.add_http_request_header(CONVERT_FLAG, "");

Copilot uses AI. Check for mistakes.
Comment thread html2md/src/lib.rs
Comment on lines 119 to 123
Err(e) => {
println!(
"cannot convert body to string {} for {}",
e,
self.get_path()
);
self.send_http_response(
SERVER_ERROR,
vec![],
Some(b"Origin response is not valid UTF-8"),
);
return Action::Pause;
println!("cannot convert body to string {} for {}", e, self.get_path());
return Action::Continue; // if body is not valid UTF-8, skip conversion and return original body
}
};
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When UTF-8 decoding fails you return Action::Continue, but at this point (when response.md is set) the response headers were already rewritten in on_http_response_headers (e.g., Content-Type: text/markdown, removed Content-Length, set Transfer-Encoding: chunked). This will forward the original (non-Markdown) body with Markdown headers, which is incorrect. Consider deferring header mutations until after a successful conversion, or explicitly restoring the original Content-Type/encoding-related headers and clearing response.md before continuing.

Copilot uses AI. Check for mistakes.
Comment thread html2md/src/lib.rs
Comment on lines 128 to 132
Ok(md) => md,
Err(e) => {
println!(
"cannot convert HTML to Markdown: {} for {}",
e,
self.get_path()
);
self.send_http_response(
SERVER_ERROR,
vec![],
Some(b"Failed to convert HTML to Markdown"),
);
return Action::Pause;
println!("cannot convert HTML to Markdown: {} for {}", e, self.get_path());
return Action::Continue; // if conversion fails, skip and return original body
}
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On conversion failure you now return Action::Continue, but the response headers may have already been rewritten to Markdown in on_http_response_headers. This can result in serving the original HTML body with Content-Type: text/markdown and chunked transfer encoding, and also caching the wrong variant. If the intent is to fall back to the origin body, the filter should keep the original headers (or revert them) when conversion fails.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants