Fix multipart filename quoting#279
Conversation
Signed-off-by: Jeet Dekivadia <jeet.university@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c8ecfd8681
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| # | ||
| # @return [String] | ||
| private def escape_multipart_filename(filename) | ||
| filename.to_s.gsub(/["\\]/) { "\\#{_1}" }.delete("\r\n") |
There was a problem hiding this comment.
Preserve compatible encodings in multipart streams
When a filename contains non-ASCII characters and the uploaded content is binary, this emits a UTF-8 header chunk followed by an ASCII-8BIT file chunk; the Net requester wraps the enumerator in ReadIOAdapter, whose buffer concatenation then raises Encoding::CompatibilityError before the request is sent. This regresses uploads such as OpenAI::FilePart.new(binary_io, filename: "é.png"), which previously kept the filename ASCII via encode_path; please keep multipart chunks in a consistent binary/ascii-compatible encoding or otherwise avoid mixing UTF-8 header chunks with binary body chunks.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in 3aa13fe. Escaped multipart filenames are now emitted as ASCII-8BIT, and the regression covers a non-ASCII filename followed by binary content through ReadIOAdapter.
Signed-off-by: Jeet Dekivadia <jeet.university@gmail.com>
|
Quick follow-up for review context. The second commit keeps multipart filename header chunks binary-compatible with file body writes, while still serializing filename parameters as quoted strings, escaping quote and backslash delimiters, and removing CR/LF before header emission. Validation from preparation: git diff --check; bundle install; ./scripts/lint; ./scripts/test; plus small Ruby probes for quote escaping, CR/LF removal, and ASCII-8BIT concatenation with binary content. I kept the scope limited to multipart header serialization and focused regressions. |
Summary
filename=values as quoted-strings instead of URI path segmentsWhy
Multipart
filename=parameters are quoted-string values. URL-encoding them changes ordinary filenames such asd o gintod%20o%20g. Writing literal values instead also needs delimiter escaping and CR/LF removal so crafted filenames cannot inject multipart headers.Validation
git diff --checkRemote Linux validation
git diff --checkbundle install./scripts/lint./scripts/test