Skip to content

feat(tui): render inline images for ReadMediaFile tool results#492

Open
linxinhong wants to merge 1 commit into
MoonshotAI:mainfrom
linxinhong:feat/tui-readmediafile-inline-image
Open

feat(tui): render inline images for ReadMediaFile tool results#492
linxinhong wants to merge 1 commit into
MoonshotAI:mainfrom
linxinhong:feat/tui-readmediafile-inline-image

Conversation

@linxinhong
Copy link
Copy Markdown

Problem

 ReadMediaFile tool results containing `image_url` ContentPart are currently rendered as plain text metadata

(image · 45.2 KB) in the TUI transcript. Users cannot visually inspect images without switching to the vis
web interface or running an external viewer.

 ## What changed

 Enhanced the `readMediaSummary` renderer in `media.ts` to detect `image_url` data URLs and, on terminals that

support the Kitty or iTerm2 inline graphics protocol, render the actual image using pi-tui's Image component.

 - Parse and preserve the base64 payload from `image_url` data URLs.
 - Extract original pixel dimensions from ReadMediaFile's `original size NxMpx` text when available.
 - Use `getCapabilities()` to gate image rendering: only active on `kitty` / `iterm2` terminals.
 - Fall back to the existing text summary on unsupported terminals.
 - Image display is capped at 12 rows × 60 columns to avoid monopolizing the viewport.

 ## Checklist

 - [x] I have read the CONTRIBUTING document.
 - [x] Ran `pnpm typecheck` — passes.
 - [x] Changeset included.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Jun 5, 2026

🦋 Changeset detected

Latest commit: 38142f3

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@moonshot-ai/kimi-code Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 38142f3346

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

};
const dims = parseOriginalSize(summary.originalSize);
const image = new Image(
summary.base64,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid sending huge images to the terminal

When a supported Kitty/iTerm2 terminal expands a large ReadMediaFile image, this path passes the full data-URL payload directly into the inline image component even though ReadMediaFile permits files up to 100 MB (packages/agent-core/src/tools/builtin/file/read-media.ts). The maxHeightCells/maxWidthCells options cap the displayed cell size, but they do not reduce the base64 payload being emitted, so expanding a large screenshot/photo can push tens or hundreds of MB of escape-sequence data through the TUI and make it appear hung; gate inline rendering by summary.bytes or generate a smaller thumbnail before constructing Image.

Useful? React with 👍 / 👎.

const theme: ImageTheme = {
fallbackColor: (s: string) => chalk.hex(ctx.colors.textDim)(s),
};
const dims = parseOriginalSize(summary.originalSize);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Parse the dimensions format emitted by ReadMediaFile

For real ReadMediaFile results with known image dimensions, core emits the leading system text as Original dimensions: WxH pixels. (packages/agent-core/src/tools/builtin/file/read-media.ts), but this new inline-image path only looks for the older WxHpx summary format before passing dimensions to Image. That means the renderer almost always constructs the image without the original pixel size, so terminals cannot reliably reserve/scale the image with the intended aspect ratio; accept the current Original dimensions wording or source the dimensions from the actual output format before rendering.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants