Skip to content

Don't wait for file-based MCPs we never attempt to start#10969

Merged
seemeroland merged 8 commits into
masterfrom
roland/file-based-mcp-wait-fix
May 15, 2026
Merged

Don't wait for file-based MCPs we never attempt to start#10969
seemeroland merged 8 commits into
masterfrom
roland/file-based-mcp-wait-fix

Conversation

@seemeroland
Copy link
Copy Markdown
Contributor

@seemeroland seemeroland commented May 14, 2026

Description

In an oz run, we were timing out after waiting 60s for discovered file-based MCP servers to reach a terminal status.

We intentionally do not start project-scoped file-based MCP servers for security reasons in this PR

But the agent runner was still discovering these and waiting for them to reach a terminal status, even though we never attempt to start them

Now CloudEnvMcpScanComplete contains all detected file-based mcp servers, but also notes which we will actually attempt to auto-start. We only wait for those. Did some refactoring to make clear we don't auto-start every file-based mcp server we detect.

Also add some logging and reduce the timeout to 20s.

@cla-bot cla-bot Bot added the cla-signed label May 14, 2026
@seemeroland seemeroland marked this pull request as ready for review May 14, 2026 23:47
@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 14, 2026

@seemeroland

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR changes cloud-environment file-based MCP discovery so the agent runner waits only for servers that were actually auto-start requested, while still emitting metadata for all detected servers. It also reduces the readiness timeout and adds diagnostic logging around pending MCP startup state.

Concerns

No blocking correctness, security, or error-handling concerns found in the reviewed diff.

Verdict

Found: 0 critical, 0 important, 0 suggestions

Approve

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

@seemeroland seemeroland requested a review from vkodithala May 15, 2026 01:01
Comment on lines +1681 to +1699
let mut new_installations = Vec::new();
for installation in installations {
let uuid = installation.uuid();
let server_name = &installation.templatable_mcp_server().name;
if self.active_servers.contains_key(&uuid) {
log::info!(
"Skipping file-based MCP server '{server_name}' ({uuid}); already running"
);
continue;
}
if self.spawned_servers.contains_key(&uuid) {
log::info!(
"Skipping file-based MCP server '{server_name}' ({uuid}); already starting"
);
continue;
}
log::info!("Spawning file-based MCP server '{server_name}' ({uuid})");
new_installations.push(installation.clone());
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice improvement! Thanks for making this more verbose.

Copy link
Copy Markdown
Contributor

@vkodithala vkodithala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Left a few comments, nothing blocking. Thanks for the fix!

Comment on lines +28 to +31
/// UUIDs that were actually auto-start requested while parsing each `(root, provider)`.
/// They are temporarily stored here and removed to emit FileBasedMCPManagerEvent::CloudEnvMcpScanComplete
pending_scan_auto_started_servers_by_root:
HashMap<PathBuf, HashMap<MCPProvider, HashSet<Uuid>>>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This is spaced weirdly. Should the HashMap declaration be inline?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cargo fmt wants it separated like this :/

Comment thread app/src/ai/mcp/file_based_manager.rs Outdated
Comment on lines +325 to +327
log::info!(
"Auto-spawning file-based MCP server '{server_name}' ({installation_uuid})"
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This might get noisy, especially for local logs. I think we already have sufficient signal from logs you included in driver.rs to enumerate which servers were auto-started, is there any functional reason to keep this? Or maybe just make this a log::debug?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we autospawning servers that often? I thought it would be useful to know locally whenever mcp servers are being spawned, since we don't seem to have that now

Comment thread app/src/ai/mcp/file_based_manager.rs
@seemeroland seemeroland merged commit 8ce7d14 into master May 15, 2026
25 checks passed
@seemeroland seemeroland deleted the roland/file-based-mcp-wait-fix branch May 15, 2026 19:14
lawsmd pushed a commit to lawsmd/cortex that referenced this pull request May 22, 2026
…10969)

## Description

In an oz run, we were timing out after waiting 60s for discovered
file-based MCP servers to reach a terminal status.

We intentionally do not start project-scoped file-based MCP servers for
security reasons in [this
PR](warpdotdev/warp-internal#24640)

But the agent runner was still discovering these and waiting for them to
reach a terminal status, even though we never attempt to start them

Now `CloudEnvMcpScanComplete` contains all detected file-based mcp
servers, but also notes which we will actually attempt to auto-start. We
only wait for those. Did some refactoring to make clear we don't
auto-start every file-based mcp server we detect.

Also add some logging and reduce the timeout to 20s.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants