fix(ai): recover from invalid tool-call input instead of aborting the agent stream by boomyao · Pull Request #2192 · vercel/workflow

boomyao · 2026-06-01T05:19:28Z

Problem

In DurableAgent, the two kinds of tool error are handled in opposite ways inside executeTool:

execute() throws → caught and converted to an error-text tool result fed back to the model, so the agent recovers and the stream continues. The code comment even says this "aligns with AI SDK's streamText behavior for individual tool failures."
Tool-call arguments fail inputSchema validation (and no experimental_repairToolCall fixes them) → throw, which propagates out of executeTool, aborts agent.stream(), and fails the entire durable workflow run.

A model occasionally emitting a slightly-malformed tool call (an empty array where .min(1) is required, a missing required field, a wrong type, truncated-then-JSON-repaired args) is a recoverable event — the model will usually fix it if told. But today it is fatal: one bad tool call kills a long-running task, with no chance for the agent to self-correct. The only hook on this path, experimental_repairToolCall, can't help here because its returned tool call must itself pass the schema — so it can fix malformed JSON syntax but cannot express "tell the model its arguments were invalid and let it regenerate."

This is inconsistent (the framework already recovers from the harder case — execute() throwing) and looks like an oversight rather than intent.

Reproduction

import { z } from 'zod';
import { DurableAgent } from '@workflow/ai/agent';
import { MockLanguageModelV3, convertArrayToReadableStream } from 'ai/test';

const toolCall = (toolName: string, input: string) =>
  convertArrayToReadableStream<any>([
    { type: 'stream-start', warnings: [] },
    { type: 'tool-call', toolCallId: 'c1', toolName, input },
    { type: 'finish', finishReason: 'tool-calls', usage: { inputTokens: 1, outputTokens: 1, totalTokens: 2 } },
  ]);
const stop = () =>
  convertArrayToReadableStream<any>([
    { type: 'stream-start', warnings: [] },
    { type: 'text-start', id: 't' }, { type: 'text-delta', id: 't', delta: 'ok' }, { type: 'text-end', id: 't' },
    { type: 'finish', finishReason: 'stop', usage: { inputTokens: 1, outputTokens: 1, totalTokens: 2 } },
  ]);

async function run(toolName: string, inputSchema: any, execute: any, input: string) {
  let n = 0;
  const model = new MockLanguageModelV3({ doStream: async () => (++n === 1 ? { stream: toolCall(toolName, input) } : { stream: stop() }) });
  const agent = new DurableAgent({ model: () => model, instructions: 'x', tools: { [toolName]: { description: 'd', inputSchema, execute } } });
  try {
    await agent.stream({ messages: [{ role: 'user', content: 'go' }], activeTools: [toolName], maxSteps: 5, writable: new WritableStream({ write() {} }), preventClose: true, sendFinish: false });
    return 'STREAM SURVIVED';
  } catch (e) { return `STREAM ABORTED: ${(e as Error).message}`; }
}

// A: strict schema, model sends invalid args (empty string violates .min(1))
console.log(await run('strict', z.object({ x: z.string().min(1) }), () => ({ ok: true }), '{"x":""}'));
// B: permissive schema, but execute() throws
console.log(await run('thrower', z.object({}), () => { throw new Error('boom'); }, '{}'));

Before this PR:

A (schema-invalid input) → STREAM ABORTED: Invalid input for tool "strict": [ ... too_small ... ]
B (execute() throws)     → STREAM SURVIVED

A should survive too.

Fix

executeTool already funnels both malformed-JSON and the re-thrown "Invalid input for tool ..." schema-validation error through a single throw parseError at the end of the parse/validate block. This PR changes that one escape point to return the error as an error-text tool result — identical to how execute() errors are handled a few lines below — so the agent receives the error as a tool result and can correct its arguments and retry within maxSteps. experimental_repairToolCall still runs first; only the final give-up changes from throw to recover.

After this PR, both A and B print STREAM SURVIVED.

Notes

Behavior change: a tool call with invalid arguments that previously rejected the stream now feeds the validation error back to the model (bounded by maxSteps), consistent with execute() errors and AI SDK streamText. Happy to gate it behind an option (e.g. onInvalidToolInput: 'feedback' | 'throw', default 'feedback') if you'd prefer to preserve the throw for some callers — let me know.
Added a regression test mirroring the existing "tool execution error → error-text" test.
Added a changeset (@workflow/ai patch).

Verified locally: packages/ai typecheck clean, vitest run (47 tests) green, Biome clean on the changed files.

… stream DurableAgent.executeTool threw when a tool call's arguments failed inputSchema validation (and no experimental_repairToolCall fixed it), aborting the whole agent stream — which fails the entire durable workflow run. Tool *execution* errors are already recovered (returned to the model as an error-text tool result so the agent can self-correct); this makes input parse/validation failures consistent: return the error as an error-text tool result instead of throwing, so a single occasionally-malformed model tool-call can no longer kill a long-running task. Aligns with AI SDK streamText behavior. Signed-off-by: yao <zhangyaoruo@outlook.com>

changeset-bot · 2026-06-01T05:19:33Z

🦋 Changeset detected

Latest commit: e2266dd

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package

Name	Type
@workflow/ai	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

vercel · 2026-06-01T05:19:41Z

@boomyao is attempting to deploy a commit to the Vercel Labs Team on Vercel.

A member of the Team first needs to authorize it.

VaguelySerious

AI review: no blocking issues

VaguelySerious · 2026-06-29T19:12:50Z

+    // of aborting the entire stream. This aligns with AI SDK's streamText behavior
+    // for tool failures. Reaches here both for malformed JSON and for the
+    // re-thrown "Invalid input for tool ..." schema-validation error above.
+    return {


AI Review: Note

Recovering the run is the right call. One side effect worth being intentional about: because nothing throws out of executeTool for invalid input anymore, this path no longer reaches the outer catch that invokes onError, and the return happens before the recordSpan block below — so an invalid tool call now produces no ai.toolCall span and no onError callback. That's consistent with how execute() errors and AI SDK's tool-error are handled (neither surfaces via onError), but it's a real change from today, and the execute() path at least still emits a span. A caller relying on onError to observe malformed model output will go silent. Suggest emitting an ai.toolCall span here with the error recorded, so the recovered failure stays visible in traces.

Related follow-up (out of scope here): a hallucinated/unknown tool name still throws and aborts the whole stream higher up in this function — the same class of recoverable model mistake — so the two paths aren't fully consistent yet.

VaguelySerious · 2026-06-29T19:12:50Z

      });
    });

+    it('should convert invalid tool input to error-text result instead of failing stream', async () => {


AI Review: Nit

This verifies the validation error is fed back as error-text, but the mocked second next() returns { done: true }, so it doesn't prove the agent productively recovers. Consider having the second turn return a corrected tool call and asserting the tool's execute actually runs with the fixed input — that exercises the full "self-correct and retry within maxSteps" claim, not just the error-feedback half.

… productive recovery Address review feedback on recovering from invalid tool-call input: - The invalid-input recovery path no longer threw, so it produced no ai.toolCall span (the execute()-error path still does). Emit a span here that records the validation error and ERROR status, so the recovered failure stays observable in traces even though it is intentionally not surfaced via onError (matching tool-execution errors and AI SDK). - Add a test that drives invalid -> corrected tool call and asserts the tool actually executes once with the fixed input, proving the agent productively self-corrects rather than only feeding the error back. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

VaguelySerious · 2026-06-29T19:15:01Z

(AI) Pushed a follow-up commit addressing two of the review points directly:

Observability (Note): the invalid-input recovery path now emits an ai.toolCall span recording the validation error + ERROR status, so the recovered failure stays visible in traces (it remains intentionally off the onError path, matching tool-execution errors / AI SDK).
Test (Nit): added a test that drives invalid → corrected tool call and asserts the tool actually executes once with the fixed input, proving productive self-correction rather than just error feedback.

The no-such-tool nit is left as a separate follow-up as noted inline. @workflow/ai build + tests green locally.

…over-invalid-tool-input

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

VaguelySerious · 2026-06-29T22:08:42Z

(AI) Opened #2699 as an in-repo copy of this PR so the full CI suite (including the deploy-backed E2E lanes that fork PRs can't run) executes against the change. Credit for the fix remains with @boomyao — the copy is co-authored accordingly. #2699 is set to close this PR when it merges.

VaguelySerious · 2026-06-29T22:31:29Z

@boomyao Can't merge your PR due to signed commit requirements. I created a copy in #2699 and will merge that and make sure you end up in the git log

@boomyao

… agent stream (#2699) In-repo copy of #2192 by @boomyao, opened to run the full CI suite. Co-authored-by: yao <zhangyaoruo@outlook.com>

@boomyao

… agent stream (#2699) In-repo copy of #2192 by @boomyao, opened to run the full CI suite. Co-authored-by: yao <zhangyaoruo@outlook.com> Signed-off-by: Peter Wielander <mittgfu@gmail.com>

@boomyao

… agent stream (#2699) (#2703) #2192 by @boomyao Co-authored-by: yao <zhangyaoruo@outlook.com> Co-authored-by: Peter Wielander <mittgfu@gmail.com>

boomyao requested a review from a team as a code owner June 1, 2026 05:19

wasimxyz mentioned this pull request Jun 22, 2026

DurableAgent rejects MCP dynamic tools that have no validate function on inputSchema #1576

Open

VaguelySerious reviewed Jun 29, 2026

View reviewed changes

VaguelySerious requested a review from ijjk as a code owner June 29, 2026 19:14

VaguelySerious and others added 2 commits June 29, 2026 12:30

Merge remote-tracking branch 'origin/main' into fix/durable-agent-rec…

2462eca

…over-invalid-tool-input

chore(changeset): condense to a single sentence

e2266dd

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

VaguelySerious approved these changes Jun 29, 2026

View reviewed changes

VaguelySerious mentioned this pull request Jun 29, 2026

fix(ai): recover from invalid tool-call input instead of aborting the agent stream #2699

Merged

VaguelySerious closed this in #2699 Jun 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(ai): recover from invalid tool-call input instead of aborting the agent stream#2192

fix(ai): recover from invalid tool-call input instead of aborting the agent stream#2192
boomyao wants to merge 4 commits into
vercel:mainfrom
boomyao:fix/durable-agent-recover-invalid-tool-input

boomyao commented Jun 1, 2026

Uh oh!

changeset-bot Bot commented Jun 1, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Jun 1, 2026

Uh oh!

VaguelySerious left a comment

Uh oh!

VaguelySerious Jun 29, 2026

Uh oh!

VaguelySerious Jun 29, 2026

Uh oh!

VaguelySerious commented Jun 29, 2026

Uh oh!

VaguelySerious commented Jun 29, 2026

Uh oh!

VaguelySerious commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

boomyao commented Jun 1, 2026

Problem

Reproduction

Fix

Notes

Uh oh!

changeset-bot Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

vercel Bot commented Jun 1, 2026

Uh oh!

VaguelySerious left a comment

Choose a reason for hiding this comment

Uh oh!

VaguelySerious Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

VaguelySerious Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

VaguelySerious commented Jun 29, 2026

Uh oh!

VaguelySerious commented Jun 29, 2026

Uh oh!

VaguelySerious commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

changeset-bot Bot commented Jun 1, 2026 •

edited

Loading