fix(ai): recover from invalid tool-call input instead of aborting the agent stream#2192
fix(ai): recover from invalid tool-call input instead of aborting the agent stream#2192boomyao wants to merge 4 commits into
Conversation
… stream DurableAgent.executeTool threw when a tool call's arguments failed inputSchema validation (and no experimental_repairToolCall fixed it), aborting the whole agent stream — which fails the entire durable workflow run. Tool *execution* errors are already recovered (returned to the model as an error-text tool result so the agent can self-correct); this makes input parse/validation failures consistent: return the error as an error-text tool result instead of throwing, so a single occasionally-malformed model tool-call can no longer kill a long-running task. Aligns with AI SDK streamText behavior. Signed-off-by: yao <zhangyaoruo@outlook.com>
🦋 Changeset detectedLatest commit: e2266dd The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
@boomyao is attempting to deploy a commit to the Vercel Labs Team on Vercel. A member of the Team first needs to authorize it. |
VaguelySerious
left a comment
There was a problem hiding this comment.
AI review: no blocking issues
| // of aborting the entire stream. This aligns with AI SDK's streamText behavior | ||
| // for tool failures. Reaches here both for malformed JSON and for the | ||
| // re-thrown "Invalid input for tool ..." schema-validation error above. | ||
| return { |
There was a problem hiding this comment.
AI Review: Note
Recovering the run is the right call. One side effect worth being intentional about: because nothing throws out of executeTool for invalid input anymore, this path no longer reaches the outer catch that invokes onError, and the return happens before the recordSpan block below — so an invalid tool call now produces no ai.toolCall span and no onError callback. That's consistent with how execute() errors and AI SDK's tool-error are handled (neither surfaces via onError), but it's a real change from today, and the execute() path at least still emits a span. A caller relying on onError to observe malformed model output will go silent. Suggest emitting an ai.toolCall span here with the error recorded, so the recovered failure stays visible in traces.
Related follow-up (out of scope here): a hallucinated/unknown tool name still throws and aborts the whole stream higher up in this function — the same class of recoverable model mistake — so the two paths aren't fully consistent yet.
| }); | ||
| }); | ||
|
|
||
| it('should convert invalid tool input to error-text result instead of failing stream', async () => { |
There was a problem hiding this comment.
AI Review: Nit
This verifies the validation error is fed back as error-text, but the mocked second next() returns { done: true }, so it doesn't prove the agent productively recovers. Consider having the second turn return a corrected tool call and asserting the tool's execute actually runs with the fixed input — that exercises the full "self-correct and retry within maxSteps" claim, not just the error-feedback half.
… productive recovery Address review feedback on recovering from invalid tool-call input: - The invalid-input recovery path no longer threw, so it produced no ai.toolCall span (the execute()-error path still does). Emit a span here that records the validation error and ERROR status, so the recovered failure stays observable in traces even though it is intentionally not surfaced via onError (matching tool-execution errors and AI SDK). - Add a test that drives invalid -> corrected tool call and asserts the tool actually executes once with the fixed input, proving the agent productively self-corrects rather than only feeding the error back. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
(AI) Pushed a follow-up commit addressing two of the review points directly:
The no-such-tool nit is left as a separate follow-up as noted inline. |
…over-invalid-tool-input
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Problem
In
DurableAgent, the two kinds of tool error are handled in opposite ways insideexecuteTool:execute()throws → caught and converted to anerror-texttool result fed back to the model, so the agent recovers and the stream continues. The code comment even says this "aligns with AI SDK's streamText behavior for individual tool failures."inputSchemavalidation (and noexperimental_repairToolCallfixes them) →throw, which propagates out ofexecuteTool, abortsagent.stream(), and fails the entire durable workflow run.A model occasionally emitting a slightly-malformed tool call (an empty array where
.min(1)is required, a missing required field, a wrong type, truncated-then-JSON-repaired args) is a recoverable event — the model will usually fix it if told. But today it is fatal: one bad tool call kills a long-running task, with no chance for the agent to self-correct. The only hook on this path,experimental_repairToolCall, can't help here because its returned tool call must itself pass the schema — so it can fix malformed JSON syntax but cannot express "tell the model its arguments were invalid and let it regenerate."This is inconsistent (the framework already recovers from the harder case —
execute()throwing) and looks like an oversight rather than intent.Reproduction
Before this PR:
Ashould survive too.Fix
executeToolalready funnels both malformed-JSON and the re-thrown"Invalid input for tool ..."schema-validation error through a singlethrow parseErrorat the end of the parse/validate block. This PR changes that one escape point to return the error as anerror-texttool result — identical to howexecute()errors are handled a few lines below — so the agent receives the error as a tool result and can correct its arguments and retry withinmaxSteps.experimental_repairToolCallstill runs first; only the final give-up changes fromthrowto recover.After this PR, both
AandBprintSTREAM SURVIVED.Notes
maxSteps), consistent withexecute()errors and AI SDKstreamText. Happy to gate it behind an option (e.g.onInvalidToolInput: 'feedback' | 'throw', default'feedback') if you'd prefer to preserve the throw for some callers — let me know.@workflow/aipatch).Verified locally:
packages/aitypecheck clean,vitest run(47 tests) green, Biome clean on the changed files.