Skip to content

Bug: AssertionError: Request already responded to — cancellation race in v1.27.0 #2416

@bbarwik

Description

@bbarwik

Initial Checks

Description

Bug: AssertionError: Request already responded to — cancellation race in v1.27.0

AssertionError: Request already responded to when CancelledNotification arrives after handler completes but before respond()

Description

When a client sends a notifications/cancelled for a request whose handler has already finished executing but hasn't yet called message.respond(), the server crashes with AssertionError: Request already responded to.

PR #2334 (v1.27.0) fixed the ClosedResourceError crash path by catching CancelledError in _handle_request and guarding respond() against BrokenResourceError/ClosedResourceError. However, it left a race window between handler completion and respond() where a cancellation notification can set _completed = True first, causing the assert on line 129 of session.py to fire.

Reproduction scenario

  1. Client sends a tools/call request with a long-running handler (e.g. polling with 600s timeout)
  2. Handler completes and returns a result
  3. Between the handler's return and the await message.respond(response) call in _handle_request, the client sends notifications/cancelled for that same request ID
  4. The cancellation notification handler (session.py:403-406) calls responder.cancel(), which:
    • Calls _cancel_scope.cancel()
    • Sets _completed = True
    • Sends an error response "Request cancelled"
  5. Back in _handle_request, execution reaches await message.respond(response) at server.py:800
  6. respond() hits assert not self._completed at session.py:129crash

The cancel_scope.cancel() only raises CancelledError if the task is currently in an await. Since the handler already returned, the code path between the handler return and respond() is synchronous — no checkpoint where CancelledError can be delivered. The except anyio.get_cancelled_exc_class() guard at server.py:773 never fires.

Stack trace

ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
  +-+---------------- 1 ----------------
    | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
    +-+---------------- 1 ----------------
      | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
      +-+---------------- 1 ----------------
        | Traceback (most recent call last):
        |   File "mcp/server/lowlevel/server.py", line 703, in _handle_message
        |     await self._handle_request(message, req, session, lifespan_context, raise_exceptions)
        |   File "mcp/server/lowlevel/server.py", line 800, in _handle_request
        |     await message.respond(response)
        |   File "mcp/shared/session.py", line 129, in respond
        |     assert not self._completed, "Request already responded to"
        | AssertionError: Request already responded to
        +------------------------------------

Impact

This crashes the entire MCP server process, killing all in-flight requests. In our case, the server manages multiple long-running background agents, so a crash loses all active work. The crash is triggered by normal client behavior (user cancels an operation), making it a reliability issue rather than an edge case.

Related

Example Code

The race window in `_handle_request` (`server.py:719`):


# Line 770: handler completes, returns response
response = await handler(req)

# ... exception handling ...

# Line 799-800: GAP — between handler return and respond(),
# a CancelledNotification can arrive on another task and call
# responder.cancel(), setting _completed = True and sending
# an error response. No await in this gap means no CancelledError
# can be delivered.
try:
    await message.respond(response)  # <-- assert fires here
except (anyio.BrokenResourceError, anyio.ClosedResourceError):
    ...


The `except anyio.get_cancelled_exc_class()` at line 773 correctly handles the case where the cancellation arrives *during* handler execution. But it cannot handle cancellation that arrives *after* the handler returns, because there's no async checkpoint between the handler return and `respond()`.

## Suggested fix

In `session.py`, change `respond()` to handle the already-completed case gracefully instead of asserting:


async def respond(self, response: SendResultT | ErrorData) -> None:
    if not self._entered:
        raise RuntimeError("RequestResponder must be used as a context manager")
    
    # If already completed (e.g. by a concurrent cancellation), skip silently.
    if self._completed:
        return

    if not self.cancelled:
        self._completed = True
        await self._session._send_response(
            request_id=self.request_id, response=response
        )


Alternatively, the guard could be added in `_handle_request` before calling `respond()`:


if not message._completed:
    try:
        await message.respond(response)
    except (anyio.BrokenResourceError, anyio.ClosedResourceError):
        logger.debug("Response for %s dropped - transport closed", message.request_id)


The first approach (in `respond()` itself) is more robust since it closes the race for all callers.

Python & MCP Python SDK

- `mcp` 1.27.0
- Python 3.14
- anyio (asyncio backend)
- FastMCP stdio transport
- Client: Claude Code 2.1.92

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions