Make client-side cancellation work over the 2026 transports#3046
Conversation
The modern session stamp suppressed the courtesy notifications/cancelled
for every request, so abandoning a request over 2026 streamable HTTP left
the POST open and the server running, and over 2026 stream-pair transports
never sent the frame the spec requires. The frame is now the dispatcher's
uniform abandon signal: stream transports write it (the 2026 stdio
cancellation spelling), while the streamable-HTTP transport translates it
into aborting the named request's own in-flight POST - closing the
response stream is that wire's cancellation signal, and no client-to-server
notification ever POSTs at 2026. Each POST records the era it was sent
under so a late cancel is interpreted per the named request, not whatever
was negotiated since; pre-2026 wires still POST the frame (a disconnect is
explicitly not a cancel there). The negotiation methods keep their
cancellation opt-out on every path.
Callers can also supply their own request id via CallOptions["request_id"]
on both dispatchers - groundwork for demultiplexing subscriptions/listen
streams, whose id must be known before the result arrives. Ids reach the
peer verbatim ("7" stays a string), collide loudly only for the caller who
chose them (minting skips occupied keys), and share one coerced collision
domain so the in-memory dispatcher raises exactly where the wire one would.
469fdb2 to
69e6cd3
Compare
There was a problem hiding this comment.
I didn't find any bugs in this change, but it's a substantial PR touching core cancellation and request-correlation semantics across both dispatchers and the streamable-HTTP transport, so it warrants a human reviewer's judgment on the design.
Extended reasoning...
Overview
This PR reworks client-side cancellation for the 2026 transports and adds a caller-supplied request-id seam. It touches five source files: the modern session stamp in src/mcp/client/session.py (no longer forcing cancel_on_abandon=False except for negotiation methods), src/mcp/client/streamable_http.py (new _InFlightPost registry, per-request abort scopes, era-aware translation of outbound notifications/cancelled into POST aborts, and moving the protocol-version cache into the serialized write loop), src/mcp/shared/dispatcher.py (new CallOptions["request_id"] key and public coerce_request_id helper), and both JSONRPCDispatcher and DirectDispatcher (caller-supplied ids, collision detection in a shared coerced domain, mint-past-occupied-key logic). Roughly 400 lines of new tests accompany the change.
Security risks
No direct security-sensitive surfaces (auth, crypto, permissions) are touched. The main risk class is protocol-correctness rather than security: an incorrectly swallowed or leaked notifications/cancelled frame, a wrong-era interpretation, or an id-collision bug could leak server work or mis-correlate responses, but these are reliability concerns, not exposure or injection vectors.
Level of scrutiny
High. This is core client transport and dispatcher logic with subtle concurrency semantics: abort-scope registration ordering relative to task spawning, identity-guarded teardown for reused request ids, per-request era capture versus the mutable negotiated-version cache, and a shared coerced collision domain between the wire and in-memory dispatchers. It also adds a new public knob (CallOptions["request_id"]) and promotes coerce_request_id into the shared dispatcher module's public surface — API decisions a maintainer should weigh in on, especially since the upcoming subscriptions/listen client driver builds on these seams.
Other factors
The bug-hunting pass found no issues, the PR description is thorough about spec rationale (2026 HTTP disconnect-as-cancel vs. stdio cancelled-frame), and test coverage is extensive, including an end-to-end ASGI test and race-oriented cases (mid-session version flip, reused-id successor registration). Those are strong positives, but the size, the concurrency subtleties, and the new API surface put this well outside the simple/mechanical category the auto-approval bar requires, so I'm deferring to a human reviewer rather than approving.
Client-side request cancellation was a no-op on the 2026 wire: the modern session stamp forced
cancel_on_abandon=Falsefor every request, so abandoning a request (caller cancellation or timeout) over streamable HTTP left the POST/SSE stream open and the server running, and over 2026 stream-pair transports never sent thenotifications/cancelledthe spec requires. This PR makes cancellation real on both modern transports, and adds the caller-supplied request-id seam the upcoming clientsubscriptions/listendriver needs.Motivation and Context
Per the 2026-07-28 spec, the two transports spell client cancellation differently:
notifications/cancelledmessage is required or expected" — and the wire defines no client-to-server notifications at all.notifications/cancellednotification referencing the request ID."The courtesy frame the dispatcher already emits on abandon is now the uniform internal signal: stream transports write it (spec-correct at 2026 stdio, unchanged at 2025), while the streamable-HTTP transport translates it into aborting the named request's own in-flight POST instead of writing it. Every request POST is registered with an abort scope and the era it was sent under, so a late cancel is interpreted per the named request's era rather than whatever was negotiated since; on pre-2026 wires the frame still POSTs (a 2025 disconnect is explicitly not a cancel). Registration happens synchronously in the write loop, before the POST task is spawned, so a cancel dequeued immediately after its request can never miss it, and teardown is identity-guarded so a finished task unwinding late cannot evict a reused id's successor registration. The protocol-version cache moves into the same serialized loop for the same reason: it now reflects wire order instead of POST-task scheduling order.
Separately,
CallOptions["request_id"]lets a caller supply the outbound request id on both dispatchers. Asubscriptions/listenclient driver must know its subscription id (= the request's JSON-RPC id) before the result arrives, which the mint-internally-only design made impossible. Supplied ids reach the peer verbatim ("7" stays a string), collide loudly only for the caller who chose them (minting skips occupied keys), and both dispatchers share one coerced collision domain — so the in-memoryDirectDispatcherraises exactly whereJSONRPCDispatcherwould in production. The negotiation methods (initialize,server/discover) keep their cancellation opt-out on every path.How Has This Been Tested?
subscriptions/listencloses the POST, the server releases the subscription (observed through the bus seam), and nonotifications/cancelledcrosses the wire.Client: at 2026, a parkedtools/callcancelled from the caller's scope is observed as a genuine cancellation server-side with zero cancelled frames on the wire; at 2025 (mode="legacy"), the same flow POSTs exactly onenotifications/cancelledframe and the stream stays open. A rawsubscriptions/listensent withrequest_id="verify-listen-1"came back ack-stamped with that id.Breaking Changes
None against released surfaces. Within the v2 line: abandoning a modern-era request now produces the spec's cancellation behavior (POST abort at 2026 HTTP, a cancelled frame at 2026 stdio) instead of silently leaking the request.
Types of changes
Checklist
Additional context
One deliberate boundary: the transport translates exactly
notifications/cancelledbecause it is the only client-to-server notification the 2026 core protocol defines. A table-driven guard at the POST boundary (rejecting any future client notification a modern wire doesn't define) would make illegal frames unconstructible for extensions too — left as a follow-up seam rather than smuggled into this change.The client
subscriptions/listendriver (context-managedclient.listen(...)yielding typed events) builds directly on these seams and follows as its own PR.