Why STDIO is a dead end for teams

STDIO transport means the client launches your server as a child process and speaks JSON-RPC over stdin and stdout. For local development it is perfect: no network, no auth, no latency. As a distribution mechanism it fails structurally, not incidentally:

A remote server inverts all four: one deployed version, secrets on the server, every client surface, real telemetry.

The target: one endpoint speaking streamable HTTP

The current remote transport is streamable HTTP: a single endpoint (by convention /mcp) that accepts POSTed JSON-RPC messages. The server answers each POST either with a plain application/json body or with a text/event-stream response that carries one or more messages as server-sent events. It replaced the older HTTP-plus-SSE transport from 2024, which needed a separate long-lived GET stream and a second endpoint; do not build against that pattern in 2026.

The good news is that your tool logic does not change. In the TypeScript SDK, the migration is swapping StdioServerTransport for StreamableHTTPServerTransport behind an HTTP route, and the Python SDK has the equivalent. The shape of the work is a transport swap plus the two design changes below, not a rewrite.

// Before: the client spawns you
const transport = new StdioServerTransport();
await server.connect(transport);

// After: you are an HTTP endpoint, stateless mode
const transport = new StreamableHTTPServerTransport({
  sessionIdGenerator: undefined, // no sessions, see next section
});
await server.connect(transport);
// wire transport.handleRequest(req, res) into your POST /mcp route

What carries over, and what quietly changes

Your tool definitions, input schemas, and handler logic move unchanged; that is most of the value of the server you already built. Three things change underneath them, and each one is easy to miss because nothing in the type system flags it:

Design stateless, because the spec is about to force you

The classic STDIO habit is in-memory state: a module-level variable holding the open document, the parsed file, the cursor position. Locally that works because one process serves one user for one session. Remotely it breaks under the first load balancer, and the protocol itself is removing the crutch: the 2026-07-28 spec release deletes protocol sessions entirely. SEP-2567 removes the Mcp-Session-Id header and SEP-2575 removes the initialize handshake, as we documented change by change in our 2026-07-28 migration guide. A remote server built today on session state is a server you migrate twice.

The pattern that survives: every handler self-contained, and anything that must persist across calls expressed as an explicit handle. A tool returns { "document_id": "doc_8f3a" }; later calls take document_id as an ordinary argument and load what they need from a database or object store. If you deploy to a serverless platform, this is not even a choice, it is the execution model.

Auth: choose by client surface, not by taste

Remote means strangers can POST to your endpoint, so auth becomes a real decision. There are three workable models, and the right one depends on which clients need to reach you:

ModelReachesCost to build
Authless (rate-limited by IP)Every MCP client, including claude.aiTrivial
API key in a headerClaude Code, Cursor, config-file clientsLow
OAuth 2.1 + Dynamic Client RegistrationEverything above, plus authenticated claude.aiHigh

The asymmetry that surprises people: API keys do not reach claude.ai. The claude.ai custom-connector UI takes a server URL plus optional OAuth client credentials under Advanced settings; there is no field for an arbitrary authorization header. Claude Code, by contrast, accepts one directly:

claude mcp add --transport http my-server https://api.example.com/mcp \
  --header "Authorization: Bearer sk_live_..."

So the pragmatic ladder is: ship an authless tier with a small per-IP quota so anyone can evaluate you from any client, accept API keys in the Authorization header for developer surfaces, and add OAuth 2.1 with DCR and RFC 9728 protected-resource metadata only when you need authenticated claude.ai users. OAuth is also where the 07-28 release tightens requirements most, so if you build it, build it against the new spec from day one.

The Accept-header gotcha

This one costs almost everyone an afternoon. The streamable HTTP transport requires clients to send Accept: application/json, text/event-stream on every POST, advertising that they can handle both response framings. SDK-based servers enforce it: leave the header off and you get an HTTP 406 before your handler ever runs. Your server is fine; your curl command is not.

curl -X POST https://api.example.com/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'

The second half of the gotcha: even a single JSON-RPC response may come back SSE-framed, with Content-Type: text/event-stream and the actual message wrapped in an event:

event: message
data: {"jsonrpc":"2.0","id":1,"result":{"tools":[...]}}

Any script that pipes the body straight into JSON.parse or jq will choke on the event: and data: lines. Check the response content type and strip the SSE framing when present, or better, do your testing with a real MCP client like the Inspector below instead of raw curl.

Where to deploy

Three targets cover nearly every case. Vercel with the mcp-handler package gives you a fresh, stateless handler per request, which matches the post-07-28 model exactly. Cloudflare Workers has first-class MCP support in its agents SDK and puts you close to users globally. A plain long-lived Node or Python process on Fly.io, Render, or your own infrastructure works too and is the natural fit if your tools hold large models or warm caches. Two checks regardless of target: your platform's request timeout must exceed your slowest tool, and every proxy, CDN, and WAF in front of you must pass Mcp-* headers through untouched, which the 07-28 spec starts requiring via SEP-2243.

The migration, in order

Done as a sequence, this is days of work, not weeks. The order matters because each step is verifiable before the next one starts:

  1. Swap the transport to streamable HTTP behind a POST /mcp route and run it locally against the Inspector.
  2. Remove in-memory and session state; convert anything persistent to explicit handles backed by a store.
  3. Audit tools for concurrent execution.
  4. Pick the auth model for your client surfaces and implement the cheapest one that covers them.
  5. Deploy, then walk the request path for header stripping and timeout limits.
  6. Verify from the outside, which is the next section.

Verify it with the MCP Inspector CLI

Do not declare victory from a browser tab. The MCP Inspector has a CLI mode that exercises your server exactly the way a client does. List your tools:

npx @modelcontextprotocol/inspector --cli \
  https://api.example.com/mcp --transport http --method tools/list

Call one with arguments, end to end:

npx @modelcontextprotocol/inspector --cli \
  https://api.example.com/mcp --transport http \
  --method tools/call --tool-name search_docs --tool-arg query=billing

And confirm the authenticated path works with a header:

npx @modelcontextprotocol/inspector --cli \
  https://api.example.com/mcp --transport http \
  --method tools/list --header "Authorization: Bearer sk_live_..."

If all three pass from a machine that is not yours, against the deployed URL, with no local process involved, the migration is real. Then add the server to Claude Code and claude.ai and run one tool call from each, because client quirks are exactly the thing this checklist exists to catch. We run this same sequence against our own production server at slipstack.dev/mcp after every deploy.