From local STDIO to remote MCP server: the complete migration path

There is a specific moment every MCP server hits: it works beautifully in Claude Desktop on your machine, a teammate asks to use it, and you realize the honest answer is "clone the repo, install Node, edit a JSON file in your Library folder, and restart the app." That answer does not scale to a second engineer, let alone a team or a customer. This is the complete path from a local STDIO server to a remote one: transport, state, auth, the one header gotcha that wastes an afternoon, where to deploy, and how to prove the result actually works.

Why STDIO is a dead end for teams

STDIO transport means the client launches your server as a child process and speaks JSON-RPC over stdin and stdout. For local development it is perfect: no network, no auth, no latency. As a distribution mechanism it fails structurally, not incidentally:

Every user is an operator. Each person installs your runtime and dependencies, keeps them updated, and edits client config by hand. Every laptop runs a different version of your server, and you cannot fix a bug for anyone but yourself.
Secrets sprawl. API keys for the systems your tools touch end up in plaintext config files on every machine.
The biggest client surface cannot reach you. claude.ai in the browser and on mobile cannot spawn a local process. STDIO-only means desktop-app users only.
Zero observability. No logs, no metrics, no error reporting. When it breaks for a user, your debugging tool is a screenshot.

A remote server inverts all four: one deployed version, secrets on the server, every client surface, real telemetry.

The target: one endpoint speaking streamable HTTP

The current remote transport is streamable HTTP: a single endpoint (by convention /mcp) that accepts POSTed JSON-RPC messages. The server answers each POST either with a plain application/json body or with a text/event-stream response that carries one or more messages as server-sent events. It replaced the older HTTP-plus-SSE transport from 2024, which needed a separate long-lived GET stream and a second endpoint; do not build against that pattern in 2026.

The good news is that your tool logic does not change. In the TypeScript SDK, the migration is swapping StdioServerTransport for StreamableHTTPServerTransport behind an HTTP route, and the Python SDK has the equivalent. The shape of the work is a transport swap plus the two design changes below, not a rewrite.

// Before: the client spawns you
const transport = new StdioServerTransport();
await server.connect(transport);

// After: you are an HTTP endpoint, stateless mode
const transport = new StreamableHTTPServerTransport({
  sessionIdGenerator: undefined, // no sessions, see next section
});
await server.connect(transport);
// wire transport.handleRequest(req, res) into your POST /mcp route

What carries over, and what quietly changes

Your tool definitions, input schemas, and handler logic move unchanged; that is most of the value of the server you already built. Three things change underneath them, and each one is easy to miss because nothing in the type system flags it:

Concurrency arrives. A STDIO process serves exactly one user. A remote endpoint serves everyone at once, so any handler that mutates shared state, writes a fixed temp-file path, or assumes calls arrive one at a time is now a race condition. Audit every tool for re-entrancy before you deploy, not after the first overlapping request corrupts something.
The stdout discipline disappears. Under STDIO, stdout belongs to the protocol and a stray console.log corrupts the stream, which is why your code carefully routes diagnostics to stderr. Over HTTP that constraint is gone: stdout is just a log line. Take the win and wire structured logging while you are in there, because remote users will exercise paths you never hit locally.
Secrets move from env blocks to a secrets manager. Locally, credentials live in the env stanza of each user's client config. Remotely they belong to the deployment, set once, rotated centrally, never pasted into a teammate's JSON file again.

Design stateless, because the spec is about to force you

The classic STDIO habit is in-memory state: a module-level variable holding the open document, the parsed file, the cursor position. Locally that works because one process serves one user for one session. Remotely it breaks under the first load balancer, and the protocol itself is removing the crutch: the 2026-07-28 spec release deletes protocol sessions entirely. SEP-2567 removes the Mcp-Session-Id header and SEP-2575 removes the initialize handshake, as we documented change by change in our 2026-07-28 migration guide. A remote server built today on session state is a server you migrate twice.

The pattern that survives: every handler self-contained, and anything that must persist across calls expressed as an explicit handle. A tool returns { "document_id": "doc_8f3a" }; later calls take document_id as an ordinary argument and load what they need from a database or object store. If you deploy to a serverless platform, this is not even a choice, it is the execution model.

Auth: choose by client surface, not by taste

Remote means strangers can POST to your endpoint, so auth becomes a real decision. There are three workable models, and the right one depends on which clients need to reach you:

Model	Reaches	Cost to build
Authless (rate-limited by IP)	Every MCP client, including claude.ai	Trivial
API key in a header	Claude Code, Cursor, config-file clients	Low
OAuth 2.1 + Dynamic Client Registration	Everything above, plus authenticated claude.ai	High

The asymmetry that surprises people: API keys do not reach claude.ai. The claude.ai custom-connector UI takes a server URL plus optional OAuth client credentials under Advanced settings; there is no field for an arbitrary authorization header. Claude Code, by contrast, accepts one directly:

claude mcp add --transport http my-server https://api.example.com/mcp \
  --header "Authorization: Bearer sk_live_..."

So the pragmatic ladder is: ship an authless tier with a small per-IP quota so anyone can evaluate you from any client, accept API keys in the Authorization header for developer surfaces, and add OAuth 2.1 with DCR and RFC 9728 protected-resource metadata only when you need authenticated claude.ai users. OAuth is also where the 07-28 release tightens requirements most, so if you build it, build it against the new spec from day one.

The Accept-header gotcha

This one costs almost everyone an afternoon. The streamable HTTP transport requires clients to send Accept: application/json, text/event-stream on every POST, advertising that they can handle both response framings. SDK-based servers enforce it: leave the header off and you get an HTTP 406 before your handler ever runs. Your server is fine; your curl command is not.

curl -X POST https://api.example.com/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'

The second half of the gotcha: even a single JSON-RPC response may come back SSE-framed, with Content-Type: text/event-stream and the actual message wrapped in an event:

event: message
data: {"jsonrpc":"2.0","id":1,"result":{"tools":[...]}}

Any script that pipes the body straight into JSON.parse or jq will choke on the event: and data: lines. Check the response content type and strip the SSE framing when present, or better, do your testing with a real MCP client like the Inspector below instead of raw curl.

Where to deploy

Three targets cover nearly every case. Vercel with the mcp-handler package gives you a fresh, stateless handler per request, which matches the post-07-28 model exactly. Cloudflare Workers has first-class MCP support in its agents SDK and puts you close to users globally. A plain long-lived Node or Python process on Fly.io, Render, or your own infrastructure works too and is the natural fit if your tools hold large models or warm caches. Two checks regardless of target: your platform's request timeout must exceed your slowest tool, and every proxy, CDN, and WAF in front of you must pass Mcp-* headers through untouched, which the 07-28 spec starts requiring via SEP-2243.

The migration, in order

Done as a sequence, this is days of work, not weeks. The order matters because each step is verifiable before the next one starts:

Swap the transport to streamable HTTP behind a POST /mcp route and run it locally against the Inspector.
Remove in-memory and session state; convert anything persistent to explicit handles backed by a store.
Audit tools for concurrent execution.
Pick the auth model for your client surfaces and implement the cheapest one that covers them.
Deploy, then walk the request path for header stripping and timeout limits.
Verify from the outside, which is the next section.

Verify it with the MCP Inspector CLI

Do not declare victory from a browser tab. The MCP Inspector has a CLI mode that exercises your server exactly the way a client does. List your tools:

npx @modelcontextprotocol/inspector --cli \
  https://api.example.com/mcp --transport http --method tools/list

Call one with arguments, end to end:

npx @modelcontextprotocol/inspector --cli \
  https://api.example.com/mcp --transport http \
  --method tools/call --tool-name search_docs --tool-arg query=billing

And confirm the authenticated path works with a header:

npx @modelcontextprotocol/inspector --cli \
  https://api.example.com/mcp --transport http \
  --method tools/list --header "Authorization: Bearer sk_live_..."

If all three pass from a machine that is not yours, against the deployed URL, with no local process involved, the migration is real. Then add the server to Claude Code and claude.ai and run one tool call from each, because client quirks are exactly the thing this checklist exists to catch. We run this same sequence against our own production server at slipstack.dev/mcp after every deploy.