Skip to content

Remote MCP Server Guide

Build an MCP server, run it locally, and deploy it as a remote service any MCP client (agents, IDEs) can call over the network. A complete, runnable reference implementation ships alongside this guide: example-mcp-server/.

🤖 LLM agents: if you're an agent helping a user wrap their own code as a remote MCP server, read the agent playbook → for-llm-agents.md first — it has the build steps, the conventions to follow, and the decisions to confirm with your user before writing code. Machine-readable site index: /llms.txt.

Who this is for

You're comfortable with Python and know the basics of MCP (an LLM/agent calls "tools" your server exposes), and you want to put an MCP server online so an agent can reach it.

Prerequisites - Python 3.10+ and uv (or plain pip). - For the remote part only: a Cloudflare account with a domain on it.

What you'll have at the end: a working MCP server reachable at https://<your-host>/mcp, protected by an auth token, that an agent can connect to and call.

Reading path: Concepts → Architecture → Run locally → Deploy remotely → Connect a client → Troubleshooting. If you just want it running, skip to Run it locally.


Concepts (1 minute)

MCP (Model Context Protocol) is a standard way for an LLM/agent to discover and call your code. Your server exposes tools (functions the agent can call), and optionally resources (readable data) and prompts (templates). Full spec: https://modelcontextprotocol.io.

An MCP server can run two ways:

Local (stdio) Remote (HTTP) ← this guide
How the client reaches it spawns your process locally over the network at a URL
Good for quick local tools heavy/long jobs, GPU, shared across teams, wrapping a service
Transport stdio Streamable HTTP (endpoint at /mcp)

This guide uses the official mcp Python SDK (FastMCP) with Streamable HTTP.


Architecture (mental model)

flowchart LR
  A["Agent / IDE"] --> B["MCP Client"]
  B -->|"HTTPS /mcp + token"| C["Cloudflare Access<br/>(checks service token)"]
  C --> D["Cloudflare Tunnel<br/>(cloudflared)"]
  D --> E["Local FastMCP server<br/>127.0.0.1:8900"]
  E --> F["Tool / Job / File"]

What each hop does:

  1. Agent / IDE wants to call a tool.
  2. MCP Client opens a Streamable-HTTP session to https://<your-host>/mcp, sending the auth headers.
  3. Cloudflare Access checks the service token at the edge; no token → blocked (401/403).
  4. Cloudflare Tunnel (cloudflared) forwards allowed traffic to your machine — no open ports, no public IP, TLS handled for you.
  5. FastMCP server (bound to 127.0.0.1) handles the MCP request.
  6. It runs a tool, kicks off a job, or serves a file.

Locally (the next section) you talk straight to step 5 — no Cloudflare needed.


Run it locally

Get example-mcp-server/ (it ships with this guide; on the docs site use the Download button).

cd example-mcp-server
python --version            # need 3.10+
uv sync                     # create venv + install deps   (or: pip install mcp)
python server.py            # start the server

You should see uvicorn start:

INFO:     Started server process [12345]
INFO:     Uvicorn running on http://127.0.0.1:8900 (Press CTRL+C to quit)

In a second shell, verify it:

curl -s http://127.0.0.1:8900/healthz
# -> {"ok": true}

python smoke_test.py http://127.0.0.1:8900/mcp

Expected success output:

tools: ['add', 'echo', 'start_render', 'get_render_status', 'get_render_result']
add(2,3) -> {'sum': 5.0}
start_render -> {'job_id': '2669ffa4...', 'state': 'queued'}
status -> running 1/3 [1/3] preparing assets
status -> running 2/3 [2/3] rendering frames
status -> running 3/3 [3/3] encoding output
status -> succeeded 3/3 done
result -> {'ready': True, 'url': 'http://127.0.0.1:8900/files/2669ffa4...'}
OK ✅

No .env is needed for local runs. When that works, you have a real MCP server — now make it remote.


Server reference

Defining tools

A tool is a decorated function. Its docstring is what the agent reads to decide when to call it, so make it precise. Return JSON-serializable data; on failure return {"error": "..."} instead of raising. Validate inputs.

from typing import Annotated

from mcp.server.fastmcp import FastMCP
from pydantic import Field

mcp = FastMCP("example-mcp", host="127.0.0.1", port=8900)

@mcp.tool()
def add(
    a: Annotated[float, Field(description="First number to add.")],
    b: Annotated[float, Field(description="Second number to add.")],
) -> dict:
    """Add two numbers and return {"sum": a+b}."""
    return {"sum": a + b}

def main():
    mcp.run(transport="streamable-http")   # endpoint at /mcp

Per-parameter descriptions. FastMCP derives inputSchema from the type hints, but the docstring only becomes the tool-level description — FastMCP does not parse a docstring Args: block into per-parameter docs. To describe each argument (which the agent sees in the schema), annotate it with Annotated[T, Field(description="...")], as above. Keep the python default outside the Annotated (x: Annotated[int, Field(description="...")] = 8) so the parameter stays optional. Field can also carry validation (ge/le/min_length/pattern) and examples, which flow into the schema too — but note that adds rejection of out-of-range values, so don't use it on a parameter you intend to silently clamp.

Tool names: ^[A-Za-z0-9_-]+$, unique, non-empty description, valid JSON-Schema inputSchema (FastMCP derives it from the type hints, with per-parameter descriptions coming from Annotated[..., Field(...)]).

HTTP endpoints

Method & Path Purpose Auth
POST /mcp MCP Streamable HTTP (initialize, tools/list, tools/call, …) required
GET /healthz Liveness probe → {"ok": true} required*
GET /files/{id} Download an artifact produced by a job required*

* Behind Cloudflare Access, every path on the host requires the service token. With the app-level Bearer option (3B), /mcp is gated by the built-in verifier, /files by an in-handler check, and /healthz is intentionally public.

Add non-MCP routes with @mcp.custom_route:

from starlette.requests import Request
from starlette.responses import JSONResponse, FileResponse

@mcp.custom_route("/healthz", methods=["GET"])
async def healthz(request: Request):
    return JSONResponse({"ok": True})

Async job contract (long tasks)

Never block a tool for minutes. Split long work into three tools:

Tool Input Output
start_<x> job spec {job_id, state}
get_<x>_status job_id {job_id, state, stage, total_stages, message, error?}
get_<x>_result job_id {ready, url, …}

state ∈ {queued, running, succeeded, failed}exactly these four; clients treat succeeded / failed as terminal. The client calls start_*, polls get_*_status (typically every 10–60 s) until terminal, then calls get_*_result. See start_render / get_render_status / get_render_result in example-mcp-server/server.py (a background thread does the work; the demo job store is in-memory — use a DB/redis/file in production so it survives restarts).

Progress reporting (required for long jobs)

Status responses drive the caller's UI — consuming agents render a live progress bar automatically from these fields, so report them on every poll:

Field Type Meaning
stage int current step, 0..total_stages
total_stages int total number of steps
message str one-line human-readable current step, e.g. "[2/3] rendering frames"
job_id str echo it in every payload so clients can correlate polls

Rules:

  • get_<x>_status must return immediately — clients poll it; never block inside it.
  • Derive total_stages from the real pipeline (e.g. len(STEPS), or thread the total through your progress callback). A hardcoded total silently desyncs when you add a step — the caller's progress bar then shows 6/5.
  • On success, set stage = total_stages so the bar completes at 100%.
  • get_<x>_result returns ready: true only after state == "succeeded".

File delivery

Return a downloadable URL, never a local path (remote clients can't read your disk):

import os
PUBLIC_BASE_URL = os.getenv("PUBLIC_BASE_URL", "https://example-mcp.<your-zone>")

@mcp.custom_route("/files/{job_id}", methods=["GET"])
async def serve_file(request: Request):
    job_id = request.path_params["job_id"]
    ...  # 404 if not ready
    return FileResponse(path, filename=f"{job_id}.bin")

# get_render_result returns: {"ready": True, "url": f"{PUBLIC_BASE_URL}/files/{job_id}"}

Deploy remotely

1. Configure & run the server

Bind to 127.0.0.1 (only the tunnel reaches it). Set env (e.g. in .env):

Var Default Purpose
MCP_HOST 127.0.0.1 bind address
MCP_PORT 8900 port
PUBLIC_BASE_URL public origin, used to build file URLs

2. Expose over HTTPS

The server binds 127.0.0.1, so something in front must terminate TLS and forward to it. Two paths — pick the one matching your auth choice in step 3:

  • Option A — Cloudflare Tunnel (recommended): Cloudflare gives you TLS, a hostname, and edge auth (the service token in 3A); your app needs no auth code.
  • Option B — your own reverse proxy: you terminate TLS and the app verifies a Bearer token with FastMCP's built-in token verification (3B). No Cloudflare involved.

Option A — Cloudflare Tunnel

No open ports, no public IP, no self-managed TLS.

cloudflared tunnel login
cloudflared tunnel create example-mcp
# edit ~/.cloudflared/config.yml   (template: deploy/cloudflared.config.example.yml)
cloudflared tunnel route dns example-mcp example-mcp.<your-zone>
cloudflared tunnel run example-mcp
# ~/.cloudflared/config.yml
tunnel: <TUNNEL_ID>
credentials-file: /home/<user>/.cloudflared/<TUNNEL_ID>.json
ingress:
  - hostname: example-mcp.<your-zone>
    service: http://127.0.0.1:8900     # /mcp, /healthz, /files all go here
  - service: http_status:404

Use a first-level subdomain (example-mcp.<your-zone>), not a deeper one (example-mcp.api.<your-zone>). Free Cloudflare Universal SSL only covers the apex and *.<your-zone>; a 2-level subdomain has no cert and its custom domain hangs at "Verifying".

Option B — Your own reverse proxy (no Cloudflare)

Terminate HTTPS yourself in front of the 127.0.0.1:8900 origin. Caddy is the least effort (automatic Let's Encrypt certificates):

# Caddyfile  —  serves https://example-mcp.example.com
example-mcp.example.com {
    reverse_proxy 127.0.0.1:8900     # forwards /mcp, /healthz, /files
}

(nginx + certbot, or a cloud load balancer, work the same way.) Point the host's DNS A/AAAA record at your machine and open :443. There is now no edge auth — the proxy forwards everything — so you MUST add the app-level Bearer check (Option B of step 3) or your tools are open to anyone who finds the URL.

3. Authenticate every request

A public endpoint must be gated, or anyone can call your tools. Use the option matching how you exposed it.

Option A — Cloudflare Access service token (pairs with 2A)

  1. Zero Trust → Access → Applications → Add → Self-hosted; domain = example-mcp.<your-zone>.
  2. Access → Service Auth → Create Service Token → copy Client ID and Client Secret.
  3. Add a policy: Action = Service Auth, Include = that token.

Only requests with CF-Access-Client-Id / CF-Access-Client-Secret reach your origin. Give each server its own token, named per server (e.g. EXAMPLE_CF_CLIENT_ID / EXAMPLE_CF_CLIENT_SECRET). The app itself needs no auth code — the edge enforces it.

Option B — App-level Bearer token (pairs with 2B)

With no Cloudflare Access there is no edge auth, so verify a shared secret inside the app. FastMCP has token verification built in: pass a TokenVerifier to the constructor and the SDK itself gates /mcp, answering missing/bad tokens with a spec-correct 401 + WWW-Authenticate: Bearer header:

import hmac
import os

from mcp.server.auth.provider import AccessToken, TokenVerifier
from mcp.server.auth.settings import AuthSettings

EXPECTED = os.environ["MCP_BEARER_TOKEN"]   # a long random secret, supplied via env

class StaticVerifier(TokenVerifier):
    """Accept exactly one shared token (constant-time compare)."""
    async def verify_token(self, token: str) -> AccessToken | None:
        if hmac.compare_digest(token, EXPECTED):
            return AccessToken(token=token, client_id="shared-secret", scopes=[])
        return None

mcp = FastMCP(
    "example-mcp", host=MCP_HOST, port=MCP_PORT,
    token_verifier=StaticVerifier(),
    auth=AuthSettings(                  # OAuth-shaped plumbing the SDK insists on:
        issuer_url=PUBLIC_BASE_URL,     #   nominal "issuer" — never contacted
        resource_server_url=None,       #   skip the RFC 9728 metadata endpoint
    ),
)

Built-in auth covers only /mcp@mcp.custom_route paths stay public by design (the SDK intends them for health checks). That's exactly right for /healthz (the container HEALTHCHECK — plain urllib, no token — keeps working) but wrong for /files, so gate that one inside its handler:

@mcp.custom_route("/files/{job_id}", methods=["GET"])
async def serve_file(request: Request):
    token = request.headers.get("authorization", "").removeprefix("Bearer ").strip()
    if not hmac.compare_digest(token, EXPECTED):
        return JSONResponse({"error": "unauthorized"}, status_code=401)
    ...                                 # 404 if not ready, then FileResponse — unchanged
  • Nothing else changes: mcp.run(transport="streamable-http") stays (no uvicorn wrapper, no custom middleware), and the client just sends Authorization: Bearer <token> (see Connect a client).
  • The two AuthSettings fields are required because the SDK models auth as an OAuth resource server; for a static shared secret they are inert — issuer_url is informational and resource_server_url=None disables the /.well-known/oauth-protected-resource metadata route. Inert but still validated: issuer_url must parse as a URL, so a literal https://example-mcp.<your-zone> placeholder left in PUBLIC_BASE_URL crashes at startup (ValidationError: … invalid international domain name) — export the real value first.
  • TokenVerifier is the real seam: swap StaticVerifier for a JWT or token-introspection verifier later without touching anything else. (The standalone FastMCP package — gofastmcp.com — ships ready-made verifiers behind the same idea, e.g. JWTVerifier; note its StaticTokenVerifier is dev/test-only per its own docs.)
  • The shipped example server is unauthenticated by design (it expects to sit behind Access); this constructor swap is for when you front it with your own TLS instead. Hand the consumer the token the same way you would a service token — over a secure channel, never in the repo.

4. Keep it running (systemd)

Run the server and the tunnel as user services so they survive crashes/reboots (templates: deploy/example-mcp.service, deploy/cloudflared-example-mcp.service):

systemctl --user daemon-reload
systemctl --user enable --now example-mcp cloudflared-example-mcp
loginctl enable-linger "$USER"      # survive logout / reboot

4b. Or run it in a container (Docker)

Containerizing is an alternative to the systemd path above — pick one. A container pins the runtime and deps and restarts on its own; the worked example ships a Dockerfile, a standalone docker-compose.yml, and a .dockerignore so you can build and run it directly:

cd example-mcp-server
docker build -t example-mcp .
docker run --rm -p 8900:8900 example-mcp     # MCP at http://localhost:8900/mcp

# verify (second shell):
curl -s http://localhost:8900/healthz        # -> {"ok": true}
python smoke_test.py http://localhost:8900/mcp

Or with the bundled standalone compose file (host-published on 8900, auto-restart, healthcheck):

docker compose up --build        # build + run; MCP at http://localhost:8900/mcp

Container specifics:

  • MCP_HOST=0.0.0.0 inside the container (the image sets this by default), not 127.0.0.1. The localhost bind that's correct on a host keeps the port unreachable from outside the container, so the published port / compose network can't see it. Binding 0.0.0.0 here is safe because the container boundary (and, when remote, Cloudflare Access) is the real perimeter — there is no app-level auth.
  • The image declares a HEALTHCHECK that hits GET /healthz with stdlib urllib (no curl in the slim image), so docker ps / compose report the container healthy only once the server answers. Use it to health-order dependents.
  • Set PUBLIC_BASE_URL at run time (e.g. -e PUBLIC_BASE_URL=https://example-mcp.<your-zone>) so the URLs from get_render_result point at your public origin, not 127.0.0.1.

Remote + auth are unchanged. A plain docker run is unauthenticated — Docker only replaces the "keep it running" step. To go remote you still front the container with one of the step 2–3 paths: either Cloudflare Tunnel + Access (point the tunnel ingress at the published port, service: http://127.0.0.1:8900; the service token gates every path at the edge), or your own reverse proxy + the built-in Bearer verifier (apply the 3B constructor swap, rebuild the image, and set MCP_BEARER_TOKEN in the container env). Either way the auth lives outside docker run itself.


Connect a client / agent

Point any MCP client at https://example-mcp.<your-zone>/mcp with the service-token headers:

import asyncio, os
from mcp import ClientSession
from mcp.client.streamable_http import streamablehttp_client

HEADERS = {
    "CF-Access-Client-Id": os.environ["EXAMPLE_CF_CLIENT_ID"],
    "CF-Access-Client-Secret": os.environ["EXAMPLE_CF_CLIENT_SECRET"],
}

async def main():
    async with streamablehttp_client("https://example-mcp.<your-zone>/mcp", headers=HEADERS) as (r, w, _):
        async with ClientSession(r, w) as s:
            await s.initialize()
            print([t.name for t in (await s.list_tools()).tools])   # ['add', 'echo', 'start_render', ...]
            print(await s.call_tool("add", {"a": 2, "b": 3}))       # -> {'sum': 5.0}

asyncio.run(main())

For a Bearer-token deployment (step 3 Option B), send one Authorization header instead of the two CF headers — everything else is identical (per-server naming again: the consumer's EXAMPLE_MCP_TOKEN holds the same secret the server reads as MCP_BEARER_TOKEN):

HEADERS = {"Authorization": f"Bearer {os.environ['EXAMPLE_MCP_TOKEN']}"}

Agent frameworks register it the same way: a streamable-HTTP MCP server at the /mcp URL with whichever auth headers your deployment uses.

Hand-off: the registration JSON

When your server is deployed, the deliverable to the consuming agent team is one JSON block — most agent frameworks (nanobot, Claude Desktop, …) register MCP servers in exactly this shape — plus the secret values (service token or Bearer token) sent over a secure channel:

{
  "mcpServers": {
    "example-mcp": {
      "type": "streamableHttp",
      "url": "https://example-mcp.<your-zone>/mcp",
      "headers": {
        "CF-Access-Client-Id": "${EXAMPLE_CF_CLIENT_ID}",
        "CF-Access-Client-Secret": "${EXAMPLE_CF_CLIENT_SECRET}"
      },
      "enabledTools": ["add", "echo", "start_render", "get_render_status", "get_render_result"]
    }
  }
}

For a Bearer-token deployment, swap the headers block for a single Authorization entry (the rest is unchanged):

"headers": { "Authorization": "Bearer ${EXAMPLE_MCP_TOKEN}" }
  • Keep ${VAR} placeholders in the JSON — the consumer stores the real token values in their own .env and the framework substitutes them at startup. Never put the secret itself in the JSON.
  • Name the env vars after your server (EXAMPLE_CF_*) so a consumer can hold tokens for several MCP servers side by side without collisions.
  • enabledTools is the consumer-side whitelist — list exactly the tools you intend them to call.
  • Alongside the JSON, hand over: the tool list with one-line descriptions, the async-job contract fields if you have long tasks (see Server reference), and the Client ID/Secret or Bearer token (transmit it securely; an Access secret is shown only once at creation).

Troubleshooting

Symptom Likely cause Fix
ModuleNotFoundError: No module named 'mcp' wrong/old Python (e.g. system 3.8) use Python 3.10+; uv sync then .venv/bin/python server.py
client hangs / ConnectError to localhost server not running on that port confirm python server.py is up; curl 127.0.0.1:<port>/healthz
curl: (35) … handshake failure / code 000 on the public URL no TLS cert yet, or nothing serving check the tunnel + origin (below); for a new custom domain wait for the cert to issue
public URL returns 530 tunnel has no active connection (origin unreachable) cloudflared tunnel info <name>; restart cloudflared tunnel run <name>; verify curl 127.0.0.1:<port>/healthz locally
401 / 403 missing/wrong auth (Access headers, or the Authorization header in Bearer mode) Access: send both CF-Access-Client-Id and CF-Access-Client-Secret; check the Service Auth policy on the Access app. Bearer (3B): send Authorization: Bearer <token> matching the server's MCP_BEARER_TOKEN
custom domain stuck at "Verifying" 2-level subdomain not covered by free Universal SSL use a first-level subdomain (or enable Advanced Certificate Manager / Total TLS)
(CI) wrangler … Project not found [code: 8000007] the Pages/target project doesn't exist create it first (e.g. wrangler pages project create <name> --production-branch=main)

Healthy smoke test prints the tools: [...], add(2,3) -> {'sum': 5.0}, the start_render → status → result lines, and OK ✅ (see Run it locally). If you get that locally but not remotely, the problem is in the tunnel/Access (or proxy/Bearer) layer, not your server.


Checklist

  • [ ] Server binds 127.0.0.1; not exposed directly (tunnel or reverse proxy only).
  • [ ] Auth in place — Option A: Cloudflare Access service token (edge gates all paths incl. /files); or Option B: built-in Bearer verifier on /mcp + in-handler check on /files (/healthz public by design).
  • [ ] First-level subdomain (free SSL coverage).
  • [ ] Secrets in env / .env, never committed.
  • [ ] Inputs validated; tools return JSON, {"error": ...} on failure.
  • [ ] Long tasks use the async job contract and report stage/total_stages/message in status; tools never block.
  • [ ] Hand-off JSON delivered (mcpServers entry with ${VAR} placeholders + per-server env-var names); token secret sent securely, never committed.
  • [ ] Artifacts returned as URLs, not local paths.
  • [ ] GET /healthz present.
  • [ ] Server kept running — either systemd (Restart=always, linger enabled) or a container (restart: unless-stopped, MCP_HOST=0.0.0.0, /healthz HEALTHCHECK); tunnel + Access (or reverse proxy + Bearer) still front it.