In this article

May 5, 2026

Building an MCP server from a REST API

A hands-on guide to implementing an MCP server in Python: tools, resources, prompts, transports, and authentication, with a full worked example.

Maria Paktiti

May 5, 2026

Explore with AI

Open in ChatGPT

Open in Claude

Open in Perplexity

This is a practical guide to building an MCP server that wraps an existing REST API. By the end, you'll have a working server in Python that exposes a real product surface to an LLM agent through tools, resources, and prompts, with proper error handling, OAuth-based authentication, and a path from local development to a remote production deployment.

Note that most of the interesting work in building an MCP server happens before the code: deciding which endpoints become tools, which become resources, which workflows deserve prompts, and which parts of your API to leave out entirely. We covered that in a separate post, Designing an MCP server from a REST API, and the example surface used here came out of that exercise. You don't need to read the design post to follow this one. But if at any point a choice below feels arbitrary ("why is this a tool and not a resource?", "why combine these endpoints into one tool?"), the design post is where the reasoning lives.

The example throughout is Pantry, a fictional cookbook and meal-planning service. It has 24 REST endpoints covering recipes, ingredients, a user pantry, meal plans, and shopping lists. We're going to expose a curated subset of that to an agent: eight tools, three resources, two prompts. Concrete enough to actually build, generic enough that the patterns transfer to any REST API you have lying around.

What we're building

The MCP surface we're building looks like this.

Eight tools:

search_recipes
save_recipe
whats_for_dinner
plan_week_meals
update_pantry
add_recipe_to_plan
get_shopping_list
suggest_substitutes

Three resources:

recipe://{id}
pantry://current
meal-plan://{id}

Two prompts:

weekly_meal_planning
recipe_from_text

Nine of Pantry's 24 endpoints are deliberately not exposed; the rest are merged, split, or wrapped according to a few recurring patterns. We'll show three or four tools in full and gesture at the rest. The pattern repeats; once you've seen it, you've seen it.

Project setup

We're going to use the official Python SDK, which ships with FastMCP, the high-level framework that turns decorated Python functions into MCP-compliant tools, resources, and prompts. FastMCP handles JSON-RPC framing, schema generation from type hints, and transport plumbing, so you get to write Python and it handles the protocol.

There's also a separate, more featureful fastmcp package on PyPI. It's good, but for a tutorial that needs to age well, the SDK that ships with the spec is the safer pick. The patterns translate.

Install uv if you don't have it (it's the package manager the official MCP docs use), then set up the project:

  
uv init pantry-mcp
cd pantry-mcp
uv venv
source .venv/bin/activate
uv add "mcp[cli]" httpx pydantic python-jose

Folder structure:

  
pantry-mcp/
├── pantry_mcp/
│   ├── __init__.py
│   ├── server.py          # FastMCP server, tools, resources, prompts
│   ├── client.py          # Pantry REST API client
│   ├── models.py          # Pydantic models for inputs and outputs
│   └── auth.py            # AuthKit token verification
├── tests/
└── pyproject.toml

Three concerns, three files. The server file holds the MCP surface, the client file wraps the REST API, the auth file verifies bearer tokens. Mixing these together gets ugly fast in any server with more than two or three tools.

A minimal working server

Before we touch Pantry, here's the smallest possible server that runs:

  
# pantry_mcp/server.py
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("pantry")

@mcp.tool()
def hello(name: str) -> str:
    """Greet the user by name."""
    return f"Hello, {name}!"

if __name__ == "__main__":
    mcp.run(transport="streamable-http")

Run it:

  
uv run python -m pantry_mcp.server

The server is now listening at http://localhost:8000/mcp. The decorator alone is enough; FastMCP reads the type hints and the docstring, builds a JSON Schema, and registers hello as a tool. The docstring becomes the description the model sees when deciding whether to call this tool. Hold onto that fact, because it's load-bearing.

Why streamable HTTP, and when to use stdio

FastMCP supports three transports:

‍stdio runs your server as a subprocess of the client and pipes JSON-RPC over stdin and stdout. Use stdio when the server runs locally, alongside the client, with no network in between. Claude Desktop's local config uses stdio. It's simple and has no auth story to worry about because the client launches the process directly.
‍sse (Server-Sent Events) is deprecated; ignore it.
‍streamable-http is the current standard, runs your server as a normal HTTP service, and is what you want for any remote deployment. Use streamable-http when the server is, or might become, remote. It works with any HTTP infrastructure (load balancers, gateways, TLS termination), it's the only transport with a real auth model, and it's the one the rest of this post uses.

For development, you can also point Claude Desktop at a local http://localhost:8000/mcp URL, so streamable-http doesn't lock you out of local testing.

Wrapping the Pantry REST API

We'll keep the API client separate from the MCP server. Single responsibility, easier testing, easier swapping out for mocks.

  
# pantry_mcp/client.py
import os
import httpx
from typing import Any

class PantryClient:
    def __init__(self, base_url: str | None = None, token: str | None = None):
        self.base_url = base_url or os.environ["PANTRY_API_URL"]
        self.token = token or os.environ.get("PANTRY_API_TOKEN")
        self._client = httpx.AsyncClient(
            base_url=self.base_url,
            headers={"Authorization": f"Bearer {self.token}"} if self.token else {},
            timeout=10.0,
        )

    async def get(self, path: str, **params) -> Any:
        r = await self._client.get(path, params=params)
        r.raise_for_status()
        return r.json()

    async def post(self, path: str, json: dict) -> Any:
        r = await self._client.post(path, json=json)
        r.raise_for_status()
        return r.json()

    async def patch(self, path: str, json: dict) -> Any:
        r = await self._client.patch(path, json=json)
        r.raise_for_status()
        return r.json()

    async def aclose(self):
        await self._client.aclose()

Then in server.py:

  
from pantry_mcp.client import PantryClient

pantry = PantryClient()

Production code would handle auth and retries more carefully, but this is enough to make the rest of the article concrete.

Tool 1: search_recipes

Now the first real tool. This is the simplest case in the design: one MCP tool roughly maps to one REST endpoint.

  
@mcp.tool()
async def search_recipes(
    query: str,
    max_minutes: int | None = None,
    dietary: list[str] | None = None,
    cuisine: str | None = None,
) -> list[dict]:
    """Search Pantry's recipe catalog by free-text query and optional filters.

    Use this for general recipe discovery when the user has not specified
    that they want something based on what's in their pantry. For pantry-aware
    suggestions, use `whats_for_dinner` instead.

    Args:
        query: Free-text search, e.g. "chicken curry" or "quick weeknight pasta".
        max_minutes: Optional cap on total time. Set to 30 for quick meals.
        dietary: Optional list like ["vegetarian", "gluten-free"]. Combined with AND.
        cuisine: Optional cuisine name like "thai" or "italian".

    Returns:
        Up to 10 recipes, each with id, title, total_minutes, and a short summary.
        Use the recipe id with the `recipe://{id}` resource to read full details.
    """
    params: dict = {"q": query}
    if max_minutes is not None:
        params["max_minutes"] = max_minutes
    if dietary:
        params["dietary"] = ",".join(dietary)
    if cuisine:
        params["cuisine"] = cuisine

    results = await pantry.get("/recipes", **params)
    return [
        {
            "id": r["id"],
            "title": r["title"],
            "total_minutes": r["total_minutes"],
            "summary": r["summary"],
        }
        for r in results[:10]
    ]

A few things are happening here that are worth pausing on:

The function signature is the schema. FastMCP reads query: str, max_minutes: int | None = None, and the rest, and produces a JSON Schema that the model sees. The model knows query is required, max_minutes is optional, dietary is a list of strings. No separate schema definition needed.
The docstring is the description. The first line ("Search Pantry's recipe catalog...") is the short description. The body explains when to use this tool, including a pointer to a different tool when the situation calls for it. The Args: section gives per-parameter guidance. This whole block becomes part of the prompt the model sees, every turn. Treat it like a function spec for an unfamiliar collaborator who reads it once.
The return value is shaped, not raw. The Pantry API returns whatever it returns, but we trim to the fields the model actually needs and cap to 10 results. Returning the full API response works, but it costs context and risks the model latching onto irrelevant fields. The "Use the recipe id with the recipe://{id} resource" line is a hint that nudges the model toward the resource for full detail.

Writing tool descriptions the model can use

Since we're going to write seven more of these, let's be explicit about the shape.

A good tool docstring has four parts.

The short description is one sentence. It explains what the tool does, in a way that distinguishes it from neighbors. "Search recipes" is bad; "Search Pantry's recipe catalog by free-text query and optional filters" is better.
The when-to-use guidance explains positively when the tool is the right pick, and negatively when a different tool would be better. The model sees all your tools at once and is choosing among them. Helping it disambiguate is the single highest-leverage thing you can do for tool quality. "Use this for X. For Y, use other_tool instead." is the pattern.
The per-parameter notes matter most for parameters that aren't obvious. query: str doesn't need a note. max_minutes: int | None does, because the model needs to know what scale it's in (seconds? minutes? hours?) and what setting it to None means.
The return shape mentions the structure of the response. If the response references another primitive (like our recipe://{id} resource), say so explicitly.

If your docstrings are getting too long, consider whether you have too few or too many tools. A docstring that needs four paragraphs to explain when to use the tool usually means the tool does too much; split it. A docstring that's two lines and still ambiguous usually means the tool is too thin; merge it with a neighbor.

Tool 2: save_recipe

The second tool is a clean 1-to-1 with POST /recipes, but it surfaces a wrinkle that didn't show up before. REST endpoints scatter their inputs across path, query, headers, and body. MCP tools have a single flat input schema. You have to flatten.

  
from pydantic import BaseModel, Field

class Ingredient(BaseModel):
    name: str = Field(description="Ingredient name, e.g. 'olive oil'")
    quantity: float = Field(description="Numeric amount, e.g. 2 or 0.5")
    unit: str = Field(description="Unit, e.g. 'tbsp', 'cups', 'g'")

@mcp.tool()
async def save_recipe(
    title: str,
    ingredients: list[Ingredient],
    steps: list[str],
    total_minutes: int,
    cuisine: str | None = None,
    dietary: list[str] | None = None,
) -> dict:
    """Save a new recipe to Pantry.

    Use this when the user has dictated, pasted, or otherwise authored a recipe
    they want to keep. The `recipe_from_text` prompt invokes this tool after
    parsing free-form text into structured fields.

    Args:
        title: Short, distinctive name. Avoid leading articles ("Pasta" not "The Pasta").
        ingredients: One Ingredient per line, with name, quantity, and unit.
        steps: Ordered list of instructions, one per step.
        total_minutes: Total time including prep and cook.
        cuisine: Optional cuisine label like "thai" or "italian".
        dietary: Optional tags like ["vegetarian", "gluten-free"].

    Returns:
        The saved recipe's id and a short confirmation summary.
    """
    body = {
        "title": title,
        "ingredients": [i.model_dump() for i in ingredients],
        "steps": steps,
        "total_minutes": total_minutes,
    }
    if cuisine:
        body["cuisine"] = cuisine
    if dietary:
        body["dietary"] = dietary

    result = await pantry.post("/recipes", json=body)
    return {
        "recipe_id": result["id"],
        "summary": f"Saved '{title}' with {len(ingredients)} ingredients and {len(steps)} steps. Recipe ID {result['id']}.",
    }

Three things to notice:

The Ingredient Pydantic model gives us a nested structure with proper field descriptions, while keeping the top-level tool signature flat. The model sees five top-level parameters, and ingredients is an array of objects with three named fields. Pydantic's Field(description=...) propagates into the JSON Schema, so each ingredient field gets explained.‍
Field descriptions matter. The model uses them to decide what to put in each field. Generic names get generic values; descriptive names get the right values. "Ingredient name, e.g. 'olive oil'" is much better than just "name".
The return is enriched. We send back the new recipe's ID, but we also send back a short, English-y summary the model can quote to the user. Without the summary, the model gets {"recipe_id": 472} and has to remember on its own that the user just saved a chicken curry. With the summary, the next turn writes itself.

A note on naming collisions

If your REST endpoint takes both a path id and a query id, you can't put both at the top level. Pick one to rename in the tool. We don't have that case for save_recipe, but it's worth knowing the pattern: rename query parameters first, since path parameters tend to be more semantically central. path_id and id is uglier than id and query_id.

If your OpenAPI spec uses $ref heavily, you'll need to inline those references when building tool schemas. FastMCP's Pydantic-based approach sidesteps this if you write the models yourself; if you're auto-generating from OpenAPI, your generator should handle inlining. Tool schemas are self-contained and $ref is not safe to assume across the wire.

Tool 3: plan_week_meals

The flagship N-to-1 tool. This is the one that turns five REST calls into one tool call, and the one that pays the most rent in the agent surface.

  
from datetime import date, timedelta

@mcp.tool()
async def plan_week_meals(
    start_date: str,
    days: int = 7,
    dietary: list[str] | None = None,
    prefer_pantry: bool = True,
) -> dict:
    """Plan a week of dinners, optionally favoring recipes that use the user's pantry.

    Use this when the user asks to plan meals for an upcoming period. This tool
    handles the full workflow: it reads the user's pantry, reads dietary
    preferences, queries the recipe catalog, creates a meal plan, and adds the
    chosen recipes to it. Returns a friendly summary listing what was planned.

    For one-off "what should I cook tonight" questions, use `whats_for_dinner`
    instead. This tool is for multi-day planning.

    Args:
        start_date: ISO date string, e.g. "2026-05-04". The first day of the plan.
        days: How many days to plan. Default 7. Cap at 14.
        dietary: Override the user's saved dietary preferences. Usually leave as None.
        prefer_pantry: If True (default), weight selection toward recipes whose
            ingredients are mostly already in the user's pantry.

    Returns:
        A dict with `plan_id`, `summary` (English text the agent can quote), and
        `pantry_coverage` (how many pantry ingredients the plan uses).
    """
    if days < 1 or days > 14:
        return {
            "error": "days must be between 1 and 14",
            "hint": "Try days=7 for a standard week.",
        }

    # 1. Read pantry (for both dietary inference and pantry-weighted ranking)
    pantry_items = await pantry.get("/pantry") if prefer_pantry else []
    pantry_names = {p["ingredient"].lower() for p in pantry_items}

    # 2. Read user preferences if dietary not overridden
    if dietary is None:
        prefs = await pantry.get("/users/me/preferences")
        dietary = prefs.get("dietary", [])

    # 3. Query candidate recipes
    candidates = await pantry.get(
        "/recipes",
        dietary=",".join(dietary) if dietary else None,
        max_minutes=60,
        limit=50,
    )

    # 4. Rank by pantry overlap
    def overlap(recipe: dict) -> int:
        return sum(
            1 for ing in recipe.get("ingredient_names", [])
            if ing.lower() in pantry_names
        )
    if prefer_pantry:
        candidates.sort(key=overlap, reverse=True)
    chosen = candidates[:days]

    if len(chosen) < days:
        return {
            "error": f"Only found {len(chosen)} recipes matching the criteria.",
            "hint": "Try relaxing dietary filters or setting prefer_pantry=False.",
        }

    # 5. Create the plan
    plan = await pantry.post("/meal-plans", json={
        "start_date": start_date,
        "days": days,
    })
    plan_id = plan["id"]

    # 6. Add each chosen recipe to its day
    start = date.fromisoformat(start_date)
    for i, recipe in enumerate(chosen):
        day = (start + timedelta(days=i)).isoformat()
        await pantry.post(
            f"/meal-plans/{plan_id}/recipes",
            json={"recipe_id": recipe["id"], "day": day, "meal_slot": "dinner"},
        )

    # 7. Build a friendly summary
    titles = [
        f"{(start + timedelta(days=i)).strftime('%A')} {r['title'].lower()}"
        for i, r in enumerate(chosen)
    ]
    coverage = sum(overlap(r) for r in chosen)

    return {
        "plan_id": plan_id,
        "summary": f"Planned {days} dinners starting {start_date}: " + ", ".join(titles) + f". Plan ID {plan_id}.",
        "pantry_coverage": f"{coverage} pantry ingredients used across the plan.",
    }

This tool is doing real work. It reads two endpoints to gather context, queries a third for candidates, applies its own ranking logic, then writes to two more endpoints to persist the plan. From the agent's perspective, all of that is one call.

The error returns are not exceptions. If days is out of range, or there aren't enough matching recipes, we return a structured response with error and hint fields. The model reads that and either adjusts and retries, or relays to the user. No traceback, no 500, no failed tool call. We'll come back to this pattern in the error-handling section.

The summary is English. "Planned 7 dinners starting 2026-05-04: Monday miso salmon, Tuesday tofu stir-fry..." is something the agent can quote almost verbatim. The structured fields are there too, in case downstream tools need them, but the human-readable summary carries the conversation forward.

Tool 4: update_pantry

Worth showing one more, because it illustrates the bulk-operation pattern that replaces three thin CRUD tools with one useful one.

  
from typing import Literal

class PantryChange(BaseModel):
    action: Literal["add", "update", "remove"] = Field(
        description="What to do with this item: add new, update quantity, or remove entirely."
    )
    ingredient: str = Field(description="Ingredient name, e.g. 'eggs'.")
    quantity: float | None = Field(
        default=None,
        description="Quantity for add/update. Omit for remove.",
    )
    unit: str | None = Field(
        default=None,
        description="Unit like 'oz', 'g', 'count'. Required for add.",
    )

@mcp.tool()
async def update_pantry(changes: list[PantryChange]) -> dict:
    """Apply a batch of pantry changes in one call.

    Use this whenever the user describes pantry changes, however many. The user
    can say "I bought eggs and milk and used up the flour", and you should send
    one call with three changes, not three calls.

    Args:
        changes: A list of PantryChange items. Each has an action (add/update/remove),
            an ingredient name, and (for add/update) a quantity and unit.

    Returns:
        A dict with counts and a short English summary.
    """
    added, updated, removed, errors = 0, 0, 0, []

    for change in changes:
        try:
            if change.action == "add":
                await pantry.post("/pantry", json={
                    "ingredient": change.ingredient,
                    "quantity": change.quantity,
                    "unit": change.unit,
                })
                added += 1
            elif change.action == "update":
                # Pantry's PATCH expects an item id; resolve by name first.
                items = await pantry.get("/pantry", ingredient=change.ingredient)
                if not items:
                    errors.append(f"{change.ingredient}: not in pantry, can't update")
                    continue
                await pantry.patch(f"/pantry/{items[0]['id']}", json={"quantity": change.quantity})
                updated += 1
            elif change.action == "remove":
                items = await pantry.get("/pantry", ingredient=change.ingredient)
                if not items:
                    errors.append(f"{change.ingredient}: not in pantry, can't remove")
                    continue
                await pantry.delete(f"/pantry/{items[0]['id']}")
                removed += 1
        except Exception as e:
            errors.append(f"{change.ingredient}: {e}")

    parts = []
    if added: parts.append(f"added {added}")
    if updated: parts.append(f"updated {updated}")
    if removed: parts.append(f"removed {removed}")
    summary = "Pantry updated: " + ", ".join(parts) if parts else "No changes applied."
    if errors:
        summary += f" Issues: {'; '.join(errors)}"

    return {
        "added": added,
        "updated": updated,
        "removed": removed,
        "errors": errors,
        "summary": summary,
    }

Three notes:

The Literal["add", "update", "remove"] type gives the model a closed set of valid actions. Compared to a free-form string that says "the action to take", this dramatically reduces the chance of the model inventing an action like "replace" that doesn't exist.
We resolve names to IDs inside the tool. Pantry's PATCH and DELETE want item IDs, but the user (and the model) speaks in ingredient names. Doing the lookup inside the tool means the model doesn't have to chain calls to find IDs first. Fewer round trips, less for the model to track.
Partial failures are reported, not raised. If three changes succeed and one fails, we return what worked along with what didn't. The agent can confirm what happened and the user gets a useful answer instead of an opaque tool error.

The other tools, briefly

The remaining four tools follow the same patterns:

whats_for_dinner reads the pantry and queries the recipe catalog with a pantry-weighted score. Same shape as plan_week_meals, but for a single meal.
add_recipe_to_plan is a clean 1-to-1 wrap around POST /meal-plans/{id}/recipes, taking a recipe id, plan id, day, and meal slot.
get_shopping_list calls GET /shopping-lists/{plan_id}, then optionally filters out items already in the user's pantry. Maps two endpoints' worth of work to one tool with a exclude_pantry flag.
suggest_substitutes calls GET /ingredients/{id} for the substitution list, with optional recipe context to weight the suggestions. Useful when the user is mid-recipe and out of an ingredient.

Each of these is twenty to forty lines, and the patterns above (flat parameters, structured returns, English summaries, error responses with hints) repeat. Beyond the third or fourth tool, the writing is mostly mechanical.

Resources

Resources are the read side of the server. They are addressable by URI and the model fetches them as needed, the same way a human might pull up a reference page.

  
@mcp.resource("recipe://{recipe_id}")
async def get_recipe(recipe_id: str) -> str:
    """Full detail for a single recipe: ingredients, steps, time, dietary tags."""
    recipe = await pantry.get(f"/recipes/{recipe_id}")
    lines = [
        f"# {recipe['title']}",
        f"Total time: {recipe['total_minutes']} minutes",
        f"Cuisine: {recipe.get('cuisine', 'unspecified')}",
        f"Dietary: {', '.join(recipe.get('dietary', [])) or 'none'}",
        "",
        "## Ingredients",
    ]
    for ing in recipe["ingredients"]:
        lines.append(f"- {ing['quantity']} {ing['unit']} {ing['name']}")
    lines.append("")
    lines.append("## Steps")
    for i, step in enumerate(recipe["steps"], 1):
        lines.append(f"{i}. {step}")
    return "\n".join(lines)

A couple of notes:

The decorator takes a URI template. recipe://{recipe_id} is parameterized; the model can fetch recipe://472 and the SDK extracts recipe_id="472" and passes it in.
The return is plain text, formatted for reading. We could return JSON, but the model's job is to read this and use it, and Markdown is closer to how the model wants to consume it. Plain text resources travel well across context windows.

The other two resources follow the same shape:

  
@mcp.resource("pantry://current")
async def get_current_pantry() -> str:
    """Current contents of the user's pantry, grouped by category."""
    items = await pantry.get("/pantry")
    # Group by category and format
    ...

@mcp.resource("meal-plan://{plan_id}")
async def get_meal_plan(plan_id: str) -> str:
    """Full meal plan with recipes by day."""
    plan = await pantry.get(f"/meal-plans/{plan_id}")
    ...

Resources are cheap. They don't burn a tool call slot, and the model can pull them in selectively. If you find yourself debating whether a piece of read-only data should be a tool or a resource, the answer is almost always resource.

Prompts

Prompts are the most unfamiliar of the three primitives. They aren't called by the model autonomously; they're picked by the user from a menu, like macros. When invoked, they expand into a multi-step set of instructions that primes the agent for a particular workflow.

  
@mcp.prompt()
def weekly_meal_planning(start_date: str | None = None) -> str:
    """Plan a week of meals using the user's pantry and preferences."""
    target = start_date or "next Monday"
    return f"""I'd like to plan dinners for the week starting {target}.

Walk me through this in three steps:

1. Check my current pantry contents using the `pantry://current` resource and
   note what I have a lot of, what's running low, and what's about to expire.

2. Ask me one or two clarifying questions about preferences for the week:
   things I'm in the mood for, anything to avoid, busy nights when I want
   something quick.

3. Use the `plan_week_meals` tool with `prefer_pantry=True` to generate the
   plan, then summarize it day by day, calling out which recipes use my
   pantry items.

Don't generate the plan until after step 2. I want the chance to weigh in.
"""

@mcp.prompt()
def recipe_from_text(text: str) -> str:
    """Parse free-form recipe text and save it to Pantry."""
    return f"""I'm going to give you the text of a recipe. It might be from
a website, a photo I transcribed, or something a friend sent me. The format
will be inconsistent.

Here's the text:

---
{text}
---

Please:

1. Extract the title, ingredients (name, quantity, unit), steps, and total
   time. If the text doesn't include a total time, estimate one and tell me
   so I can correct.

2. Show me what you parsed before saving, in a short readable form.

3. Once I confirm, call the `save_recipe` tool to save it. If I want changes
   first, apply them and re-confirm before saving.
"""

Two things to notice:

Prompts are templates, not workflows the model follows blindly. The text inside is a structured set of instructions the agent will then act on, with full freedom to use tools, ask questions, and adapt. Think of the prompt as the agent's mission brief for this conversation.
The prompt knows about the tools and resources by name. weekly_meal_planning references pantry://current and plan_week_meals. Prompts are the right place to encode "the way we do meal planning around here" without baking that workflow into a single tool with a thousand parameters.

If you find yourself wanting an MCP "workflow" that takes ten arguments and runs five tools in order, that's a prompt, not a tool. Prompts are how you compose primitives without losing the model's ability to react to what comes back.

Error handling that helps the agent recover

We've already used this pattern a few times. It deserves explicit treatment.

In a REST API, errors are exceptional. In an MCP server, they're a normal part of the conversation, because the model will call your tools imperfectly. Names get misspelled, IDs get hallucinated, timestamps get the wrong format. The question is whether your error message helps the model fix it or just stops the conversation.

Three patterns work well.

‍Return errors, don't raise them, when recovery is possible. A 404 from the underlying API is a fact about the world, not an exception. If the user asked for recipe_id 472 and that doesn't exist, return:

  
{
    "error": "recipe 472 not found",
    "hint": "Use search_recipes to find available recipes.",
}

The model reads that and calls search_recipes. Conversation continues.

‍Raise (or return errors) with clear hints when no recovery is possible. If the user's auth token is bad, there's no graceful recovery; we want a clear failure with a message the agent can relay. FastMCP will catch raised exceptions and surface them as tool errors with the message attached. Make the message useful:

  
raise ValueError(
    "Pantry API returned 401 Unauthorized. The server's API token may be expired."
)

‍Validate at the edge, with informative messages. If a parameter is out of range, validate before calling the API and return a useful error. The days < 1 or days > 14 check in plan_week_meals is an example. The model sees the bound, doesn't have to discover it from a downstream 400.

What you don't want is the agent receiving Error: invalid input or a Python traceback. Those are dead ends. Every error is a chance to either continue or fail clearly; never the third option of failing opaquely.

Schema mechanics: Where REST and MCP collide

A few practical notes from building real servers.

‍Authentication headers are not parameters. Your REST API takes a bearer token in Authorization: Bearer .... Your MCP tool absolutely does not take that as a parameter. Auth lives outside the tool surface, in the API client and (for MCP-level auth) in the bearer token verification middleware we'll build in a moment.‍
Don't expose pagination cursors. REST APIs love opaque next_cursor tokens. The model has no reason to see those. Tools either return the most relevant N results capped at a sensible number, or expose pagination behind a clean parameter like page: int = 1. Cursor tokens leak implementation details.‍
Stable IDs are good. Internal IDs are bad. If your REST API has two ID systems (slugs for users to read, UUIDs for foreign keys), expose the human-readable one. The model writes the same things back, so making IDs round-trip cleanly is worth a small cost.‍
Idempotency matters. The model will retry. If your add_recipe_to_plan tool can be called twice without producing two copies of Wednesday's chicken, you'll spend less time debugging tool loops. Either rely on the underlying API's idempotency, or check for duplicates in the tool.

Running it locally with the MCP Inspector

Once your server runs, the next step is the MCP Inspector. It's a visual testing tool that connects to any MCP server and lets you exercise tools, browse resources, and view the protocol traffic.

Run your server in one terminal:

  
uv run python -m pantry_mcp.server

In another:

  
npx @modelcontextprotocol/inspector

The Inspector opens in your browser. Connect to http://localhost:8000/mcp (transport: streamable HTTP). You'll see a list of every tool, resource, and prompt FastMCP has registered, and you can call any of them with arbitrary arguments and see the raw response.

Use the Inspector to verify three things before connecting to a real client:

Every tool description reads sensibly out of context (no broken cross-references, no jargon the model wouldn't know).
Every tool returns something useful for both happy-path and error inputs.
Every resource template you've defined actually fetches.

The Inspector is also where you'll catch dumb stuff: tools that return raw JSON when you meant to summarize, resources that throw on missing IDs, prompts that reference primitives you haven't actually registered.

Connecting to Claude Desktop

For local development against a desktop client, Claude's config file is the simplest path. On macOS:

  
// ~/Library/Application Support/Claude/claude_desktop_config.json
{
  "mcpServers": {
    "pantry": {
      "command": "uv",
      "args": ["--directory", "/path/to/pantry-mcp", "run", "python", "-m", "pantry_mcp.server"],
      "env": {
        "PANTRY_API_URL": "https://api.pantry.example.com",
        "PANTRY_API_TOKEN": "your-pantry-api-token"
      }
    }
  }
}

Restart the desktop app. Your tools, resources, and prompts now appear in the conversation.

This is fine for local testing, but not how you'd ship to users. For that, you want a remote server with proper authentication, which is the next section.

Authentication with AuthKit

Once your MCP server is exposed over the network, you need authentication. The MCP spec is clear about this:

HTTP-transport servers use OAuth 2.1 with bearer tokens.
The client discovers the authorization server via Protected Resource Metadata.
Clients identify themselves through Client ID Metadata Document (added to the spec in November 2025) or, for older clients, Dynamic Client Registration.

You can implement all of this yourself, but we don't recommend it. OAuth 2.1, dynamic client registration, JWKS rotation, CIMD endpoints, the full discovery dance: it's all spec-correct, all evolving, and all unrelated to your actual product. AuthKit is the WorkOS product that handles this for you. Your server stays focused on tools and resources and just verifies the tokens AuthKit issues.

!!If you want to skip most of the manual setup that follows, the WorkOS CLI's AI Installer (npx workos@latest install) handles dashboard configuration, API key setup, and SDK wiring in a few minutes.!!

Here's what your server needs to add.

How the flow works

Before any code, here's what an authenticated MCP request actually looks like:

The MCP client (Claude Desktop, an agent framework, etc.) makes a request to your server with no token.
Your server returns 401 with a WWW-Authenticate header pointing at your /.well-known/oauth-protected-resource endpoint.
The client fetches that metadata, sees AuthKit listed as the authorization server, and kicks off an OAuth flow against AuthKit.
AuthKit handles the login UI, the consent screen, and the token issuance.
The client retries the original request, this time with the issued bearer token in the Authorization header.
Your server verifies the token against AuthKit's JWKS endpoint and proceeds.

Your server only owns steps 2 and 6. Everything else is AuthKit's responsibility. Concretely, that means three pieces of code: a metadata endpoint, a token verifier, and a middleware that ties them together.

Dashboard setup

Before you can issue tokens, configure AuthKit. Three steps in the WorkOS dashboard:

Create an OAuth application. Dashboard → Applications → Create application. Name it after your MCP server. Save the resulting client ID; you'll need it shortly.
Enable Client ID Metadata Document (CIMD). Dashboard → Applications → Configuration → Manage under Client ID Metadata Document. CIMD is how MCP clients identify themselves to your authorization server: rather than pre-registering with you, the client publishes a metadata document at a URL and identifies itself by that URL. CIMD is the spec-recommended approach and what you should default to.
Note your AuthKit domain. It looks like your-app.authkit.app and is the value you'll set as AUTHKIT_DOMAIN in your environment.

A heads-up about Dynamic Client Registration. Before CIMD landed in the MCP spec in November 2025, the standard way for clients to identify themselves was Dynamic Client Registration (DCR), where each client registers with the authorization server and receives credentials. Some older MCP clients still expect DCR and don't yet support CIMD. If you need to support those clients, you can enable DCR alongside CIMD on the same Configuration page; AuthKit supports both simultaneously. For the deeper tradeoffs, see MCP client registration: CIMD vs DCR.

Now the code.

Token verification

Verify bearer tokens on every request. AuthKit signs JWTs with keys exposed at a JWKS endpoint, and verification is a matter of checking the signature, the issuer, and the audience.

We'll use pyjwt's PyJWKClient, which handles fetching the JWKS, caching it, and refreshing when AuthKit rotates its signing keys. Doing this by hand with lru_cache and a manual httpx fetch is a recipe for tokens silently breaking the day after a key rotation, so let the library handle it.

  
# pantry_mcp/auth.py
import os
import jwt
from jwt import PyJWKClient
from jwt.exceptions import InvalidTokenError

AUTHKIT_DOMAIN = os.environ["AUTHKIT_DOMAIN"]  # e.g. "your-app.authkit.app"
ISSUER = f"https://{AUTHKIT_DOMAIN}"
JWKS_URL = f"{ISSUER}/oauth2/jwks"

# PyJWKClient caches keys and refreshes them automatically on rotation.
_jwks_client = PyJWKClient(JWKS_URL, cache_keys=True, lifespan=300)

def verify_token(token: str) -> dict:
    """Verify an AuthKit-issued JWT. Returns the claims on success.

    Raises ValueError if the token is invalid, expired, or wrong issuer.
    """
    try:
        signing_key = _jwks_client.get_signing_key_from_jwt(token).key
        return jwt.decode(
            token,
            signing_key,
            algorithms=["RS256"],
            issuer=ISSUER,
            options={"verify_aud": False},  # set audience explicitly if you use it
        )
    except InvalidTokenError as e:
        raise ValueError(f"Invalid bearer token: {e}")

The audience claim is your AuthKit client ID; verifying it makes sure tokens issued for some other application can't be replayed against your MCP server. In production this matters; in local development it's still good practice.

Wiring auth into FastMCP

FastMCP supports custom HTTP middleware on its streamable HTTP transport. Add a middleware that pulls the token, verifies it, and on failure returns the 401 response with the WWW-Authenticate header that lets clients discover the authorization server:

  
# pantry_mcp/server.py
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.responses import JSONResponse
from pantry_mcp.auth import verify_token

WWW_AUTHENTICATE = (
    'Bearer error="unauthorized", '
    'error_description="Authorization needed", '
    'resource_metadata="https://mcp.example.com/.well-known/oauth-protected-resource"'
)

class AuthMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request, call_next):
        # Skip auth for the metadata endpoint itself
        if request.url.path.startswith("/.well-known/"):
            return await call_next(request)

        auth = request.headers.get("authorization", "")
        if not auth.startswith("Bearer "):
            return JSONResponse(
                {"error": "Missing bearer token"},
                status_code=401,
                headers={"WWW-Authenticate": WWW_AUTHENTICATE},
            )
        try:
            claims = verify_token(auth[7:])
            request.state.user_id = claims["sub"]
        except ValueError as e:
            return JSONResponse(
                {"error": str(e)},
                status_code=401,
                headers={"WWW-Authenticate": WWW_AUTHENTICATE},
            )

        return await call_next(request)

# Attach the middleware to the FastMCP app
mcp_app = mcp.streamable_http_app()
mcp_app.add_middleware(AuthMiddleware)

The WWW-Authenticate header is the discovery hook. When an MCP client gets a 401 with that header, it follows the resource_metadata URL, finds AuthKit listed as the authorization server, and starts the OAuth flow. The user signs in once, the client gets a token, and subsequent requests carry it.

The metadata endpoint

Add the /.well-known/oauth-protected-resource endpoint that the WWW-Authenticate header points to:

  
from starlette.routing import Route
from starlette.responses import JSONResponse

async def protected_resource_metadata(request):
    return JSONResponse({
        "resource": "https://mcp.example.com",
        "authorization_servers": [f"https://{AUTHKIT_DOMAIN}"],
        "bearer_methods_supported": ["header"],
    })

mcp_app.routes.append(
    Route("/.well-known/oauth-protected-resource", protected_resource_metadata)
)

That's the whole auth surface. Verify tokens, advertise the authorization server, let AuthKit handle everything else.

Compatibility with older clients

Some MCP clients don't yet support Protected Resource Metadata and instead expect to find OAuth Authorization Server Metadata directly on your server at /.well-known/oauth-authorization-server. To stay compatible, your server can proxy that endpoint to AuthKit's:

  
import httpx

async def authorization_server_metadata(request):
    async with httpx.AsyncClient() as client:
        r = await client.get(f"https://{AUTHKIT_DOMAIN}/.well-known/oauth-authorization-server")
    return JSONResponse(r.json())

mcp_app.routes.append(
    Route("/.well-known/oauth-authorization-server", authorization_server_metadata)
)

Newer clients use the protected-resource path; older ones hit the authorization-server path; both work. Cache the upstream response in production rather than fetching it on every probe.

Standalone: Bring your own users

What we've shown is the standard AuthKit integration: AuthKit hosts the sign-in flow, users authenticate against AuthKit, your server verifies the resulting tokens. If you don't already have user accounts elsewhere, this is the fastest path.

If your app already has its own authentication (existing user accounts, custom login UI, anything you don't want to migrate), use Standalone Connect. AuthKit becomes the OAuth authorization server but redirects users to your application's Login URI for the actual login. Your app authenticates the user however it normally does, then calls AuthKit's completion API to finish the OAuth flow.

Configuration changes:

In Dashboard → Applications → Configuration, set your Login URI to a route on your app that accepts an external_auth_id query parameter (https://your-app.example.com/auth/login). This is where AuthKit will send users to log in.
Your Login URI handler authenticates the user (using whatever you already have), then calls AuthKit's completion API:

  
import httpx

async def complete_authkit_login(external_auth_id: str, user: dict) -> str:
    """Tell AuthKit that authentication succeeded; return the redirect URI."""
    async with httpx.AsyncClient() as client:
        r = await client.post(
            "https://api.workos.com/authkit/oauth2/complete",
            headers={"Authorization": f"Bearer {os.environ['WORKOS_API_KEY']}"},
            json={
                "external_auth_id": external_auth_id,
                "user": {
                    "id": user["id"],
                    "email": user["email"],
                    "first_name": user.get("first_name"),
                    "last_name": user.get("last_name"),
                },
            },
        )
        r.raise_for_status()
    return r.json()["redirect_uri"]

After your handler authenticates the user, call this with the external_auth_id from the query parameter and your user's data, then redirect the browser to the returned redirect_uri. AuthKit takes over from there: consent screen, token issuance, redirect back to the MCP client.

Token verification on the MCP server side is identical to the standard integration. The only difference is who runs the login screen.

Full integration guide is at workos.com/docs/authkit/mcp. Sign up at workos.com to start.

Testing strategies

Three layers, in order of effort.

‍Unit tests for tool functions. Mock the PantryClient and call your tool functions like normal Python. This catches the easy bugs (wrong field names, missing optional parameters, off-by-one in date math) without spinning up the protocol stack.‍
Integration tests with the MCP Inspector. Connect the Inspector to a server running against a test API, walk through every tool and resource, and confirm the responses are useful. The Inspector is your fastest feedback loop for "does this read well" questions.‍
Conversation tests with a real model. This is where the design assumptions get tested. Connect Claude Desktop or another client and try the user requests from the design doc: "Plan me a week of dinners," "What can I make tonight," "I bought milk and eggs." If the model picks the right tool every time, your descriptions are good. If it tries the same wrong tool twice, your descriptions need work.

The third layer is the one most teams skip. Don't.

Troubleshooting

Symptoms and what they usually mean:

The model never calls a particular tool. The description probably overlaps with another tool's. Check that each tool's "when to use" guidance distinguishes it clearly.
The model passes the wrong parameter type. The schema isn't tight enough. Replace str with Literal[...] for closed sets, narrow int to bounded ranges in the description, or use a Pydantic model with Field(description=...).
The model loops on the same tool. It's probably retrying after an error it doesn't understand. Check that your error returns include actionable hints, and that retries don't immediately fail the same way.
A tool returns nothing or an empty list and the model gets confused. Empty results are a common case; explicitly return a structured empty response with a summary field that says so. {"results": [], "summary": "No recipes matched."} is much easier to handle than [].
Streamable HTTP hangs intermittently behind a proxy. Some proxies buffer, which breaks streaming. Disable buffering for the MCP path. For nginx, that's proxy_buffering off; in the location block. For AWS ALB, set the idle timeout to at least 120 seconds.
The Inspector connects but tools don't appear. The server probably crashed during initialization without surfacing the error. Check the server logs for a stack trace; FastMCP doesn't always propagate startup errors over the wire.

Wrapping up

You now have an MCP server that exposes the Pantry API to LLM agents through a well-shaped surface: eight tools that map to user intents (not just endpoints), three resources that read cheaply, two prompts that encode standard workflows. Authentication delegates to AuthKit so you don't have to build OAuth from scratch. Errors are recoverable, descriptions are written for the model, and the Inspector is wired up for fast iteration.

The patterns generalize. The Pantry-specific code is maybe two hundred lines; the rest is structure that ports to any REST API. Take an inventory of your endpoints, decide what each one is really for, and write tools the same way: flat parameters, structured returns, English summaries, hints in errors, and one tool per user intent rather than one tool per endpoint.

For the design rationale behind those choices, including how to decide which endpoints to combine, split, or skip, Designing an MCP server from a REST API is the companion piece.