Context and motivation
Interfaces have been steadily moving from structured actions to natural conversations. Command lines gave us precision but kept most users out. GUIs made things visual, yet still forced users to learn paths and forms. REST brought modularity and clarity for machines. Conversation brings intention to the center: the user describes what they want, not how to do it.
The friction points are familiar: users guess the right words or the right clicks; too many options, not enough guidance; and forms that don’t adapt to the way people ask. A conversational layer changes the game by letting the system do the translation from intent to action.
Here’s a quick story. A product manager wants “the cheapest 95-octane station nearby, open now.” In a dashboard, they would set filters, pick a radius, sort by price, and still double-check schedules. In a conversation, they just ask. Your system—through MCP—translates that intent into a plan: find stations near coordinates, fetch prices, filter by opening hours, rank by price, return a clear answer.
From REST to reasoning
REST taught us to design services as resources with predictable operations. That mental model—input → logic → output—made integration reliable. Reasoning builds on that: instead of asking clients to orchestrate every step, we allow an LLM to plan and call the right capabilities, given the user’s intent and context.
Think of it as elevating from “which endpoint do I call?” to “what outcome do I need?” Without abandoning the rigor of your contracts.
MCP in practice
Model–Context–Protocol is the thin conversational bridge. You expose “tools” (functions with typed inputs/outputs). The LLM decides when and how to call them based on the ongoing conversation. There are SDKs in multiple languages so you can adopt MCP in your current stack, and community momentum is pushing toward standard transports (including HTTP) and broader vendor support.
Why a conversational layer on top of your API?
Traditional interfaces assume users know the right path—what to click, which form to fill, which field to filter. Conversations flip that: users express intent, and your system does the legwork. MCP is the bridge that lets LLMs call your capabilities safely and with structure.
What we’re building
We’ll expose a handful of tools over the public precioil.es
API to answer fuel price questions in Spain. The full, working code is in the demo repo: fuel-mcp.
Prerequisites
- Python 3.10+
- Either
uv
orpip
- Basic understanding of REST and Python
Before we dive in, make sure your environment is predictable. Reproducibility is your friend when debugging conversational behavior.
- Recommended:
uv
for fast, hermetic runs; otherwise, a cleanvirtualenv
+pip
works well. - OS support: macOS/Linux out of the box; on Windows, prefer WSL2 for parity with production Linux environments.
- Networking: if you expose HTTP, ensure your firewall and port (e.g., 8001) are open locally.
- Optional but handy: a tool to capture logs to a file (e.g., redirect
stdout
/stderr
) for postmortems on failed sessions.
Step 1 — Map your API to “tools” (contracts first)
Start with intent, not endpoints. Imagine the questions users want to ask and crystallize them into tools with clear inputs and predictable outputs. The more explicit your contracts, the more reliable the LLM’s behavior.
- get stations around coordinates and radius
- get details for a station
- get national average prices over time
- get price history for a station
Design notes:
- Prefer small, composable tools over monoliths. It helps the LLM chain steps.
- Validate input ranges and formats up front. Fail fast, fail clear.
- Name parameters semantically (e.g.,
latitud
,longitud
,radio
) to reduce ambiguity.
Think of a tool as an “endpoint with intent.” You’re not removing structure—you’re making the structure legible to language.
Step 2 — Scaffold an MCP server
We’ll use fastmcp
to define tools and serve them over MCP. Keep a thin, dependable core: timeouts, error surfacing, and lightweight logging. Observability matters because conversational flows are dynamic—you want to see what was called and why.
Design choices and trade-offs:
- Keep HTTP clients short-lived with sensible timeouts; avoid hidden global state.
- Log request paths and status codes, not raw payloads containing PII.
- Prefer simple data shapes; the LLM doesn’t need every field—only what’s useful to answer.
from fastmcp import FastMCP, Context
import httpx
mcp = FastMCP("PrecioilAPIServer")
BASE_URL = "https://api.precioil.es"
def _make_request(ctx: Context, endpoint: str, params: dict | None = None):
with httpx.Client(base_url=BASE_URL, timeout=20.0) as client:
response = client.get(endpoint, params=params)
response.raise_for_status()
data = response.json()
ctx.log(f"GET {endpoint} -> {response.status_code}")
return data
Step 3 — Add your first tool: stations by radius
Why this matters: most user journeys start broad (“near me”, “in this city”). This tool lets the LLM discover candidates before it narrows down the best option.
@mcp.tool()
def get_stations_by_radius(ctx: Context, latitud: float, longitud: float, radio: int, pagina: int = 1, limite: int = 15):
"""Find gas stations near coordinates within a radius (km)."""
params = httpx.QueryParams({
"latitud": latitud,
"longitud": longitud,
"radio": radio,
"pagina": pagina,
"limite": limite,
})
return _make_request(ctx, "/estaciones/radio", params)
Step 4 — Add more tools (details and analytics)
Now we enable depth. After discovery, the assistant needs to compare and explain. Details, averages, and history unlock richer answers: cheapest now, stable over time, or best value within constraints.
Practical tips:
- Normalize units, currencies, and date formats at the tool boundary.
- Keep historical ranges bounded (e.g., 30–90 days) to avoid heavy payloads.
- Return both raw values and a short human-friendly summary when it helps.
@mcp.tool()
def get_station_details(ctx: Context, station_id: str):
return _make_request(ctx, f"/estaciones/{station_id}")
@mcp.tool()
def get_daily_average_price(ctx: Context, fuel_type: str, start_date: str, end_date: str):
params = httpx.QueryParams({
"tipo": fuel_type,
"inicio": start_date,
"fin": end_date,
})
return _make_request(ctx, "/precios/media-diaria", params)
@mcp.tool()
def get_station_prices_history(ctx: Context, station_id: str, start_date: str, end_date: str):
params = httpx.QueryParams({
"inicio": start_date,
"fin": end_date,
})
return _make_request(ctx, f"/estaciones/{station_id}/historial", params)
Finish the module with a simple runner:
if __name__ == "__main__":
mcp.run()
Step 5 — Run the server
You can run over STDIO for local clients or expose HTTP for networked use. Start simple, then harden.
Using uv
(auto-manages the environment):
uv run precioil_mcp_server.py
Or with pip
:
pip install -r requirements.txt
python precioil_mcp_server.py
Serve over HTTP if you want to expose it on a network:
uv run precioil_mcp_server.py -- --transport http --host 0.0.0.0 --port 8001
Troubleshooting:
- Address already in use: change
--port
or stop the conflicting process. - SSL or proxies: start with plain HTTP on localhost; add TLS/ingress later.
- Timeouts: increase client timeouts slightly if you see spurious failures, but investigate upstream latency first.
- Slow cold starts: warm the process (one health call) before load or demos.
Step 6 — Connect from an MCP client
Install and register the server with a compatible client. From there, prompt as usual—the client will decide when to invoke your tools.
If your client supports fastmcp
installation:
fastmcp install precioil_mcp_server.py
Then ask your LLM:
- “Find stations within 5 km of Plaça de Catalunya for 95-octane.”
- “Show the last 30 days of diesel prices at station 1234.”
Behind the scenes, the LLM chains tool calls: discover nearby stations, fetch details, rank by price, and return the best option with context. You keep control through contracts; the assistant handles orchestration.
Verify the integration:
- Your client should list tools with names like:
get_stations_by_radius
,get_station_details
,get_daily_average_price
,get_station_prices_history
. - Try a constrained query first (small radius, single fuel type) to validate inputs.
- Inspect logs for each call (endpoint, status code, latency). Tight loops or repeated calls may indicate ambiguous prompts—clarify the prompt or add guardrails.
Step 7 — Tips for robust reasoning
- Validate inputs rigorously (ranges, formats, enums).
- Prefer smaller composable tools over one mega-tool.
- Add retries and fallbacks for network calls.
- Log every tool call and surface failures to the user clearly.
- Keep outputs stable and typed; LLMs thrive on consistent schemas.
Also consider:
- Add a small in-memory cache for frequently requested lookups (e.g., station details for 5 minutes).
- Version your tool schemas; include a
schema_version
in responses when you evolve fields. - Provide explicit enums for fuel types; reject unknown values with helpful error messages.
- Add guardrails for large radii or date ranges to prevent expensive queries.
Antipatterns to avoid:
- One mega-tool that “does everything.” It becomes brittle and opaque.
- Silent failures. Users (and LLMs) need clear reasons to adapt the next step.
- Overfetching. Big payloads slow everything down and rarely improve answers.
FAQ
Q: Can I expose authenticated endpoints via MCP?
A: Yes. Keep auth out of prompts. Handle tokens/keys inside the server, scope them minimally, and redact sensitive fields from logs.
Q: How do I prevent the model from spamming tools?
A: Rate-limit per conversation and add lightweight debouncing (ignore identical repeated calls within a short window). Clarify prompts with confirmations for expensive operations.
Q: How do I test this without an LLM client?
A: Write thin unit tests that call your tool functions directly with representative inputs/outputs. Then add an integration test that runs the server and exercises a happy-path conversation script.
Wrap-up
REST made services accessible. MCP makes them usable in natural language. You don’t need to rebuild your product—just expose the capabilities you already have as tools, with contracts that make sense to both machines and people. Start small, learn from real conversations, and iterate quickly.
What changed:
- From endpoints to tools: contracts that encode intent, not just transport.
- From flows to reasoning: the assistant plans and calls your capabilities.
- From rigid UI paths to dialogue: users describe outcomes, systems execute.
What to do next (practical sequence):
- Pick one high-impact intent (e.g., “cheapest 95 within 5 km”).
- Expose a minimal set of tools that make that answer possible.
- Add observability (logs, latencies, outcomes) from day one.
- Handle failure cases explicitly (timeouts, empty results, validation errors).
- Ship, watch transcripts, and refine prompts and contracts together.
Pitfalls to avoid:
- Vague inputs and implicit defaults that confuse the model.
- Overfetching large payloads that slow down every hop.
- “Do-everything” tools that are hard to reason about or test.
A short roadmap you can borrow:
- Add a geocoding tool to resolve places to coordinates.
- Cache station lookups and average prices briefly to cut latency.
- Persist usage metrics and surface a small analytics dashboard.
- Introduce auth and per-tool rate limits for multi-tenant scenarios.
- Track cost/latency budgets and enforce them per conversation.
Explore and contribute:
- Code and working PoC: fuel-mcp
- Ideas and PRs welcome; try adding a new tool or transport and open an issue.
Questions? Let’s chat!