Beyond request-response: How MCP servers are learning to collaborate
How MCP is evolving from model-driven execution to collaborative workflows with sampling and elicitation.
The Model Context Protocol (MCP) started with a clean architectural principle: servers expose tools, and LLMs decide when to use them. It’s a simple request–response model that mirrors decades of API design. The LLM asks, the server answers, and control flows in one direction.
That model worked well at first. But as MCP moved into production, it began to show its limits.
Real systems need to handle sensitive authentication flows, resolve ambiguity without guessing, and sometimes rely on the server’s own intelligence to move work forward. In practice, this exposed gaps in a strictly model-driven execution model.
MCP’s response was not to abandon its original design, but to extend it. The protocol introduced a set of collaboration patterns (sampling, URL-mode elicitation, and form-mode elicitation) that let servers participate more actively in workflows while keeping control explicit and reviewable.
In this article, we’ll look at how those patterns emerged, what production problems they solve, and how they are reshaping the balance of responsibility between models, servers, and users. We’ll also look at the growing debate around bidirectional tool calls and what it reveals about how far server-side intelligence should go.
1. Sampling: Servers that think
Sampling allows MCP servers to ask the model for help, rather than treating it as a one-way caller.
Instead of the model unilaterally deciding everything, the server can:
- Request a completion
- Ask for reasoning about intermediate state
- Validate assumptions before proceeding
This changes the relationship subtly but significantly.
The server is no longer just executing commands. It is collaborating with the model, using it as a reasoning engine while still retaining ownership of state and execution.
And here’s where it gets powerful: The user is always in the loop.
Before the request is sent to the model, the user can see the message, edit it, or even reject it altogether if something looks off. After the model returns a completion, the same thing happens:
- The user can review the output.
- Edit or improve it, especially for quality or safety.
- Or reject it before it’s passed back to the server.
This built-in review loop makes sampling ideal for use cases where accuracy, transparency, or human judgment is essential, like content moderation, customer support, or decision-making systems.
Example: Using sampling to validate assumptions before execution
Imagine an MCP server that manages access to internal datasets and generates analytical reports. A user asks: “Generate a report on customer churn for last quarter.”
The model interprets the request and proposes querying a churn dataset and generating a summary. However, in production, the server knows that:
- there are multiple churn definitions
- the wrong choice could lead to incorrect reporting
- this decision should not be made silently
Instead of executing immediately, the server initiates sampling.
First, the server requests a completion from the model asking it to explain:
- which churn definition it intends to use
- how it interprets “last quarter”
- what assumptions it is making
That sampled output is returned to the client and shown to a human reviewer, for example: “I plan to use the paid-customer churn dataset and interpret last quarter as Q4 of the calendar year.”
At this point, the human can approve, edit (for example, change the quarter definition), or reject the plan. Only after this review does the server proceed.
Sampling provides support for this loop by allowing the server to identify the ambiguity, ask an LLM for more info, and have a human confirm the actions.
2. URL-mode elicitation: Secure user interactions
Some interactions cannot safely happen inside an MCP client or model context.
OAuth authorization, credential entry, payment flows, and enterprise SSO all require trusted, external surfaces. Passing these through the model is not just risky, it is often unacceptable.
URL-mode elicitation gives MCP servers a first-class way to say: “This workflow must leave the model and client context before it can continue.”
Instead of relying on ad-hoc errors or undocumented redirects, servers can explicitly pause execution and require an out-of-band user interaction.
Example: Using URL-mode elicitation to enforce an OAuth security boundary
Imagine an MCP server that integrates with GitHub to fetch private repositories on behalf of a user.
A user asks: “Show me all open pull requests in my private repositories.”
The model understands the intent and suggests calling a tool that queries the GitHub API. However, the server immediately knows there is a problem: it does not yet have permission to access the user’s private repositories.
Critically, the server cannot safely ask the user for a GitHub access token inside the MCP client. Doing so would expose sensitive credentials to the client and potentially to the model context, which is unacceptable.
Instead, the server initiates URL-mode elicitation.
The server pauses the workflow and sends a URL-mode elicitation request to the client with a message like: “To continue, you need to authorize access to your GitHub account.”
Along with the message, the server provides a URL pointing to GitHub’s official OAuth authorization page. The client presents this message to the user and opens the URL in a trusted browser environment.
At this point:
- The user completes the OAuth flow directly with GitHub
- Credentials and authorization codes never pass through the MCP client
- The server receives the OAuth callback on its own endpoint and exchanges the code for an access token
Only after the server has securely obtained and stored the token does it mark the elicitation as complete and resume the MCP workflow.
From the model’s perspective, nothing sensitive ever occurred. From the client’s perspective, the interaction was explicit and understandable. From the server’s perspective, the security boundary was enforced without relying on undocumented behavior or special cases.
This is the core value of URL-mode elicitation: it allows MCP servers to require sensitive interactions to happen outside the protocol, while still integrating them cleanly into the workflow’s control flow.
3. Form-mode elicitation: Runtime user input
Not all problems are about security or intelligence. Some are simply ambiguous.
Form-mode elicitation allows a server to pause and ask the user for structured input when multiple valid paths exist. Instead of guessing, the system can ask for clarification and then resume execution.
This is especially important for workflows where incorrect assumptions are costly.
Form-mode elicitation addresses a subtle but common issue in LLM-driven systems: models are too willing to guess when they should ask.
Example: Using form-mode elicitation to resolve runtime ambiguity
Imagine an MCP server that manages database migrations across multiple environments.
A user asks: “Apply the latest schema changes.”
The model understands the request and proposes running a migration tool. However, from the server’s perspective, this request is ambiguous. There are multiple environments and applying the migration to the wrong environment could cause downtime or data loss.
In a purely model-driven setup, the model might guess based on prior context or typical usage. That guess could be wrong.
Instead, the server initiates form-mode elicitation to disambiguate the request.
The server pauses execution and asks the client to collect structured input from the user, such as:
- Which environment should the migration run against?
- Should the migration be applied immediately or scheduled?
The server includes a JSON Schema that defines the expected fields and valid values. The client presents this to the user using its own interface and returns the response as structured data.
Once the server receives the response, it can:
- Validate the input against the schema
- Select the correct execution path
- Proceed deterministically with the migration
No guessing is involved, and no sensitive data is exposed. The ambiguity is resolved explicitly at runtime.
This is the core purpose of form-mode elicitation. It provides a controlled way to stop execution when multiple valid interpretations exist and to resume only once the missing information is supplied. In practice, it helps prevent a common failure mode of LLM-based systems: confidently choosing the wrong path when the correct move is to ask a question.
Bidirectional tool calls: The next frontier?
The patterns we’ve covered so far are production-ready and officially supported. But there’s an active debate in the MCP community about whether the protocol should go further.
A proposal in the MCP GitHub repository (SEP-1006) asks a provocative question:
What if servers could proactively call tools on the agent, instead of only responding to requests?
Consider this scenario.
An expense-management server detects that a new corporate policy takes effect tomorrow. Instead of waiting for the user to ask about expenses, the server initiates an interaction: “I noticed you have pending receipts. The new policy starts tomorrow. Do you want to submit these under the current rules?”
This would flip the current MCP dynamic entirely.
Under this model:
- Servers could push notifications that trigger agent workflows.
- Event-driven architectures would become native to MCP.
- Real-time systems could react immediately to external changes.
- Multi-agent coordination could be handled directly at the protocol level.
In other words, MCP would move beyond request–response into something closer to collaborative, event-driven execution.
The proposal does not ignore security concerns. It includes mechanisms such as:
- Explicit capability handshakes.
- Whitelisting which tools a server is allowed to invoke.
- Client-side rejection and filtering of server-initiated actions.
Even so, the idea is controversial precisely because it challenges one of MCP’s original principles:
the model initiates, the server responds.
The community discussion around SEP-1006 makes the tension clear. On one hand, these capabilities would unlock powerful and pragmatic patterns that production systems clearly want. On the other hand, allowing servers to initiate actions raises real questions about trust, consent, predictability, and abuse.
Notably, many teams already work around this limitation today using side channels such as WebSockets, background polling, or external event buses. The proposal surfaces an uncomfortable truth: the need for bidirectional interaction hasn’t gone away just because the protocol doesn’t formally support it.
Future directions: Collaboration without autonomy
The direction MCP appears to be heading toward is not autonomous servers replacing model-driven workflows. Instead, it points to more structured collaboration, with clearer boundaries between roles.
A likely future MCP model looks something like this:
- Models remain responsible for reasoning, interpretation, and user-facing language.
- Servers increasingly own policy, state, and execution control.
- Humans are involved at explicit, reviewable checkpoints rather than implicitly through model guesses.
- The protocol encodes these boundaries instead of leaving them to convention.
The unresolved question is how proactive servers should be allowed to become.
Bidirectional interaction could make MCP more expressive for event-driven and multi-agent systems. But every step in that direction must be weighed against the risks of unintended execution, loss of user intent, and expanded attack surfaces.
If MCP’s recent history is any guide, the protocol will likely continue to evolve cautiously. Features that make collaboration explicit, reviewable, and enforceable will win out over those that introduce silent autonomy.
Why this debate matters
The discussion around true bidirectional tool calls isn’t just about adding another feature to MCP. It exposes a deeper question about how AI systems should be structured as they move from experiments to infrastructure.
So far, MCP’s evolution has followed a consistent pattern. Each new capability has emerged in response to a specific production failure mode:
- Sampling exists because models need guidance and oversight before execution.
- Form elicitation exists because ambiguity should be resolved explicitly, not guessed.
- URL-mode elicitation exists because security boundaries must be enforced outside the model and client context.
Each of these features increases collaboration without handing full control to the server.
The bidirectional proposal pushes on that boundary. It asks whether collaboration should remain reactive or whether servers should sometimes be allowed to initiate workflows themselves. That question matters because it determines who ultimately owns control, intent, and authority in MCP-based systems.
Conclusion
MCP did not abandon its original design. It evolved beyond it.
Sampling, elicitation, and the ongoing debate around bidirectionality all point to the same lesson: real systems require collaboration, not just delegation.
As MCP matures, servers are no longer passive endpoints. They are becoming active collaborators in execution, shaping workflows alongside models and users rather than simply responding to requests.
The open question is no longer whether servers should become more capable. It is how far that capability should extend, and where the protocol should draw its boundaries.
How MCP answers that question will define what kind of infrastructure it becomes.