MCP is not dead, it's being misused

Since the end of February 2026, the MCP (Model Context Protocol) has been under heavy criticism.

“MCP is dead, long live the CLI,” titles Eric Holmes in an article that went viral.

Perplexity’s CTO announces the internal abandonment of the protocol.

Garry Tan, president of Y Combinator, tweets bluntly: “MCP sucks honestly.”

The OpenClaw project deliberately chose not to support it.

The grievances are well known:

Token consumption explosion,
Immature authentication,
Questionable operational reliability.

And they are legitimate — within a very specific architectural model. One that we never adopted at Agora.

The real problem: the LLM at the helm of MCP

The pattern everyone is attacking is the LLM-as-router. You inject into the model’s context the schemas of dozens of MCP tools — their descriptions, parameters, and constraints. The LLM must then choose which tool to call, with which arguments, and in what order. This is the dominant pattern in the current AI agent ecosystem.

And this is precisely where the criticisms hit the mark.

Cloudflare measured the gap on its own API: 1.17 million tokens to expose its 2,500 endpoints via native MCP schemas — more than the complete context window of the most advanced models. Even keeping only the required parameters, it remains at 244,000 tokens. Their solution, Code Mode, brings everything down to ~1,000 tokens through two generic tools (search() and execute()), a 99.9% reduction.

When every token has a cost in latency, energy, and euros, the native MCP equation no longer holds up.

Not to mention the risk of error: an LLM choosing among 50 tools can make the wrong call, invent parameters, or trigger a destructive action on a misunderstanding.

The diagnosis is correct. But the conclusion — “we must abandon MCP” — confuses the protocol with an architectural pattern.

Our approach: separating understanding from execution

A clear separation between understanding and execution (MCP)

At Agora, we made a fundamentally different architectural choice from the very design of our agentic platform.

The LLM never sees the MCP schemas. It does not choose which tool to call. It does not construct the parameters of an API call.

The flow is as follows:

The LLM intervenes on what it does best: understanding natural language.

It classifies the user’s intent (“I want to request time off,” “show my January payslip”),
It extracts the mentioned entities (dates, names, amounts), and detects conversation follow-ups.

MCP execution driven by a DSL and a dedicated SDK

It is then our SDK, driven by a declarative DSL, that takes over.

A DSL — Domain-Specific Language — is a language specialized for a specific domain. Unlike a general-purpose language like Python, a DSL expresses business rules concisely and readably.

In our case, this DSL is injected into the LLM’s context as a structured prompt: it describes the intents recognized by the agent, the expected parameters for each intent, and the dialog rules to follow to collect them.

What is the key difference from the LLM-as-router pattern?

Instead of injecting hundreds of MCP tool schemas into the context and asking the LLM to choose the right endpoint with the right parameters, we provide it with a targeted business grammar. The LLM only needs to understand what the user wants to do — not how the API works.

This prompt specialization makes classification significantly more stable and considerably reduces the cognitive load imposed on the model.

The result: fewer tokens in context, less ambiguity, fewer errors.

Once the intent is classified and entities are extracted, the SDK — on the code side — knows which intent corresponds to which MCP call. And it orchestrates a collection dialog when information is missing:

“For which dates would you like to request this time off?”
“Is this paid leave or a compensatory day off?”

The MCP call is only triggered once all arguments are collected and validated. Not before.

What this changes in practice

Tokens used for understanding, not for routing

In the LLM-as-router model, most of the token budget is consumed by tool descriptions injected into the context. And that is before the user has even asked a question.

In our system, the LLM’s context contains the conversation history and the classification prompt. The MCP schemas do not appear there.

Result: token consumption per request is considerably reduced, which translates directly into superior processing capacity on our local inference infrastructure.

Guaranteed argument collection

The classic pattern relies on the hope that the LLM will correctly extract all parameters from the user’s message in a single pass. When a required parameter is missing, the behavior is unpredictable: invented parameter, partial call, or silent failure.

Our automatic dialog system detects missing arguments and engages a structured conversation to collect them.

Disambiguation (“you have two managers, which one?”) and confirmation (“I’m going to request leave from March 15 to 22, is that correct?”) are workflow steps, not emergent model behaviors.

Authentication: a non-problem

Among the recurring criticisms directed at MCP, authentication comes up systematically.

Eric Holmes sums up the prevailing sentiment: “Why should a protocol for giving tools to an LLM need to worry about authentication?”

CLIs rely on proven mechanisms — aws sso login, gh auth login, kubeconfig — and it works.

But this criticism confuses two things:

The immaturity of auth implementation in the community MCP ecosystem, and
The protocol’s intrinsic ability to integrate with robust authentication solutions.

MCP, as a protocol based on JSON-RPC, does not need to reinvent authentication. It can — and should — rely on proven standards that already exist: SAML for enterprise SSO, OAuth 2.1 for access delegation, OpenID Connect for federated authentication.

The protocol’s 2026 roadmap is moving in this direction, with OAuth 2.1 integration and support for streamable HTTP transport.

At Agora, MCP call authentication is handled upstream by the platform. The SDK controls the entire chain: user authentication, permission verification, then transmission of a validated identity context to the MCP server.

The conversational agent never directly handles credentials. And this is not a limitation — it is a separation of responsibilities. The same principle that ensures a controller in a web application does not manage JWT tokens itself: it delegates to an authentication middleware.

In fact, organizations struggling with MCP auth are generally those trying to have the LLM manage authentication itself, or deploying community MCP servers without an upstream security layer. In a controlled architecture, authentication has been a solved problem for a long time.

MCP as an interoperability standard, not as an agent engine

The current debate pits two caricatured visions against each other: “MCP everywhere” versus “MCP nowhere.” Reality is, as often, more nuanced.

MCP remains an excellent interoperability standard between an AI platform and the business applications it must control. It offers a structured, versionable, documentable interface contract.

For an HRIS, ERP, or CRM publisher looking to make their application accessible through a conversational agent, MCP provides a clear framework — far more than “expose a CLI and let the LLM figure it out.”

What is problematic is not MCP as a protocol. It is the architectural pattern of exposing the entire MCP surface to the LLM and entrusting it with routing and parameter decisions.

Separate natural language understanding from action execution.
Give the LLM the role of understanding the user.
Give deterministic code the role of driving the calls.

And you get an MCP that does exactly what it was designed for: a reliable integration standard between systems.

MCP is not dead

MCP is not dead. But the naive architecture that leaves the LLM alone at the controls — choosing the tool, guessing the parameters, triggering execution — deserves the criticism it receives.

At Agora, we build conversational agents for business software publishers who process sensitive data on sovereign infrastructure.

Entrusting MCP routing to a probabilistic model was never an option. Our stack relies on an LLM for understanding, a DSL for routing, dialogs for collection, and MCP for interoperability.

The result: fewer tokens consumed, reliable actions, protected data, controlled authentication, and a protocol that does what is asked of it — nothing more, nothing less.