AI Frontier·9 min read

MCP Is Not a Connector — It's Your Governance Surface for the Agent Era

By SocialHub.AI Team

Most coverage frames the Model Context Protocol as plumbing. The more consequential enterprise reframe: MCP is the single auditable place where identity, authorization, logging and cost control for AI tool-use get decided.

The framing problem: "connector" undersells the decision you're actually making

When Anthropic introduced the Model Context Protocol (MCP) in November 2024 as an open standard for connecting AI systems to data sources and business tools, the industry did what it usually does with a new integration spec: it filed MCP under plumbing. Adapters, transports, a tidy way to stop writing one-off glue for every model and every tool. All true, and all beside the point if you sit in the CIO, CTO, or CISO chair.

Here is the reframe that matters for an enterprise. The moment an agent can call a tool that reads your customer data or writes to your systems of record, you have created a new control point — whether or not you named it. MCP is the place where that control point can be made explicit. It is not merely how the agent reaches your tools; it is where you decide who is calling, on whose identity, which tools are even visible, how each call is logged, and what each call is allowed to cost. Those are governance questions, not connectivity questions.

Treat MCP as a connectivity feature and you'll optimize for the wrong thing — coverage, integration count, time-to-first-call. Treat it as a governance surface and you'll optimize for what actually determines whether agents are safe to deploy at scale: a single, standardized, auditable interface where policy is enforced once and observed everywhere.

Why "many custom integrations" quietly recreates governance N times

Most enterprises don't arrive at the agent era with a clean slate. They arrive with a portfolio of point integrations: a bespoke connector here, a service account there, an API wrapper a team built two quarters ago that nobody fully remembers. Each of these was a reasonable local decision. Collectively, they are a governance problem, because each one re-implements identity, authorization, logging, and rate/cost control in its own idiom — or skips some of those entirely.

Three failure modes follow predictably. First, identity drifts: one integration runs as a shared service account, another impersonates the end user, a third uses a long-lived key minted by someone who has since changed teams — so "who did this?" has N different answers, none complete. Second, authorization fragments, because the rules for what a tool may do live in N codebases with N review cadences; a control you tightened in one place silently stays loose in another. Third — the one finance and audit feel first — the numbers drift, as the same question answered through two integration paths returns two answers. Once stakeholders stop trusting the numbers, the agent program stops mattering.

The standardized alternative is neither magic nor free, but it changes the shape of the work. Instead of governing N integrations N times, you govern one interface once and expose tools through it deliberately. Identity is established in a single place. Authorization is a property of the interface, not a habit of each connector. Logging is uniform, so the audit trail is actually a trail. And the numbers stop drifting because there is one path to compute them.

Identity and tenant context: decide it at the door, not in the tool

The first question a governance surface has to answer is whose authority a call carries. In a multi-tenant SaaS context this is subtle, because there are two distinct facts in play: the identity of the caller (which agent, acting for which user or service) and the tenant context the call operates within (which customer's data and entitlements apply). Conflate them and you get the classic confused-deputy bug, where an agent legitimately authenticated for one purpose is steered into acting with privileges it should never have inherited.

The discipline worth adopting is to derive tenant context from the server-side tenant binding, not from whatever the caller's session happens to assert. In our own implementation, the customer tier that gates feature access is resolved from tenant context, never read back from the caller's session — precisely so that a caller cannot widen its own blast radius by presenting a richer-looking session. Identity tells you who is asking; tenant context tells you what universe of data and entitlements the answer must be computed within. Both are decided at the door, before any tool runs.

This is also where you make the call that agents are not users. A human in a console has eyes, hesitation, and a manager. An agent has a loop. The identity and context you grant it should reflect that it can act faster and more literally than any person you've provisioned.

Per-tool authorization: exposure is a decision, not a default

The most common and most dangerous default in early agent deployments is all-or-nothing exposure: connect the agent to the platform and let it reach everything the underlying credentials can reach. Convenient in a demo, indefensible in production. A governance surface treats each tool's visibility and permission as a deliberate, reviewable decision.

Concretely, authorization should be multi-axial rather than a single on/off switch. We use a three-axis model — scope and tier, with tier sourced from tenant context as described above — so that a given tool is exposed only when the caller's scope permits it and the tenant's tier includes it. The point of the second axis is that capability is not just a function of who is asking but of which customer's entitlements are in force. This keeps the exposure surface honest: adding a new tool to the catalog does not silently make it reachable by every agent in every tenant.

The mental model to give your architects is a catalog, not a switchboard. Every tool on the catalog earned its place through a decision about who may call it and under what tenant conditions. Tools that no agent currently needs are simply not on the catalog. "All-or-nothing" is a false economy; "deliberate per-tool" is the only posture that survives contact with a real audit.

Fail-closed writes: the asymmetry between reading and acting

Reads and writes are not morally equivalent, and your governance surface should not pretend they are. A read that errs returns stale or empty data — recoverable, often invisible. A write that errs mutates a system of record, sends a message to a real customer, or moves money. The cost of being wrong is asymmetric, so the default posture must be asymmetric too.

The rule we hold to is that write tools fail closed. If authorization is ambiguous, if tenant context can't be cleanly resolved, if a guard can't be evaluated, the write does not happen. This is the opposite of the convenience-driven default many integrations inherit, where an unevaluated check resolves to "allow" so the happy path stays smooth. Fail-open is how you discover, weeks later, that an agent was writing under conditions nobody ever approved. For anything that mutates state or touches a customer, the safe default is refusal, and the burden of proof is on the call to demonstrate it is permitted.

There is a cultural dimension here. Engineers optimize for things working; security optimizes for things not breaking in the worst way. A governance surface is where you encode that the worst way — an unauthorized write executed silently — is unacceptable, even at the cost of occasionally refusing a write that would have been fine.

Budget ledgers and the kill-switch: governing cost as a first-class control

Agents introduce a category of risk that traditional integrations mostly didn't: autonomous spend. A loop that calls tools can, through a bug or an adversarial prompt, call them far more than anyone intended — burning API budget, triggering downstream charges, or flooding customers. Cost control therefore belongs in the governance surface, not bolted on afterward as a billing alert that fires once the damage is done.

The mechanism we run is a budget ledger evaluated before any spend, with per-run, per-day, and rolling 30-day limits, plus a kill-switch checked ahead of execution. Note the ordering: the budget and the kill-switch are evaluated before the spend, not reconciled after it. An alert that tells you that you overspent is a postmortem; a ledger that refuses the over-budget call is a control. The three time horizons matter because failure modes have different shapes — a runaway loop blows the per-run cap, a misconfigured schedule blows the per-day cap, and slow creep shows up only against the 30-day window.

The kill-switch deserves its own line in the design. It is the control you reach for when you don't yet understand what's happening and you need the activity to stop now. Evaluated before execution, it is a circuit breaker, not a cleanup tool. Every agent program should be able to answer one question instantly: how do we stop it? If the answer involves paging three teams and revoking keys by hand, you don't have a kill-switch — you have a hope.

Audit and redaction: scoped, time-bounded, and observable

Governance that can't be observed isn't governance; it's intention. The final layer of the surface is the one that makes everything above it provable after the fact. Agent keys should be scoped to the tools they actually need and time-bounded so that a credential's blast radius shrinks to zero on a schedule rather than lingering indefinitely. A key that can do anything forever is the single most common artifact found in the wreckage of an incident review.

Equally, the audit trail itself has to respect the data it records. We apply per-call redaction so the log captures what happened — which tool, under which identity and tenant context, with what outcome — without becoming a secondary copy of the very sensitive data the controls exist to protect. An audit log that quietly accumulates plaintext customer records is not a control; it is a new liability with a friendly name.

Together, these properties — scoped keys, time bounds, per-call redaction, uniform audit — let a CISO answer the questions that actually get asked after an incident: what could this credential do, how long could it do it, what did it in fact do, and can we prove it without leaking more in the proving.

Tying it to continuous risk — and being honest about what's left

None of this is a one-time setup. The NIST AI Risk Management Framework (AI RMF 1.0) is explicit that managing AI risk is a continuous process, not a checklist you complete and shelve. A governance surface is valuable precisely because it gives that continuous process a single place to act: when a new risk emerges, you adjust authorization, budgets, exposure, or audit in one interface rather than chasing the change across N integrations. The standardization is what makes continuous governance tractable instead of aspirational. It is also why a thoughtful exposure decision pays off twice — once when you make it, and again every time you have to revisit it.

Be honest about residual risk, because your board will be. A governance surface narrows the attack surface and makes activity auditable; it does not make agents infallible. Prompt injection can still try to steer a legitimately authorized agent toward legitimate-looking misuse. A tool deliberately exposed can still be misused within its permissions. Budgets bound cost but not judgment. The Deloitte "State of AI in the Enterprise" research has repeatedly shown the gap between AI ambition and the operational maturity to govern it — and that gap is exactly where these residual risks live. The right response is not to claim the gap is closed but to make it small, observable, and continuously managed.

We've built our platform on this conviction. SocialHub.AI exposes semantic-layer tools over MCP using HTTP streamable transport, reachable from agents like Claude, GitHub Copilot, and Microsoft 365 Copilot, with the controls described here applied at the interface rather than scattered across connectors: three-axis authorization with tier from tenant context, fail-closed writes, a per-run/per-day/30-day budget ledger with a pre-execution kill-switch, and scoped, time-bounded agent keys with per-call redaction and audit. It is the same governance discipline that underpins the loyalty work where we helped grow McDonald's China member GMV contribution from roughly 5% to 85% — outcomes at that scale only survive when the interface to the data is trustworthy by construction.

If you're deciding how agents will touch your systems of record, the question to put to your team isn't "which connector" — it's "where do identity, authorization, logging, and cost get decided, and is that one place or N." If you'd like to see what a governance-first MCP surface looks like in practice, book a demo, or read more about our approach at /platform/ai-frontier.

Want to Learn More?

Schedule a conversation with our retention loop experts.

Book a Demo More Articles