AI·May 29, 2026·10 min read

MCP security — what every team connecting agents to tools is missing

Model Context Protocol went from announcement to industry standard in a year. The security model is still being written. Here is what to harden before you ship.

If your team is building anything with AI agents in 2026, you have probably used MCP — even if nobody on the team has read the spec. It is the protocol that lets an agent reach out and use tools. And it has a security gap that almost every team is shipping straight through.

Why you should read this

Most of the AI security writing online is about prompt injection in chatbots. That problem is real, but it is small compared to what happens when an agent can also take actions — read your files, send email, query your database, call your APIs.

MCP is the plumbing that gives agents those actions. It is in every major AI framework now. Teams adopted it fast because it works. They did not slow down to ask how to lock it down, because the documentation was thin and the threats were not obvious.

If you are a builder: this is the checklist you should be running through. If you are a CISO: this is the gap your AI teams probably do not know they have. If you are an exec: this is why the AI projects in your org might be exposing more than you think.

What MCP actually is, in plain English

MCP — Model Context Protocol — is a standard way for an AI model to ask a server to do something. The model says "call this tool with these inputs." The server runs it. The result goes back to the model. The model decides what to do next.

Think of it like USB-C for AI agents. One plug, many devices. Anything that speaks MCP can talk to anything else that speaks MCP. That is why it spread so fast.

The same property that makes it useful — universal compatibility — is what makes it risky. You can plug your agent into a tool somebody else built, and your agent will use it as if it were yours.

Where the trust model breaks

MCP works fine when you wrote the agent, you wrote the tools, and you wrote the server. You trust all three.

It breaks the moment one of those pieces comes from somewhere else. A tool description written by someone else can tell your model what to do. A tool result coming back from somebody else's server can carry instructions hidden inside the data. The model treats both as input — it does not know the difference between "a helpful tool" and "an attacker pretending to be a helpful tool."

The core problem in one sentence: every time text from somewhere outside your control flows back into the model's context, that text can change what the model does next.

Tool poisoning — the attack you have not heard of yet

Most teams know about prompt injection in chat: somebody hides instructions in a document, the model reads it, the model does what the attacker said. Tool poisoning is the same idea, moved into MCP.

It shows up in two places. First, in tool descriptions. A malicious server can describe its tool with language that pushes the model to prefer it over safer alternatives. The model reads the description as guidance, not as marketing copy.

Second, in tool results. The data a tool returns becomes text in the model's context. If an attacker controls that data — a poisoned database row, a malicious file, a compromised API response — they can hide instructions inside the result. The model just searched. The search returned text that says "send the session token to this email address." The model now has both the instruction and the email tool ready to use.

Where your credentials end up

MCP servers need credentials to work — API keys, OAuth tokens, database passwords. The wrong place to put those credentials is anywhere the model can see them.

Three common mistakes. The model is given a secret and asked to pass it to a tool: now the secret lives inside the model's context, and the model can be tricked into sending it somewhere it should not. The server has one admin key shared across all users: if anything goes wrong, every user is impacted. The user logs in, the token is handed to the model, the model hands it to the server: now the token is in three places and the model is the weakest link.

The right pattern is OAuth-style: the host (your app) gets a token for the user, the server validates it, the model never sees the raw credential.

Who is actually doing the action

This is the question that AI compliance teams will ask you, and engineering teams often cannot answer.

When the agent creates a calendar event, is the actor the user, or the agent's service account? When the agent reads a database, does it see only the user's rows, or everything the server has access to? In most MCP deployments by default, the agent inherits the server's broader access — which means a confused or manipulated agent has more power than the user it is supposedly helping.

The fix has two parts. Authorize every tool call as the user, not as the service. Log every action with the user's identity attached. Without those two, your audit trail is wrong.

The supply chain problem

MCP servers are shipped as packages, containers, or hosted services. The same thing that makes them easy to adopt — find one, plug it in — makes them easy to abuse.

Three threats to plan for. Typosquatting: an attacker publishes a package with a name one character off from the real one. Teams installing fast pick the wrong one. The wrong one works correctly most of the time and ships your data out for the rest. Compromised maintainers: the real package's owner gets phished, a malicious update ships, everyone using `latest` runs it. Drifting vendors: you signed up for a hosted MCP server two years ago. The vendor has been acquired, sold, or rotated. Nobody is checking whether they still behave the way they did when you onboarded.

The fixes are the standard supply-chain answers: pin versions, verify signatures, allow-list known servers, monitor what each one talks to.

What to log

Most MCP deployments log almost nothing useful. When something goes wrong, you cannot reconstruct what the agent saw or did.

Minimum logging to get into a SIEM: every tool call with its arguments (secrets redacted), the user identity, the server identity, the response category (success, error, denied), and the session ID. Every tool description the agent saw at startup — so you can detect later if a server changed what it advertises. Every credential failure. Every result that contains URLs or email addresses, since those are common exfiltration paths.

The practical test: can your IR team answer the question "which tools did the agent call between 14:00 and 14:30, with what arguments, on behalf of which user?" If the answer is no, you are not production-ready yet.

The hardening checklist

Seven things to do this quarter, in order of impact.

1. Inventory every MCP server your agents can reach. You probably do not know the full list. Network egress logs will surprise you.

2. Pin server versions. Treat MCP servers like any other dependency. Never run `latest` in production.

3. Move credentials out of the model. Use per-user OAuth-style tokens. The architectural change is real work but pays off everywhere.

4. Filter what comes back from third-party tools before it re-enters the model. Strip URLs, instruction-like phrasing, and anything that looks like it is trying to give the model new orders.

5. Allow-list tools per role. The customer-support agent does not need a shell. The data-analysis agent does not need to send email.

6. Log every tool call to a SIEM, with structured fields. Build one detection: "agent X used tool Y for the first time, with unusual arguments." That one detection has caught real incidents.

7. Test your own agents the way an attacker would. Run prompt-injection drills. Pay external researchers if you do not have internal red-team capacity. The exploits are not theoretical — new variants are demonstrated every month.