AI is moving fast, faster than most security teams expected.
Not long ago, most AI systems were just chatbots. You asked a question, the model answered, and that was mostly where the risk ended. Now we are now in a different world: agentic AI. These systems do not just respond. They retrieve files, query databases, call APIs, write code, send messages, create tickets, and sometimes make decisions across business systems.
That is powerful. It is also risky.
This is where the Model Context Protocol, or MCP, comes in. MCP was introduced by Anthropic as an open standard for connecting AI assistants to the systems where data actually lives, including business tools, content repositories, development environments, and other data sources. Its goal is simple: instead of building a custom connector for every model and every tool, developers can use one protocol to connect AI applications with external systems.
In plain English, MCP gives AI agents a more standard way to use tools.
But here is the part that matters for security: when you give an AI system tools, you are not just improving its usefulness. You are expanding what it can touch, what it can change, and what it can accidentally leak.
MCP is not just another integration layer. It is becoming part of the security boundary for AI agents. If that boundary is weak, the model does not need to be “hacked” in the traditional sense. It only needs to be tricked into using the wrong tool, with the wrong permission, at the wrong time.
What Is MCP Security?
MCP provides a standardized way for applications to share context with language models, expose tools and capabilities to AI systems, and build composable integrations and workflows. The official specification describes three main roles: hosts, which are the LLM applications; clients, which are connectors inside the host; and servers, which provide context and capabilities. MCP uses JSON-RPC 2.0 messages, and servers can expose resources, prompts, and tools.
A simple way to think about it is this:
- The MCP Host is the AI application the user interacts with. This could be an IDE, an enterprise copilot, a desktop assistant, or a cloud-based agent.
- The MCP Client is the connector inside that host. It manages communication with MCP servers.
- The MCP Server is the service that exposes capabilities to the AI system. Those capabilities usually fall into three groups: resources, prompts, and tools. Resources provide context, such as files, database schemas, or application data. Prompts provide reusable instruction templates. Tools are executable functions the model can invoke, such as querying a database, calling an API, or performing a computation.
That architecture is elegant. It solves a real problem. Before MCP, every AI application needed custom integrations for every system it wanted to reach. MCP makes that cleaner.
But cleaner does not automatically mean safer.
The moment an AI model can use tools, the model becomes more than a text generator. It becomes a user inside your architecture. And not always a user that understands boundaries.
Why MCP Changes AI Agent Security
In traditional software, access control usually happens around users, services, APIs, and networks. With MCP, we now have another actor in the middle: the AI agent.
That agent can read external context, reason over it, and decide which tool to call next. But LLMs are not deterministic security engines. They are pattern-based systems. They can be influenced by instructions hidden inside emails, web pages, tickets, documents, logs, or even tool descriptions.
This is why MCP security is not just about writing secure code. It is about securing the entire chain:
User → AI Host → MCP Client → MCP Server → External Service
Every arrow in that chain is a trust boundary.
And trust boundaries are where security problems like to hide.
Recent Research on MCP
A recent measurement study on real-world remote MCP servers found some worrying results. The researchers identified 7,973 live remote MCP servers and reported that 40.55% exposed tools without authentication. Among authenticated servers, OAuth was a major mechanism for connecting MCP clients to remote services. The study also found that real-world OAuth-enabled MCP deployments often combine three characteristics: open client environments, dynamic client registration, and delegated authorization.
That combination matters.
Open client environments mean MCP clients may run in places the server cannot fully trust, such as desktops, IDEs, CLI tools, or cloud-hosted agent frontends.
Dynamic client registration means clients may be allowed to register themselves automatically.
Delegated authorization means the MCP server may sit between the AI client and an upstream service, acting as both a resource server and an OAuth client.
In the same study, the researchers tested 119 OAuth-enabled MCP servers and found that every tested server had at least one confirmed authentication flaw. They identified 325 flaws, with dynamic client registration issues affecting 96.6% of the tested servers, and reported that some flaws could lead to sensitive data leakage or account takeover.
This does not mean every MCP deployment is broken. The tested set was a specific subset of real-world OAuth-enabled servers. But it does show something important: authentication mistakes in MCP are not theoretical anymore.
They are already showing up in deployed systems.
The Big Risk: MCP Turns Context Into Action
Prompt injection was already a serious issue before MCP. OWASP lists prompt injection, insecure output handling, supply chain vulnerabilities, excessive agency, and overreliance among major LLM application risks.
MCP makes some of those risks more serious because it connects model reasoning to real tools.
A normal prompt injection might cause a bad answer.
An MCP prompt injection might cause a bad action.
That is the difference.
Imagine an AI assistant reading a support ticket. Hidden inside the ticket is an instruction telling the model to ignore previous instructions and call an internal tool that exports customer records. The user never asked for that. The system prompt never intended that. But the model may interpret the malicious text as part of the task unless the application has strong boundaries.
This is why I do not think of MCP security as only “API security” or only “AI safety.” It is both. The model can be manipulated, and the tool can be abused.
Risk 1: Indirect Prompt Injection and Tool Abuse
What it is
Indirect prompt injection happens when malicious instructions are hidden inside content the model reads. This could be a web page, PDF, email, Slack message, issue description, changelog, code comment, or database record.
In an MCP environment, that content can influence what tools the model chooses to call.
Why it matters
If the model has access to powerful tools, a hidden instruction can become much more dangerous. It might cause the agent to read sensitive data, modify a file, send an email, open a pull request, or call an external URL.
The problem is not that the AI is evil. It is that it is too helpful.
It sees text. It tries to follow instructions. And unless the system clearly separates trusted instructions from untrusted data, the model may treat malicious content as part of the task.
What helps
Treat anything the model reads from the outside world as untrusted. That includes documents, tickets, web pages, search results, tool descriptions, and database content.
A safer MCP design should clearly mark external content as data, not instructions. Tool calls should be authorized by deterministic policy, not by model confidence.
For example, a basic policy layer might look like this:
This is not a full security system, but it shows the right mindset: the LLM can suggest an action, but the application must decide whether that action is allowed.
Risk 2: Overpowered MCP Tools
What it is
Overpowered tools are tools that give the AI more access than it actually needs. This could be a shell tool, a broad database connector, a file writer, an unrestricted HTTP client, or an admin API token.
Why it matters
MCP tools are model-controlled by design, meaning the model can discover and invoke them based on the user’s prompt and the context it sees. The official MCP tools specification says there should always be a human in the loop with the ability to deny tool invocations, and applications should show which tools are exposed, when tools are invoked, and when confirmation is needed.
That guidance exists for a reason.
If an AI agent only has a read-only documentation search tool, a prompt injection is limited. If it has a shell, write access to production, and an outbound network tool, that same prompt injection becomes a serious incident.
What helps
Start with read-only. Add write permissions only when there is a clear business need.
Split tools into small, specific MCP servers instead of one giant “do everything” server. A documentation server should not be able to execute shell commands. A customer lookup server should not be able to export the entire CRM. A code review server should not have production deployment rights unless that action is separately approved.
What really matters is blast radius.
If the model gets confused, manipulated, or overconfident, how much damage can it do?
Risk 3: Unauthenticated Remote MCP Servers
What it is
Some remote MCP servers expose their tools without requiring authentication.
Why it matters
The recent measurement study found that 40.55% of validated live remote MCP servers exposed tools without authentication. The researchers also described an unauthenticated CRM-related MCP server that exposed thousands of internal enterprise records, including customer contact details.
That is the kind of issue that should make every security team stop and check their own environment.
An unauthenticated MCP server is not just an exposed API. It may be an exposed AI tool surface, designed to perform useful actions quickly.
Useful to the business can also mean useful to an attacker.
What helps
No production MCP server should expose sensitive tools without authentication.
At a minimum, remote MCP servers need strong identity, user-level authorization, scoped tokens, audit logs, and a deny-by-default policy. Internal-only is not the same as safe. A misconfigured internal MCP server can still be abused by compromised laptops, exposed dev tunnels, SSRF, forgotten staging environments, or overly trusted internal users.
Zero trust is not about being paranoid. It is about not assuming anything is safe by default.
Risk 4: OAuth and Dynamic Client Registration Weaknesses
What it is
For remote MCP servers, OAuth is commonly used to authorize access to user-linked services. The current MCP authorization specification is based on OAuth 2.1 and related standards, including authorization server metadata, protected resource metadata, dynamic client registration, and Client ID Metadata Documents.
The tricky part is that MCP is often used in open client environments where clients cannot safely keep long-term secrets. That makes protections like PKCE, redirect URI validation, audience validation, and state handling very important.
Why it matters
The study on real-world remote MCP servers found several OAuth-related flaw categories, including malicious dynamic client registration, blind client trust, PKCE downgrade, consent page bypass, open redirects, weak state, and code replay. In the tested OAuth-enabled subset, dynamic client registration flaws were especially common.
This is where “normal OAuth mistakes” become more dangerous.
In a regular web app, an OAuth bug might expose a user session.
In MCP, that session may unlock tools connected to SaaS accounts, cloud resources, internal documents, or enterprise workflows.
What helps
Use OAuth properly, not just nominally.
MCP’s current authorization spec says authorization servers must implement OAuth 2.1 with proper security measures, should support Client ID Metadata Documents, and may support Dynamic Client Registration. It also says MCP servers must implement OAuth Protected Resource Metadata and clients must use it for authorization server discovery.
Some practical rules:
Use PKCE with S256, and refuse flows where PKCE support is missing. The MCP authorization specification says MCP clients must implement PKCE and verify PKCE support before proceeding.
Validate redirect URIs exactly. Do not allow wildcard callbacks, partial matches, strange encodings, or attacker-controlled domains. The MCP authorization spec specifically warns about malicious redirect URIs and says authorization servers must validate exact redirect URIs against registered values.
Prefer pre-registration or Client ID Metadata Documents where possible. Treat open Dynamic Client Registration as high risk, not as a harmless convenience.
Make authorization codes one-time use.
Use strong, unpredictable state values and bind them to the session.
Display the redirect destination clearly to the user, especially for localhost redirects.
Most importantly, do not only check that “the AI app” has access. Check whether the actual requesting user has access.
Risk 5: Confused Deputy and Token Passthrough Problems
What it is
A confused deputy problem happens when a trusted service is tricked into using its authority on behalf of the wrong party.
In MCP, this can happen when an MCP server connects to third-party APIs and acts as an intermediary between the AI client and upstream services. The official MCP security guidance warns that MCP proxy servers can create confused deputy vulnerabilities, especially when static client IDs, dynamic registration, and consent cookies interact badly.
Why it matters
This is one of those security problems that looks boring until it becomes critical.
A user may think they are authorizing one MCP client. The upstream service may think it is authorizing one trusted server. The MCP server may be forwarding requests between both. If the identity and consent boundaries are not preserved, the wrong client can end up with access it should never have.
Token passthrough makes this worse. The MCP security guidance says MCP servers must not accept tokens that were not explicitly issued for the MCP server.
What helps
Keep tokens audience-bound.
The MCP authorization spec says MCP clients must include the resource parameter in authorization and token requests, and MCP servers must validate that tokens presented to them were issued specifically for their use.
That means one token should not silently work everywhere.
Do not pass upstream service tokens through MCP as if they were MCP tokens. Separate the MCP-layer token from the upstream API token. Validate audience, issuer, scopes, expiry, and user identity at every boundary.
It may feel like extra work, but this is the kind of work that prevents one compromised token from becoming an enterprise-wide problem.
Risk 6: Context Poisoning and Data Exfiltration
What it is
MCP servers can expose resources that feed context into the model. These resources may include file contents, database schemas, logs, tickets, repository data, or business records.
If an attacker can poison those resources, they may influence the model’s behavior.
Why it matters
A poisoned resource can do two things at once. It can change the model’s reasoning, and it can guide the model toward unsafe tool use.
For example, a malicious document might tell the model to summarize a folder, extract secrets, encode them into a URL, and call a network tool. The model may not understand that this is exfiltration. It may just see it as a sequence of instructions inside the task.
This is why context is not passive anymore.
In agentic systems, context can become a command path.
What helps
- Separate trusted instructions from untrusted data.
- Do not let retrieved content override system instructions.
- Do not allow resource content to directly trigger tools without policy checks.
- Add egress controls so an AI agent cannot casually send data to arbitrary external domains.
- Use content scanning, secret detection, and data-loss-prevention checks before sending tool outputs back into the model or out to external systems.
- And for sensitive workflows, consider canary secrets. If a fake secret placed in a test resource ever appears in a model output, tool call, log, or outbound request, you know something is wrong.
Risk 7: Session Hijacking and Stateful Transport Issues
What it is
Remote MCP systems may use sessions to manage ongoing client-server communication. If those sessions are predictable, reused incorrectly, or treated as authentication, attackers may be able to impersonate clients or inject malicious events.
The MCP security best practices warn that session hijacking can let an unauthorized party reuse a session ID and perform actions as the original client. The guidance says MCP servers that implement authorization must verify all inbound requests, must not use sessions for authentication, and should use secure, non-deterministic session IDs.
Why it matters
Session bugs are not new. What is new is what an attacker may get after hijacking the session.
They may not just see a page. They may gain access to tools.
What helps
Sessions should identify the conversation state, not prove identity.
Every tool request still needs authentication and authorization. Bind sessions to authenticated users. Rotate and expire session IDs. Use secure random generation. Log unusual session reuse. Never let possession of a session ID become the only gate protecting tool access.
Risk 8: Local MCP Server Compromise
What it is
Not all MCP servers are remote. Some run locally on a user’s machine through stdio transport or local configuration.
That can be useful for developer tools. It can also be dangerous.
The MCP security best practices warn that local MCP servers may have direct access to the user’s system and can become attractive targets if users download and execute untrusted servers, accept malicious startup commands, or leave insecure local services running.
Why it matters
A local MCP server can be close to everything developers care about: source code, SSH keys, environment variables, cloud credentials, local databases, browser sessions, and build tools.
This is where supply chain security meets AI agent security.
A malicious MCP server does not need to defeat the model. It can become the tool the model trusts.
What helps
- Treat MCP servers like executable dependencies.
- Pin versions.
- Verify sources.
- Use SBOMs.
- Review startup commands.
- Sandbox local servers.
- Run them with least privilege.
- Do not give a local MCP server access to your whole home directory just because it was easy during setup.
In a world where everything is connected, trust has to be earned, not assumed.
How To Secure MCP Servers
After looking at the research, the official MCP guidance, and the kinds of failures already showing up in real deployments, I would approach MCP security in layers.
1. Build an MCP Inventory First
You cannot secure what you cannot see.
Track every MCP server in use, whether local or remote. Record what tools it exposes, what resources it can read, what users can access it, what tokens it uses, and whether it can make outbound network requests.
For each server, ask:
- What can it read?
- What can it write?
- What can it execute?
- What identity does it run as?
- What happens if the model is tricked?
That last question is the one people forget.
2. Default to Least Privilege
Do not expose broad tools when narrow ones will work.
A tool called run_sql_query is risky. A tool called get_customer_ticket_status is safer.
A shell tool is risky. A tool that runs one approved script with fixed arguments is safer.
A full CRM export tool is risky. A paginated, user-scoped customer lookup tool is safer.
The goal is not to stop AI from being useful. The goal is to make unsafe actions structurally harder.
3. Put Deterministic Controls Between the Model and the Tool
The model should not be the final authority.
Before an MCP tool executes, the server should validate:
- User identity
- User permissions
- Tool scope
- Input schema
- File paths
- Network destinations
- Database query shape
- Rate limits
- Business logic constraints
- Risk level
- Human approval requirements
This is where traditional application security still matters. The LLM is not a trusted backend. Treat it like an untrusted user with a very convincing writing style.
4. Require Human Approval for High-Risk Actions
Some actions should never happen fully autonomously.
Sending emails externally, modifying production data, executing code, changing permissions, exporting records, approving payments, deleting files, or calling arbitrary external URLs should require a human checkpoint.
The official MCP specification also emphasizes user consent and control, stating that users must understand and consent to data access and operations, and that hosts must obtain explicit user consent before invoking tools.
This is not friction for the sake of friction.
It is a safety brake.
5. Harden OAuth Like It Is Production Infrastructure
For remote MCP, OAuth is not just login. It is the gate to tools.
- Use OAuth 2.1 patterns carefully.
- Use PKCE with S256.
- Validate exact redirect URIs.
- Use a strong state.
- Reject code replay.
- Use short-lived tokens.
- Rotate refresh tokens for public clients.
- Validate token audience.
- Do not pass tokens across trust boundaries.
- Do not rely on Dynamic Client Registration unless you have strict policies around it.
The MCP authorization spec also says authorization server endpoints must be served over HTTPS, and redirect URIs must either be localhost or use HTTPS.
These details may seem small, but small OAuth mistakes can become account takeover paths.
6. Sandbox MCP Servers
If an MCP server gets abused, the damage should be contained.
Run MCP servers in restricted environments. Use containers, read-only file systems, no-root users, limited environment variables, minimal secrets, and outbound network allowlists.
Do not give every MCP server access to the same credentials.
Do not run local MCP servers with broad filesystem access unless there is a strong reason.
Do not let development convenience become production exposure.
7. Monitor Tool Use Like You Monitor Admin Activity
- MCP tool calls should be logged.
- Not just errors. Not just API calls. The actual tool invocation should be visible:
- Who requested it?
- Which model or host initiated it?
- Which MCP client was used?
- Which server executed it?
- What tool was called?
- What arguments were passed?
- Was human approval required?
- What data was returned?
- Was anything sent externally?
- Logging is not only for forensics. It is how you learn what your agents are really doing.
8. Red Team the Full Agent Chain
Traditional pentesting is still useful, but MCP needs agent-aware testing.
You need to test prompt injection, tool poisoning, context poisoning, OAuth flows, dynamic client registration, redirect handling, token audience validation, session handling, privilege boundaries, and human approval bypasses.
This is where AI red teaming becomes important. A good red team does not just ask, “Can I break the web app?” It asks, “Can I manipulate the model into misusing a legitimate tool?”
That is a different question.
And it is now one of the most important questions for MCP deployments.
What I’ve Learned
MCP is not dangerous because it exists.
MCP becomes dangerous when it is invisible.
When teams do not know which servers are running, which tools are exposed, which tokens are being passed around, or which actions the model can take, the organization is already behind.
What stood out most to me is that MCP security is not one control. It is not “just add OAuth.” It is not “just add a system prompt.” It is not “just ask the user for approval.”
- It is a layered design problem.
- Authentication matters.
- Authorization matters.
- Tool design matters.
- Prompt boundaries matter.
- Human approval matters.
- Sandboxing matters.
- Monitoring matters.
- And above all, assumptions matter.
- Because once an AI agent can act, every assumption becomes a possible attack path.
Building a Safer Future for MCP-Based AI Agents
The future of AI agents will not be built on isolated chat windows. It will be built on connected systems, shared context, delegated permissions, and automated workflows.
MCP is one of the protocols helping make that future possible.
But if MCP becomes the bridge between AI and production systems, then security teams need to treat that bridge like critical infrastructure.
Frameworks like the NIST AI Risk Management Framework, the EU AI Act, and the OWASP LLM Top 10 all point in the same direction: AI systems need governance, measurement, security controls, transparency, and continuous risk management, not just clever prompts.
At Hacken, this is exactly where AI red teaming becomes valuable. The goal is not only to test whether the model says something unsafe. The goal is to test whether the whole AI system can be pushed into unsafe behavior: leaking data, abusing tools, bypassing authorization, ignoring user boundaries, or chaining small weaknesses into a real incident.




