Taming the Wild West of MCP Security

I’ve been deep in the weeds of MCP security lately while building Membrane, a security shim for the Model Context Protocol. What started as a straightforward project quickly revealed just how complex securing AI tool integrations can be. The Model Context Protocol has become the de facto standard for connecting AI assistants to external tools and data sources, but with great power comes great responsibility.

Cybersecurity shield protecting digital data

The promise of MCP is compelling. Instead of building custom integrations for every tool an AI might need, you get a universal interface that lets AI systems interact with databases, APIs, file systems, and more through natural language. The reality is messier. When you give an AI system the ability to execute commands based on natural language instructions, you’re opening doors that traditional security models weren’t designed to handle.

“The most dangerous vulnerabilities aren’t in your code—they’re in the way your AI interprets natural language.”

The Threat Landscape Nobody Talks About

The most insidious attacks on MCP systems don’t come through network vulnerabilities or buffer overflows. They come through the AI itself. Prompt injection attacks represent a fundamental shift in how we think about security. An attacker doesn’t need to exploit code; they just need to craft input that manipulates the AI into doing something it shouldn’t.

Consider this scenario: your AI assistant has access to your company’s database through MCP. Someone sends an email containing what looks like innocent text, but buried within it are instructions that cause the AI to extract and leak sensitive data. The AI isn’t malfunctioning, it’s doing exactly what it thinks it’s been asked to do. Traditional input validation falls short because the “input” is natural language that needs to be understood contextually.

Network security concept with digital lock

Then there’s the confused deputy problem, which sounds academic until you see it in action. When MCP servers act as proxies to third-party APIs, attackers can exploit situations where “the third-party authorisation server detects the cookie and skips the consent screen” and “the MCP authorisation code is redirected to the attacker’s server.” Essentially, the MCP server becomes an unwitting accomplice in an attack, using its legitimate credentials to perform unauthorised actions.

The protocol itself doesn’t help much here. Many implementations suffer from “lack of authentication standards” and “missing integrity controls,” leading to “inconsistent and often weak security implementations.” This isn’t entirely surprising for a relatively new protocol, but it means security becomes the responsibility of individual implementers, with predictably mixed results.

Building Defence in Depth

After wrestling with these challenges while building Membrane, I’ve learned that MCP security requires a fundamentally different approach. You can’t just bolt on traditional security measures and call it a day. The security needs to be woven into the fabric of how the system operates.

Authentication and authorisation become more complex when you’re dealing with AI agents that might be making decisions autonomously. Multi-factor authentication is still important, but you also need to think about how to validate that an AI’s request is legitimate and within scope. OAuth 2.0 with PKCE provides a solid foundation, but for MCP servers acting as OAuth proxies, you must “obtain user consent for each dynamically registered client before forwarding to third-party authorisation servers” to prevent those confused deputy attacks.

Layers of security protection

Input validation becomes an art form when your inputs are natural language instructions.

Key Security Principle: Implement defense in depth with multiple validation layers that understand both the syntax and semantics of AI requests.

You can’t just check for SQL injection patterns when the AI might be asked to “find all users whose names start with Robert and create a summary report.” Instead, you need to understand the intent behind requests and validate that the intended action is something the AI should be allowed to perform.

The networking layer needs attention too, but it’s the more straightforward part. TLS 1.3 everywhere, proper certificate validation, network segmentation. The usual suspects, but they’re table stakes, not solutions.

Where things get interesting is in the monitoring and logging. Traditional log analysis doesn’t work well when you’re trying to understand whether an AI’s behaviour is anomalous. You need to log not just what happened, but why the AI thought it should happen. Context becomes crucial. A request to delete files might be perfectly legitimate as part of a cleanup operation, or it might be the result of a successful prompt injection attack.

Why I Built a Security Shim

The more I worked with MCP implementations, the more convinced I became that we needed a different architectural approach. Individual MCP servers trying to implement their own security controls led to inconsistent protection and a lot of duplicated effort. That’s why I built Membrane as a security shim that sits between AI clients and MCP servers.

Security architecture diagram

Membrane Architecture: A security shim that provides centralized policy enforcement, consistent logging, and threat detection across all MCP interactions.

The shim approach gives you centralised control over security policies without requiring changes to existing MCP servers. You can implement organisation-wide security standards, get consistent logging across all your MCP interactions, and respond quickly to new threats without updating dozens of individual servers.

Performance was a major concern during development. Adding another layer to the stack inevitably introduces latency, but I found that intelligent caching and efficient security checks could minimise the impact. The key is doing expensive operations, like complex prompt injection detection, asynchronously when possible and caching results for similar requests.

The centralised architecture also makes it easier to implement sophisticated threat detection. When you can see patterns across all MCP interactions, you can spot anomalies that might not be obvious at the individual server level. An AI making an unusual number of database queries might be normal for that particular server, but concerning when viewed in the context of all system activity.

Beyond the Protocol

Working on MCP security has reinforced some broader lessons about security in the age of AI. The traditional boundaries between different types of security threats are blurring. A social engineering attack might now target an AI system rather than a human, using carefully crafted prompts instead of emotional manipulation.

AI security concept

Development practices need to evolve too.

“In the age of AI, your security perimeter extends to the very language your systems understand.”

Code reviews need to consider not just traditional security vulnerabilities, but also how AI systems might misinterpret or be manipulated into misusing the code. Static analysis tools are getting better at catching some of these issues, but they’re still playing catch-up with the rapidly evolving threat landscape.

Dependency management becomes more critical when you’re working with AI systems that might have access to sensitive resources. A compromised library could potentially be exploited through prompt injection to perform unauthorised actions. The supply chain attacks we’ve seen in traditional software are likely to evolve to target AI-specific vulnerabilities.

The operational side requires new thinking too. Incident response procedures need to account for the possibility that an AI system might be compromised not through traditional means, but through manipulation of its behaviour. How do you investigate an incident where the AI was tricked into performing unauthorised actions? How do you prevent similar attacks in the future?

The Road Ahead

The MCP ecosystem is still young, and the security landscape is evolving rapidly. New attack techniques are being discovered regularly, and defence mechanisms are struggling to keep pace. Organisations adopting MCP need to be prepared for this dynamic environment.

Future technology security

Regulatory frameworks are starting to catch up with AI security concerns, but they’re still largely focused on model training and deployment rather than the operational security of AI systems in production. I expect this to change rapidly as AI systems become more integrated into critical business processes.

The development of security standards and best practices is accelerating, but it’s still largely driven by individual organisations sharing their experiences rather than coordinated industry efforts. This is both good and bad - innovation happens quickly, but inconsistency makes it harder to build robust defences.

Looking at the broader trends, I think we’re going to see a convergence of traditional cybersecurity and AI safety practices. The techniques used to make AI systems robust and aligned are increasingly relevant to making them secure, and vice versa.

Final Thoughts

Building Membrane has been an education in the unique challenges of securing AI systems. The traditional security playbook doesn’t just need updates, it needs fundamental rethinking for an era where the primary attack vector might be a carefully crafted sentence rather than a buffer overflow.

The Model Context Protocol represents a significant step forward in AI capabilities, but it also represents a significant expansion of the attack surface. Organisations adopting MCP need to think carefully about security from the start, not as an afterthought. The cost of getting it wrong is only going to increase as these systems become more deeply integrated into critical business processes.

Security in the age of AI isn’t just about protecting against traditional threats. It’s about building systems that can maintain their integrity even when facing attacks specifically designed to exploit the unique characteristics of AI systems. It’s a challenging problem, but one that’s essential to solve if we want to realise the full potential of AI-powered tools and services.

The future of AI security will be built by people who understand both the traditional cybersecurity landscape and the emerging world of AI-specific threats. It’s an exciting time to be working in this space, even if it sometimes feels like we’re building the airplane while flying it.

Taming the Wild West of MCP Security.

The Threat Landscape Nobody Talks About

Building Defence in Depth

Why I Built a Security Shim

Beyond the Protocol

The Road Ahead

Final Thoughts