AI Agent Security Needs a Systems Overhaul, Researchers Say
CMC Crypto News

AI Agent Security Needs a Systems Overhaul, Researchers Say

According to the researchers, three core mechanisms could eliminate the majority of known attack types. The first is a clear separation between instructions and untrusted data

AI Agent Security Needs a Systems Overhaul, Researchers Say

A research paper published on May 20 by teams from Google, Gray Swan AI, EmbraceTheRed, and several universities argues that securing AI agents requires rethinking how the entire system is built, not just how the model itself behaves. The paper contends that treating the AI model as the sole security perimeter leaves too many attack surfaces unaddressed. Researchers said efforts focused only on model robustness are insufficient on their own.

The paper calls for AI agents to be treated as untrusted components within a wider system, drawing on established principles from computer security. "This domain has long dealt with powerful attackers and motivated decades of research on principles and techniques that deal with such adversaries," the researchers wrote. They argue the same adversarial framework should now apply to AI agentsecurity.

According to the researchers, three core mechanisms could eliminate the majority of known attack types. The first is a clear separation between instructions and untrusted data, so that attackers cannot embed malicious commands inside content the agent is processing. Without this boundary, a bad actor can hijack an agent's behavior by hiding instructions inside what appears to be ordinary input.

The second mechanism limits permissions. The paper argues agents should only hold the minimum access required to complete a task, rather than broad system-level rights. The third transfers control of sensitive data flow away from the agent entirely, placing it at the system level, so the agent cannot be manipulated into routing private information to unauthorized destinations. These three controls, applied together, address what the researchers describe as the structural root of most AI attack scenarios.

The paper arrives as AI agents are seeing rapid adoption in crypto. Circle CEO Jeremy Allaire predicted in January that billions of AI agents would be operating on users' behalf within five years. Trading assistant Bankr disabled transactions on May 20 after identifying an attacker who had accessed at least 14 wallets. Security experts speculated the exploit may have involved prompt injection, one of the attack types the paper addresses directly.

Aaron Ratcliff, attributions lead at blockchain intelligence firm Merkle Science, said giving an AI agent access to a wallet introduces a layer of trust into a system designed to be trustless. He said the setup can be safe if built correctly, but listed several conditions, including the ability to catch front-running, apply slippage limits, audit contracts in real time, sandbox prompts, prevent injection, and block man-in-the-middle access. Ratcliff said he would want proof of all those capabilities before the agent executes a trade.

Sean Ren, co-founder of AI-native blockchain platform Sahara AI, said model context protocols are the current standard for safety when configured correctly, but added that users should still monitor every action an agent takes. He described the protocols as a gatekeeper that sits between the AI model and a user's wallet, limiting the agent to specific approved actions such as checking balances or preparing a payment for user confirmation, rather than moving funds freely. "The agent can only perform specific, approved actions […] rather than freely moving funds or changing wallet settings," Ren said.

AI agents are currently being used to build Web3 applications, launch tokens, and interact autonomously with services and protocols. Some platforms are also exploring AI for trading, and the combination of autonomous decision-making with on-chain execution is drawing both developer interest and security scrutiny. The researchers said the goal of their framework is to apply the same systematic controls that have protected conventional software systems to this emerging class of autonomous agents.

This article contains links to third-party websites or other content for information purposes only (“Third-Party Sites”). The Third-Party Sites are not under the control of CoinMarketCap, and CoinMarketCap is not responsible for the content of any Third-Party Site, including without limitation any link contained in a Third-Party Site, or any changes or updates to a Third-Party Site. CoinMarketCap is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement, approval or recommendation by CoinMarketCap of the site or any association with its operators. This article is intended to be used and must be used for informational purposes only. It is important to do your own research and analysis before making any material decisions related to any of the products or services described. This article is not intended as, and shall not be construed as, financial advice. The views and opinions expressed in this article are the author’s [company’s] own and do not necessarily reflect those of CoinMarketCap.
0 people liked this article