Google detailed its security approach for Chrome’s agentic features, which enable automated actions like booking tickets or shopping, addressing potential data and financial risks.
Google employs several models to manage agentic actions. A User Alignment Critic, built with Gemini, scrutinizes action items generated by the planner model. If the critic model determines planned tasks do not align with user goals, it prompts the planner to re-evaluate the strategy. The critic model accesses only metadata of proposed actions, not actual web content.
To restrict agent access to disallowed or untrustworthy sites, Google utilizes Agent Origin Sets. These sets confine the model to specific read-only and read-writeable origins. Read-only origins define content Gemini can consume; for example, product listings on a shopping site are relevant, but banner ads are not. The agent can only interact with designated iframes on a page.
In a blog post, the company stated, “This delineation enforces that only data from a limited set of origins is available to the agent, and this data can only be passed on to the writable origins. This bounds the threat vector of cross-origin data leaks. This also gives the browser the ability to enforce some of that separation, such as by not even sending to the model data that is outside the readable set.”
Google monitors page navigation via another observer model to prevent access to harmful model-generated URLs.
For sensitive tasks, Google requires user consent. If an agent attempts to access sensitive sites, such as banking or medical platforms, it requests user permission. Should a site require signing in, Chrome will prompt for user permission to utilize the password manager; the agent’s model does not access password data. Users will be asked before the agent initiates actions like making a purchase or sending a message.
Google also employs a prompt-injection classifier to prevent unwanted actions and is evaluating agentic capabilities against attacks developed by researchers. Earlier this month, Perplexity released an open-source content detection model to counter prompt injection attacks against agents.




