AI News

86% of Tested AI Systems Failed a Basic Security Attack. Your Browser Agent Might Be One of Them

AI browser agent UAE risk

AI browser agent UAE risk

Researchers tested 36 LLM-integrated applications against a known attack technique called prompt injection. Thirty-one failed, an 86% failure rate. That matters more than it sounds, because the same underlying technology now sits inside Comet, Atlas, and Gemini in Chrome, the AI browser agents that can open tabs, compare prices, fill forms, and complete a checkout while you do something else entirely.

This is not a future risk. Security researchers at Brave already demonstrated it directly against Comet, with no extra clicks required.

VERDICT: AI browser agents are genuinely useful for research and comparison. They are not ready to be left alone on anything involving a payment, a password, or a government portal. Brave’s security team showed that a Reddit post with concealed instructions caused Comet to access a victim’s email, extract their address, retrieve a one-time password, and send both to an attacker’s server, triggered by nothing more than asking the agent to summarize the page. Google separately patched a high-severity vulnerability, CVE-2026-0628, that let a low-privilege browser extension hijack the Gemini panel inside Chrome and gain access to camera, microphone, and local files. These are documented, disclosed incidents, not theoretical concerns.

What an Agent Actually Does Behind the Scenes

Strip away the marketing and every one of these tools does the same three things on a loop. It reads the page the way you would. It decides what to click next based on your instruction. It repeats that until the task looks done. The fundamental weakness underneath all of it is that these systems cannot reliably distinguish your actual instruction from a malicious one hidden inside the content they’re reading. Prompt injection currently sits at the top of OWASP’s list of the most serious risks facing AI-integrated applications, and security researchers describe it as still unsolved, not nearly solved.

OpenAI’s own chief information security officer has publicly acknowledged prompt injection as exactly that, an unsolved security problem, when defending Atlas against early attacks. That is a genuinely honest admission from inside one of the companies building these tools, and it is worth taking at face value rather than assuming the issue has since been quietly fixed.

The Documented Incidents Worth Knowing About

Brave’s demonstration against Comet is the clearest, most concrete example. A Reddit comment containing concealed text instructed the AI to retrieve the user’s email address and a one-time password, then transmit both to an external server, all triggered by the simple, ordinary act of asking the agent to summarise a webpage. No unusual permission was requested mid-task. The user never saw anything resembling a request to share sensitive data.

Separately, security firm LayerX found that OpenAI’s Atlas browser, which introduced a feature called Browser Memories for persistent storage of browsing behaviour, is vulnerable to cross-site request forgery attacks that inject malicious instructions directly into that persistent memory. Once corrupted, the memory carries the injected instruction across devices and sessions, activating later whenever the user makes an entirely unrelated, legitimate query.

And Google’s Gemini Live panel inside Chrome carried a high-severity vulnerability, CVE-2026-0628, disclosed responsibly by Palo Alto Networks researchers in October 2025 and patched in early January 2026. The flaw allowed a malicious browser extension with only basic permissions to hijack the privileged Gemini panel and inherit its access to local files, camera, and microphone. Anyone running a version of Chrome from before that January patch remains exposed.

Where It Gets Risky in the UAE Specifically

Three categories of UAE digital life carry more downside than upside if an agent misreads a field or follows a hidden instruction, precisely because the systems behind them are unforgiving of small mistakes.

Banking and payment pages. An agent that misreads an amount or a recipient field can submit a transfer that was never intended, and reversing a UAE bank transfer once it clears is rarely instant or guaranteed.

Government portals. ICP, GDRFA, and MOHRE applications often need a precise document upload tied to a specific field. A misread field here does not just fail quietly, it can trigger a rejected application that costs time, a resubmission fee, and in some visa categories, a delay with knock-on effects on employment or residency status.

Saved credentials and browser-level access. Several agents request broad browser permissions to function smoothly. If the underlying account or device is compromised at any point, that standing access exposes far more than the single task the agent was given.

How to Actually Use One Without Getting Burned

Keep agents on read-only tasks first. Research, comparison shopping, and summarising are where these tools earn their keep with almost no downside if something goes slightly wrong.

Review before you confirm. Never let an agent complete a financial transaction without a manual look at the final screen before you hit confirm. Treat that final click as yours, always.

Use a clean browser profile for experimentation, one with no saved card details and limited account access, so a mistake costs you nothing real.

Keep your browser updated. The January 2026 Chrome patch for CVE-2026-0628 only protects you if you’ve actually installed it. Outdated versions remain exposed to a vulnerability that’s already been publicly documented.

Log out of government portals the moment an agent’s task is done, rather than leaving an authenticated session open in the background.

The Bottom Line

Gartner’s own guidance to enterprises evaluating these tools is blunt: block AI browser agents until the risks are better understood, or at minimum, keep them away from financial, HR, and authentication workflows. That is institutional caution from analysts with no reason to be alarmist. For an individual resident, the technology is genuinely getting useful for low-stakes tasks. Treat it like a fast, capable intern, excellent at the busywork, not yet trusted with the company card, and definitely not with your visa file.

Robius.news — Dubai, UAE — 2026 | Built to be first. Built to be trusted.

Shares:

Related Posts