Source linked

Gemini 3.5 Flash protege el uso de la computadora en el modelo principal, añade salvaguardas empresariales

Google DeepMind integra el uso de la computadora como una herramienta nativa en Gemini 3.5 Flash, permitiendo a los agentes ver, razonar y actuar a través de navegadores, móviles y escritorios con entrenamiento adversario dirigido contra la inyección rápida.

google deepmindgemini 3 5 flashcomputer useagentic aiprompt injectionenterprise automation

Google DeepMind just turned computer use from a standalone model into a native tool inside Gemini 3.5 Flash, eliminating the round-trip overhead of calling a separate model for agentic tasks that require screen interaction.

Why You Care if You Build Agents

Computer use in Gemini was previously locked behind a dedicated Gemini 2.5 computer use model. That meant any agent that needed to click, scroll, or type across a browser or desktop had to pipe everything through a separate inference endpoint, adding latency and complexity. Mateo Quiros, product manager at Google DeepMind, announced today that computer use is now a built-in tool in 3.5 Flash, alongside function calling and existing tool integrations like Search and Maps grounding.

3.5 Flash can now see, reason, and take action across browser, mobile, and desktop environments in a single model call. The tight integration delivers “our best performance yet for agentic computer use tasks,” according to Quiros. Initial use cases include continuous software testing and knowledge work across professional applications, both long-horizon automation scenarios that previously suffered from model-switching overhead.

Enterprise Safeguards That Actually Matter

Prompt injection is the perennial headache for any agent that operates on live web pages. A malicious button label or hidden instruction could hijack an agent mid-task. Google DeepMind trained on this directly: 3.5 Flash uses targeted adversarial training for computer use to reduce the model’s susceptibility to injected prompts.

On top of that, two optional enterprise safeguard systems give developers control. The first requires explicit user confirmation for sensitive or irreversible actions - think “are you sure you want to delete this database row?” The second automatically stops tasks when the system detects an indirect prompt injection. Google recommends a defense-in-depth approach: combine these with secure sandboxing, human-in-the-loop verification, and strict access controls. Best practices docs are live.

Starting Today, Not Next Quarter

Developers can start building with computer use in 3.5 Flash immediately via the Gemini API and the Gemini Enterprise Agent Platform. A demo environment hosted by Browserbase lets you test capabilities without spinning up infrastructure. Quiros also shared that customers are already driving value - the post includes quotes from early adopters (names not disclosed in the source).

With the safeguard systems in place, the next step is watching enterprise adoption handle long-horizon automation like continuous software testing across professional apps, where even a single misclick can cascade into a failed pipeline.


Source: Computer use in Gemini 3.5 Flash
Domain: blog.google

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.