NVIDIA just dropped the public beta of XR AI, a developer library that wires AR glasses into a full agentic stack - perception, reasoning, tool use, and orchestration - and it's already running at Stanford's stem cell lab and UPMC's operating room.
Most AR demos are glorified heads-up displays. XR AI is different: it treats the glasses as a sensor hub and pipes video, audio, depth, and pose data into a coordinated set of AI models and enterprise tools. The result is a spatially aware assistant that can see what you're doing, hear your questions, retrieve documentation, and guide your next move without you touching a screen.
Four Core Capabilities That Make It Work
XR AI pulls together four components that most agent frameworks leave as homework. First, it ingests real-world signals from AR and XR devices - video, audio, depth, pose, and raw sensor data. Second, it connects those signals to specialized tools: NVIDIA Metropolis for visual AI and video search/summarization, and NeMo Retriever for enterprise knowledge retrieval and RAG.
Third, it supports a broad model ecosystem including NVIDIA's own Nemotron reasoning models, Cosmos Reason for spatial reasoning, and any compatible foundation model you want to plug in. Fourth, it wraps everything in an orchestration layer powered by NeMo Agent Toolkit for tool use, multi-agent coordination, and accelerated runtime services on DGX Spark, DGX Station, or RTX PRO systems.
Real-World Deployments: From Gene Editing to Surgery
Siemens is using XR AI with DGX Spark to help factory engineers troubleshoot PLC issues hands-free. An engineer wearing lightweight glasses talks to an agent that can pull up maintenance manuals, verify work, and log what happened - all without breaking focus on the machinery.
Rana, an AutoBio company, built its LabOS system on XR AI for the Cong Lab at Stanford and the Wang Lab at Princeton. LabOS guides researchers through stem cell therapy and CRISPR gene-editing protocols in real time, identifying the right sample, confirming each step, and capturing a structured experimental record. It works with Meta, Rokid, and VITURE glasses.
At UPMC's Surreality Lab, the system is being tested to support surgical teams. The pipeline runs on XR AI and DGX Station, and its key trick is knowing what not to occlude - it surfaces context without cluttering the surgeon's view. Innoactive is using it for automotive design workflows.
What This Changes for Developers
Before XR AI, building an agent that could see a lab bench and reason about a protocol meant stitching together half a dozen separate SDKs, managing latency yourself, and praying your model chain held up at the edge. NVIDIA's bet is that a unified runtime with GPU acceleration makes that stack production-grade out of the box.
The public beta is available now. Any developer with an AR headset and a GPU can start wiring up perception, retrieval, and reasoning into an agent that doesn't just answer questions - it sees your workspace, hears your commands, and guides your hands through the next step.
Source: Hands Free, AIs Forward: NVIDIA XR AI Brings Agents to AR Glasses
Domain: blogs.nvidia.com
Comments load interactively on the live page.