NVIDIA and Microsoft Ship Unified Stack for Agentic AI from Windows to Cloud

Q: What is the significance of: NVIDIA and Microsoft Ship Unified Stack for Agentic AI from Windows to Cloud?

A massive expansion of the NVIDIA-Microsoft partnership brings specialized hardware like RTX Spark and DGX Station to Windows, alongside GPU-accelerated Microsoft Fabric for enterprise data.

Developers can now deploy autonomous agents across a continuous spectrum of hardware, ranging from 1-petaflop Windows laptops to 20-petaflop deskside supercomputers and massive Azure AI factories.

Reimagining Windows for Autonomous Agents

NVIDIA and Microsoft are fundamentally changing the Windows PC architecture to support the long-running reasoning required by AI agents. The new RTX Spark line introduces Windows PCs purpose-built for personal agents, delivering 1 petaflop of AI performance and up to 128GB of unified memory. These systems, arriving this fall from partners like ASUS, Dell, HP, Lenovo, and MSI, are designed to maintain full AI and graphics performance even when unplugged.

For enterprise-scale agentic workflows, the DGX Station for Windows provides a deskside supercomputer powered by the NVIDIA GB300 Grace Blackwell Ultra Desktop Superchip. This hardware delivers 20 petaflops of FP4 performance and up to 748GB of coherent memory, enabling the execution of frontier models with up to 1 trillion parameters directly on Windows enterprise applications. Both the RTX Spark and DGX Station utilize NVIDIA OpenShell, a secure-by-design, sandboxed runtime that isolates agents to prevent unauthorized access to files or networks.

Accelerating Enterprise Data and Model Orchestration

Agentic AI requires high-concurrency access to massive datasets, a bottleneck that NVIDIA-accelerated computing is now addressing within Microsoft Fabric. Internal benchmarking shows that SQL execution in the Microsoft Fabric Data Warehouse is up to 6x faster than CPU-powered baselines and up to 7x faster than other leading cloud data warehouse providers for high-concurrency workloads. This ensures the enterprise data layer can keep pace with agents that continuously query and reason over live data.

On the model side, Microsoft Foundry is expanding its hosted agent services to include NVIDIA, Anthropic, and OpenAI models. Anthropic's Claude models are now running natively on NVIDIA GB300 Blackwell Ultra systems on Azure. NVIDIA is also introducing Nemotron 3 Ultra, a frontier reasoning model optimized for long-running tasks in coding and research, alongside the Cosmos 3 omnimodel for physical AI, which uses a mixture-of-transformers architecture to simulate and act in the physical world.

Scaling the AI Factory

Beyond the desktop and the cloud, the partnership is scaling into massive, distributed AI factories. Microsoft's Fairwater Wisconsin facility is now live, running hundreds of thousands of NVIDIA Grace Blackwell systems as a single, unified AI factory. This infrastructure, combined with the validated NVIDIA Vera Rubin platform, is designed to optimize token economics, delivering up to 10x inference throughput per megawatt and reducing the cost per agentic token by an order of magnitude.

This unified deployment model enables a future where agentic intelligence is no longer confined to the cloud, but is a pervasive, secure, and high-performance layer across every tier of computing infrastructure.

Source: NVIDIA Partners With Microsoft on Unified Stack for Agentic AI Deployment, From Windows Devices to Cloud to Local
Domain: blogs.nvidia.com

NVIDIA and Microsoft Ship Unified Stack for Agentic AI from Windows to Cloud

Reimagining Windows for Autonomous Agents

Accelerating Enterprise Data and Model Orchestration

Scaling the AI Factory

More in Artificial Intelligence