Source linked

NVIDIA offre des compétences d'agent qui automatisent le travail de base de l'IA physique

CVPR 2026: nouvelle reconstruction de scénario d'agents d'IA, génération d'anomalies synthétiques et déploiement de politiques à travers les AV, la robotique et la vision - couvrant l'écart de flux de travail qui ralentit chaque laboratoire.

nvidiacvpr 2026physical aiautonomous vehiclesroboticsvision ai

The bottleneck in physical AI research isn't model architecture — it's the grunt work of turning real-world data into testable scenarios. NVIDIA just shipped a set of agent skills that automate that grunt work across autonomous vehicles, robotics, and vision AI, and they announced them at CVPR 2026.

Agent Skills That Reconstruct, Generate, and Evaluate

NVIDIA Cosmos 3, the open frontier model for physical AI, is now paired with skills that turn fragmented toolchains into single-agent workflows. Neural Reconstruction skills, for example, take fleet-captured video and turn it into editable 3D scenes usable in simulation. InstantNuRec does fast 3D Gaussian road-scene reconstruction from images without per-scene optimization — that's a direct speed-up for AV researchers drowning in the long tail of edge-case driving.

Defect Image Generation for vision AI automates what used to require manual photoshopping: it creates rare defects — scratches, dents, discolorations — on different surfaces using real images. The pipeline combines Isaac Sim, Cosmos 3, and NVIDIA OSMO for orchestration. Researchers get synthetic anomalies they can label and train on without ever needing a factory floor.

32 Billion Parameters for the Full Driving Stack

NVIDIA Alpamayo 2 Super is an open 32-billion-parameter reasoning vision-language-action (VLA) model that reasons, plans, and acts across the entire driving pipeline. That's not just a bigger model — it's a single model that closes the loop from perception to control. The PAI-AV Reasoning Challenge at CVPR will benchmark how well such models explain driving decisions using chain-of-causation labels.

For closed-loop reinforcement learning, AlpaGym scales policy rollouts across thousands of GPUs, while OmniDreams renders photorealistic camera frames that respond to policy actions in real time. AV researchers can now simulate corner cases that would take years to capture in real-world logs.

Robotics Workflows That Don't Require Stitching

Isaac Sim 6.0 and Isaac Lab now include agent-friendly skills for scene authoring, simulation launch, data capture, and environment validation. Mobility skills automate navigation workflows — scene search, USD conversion, environment registration, residual RL, and policy evaluation. Specialized agentic workflows help with sim-to-sim and sim-to-real tasks like physics tuning and debugging.

For healthcare robotics, Cosmos-H-Surgical-Simulator generates realistic surgical data directly from real procedures, skipping hand-engineered physics models. Cosmos 3 then generates scene variations and supports post-training with embodiment-specific data. That's a direct attack on the sim-to-real gap that has plagued surgical autonomy.

Open Infrastructure, No Gatekeeping

All agent skills and tools are available now on GitHub. Preconfigured environments called Physical AI Launchables run on hosted NVIDIA H100 Tensor Core GPUs with free trial credits — no cluster required to start. The NVIDIA Physical AI Dataset has passed 15 million downloads on Hugging Face, and the Isaac GR00T X Embodiment Sim is one of the most-downloaded robotics datasets.

NVIDIA's CVPR presence includes three open research challenges — the AI City Challenge (tenth year), the new PAI-AV Reasoning Challenge, and the AlpaSim Closed-Loop End-to-End Driving Challenge — all designed to benchmark progress in the very workflows these agent skills automate.

I'm watching whether other labs adopt these skills as the default workflow layer, or if the fragmentation simply moves up a level. Either way, the bar for what a single researcher can prototype in a day just got higher.


Source: NVIDIA Enables the Next Era Of Physical AI Research With Agent Skills For Autonomous Vehicles, Robotics And Vision AI
Domain: blogs.nvidia.com

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.