GraspGen-X et LCDrive résolvent les lacunes à l'échelle dans la robotique et l'autonomie

Q: What is the significance of: GraspGen-X et LCDrive résolvent les lacunes à l'échelle dans la robotique et l'autonomie?

NVIDIA Research introduit GraspGen-X pour la prise en main robotique à zéro coup et LCDrive pour accélérer le raisonnement des véhicules autonomes sur le matériel embarqué.

GraspGen-X applies geometric and contact understanding to any robotic gripper it encounters, eliminating the need for per-embodiment training cycles.

Most robotic grasping systems are specialists. A vision-language-action policy trained for a two-finger gripper cannot grasp with a multi-fingered dextrous hand without extensive retraining, fine-tuning, and validation. This constraint forces many robotics companies to pick a specific gripper and stick with it. GraspGen-X breaks this bottleneck by functioning as a foundation model for zero-shot grasping. Given the geometry of a new gripper and an unknown object, the model generates reliable grasp pose proposals immediately.

Training on 2 Billion Simulated Grasps

To achieve generalization, researchers generated a dataset of 2 billion simulated grasps across thousands of object shapes and synthetic gripper configurations. This scale of diversity is impossible to collect in the real world. For developers, GraspGen-X can be used out of the box for several common grippers and integrates with curoboV2, a new CUDA-accelerated motion planning library, to execute these poses in unknown environments. Building on this, the Grasp-MPC framework presented at ICRA 2026 moves the pipeline from mere generation to closed-loop grasp execution.

Latent Reasoning for Faster Autonomous Driving

While chain-of-thought reasoning improves decision-making in AI, text-based reasoning is a massive bottleneck for autonomous vehicles. Every word generated is a token that consumes time and computational resources on the embedded hardware inside a car. LCDrive solves this by replacing human-readable text with compressed latent representations. Instead of producing words, the system thinks in a compact latent space that captures spatial information.

The architecture alternates between proposing candidate actions and predicting the resulting world state to refine its next step. This reasoning loop provides output trajectory quality comparable to text-based methods while using roughly half the tokens. The model was built on NVIDIA Alpamayo and trained using supervision derived from existing vehicle data, allowing for much faster response times on vehicle-grade processors.

Scaling Agent Training via Virtual Worlds

NitroGen extends the principles of the Isaac GR00T humanoid robot foundation model to virtual environments. By treating video games as high-quality, structured training grounds, NitroGen harnesses over 40,000 hours of interaction across 1,000 games. This approach allows embodied agents to learn complex behaviors like combat, navigation, and exploration. In low-data scenarios, NitroGen improves agent performance by up to 52% over previous state-of-the-art methods. These advancements in physical AI enable more adaptive autonomous systems that can generalize from simulation to the complexities of the real world.

Source: NVIDIA Research Unlocks Advanced Grasping, Smarter Autonomous Driving and Agent Training at Scale
Domain: blogs.nvidia.com

GraspGen-X et LCDrive résolvent les lacunes à l'échelle dans la robotique et l'autonomie

Training on 2 Billion Simulated Grasps

Latent Reasoning for Faster Autonomous Driving

Scaling Agent Training via Virtual Worlds

More in Artificial Intelligence