NVIDIA’s GR00T Whole-Body Control stack in MuJoCo
20th February 2026
I’ve been running NVIDIA’s GR00T Whole-Body Control stack in MuJoCo — the sim-to-real bridge for humanoid robot locomotion. A MuJoCo viewer showing a simulated robot walking might look like a toy, but the neural network policy inside it is the same binary that runs on a real Unitree G1. Here’s what’s actually going on.
What sim2mujoco Actually Is
It’s the sim-to-real transfer bridge. The workflow:
- Train a neural network policy in NVIDIA Isaac Sim (GPU-accelerated, thousands of robots in parallel)
- Validate in MuJoCo — lightweight, accurate physics, catches policy failures before touching hardware
- Deploy the same ONNX policy onto a real Unitree G1 robot
The policy file is the same binary (.onnx) at every stage. MuJoCo is just the test bench, not the end product. Think of it like watching a self-driving car in a simulator — looks like a video game, but the model inside it drives real cars.
Real-World Applications
This exact stack (GR00T WBC) is used for:
- Warehouse automation — humanoid robots walking, picking, placing in unstructured environments
- Manufacturing — robots that can navigate factory floors, climb stairs, handle objects
- Hazardous environments — inspection in places unsafe for humans (nuclear, disaster zones)
- General-purpose humanoid robotics — NVIDIA’s GR00T project is their bet on foundation models for humanoid control
Why It Matters Technically
- The policy running here is the same binary (
.onnx) that runs on real hardware — not a separate simulation-only thing - Whole-body control with 29 DOFs (legs + arms + torso) is an unsolved hard problem — this is state of the art
- The PD controller, observation space, action scaling are all tuned to match real actuator dynamics
What Runs When You Launch the Simulation
Every 0.005s timestep, the simulation executes this loop:
reads sensors: joint positions, velocities, IMU (gravity, angular vel)
↓
builds observation vector (86 dims × 6 history frames = 516)
↓
feeds into ONNX policy (neural net)
↓
target joint angles (15 values)
↓
PD controller → joint torques
↓
MuJoCo physics engine steps the simulation
↓
repeat
What each piece is:
| Component | What It Does |
|---|---|
GR00T-WholeBodyControl-Balance.onnx | The trained neural network — same file you’d load on a real G1 robot |
g1_gear_wbc.xml | MuJoCo model of the G1 — masses, joint limits, meshes, actuators — standing in for real hardware |
| PD controller | Converts target joint angles → torques, same math runs on real motor controllers |
compute_observation | Simulates what real sensors (IMU, joint encoders) would report |
| Keyboard input | Simulates the command interface (joystick/autonomy stack on real robot) |
MuJoCo replaces the physical robot. Everything else — the policy, the PD controller, the observation pipeline — is identical to what runs on real hardware. If the robot walks here, it has a high chance of walking on the real G1. If it falls here, it would fall in real life too. That’s the whole point: break things in simulation, not on a $50k+ robot.
Which Policy Is Actually Running
The gait script only loads one policy: GR00T-WholeBodyControl-Balance.onnx — the balance/standing policy. It does not load GR00T-WholeBodyControl-Walk.onnx.
Compare with the other script which loads both:
# run_mujoco_gear_wbc.py
self.policy = self.load_onnx_policy(self.config["policy_path"]) # Balance
self.walk_policy = self.load_onnx_policy(self.config["walk_policy_path"]) # Walk
So the gait script runs the Balance policy with gait logic layered on top in Python code. The Walk neural network (GR00T-WholeBodyControl-Walk.onnx) sits unused. These are NVIDIA’s pre-trained policies, trained in Isaac Sim using reinforcement learning on the Unitree G1.
Where Does GR00T Fit In
GR00T = Generalist Robot 00 Technology. It’s NVIDIA’s initiative to build foundation models that can control any humanoid robot. The hierarchy:
NVIDIA GR00T (project/platform)
└── GR00T-WholeBodyControl (this repo)
└── decoupled_wbc (the control framework)
└── sim2mujoco (what we're running)
The full GR00T project has multiple layers:
| Layer | What It Does |
|---|---|
| GR00T Foundation Model | Large multimodal model — understands language and vision, generates robot actions |
| GR00T Whole-Body Control | Locomotion policies — walk, balance, recover from pushes |
| GR00T Dexterity | Hand and manipulation policies |
The .onnx files in this codebase are low-level controllers produced by the GR00T project. They handle: don’t fall over, walk when told to walk, track height/orientation commands.
The Bigger Picture
In a full GR00T deployment:
Human says: "Go pick up that box"
↓
GR00T Foundation Model (vision + language → high-level plan)
↓
Commands: walk forward, turn left, reach arm...
↓
GR00T WholeBodyControl ← YOU ARE HERE
(balance + locomotion neural net)
↓
Joint torques → real robot moves
We’re running the legs of the stack. GR00T is the brain that would sit on top, sending the locomotion commands that you currently send with the keyboard.
More recent articles
- OpenUSD: Advanced Patterns and Common Gotchas. - 28th March 2026
- OpenUSD Mastery: From Composition to Pipeline — A SO-101 Arm Journey - 25th March 2026
- Learning OpenUSD — From Curious Questions to Real Understanding - 19th March 2026