Teaching a Humanoid Robot to Wave: Custom Motions with GEAR-SONIC
24th February 2026
GEAR-SONIC can track arbitrary motions — not just its built-in locomotion styles. You define joint angles in a CSV, preview them with direct replay in MuJoCo, then deploy through the SONIC neural network. Here’s how the three-stage pipeline works.
Stage 1: Create the Motion
The script create_motion.py lets you define joint angles for each frame at 50 Hz. The Unitree G1 has 29 joints in IsaacLab ordering:
Legs (0-11) → Waist (12-14) → Left Arm (15-21) → Right Arm (22-28)
For example, a wave gesture builds up per frame using a smooth envelope that ramps 0→1, holds, then ramps 1→0:
for each frame:
t = frame / 50Hz
env = smooth_envelope(t) # ramps up 0→1, holds, ramps down 1→0
poses[R_SH_P] = env * -0.8 # shoulder pitch (arm slightly forward)
poses[R_SH_R] = env * -1.2 # shoulder roll (arm out to side)
poses[R_ELB] = env * 1.5 # elbow bent 90° (forearm up)
poses[R_WR_Y] = env * sin(2.5Hz) # wrist oscillates side-to-side
This outputs six CSV files:
wave/
├── joint_pos.csv (250 × 29) — joint angles per frame
├── joint_vel.csv (250 × 29) — finite-difference velocities
├── body_pos.csv (250 × 42) — 14 body positions (x,y,z)
├── body_quat.csv (250 × 56) — 14 body orientations (w,x,y,z)
├── body_lin_vel.csv (250 × 42) — body linear velocities
├── body_ang_vel.csv (250 × 42) — body angular velocities
└── metadata.txt — body part indexes, frame count
250 frames at 50 Hz gives a 5-second motion clip. Every quantity the SONIC encoder needs — joint positions, velocities, body orientations, angular velocities — is precomputed and written to disk.
Stage 2: Visualize the Motion
Before involving any neural network, you verify the motion with direct replay:
python visualize_motion.py --motion_dir wave/
This reads joint_pos.csv, reorders the joints from IsaacLab ordering to MuJoCo ordering, sets qpos directly, and renders. No policy, no controller — just raw joint angles applied frame by frame.
The reordering is necessary because IsaacLab and MuJoCo number joints differently:
isaaclab_to_mujoco = [0,3,6,9,13,17, 1,4,7,10,14,18, 2,5,8,11,15,19, ...]
What you see here is the ground truth — exactly the motion you defined. If it looks wrong at this stage, the CSV needs fixing before going further.
Stage 3: Deploy Through SONIC
This is the full pipeline with the neural network. Two terminals:
# Terminal 1: MuJoCo physics simulation
python gear_sonic/scripts/run_sim_loop.py
# Terminal 2: SONIC C++ inference with your motion data
bash deploy.sh --motion-data wave/ sim
Here’s what happens inside the SONIC C++ binary every 20ms (50 Hz):
┌──────────────────────────────────────────────────────┐
│ 1. Read CSV files (joint_pos, body_quat, etc.) │
│ │
│ 2. Every 20ms (50Hz): │
│ ┌──────────────────────────────────────────┐ │
│ │ Planner assembles reference: │ │
│ │ - next motion reference frames from CSV │ │
│ │ - current robot joint positions │ │
│ │ - current robot joint velocities │ │
│ │ - body angular velocity │ │
│ │ - last actions │ │
│ └──────────────────┬───────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────┐ │
│ │ Encoder (TensorRT GPU): │ │
│ │ 290D motion joints + velocities │ │
│ │ → 64D latent token │ │
│ └──────────────────┬───────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────┐ │
│ │ Decoder (TensorRT GPU): │ │
│ │ latent + robot state → 29 joint targets │ │
│ └──────────────────┬───────────────────────┘ │
│ ↓ │
│ Send motor commands via DDS │
└──────────────────────────────────────────────────────┘
↓
┌──────────────────────────────────────────────────────┐
│ MuJoCo (Terminal 1): │
│ Receive motor commands (kp, kd, q_target) │
│ torque = kp*(q_target - q) + kd*(dq_target - dq) │
│ Step physics → render │
└──────────────────────────────────────────────────────┘
Direct Replay vs Neural Network
This is the critical distinction: SONIC does not directly replay your CSV. It uses your motion as a hint to a trained neural network. The encoder compresses the motion reference into a 64-dimensional latent token. The decoder combines that latent with the robot’s actual physical state to produce joint targets that are physically feasible — respecting balance, joint limits, and contact dynamics.
The policy decides how closely to follow your motion. It was trained primarily on locomotion data, so:
- Leg tracking — accurate, the policy has seen thousands of walking/running examples
- Arm gestures — may be attenuated, the policy is conservative about upper-body motions it hasn’t seen during training
This is a feature, not a bug. The neural network keeps the robot balanced even if your motion CSV would cause it to fall over with direct replay.
Summary
| Stage | What It Does | Command |
|---|---|---|
| Create | Define joint angles at 50 Hz, output CSVs | python create_motion.py |
| Verify | Direct replay in MuJoCo — no neural net | python visualize_motion.py --motion_dir wave/ |
| Deploy | Through SONIC neural network | bash deploy.sh --motion-data wave/ sim |
Create defines the intent. Verify confirms the intent looks right. Deploy runs it through a physics-aware neural network that keeps the robot on its feet.
Links
- Sim2Sim repo: github.com/avparkhi/GEAR-SONIC-Sim2Sim
- SONIC paper: arxiv.org/abs/2511.07820
- Model weights: huggingface.co/nvidia/GEAR-SONIC
More recent articles
- OpenUSD: Advanced Patterns and Common Gotchas. - 28th March 2026
- OpenUSD Mastery: From Composition to Pipeline — A SO-101 Arm Journey - 25th March 2026
- Learning OpenUSD — From Curious Questions to Real Understanding - 19th March 2026