14 KiB
| id | title | status | source_sections | related_topics | key_equations | key_terms | images | examples | open_questions |
|---|---|---|---|---|---|---|---|---|---|
| whole-body-control | Whole-Body Control | established | reference/sources/paper-groot-wbc.md, reference/sources/paper-h2o.md, reference/sources/paper-omnih2o.md, reference/sources/paper-humanplus.md, reference/sources/paper-softa.md, reference/sources/github-groot-wbc.md, reference/sources/github-pinocchio.md | [locomotion-control manipulation motion-retargeting push-recovery-balance learning-and-ai joint-configuration] | [com zmp inverse_dynamics] | [whole_body_control task_space_inverse_dynamics operational_space_control centroidal_dynamics qp_solver groot_wbc residual_policy h2o omnih2o pinocchio] | [] | [] | [Can GR00T-WBC run at 500 Hz on the Jetson Orin NX? Does the stock locomotion computer expose a low-level override interface? What is the practical latency penalty of the overlay (residual) approach vs. full replacement?] |
Whole-Body Control
Frameworks and architectures for coordinating balance, locomotion, and upper-body motion simultaneously. This is the unifying layer that enables motion capture playback with robust balance.
1. What Is Whole-Body Control?
Whole-body control (WBC) treats the entire robot as a single coordinated system rather than controlling legs (balance) and arms (task) independently. A WBC controller solves for all joint commands simultaneously, subject to:
- Task objectives: Track a desired upper-body trajectory (e.g., from mocap)
- Balance constraints: Keep the center of mass (CoM) within the support polygon
- Physical limits: Joint position/velocity/torque limits, self-collision avoidance
- Contact constraints: Maintain foot contact, manage ground reaction forces
The key insight: balance is formulated as a constraint, not a separate controller. This lets the robot move its arms freely while the optimizer automatically adjusts the legs to stay stable. [T1 — Established robotics paradigm]
2. The G1 Architectural Constraint
The G1 has a dual-computer architecture (see locomotion-control): [T0]
┌─────────────────────────┐ ┌─────────────────────────┐
│ Locomotion Computer │ │ Development Computer │
│ 192.168.123.161 │ │ 192.168.123.164 │
│ (proprietary, locked) │◄───►│ Jetson Orin NX 16GB │
│ │ DDS │ (user-accessible) │
│ Stock RL balance policy │ │ Custom code runs here │
└─────────────────────────┘ └─────────────────────────┘
This creates two fundamentally different WBC approaches:
Approach A: Overlay (Residual) — Safer
- Keep the stock locomotion controller running on the locomotion computer
- Send high-level commands (velocity, posture) via the sport mode API
- Add upper-body joint commands for arms via
rt/lowcmdfor individual arm joints - The stock controller handles balance; you only control the upper body
- Pro: Low risk, stock balance is well-tuned
- Con: Limited authority — can't deeply coordinate leg and arm motions, stock controller may fight your arm commands if they shift CoM significantly
Approach B: Full Replacement — Maximum Control
- Bypass the stock controller entirely
- Send raw joint commands to ALL joints (legs + arms + waist) via
rt/lowcmdat 500 Hz - Must implement your own balance and locomotion from scratch
- Pro: Full authority over all joints, true whole-body optimization
- Con: High risk of falls, requires validated balance policy, significant development effort
Approach C: GR00T-WBC Framework — Best of Both (Recommended)
- Uses a trained RL locomotion policy for lower body (replaceable, not the stock one)
- Provides a separate interface for upper-body control
- Coordinates both through a unified framework
- Pro: Validated on G1, open-source, designed for this exact use case
- Con: Requires training a new locomotion policy (but provides tools to do so)
[T1 — Confirmed from developer documentation and GR00T-WBC architecture]
3. GR00T-WholeBodyControl (NVIDIA)
The most G1-relevant WBC framework. Open-source, designed specifically for Unitree humanoids. [T1]
| Property | Value |
|---|---|
| Repository | NVIDIA-Omniverse/gr00t-wbc (GitHub) |
| License | Apache 2.0 |
| Target robots | Unitree G1, H1 |
| Integration | LeRobot (HuggingFace), Isaac Lab |
| Architecture | Decoupled locomotion (RL) + upper-body (task policy) |
| Deployment | unitree_sdk2_python on Jetson Orin |
Architecture
┌──────────────────┐
│ Task Policy │ (mocap tracking, manipulation, etc.)
│ (upper body) │
└────────┬─────────┘
│ desired upper-body joints
┌────────▼─────────┐
│ WBC Coordinator │ ← balance constraints
│ (optimization) │ ← joint limits
└────────┬─────────┘
│
┌──────────────────┼──────────────────┐
│ │ │
┌──────▼──────┐ ┌──────▼──────┐ ┌───────▼──────┐
│ Locomotion │ │ Arm Joints │ │ Waist Joint │
│ Policy (RL) │ │ │ │ │
│ (lower body) │ │ │ │ │
└─────────────┘ └─────────────┘ └──────────────┘
Key Features
- Locomotion policy: RL-trained, handles balance and walking. Can be retrained with perturbation robustness.
- Upper-body interface: Accepts target joint positions for arms/waist from any source (mocap, learned policy, teleoperation)
- LeRobot integration: Data collection via teleoperation → behavior cloning → deployment, all within the GR00T-WBC framework
- Sim-to-real: Trained in Isaac Lab, deployed on real G1 via unitree_sdk2
- G1 configs: Supports 29-DOF and 23-DOF variants
Why This Matters for Mocap + Balance
GR00T-WBC is the most direct path to the user's goal: the locomotion policy maintains balance (including push recovery if trained with perturbations) while the upper body tracks mocap reference trajectories. The two are coordinated through the WBC layer.
Deployment on Dell Pro Max GB10 — Verified (2026-02-14) [T1]
GR00T-WBC has been successfully deployed and tested on the Dell Pro Max GB10 (NVIDIA Grace Blackwell, aarch64, Ubuntu 24.04). Key findings:
Pre-trained ONNX Policies:
GR00T-WholeBodyControl-Balance.onnx— standing balance (15 lower-body joint targets)GR00T-WholeBodyControl-Walk.onnx— locomotion with velocity commands- Both: 516-dim observation → 15-dim action. Pre-trained by NVIDIA (training code not open-sourced).
- Training: PPO via RSL-RL in Isaac Lab, domain randomization, zero-shot sim-to-real. Exact reward function and perturbation curriculum not published.
Performance on GB10:
- ~3.5 ms per control loop iteration at 50 Hz (sync mode) — only 17.5% of time budget
- 401% CPU usage (4 cores) — MuJoCo physics dominates
- Both Balance and Walk policies load and execute successfully
- Robot walks, turns, and strafes in simulation via keyboard velocity commands
Critical Fixes Required for GB10 (aarch64):
- CycloneDDS buffer overflow: The
<Tracing>XML section inunitree_sdk2py/core/channel_config.pytriggers a glibc FORTIFY_SOURCE buffer overflow on aarch64. Fix: remove the<Tracing>section entirely. (See dev-environment for patch details.) - ROS2 Python path: venv needs
.pthfile pointing to/opt/ros/jazzy/lib/python3.12/site-packages/ - ROS2 shared libraries:
export LD_LIBRARY_PATH=/opt/ros/jazzy/lib:$LD_LIBRARY_PATH - Sync mode bug:
run_g1_control_loop.pychecks for sim thread in sync mode where none exists. Patch: addnot config.sim_sync_modeguard.
Keyboard Control:
sshkeyboardlibrary fails in remote terminals (SSH, NoMachine). Workaround: use--keyboard-dispatcher-type rosand publish to/keyboard_inputROS topic from a separate process.- Keys:
]=enable Walk,w/s=fwd/back,a/d=strafe,q/e=rotate,z=stop,backspace=reset
Visualization:
- GLFW passive viewer freezes on virtual/remote displays (Xvfb, NoMachine) after a few seconds
- VNC (x11vnc) cannot capture OpenGL framebuffer updates
- Working solution: NoMachine virtual desktop (NX protocol) — viewer works initially but GLFW stalls
- Best solution: Web-based MJPEG streaming via MuJoCo offscreen renderer (bypasses all X11/GLFW issues)
4. Pinocchio + TSID (Model-Based WBC)
An alternative to RL-based WBC using classical optimization. [T1 — Established framework]
| Property | Value |
|---|---|
| Library | Pinocchio (stack-of-tasks/pinocchio) |
| Language | C++ with Python bindings |
| License | BSD-2-Clause |
| Key capability | Rigid body dynamics, forward/inverse kinematics, Jacobians, dynamics |
Task-Space Inverse Dynamics (TSID)
Pinocchio + TSID solves a QP at each timestep:
minimize || J_task * qdd - x_task_desired ||^2 (track task)
subject to CoM ∈ support polygon (balance)
q_min ≤ q ≤ q_max (joint limits)
tau_min ≤ tau ≤ tau_max (torque limits)
contact constraints (feet on ground)
- Pros: Interpretable, respects physics exactly, no training required
- Cons: Requires accurate dynamics model (masses, inertias), computationally heavier than RL at runtime, less robust to model errors
- G1 compatibility: Needs URDF with accurate dynamics. MuJoCo Menagerie model or unitree_ros URDF provide this.
Use Cases for G1
- Offline trajectory optimization (plan mocap-feasible trajectories ahead of time)
- Real-time WBC if dynamics model is accurate enough
- Validation tool: check if a retargeted motion is physically feasible before executing
5. RL-Based WBC Approaches
SoFTA — Slow-Fast Two-Agent RL
Decouples whole-body control into two agents operating at different frequencies: [T1 — Research]
- Slow agent (lower body): Locomotion at standard control frequency, handles balance and walking
- Fast agent (upper body): Manipulation at higher frequency for precision tasks
- Key insight: upper body needs faster updates for manipulation precision; lower body is slower but more stable
H2O — Human-to-Humanoid Real-Time Teleoperation
(arXiv:2403.01623) [T1 — Validated on humanoid hardware]
- Real-time human motion retargeting to humanoid robot
- RL policy trained to imitate human demonstrations while maintaining balance
- Demonstrated whole-body teleoperation including walking + arm motion
- Relevant to G1: proves the combined mocap + balance paradigm works
OmniH2O — Universal Teleoperation and Autonomy
(arXiv:2406.08858) [T1]
- Extends H2O with multiple input modalities (VR, RGB camera, motion capture)
- Trains a universal policy that generalizes across different human operators
- Supports both teleoperation (real-time) and autonomous replay (offline)
- Directly relevant: could drive G1 from mocap recordings
HumanPlus — Humanoid Shadowing and Imitation
(arXiv:2406.10454) [T1]
- "Shadow mode": real-time mimicry of human motion using RGB camera
- Collects demonstrations during shadowing, then trains autonomous policy via imitation learning
- Complete pipeline from human motion → robot imitation → autonomous execution
- Validated on full-size humanoid with walking + manipulation
6. WBC Implementation Considerations for G1
Compute Budget
- Jetson Orin NX: 100 TOPS (AI), 8-core ARM CPU
- RL policy inference: typically < 1ms per step
- QP-based TSID: typically 1-5ms per step (depends on DOF count)
- 500 Hz control loop = 2ms budget per step
- RL approach fits comfortably; QP-based requires careful optimization
Sensor Requirements
- IMU (onboard): orientation, angular velocity — available
- Joint encoders (onboard): position, velocity — available at 500 Hz
- Foot contact sensing: NOT standard on G1 — must infer from joint torques or add external sensors [T3]
- External force estimation: possible from IMU + dynamics model, or add force/torque sensors [T3]
Communication Path
Jetson Orin (user code) ──DDS──► rt/lowcmd ──► Joint Actuators
◄──DDS── rt/lowstate ◄── Joint Encoders + IMU
- Latency: ~2ms DDS round trip [T1]
- Frequency: 500 Hz control loop [T0]
- Both overlay and replacement approaches use this same DDS path
Key Relationships
- Builds on: locomotion-control (balance as a component of WBC)
- Enables: motion-retargeting (WBC provides the balance guarantee during mocap playback)
- Enables: push-recovery-balance (WBC can incorporate perturbation robustness)
- Uses: joint-configuration (all joints coordinated as one system)
- Uses: sensors-perception (IMU + encoders for state estimation)
- Trained via: learning-and-ai (RL training for locomotion component)
- Bounded by: equations-and-bounds (CoM, ZMP, joint limits)