You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 

11 KiB

id title status source_sections related_topics key_equations key_terms images examples open_questions
locomotion-control Locomotion & Balance Control established reference/sources/paper-gait-conditioned-rl.md, reference/sources/paper-getting-up-policies.md, reference/sources/official-product-page.md [joint-configuration sensors-perception equations-and-bounds learning-and-ai safety-limits whole-body-control push-recovery-balance motion-retargeting] [zmp com inverse_dynamics] [gait state_estimation gait_conditioned_rl curriculum_learning sim_to_real] [] [] [Exact RL policy observation/action space dimensions How to replace the stock locomotion policy with a custom one Stair climbing capability and limits Running gait availability (H1-2 can run at 3.3 m/s — can G1?)]

Locomotion & Balance Control

Walking, balance, gait generation, and whole-body control for bipedal locomotion.

1. Control Architecture

The G1 uses a reinforcement-learning-based locomotion controller running on the proprietary locomotion computer. Users interact with it via high-level commands; the low-level balance and gait control is handled internally. [T1 — Confirmed from RL papers and developer docs]

User Commands (high-level API)
        │
        ▼
┌─────────────────────────┐
│  Locomotion Computer     │  (192.168.123.161, proprietary)
│                          │
│  RL Policy (gait-        │  ← IMU, joint encoders (500 Hz)
│  conditioned, multi-     │
│  phase curriculum)       │
│                          │
│  Motor Commands ─────────┼──→ Joint Actuators
└─────────────────────────┘

Key Architecture Details

  • Framework: Gait-conditioned reinforcement learning with multi-phase curriculum (arXiv:2505.20619) [T1]
  • Gait switching: One-hot gait ID enables dynamic switching between gaits [T1]
  • Reward design: Gait-specific reward routing mechanism with biomechanically inspired shaping [T1]
  • Training: Policies trained in simulation (Isaac Gym / MuJoCo), transferred to physical hardware [T1]
  • Biomechanical features: Straight-knee stance promotion, coordinated arm-leg swing, natural motion without motion capture data [T1]

2. Gait Modes

Mode Description Verified Tier
Standing Static balance, all feet grounded Yes T1
Walking Dynamic bipedal walking Yes T1
Walk-to-stand Smooth transition from walking to standing Yes T1
Stand-to-walk Smooth transition from standing to walking Yes T1

[T1 — Validated in arXiv:2505.20619 on real G1 hardware]

3. Performance

Metric Value Notes Tier
Maximum walking speed 2.0 m/s 7.2 km/h T0
Verified terrain Tile, concrete, carpet Office-environment surfaces T1
Balance recovery Light push recovery Stable recovery from perturbations T1
Gait transition Smooth No abrupt mode switches T1

For comparison, the H1-2 (larger Unitree humanoid) achieves 3.3 m/s running. Whether the G1 has a running gait is unconfirmed. [T3]

4. Balance Control

The RL-based locomotion policy implicitly handles balance through learned behavior rather than explicit ZMP or capture-point controllers: [T1]

  • Inputs: IMU data (orientation, angular velocity), joint encoder feedback (position, velocity), gait command
  • Outputs: Target joint positions/torques for all leg joints
  • Rate: 500 Hz control loop
  • Learned behaviors: Center-of-mass tracking, foot placement, push recovery, arm counterbalancing

While classical bipedal control uses explicit ZMP constraints (see equations-and-bounds), the G1's RL policy learns these constraints implicitly during training.

For deep coverage of enhanced push recovery, perturbation training, and always-on balance architectures, see push-recovery-balance.

5. Fall Recovery

Multiple research approaches have been validated on the G1: [T1 — Research papers]

  • Two-stage RL: Supine and prone recovery policies (arXiv:2502.12152) — overcome limitations of hand-crafted controllers
  • HoST framework: Multi-critic RL with curriculum training for diverse posture recovery (arXiv:2502.08378)
  • Unified fall-safety: Combined fall prevention + impact mitigation + recovery from sparse demonstrations (arXiv:2511.07407) — zero-shot sim-to-real transfer

6. Terrain Adaptation

Terrain Type Status Notes Tier
Flat tile Verified Standard office floor T1
Concrete Verified Indoor/outdoor flat surfaces T1
Carpet Verified Standard office carpet T1
Stairs Unconfirmed Research papers suggest capability T4
Rough terrain Sim only Trained in sim, real-world unconfirmed T3
Slopes Unconfirmed T4

7. User Control Interface

Users control locomotion through the high-level sport mode API: [T0]

  • Velocity commands: Set forward/lateral velocity and yaw rate
  • Posture commands: Stand, sit, lie down
  • Attitude adjustment: Modify body orientation
  • Trajectory tracking: Follow waypoint sequences

Low-level joint control is also possible (bypassing the locomotion controller) but requires the user to implement their own balance control. This is advanced and carries significant fall risk. [T2]

8. Locomotion Computer Internals

The locomotion computer is a Rockchip RK3588 (8-core ARM Cortex-A76/A55, 8GB LPDDR4X, 32GB eMMC) running Linux kernel 5.10.176-rt86+ (real-time patched). [T1 — Security research papers arXiv:2509.14096, arXiv:2509.14139]

Software Architecture

A centralized master_service orchestrator (9.2 MB binary) supervises 26 daemons: [T1]

Daemon Role Resource Usage
ai_sport Primary locomotion/balance policy 145% CPU, 135 MB RAM
state_estimator IMU + encoder fusion ~30% CPU
motion_switcher Gait mode management
robot_state_service State broadcasting
dex3_service_l/r Left/right hand control
webrtc_bridge Video streaming
ros_bridge ROS2 interface
Others OTA, BLE, WiFi, telemetry, etc.

The ai_sport daemon is the stock RL policy. When you enter debug mode (L2+R2), this daemon is shut down, allowing direct motor control via rt/lowcmd.

Configuration files use proprietary FMX encryption (Blowfish-ECB + LCG stream cipher with static keys). This has been partially reverse-engineered by security researchers but not fully cracked. [T1]

Can You Access the Locomotion Computer?

Root access is technically possible via known BLE exploits (UniPwn, FreeBOT jailbreak), but no one has publicly documented deploying a custom policy to it: [T1]

Method Status Notes
SSH from network Blocked No SSH server exposed by default
FreeBOT jailbreak (app WiFi field injection) Works on firmware ≤1.6.0 Patched Oct 2025
UniPwn BLE exploit (Bin4ry/UniPwn on GitHub) Works on unpatched firmware Hardcoded AES keys + command injection
RockUSB physical flash Blocked by SecureBoot on G1 Works on Go2 only
Replacing ai_sport binary after root Not documented Nobody has published doing this
Extracting stock policy weights Not documented Binary analysis not published

Bottom line: Getting root on the RK3588 is solved. Getting a custom locomotion policy running natively on it is not — the master_service orchestrator, FMX encryption, and lack of documentation are barriers nobody has publicly overcome. [T1]

How Every Research Group Actually Deploys

All published research (BFM-Zero, gait-conditioned RL, fall recovery, etc.) uses the same approach: [T1]

  1. Enter debug mode (L2+R2) — shuts down ai_sport
  2. Run custom policy on the Jetson Orin NX or an external computer
  3. Read rt/lowstate, compute actions, publish rt/lowcmd via DDS
  4. Motor commands travel over the internal DDS network to the RK3588, which passes them to motor drivers

This works but has inherent limitations:

  • DDS network latency (~2ms round trip) vs. native on-board execution
  • No access to the RK3588's real-time Linux kernel guarantees
  • Policy frequency limited by DDS throughput and compute (typically 200-500 Hz from Jetson)

9. Custom Policy Replacement (Practical)

When to Replace

  • You need whole-body coordination (mocap + balance)
  • You need push recovery beyond what the stock controller provides
  • You want to run a custom RL policy trained with perturbation curriculum

How to Replace (Debug Mode)

  1. Suspend robot on stand or harness
  2. Enter damping state, press L2+R2ai_sport shuts down
  3. Send MotorCmd_ messages on rt/lowcmd from Jetson or external PC
  4. Read rt/lowstate for joint positions, velocities, and IMU data
  5. Publish at 500 Hz for smooth control (C++ recommended over Python for lower latency)
  6. To exit debug mode: reboot the robot (no other way)

Risks

  • Fall risk: If your policy fails, the robot falls immediately — no stock controller safety net
  • Hardware damage: Incorrect joint commands can damage actuators
  • Always test in simulation first (see simulation)

Alternative: Residual Overlay

Instead of full replacement, train a residual policy that adds small corrections to the stock controller output. See push-recovery-balance for details.

WBC Frameworks

For coordinated whole-body control (balance + task), see whole-body-control, particularly GR00T-WBC which is designed for exactly this use case on G1.

Key Relationships