11 KiB

Raw Blame History

id	title	status	source_sections	related_topics	key_equations	key_terms	images	examples	open_questions
locomotion-control	Locomotion & Balance Control	established	reference/sources/paper-gait-conditioned-rl.md, reference/sources/paper-getting-up-policies.md, reference/sources/official-product-page.md	[joint-configuration sensors-perception equations-and-bounds learning-and-ai safety-limits whole-body-control push-recovery-balance motion-retargeting]	[zmp com inverse_dynamics]	[gait state_estimation gait_conditioned_rl curriculum_learning sim_to_real]	[]	[]	[Exact RL policy observation/action space dimensions How to replace the stock locomotion policy with a custom one Stair climbing capability and limits Running gait availability (H1-2 can run at 3.3 m/s — can G1?)]

Locomotion & Balance Control

Walking, balance, gait generation, and whole-body control for bipedal locomotion.

1. Control Architecture

The G1 uses a reinforcement-learning-based locomotion controller running on the proprietary locomotion computer. Users interact with it via high-level commands; the low-level balance and gait control is handled internally. [T1 — Confirmed from RL papers and developer docs]

User Commands (high-level API)
        │
        ▼
┌─────────────────────────┐
│  Locomotion Computer     │  (192.168.123.161, proprietary)
│                          │
│  RL Policy (gait-        │  ← IMU, joint encoders (500 Hz)
│  conditioned, multi-     │
│  phase curriculum)       │
│                          │
│  Motor Commands ─────────┼──→ Joint Actuators
└─────────────────────────┘

Key Architecture Details

Framework: Gait-conditioned reinforcement learning with multi-phase curriculum (arXiv:2505.20619) [T1]
Gait switching: One-hot gait ID enables dynamic switching between gaits [T1]
Reward design: Gait-specific reward routing mechanism with biomechanically inspired shaping [T1]
Training: Policies trained in simulation (Isaac Gym / MuJoCo), transferred to physical hardware [T1]
Biomechanical features: Straight-knee stance promotion, coordinated arm-leg swing, natural motion without motion capture data [T1]

2. Gait Modes

Mode	Description	Verified	Tier
Standing	Static balance, all feet grounded	Yes	T1
Walking	Dynamic bipedal walking	Yes	T1
Walk-to-stand	Smooth transition from walking to standing	Yes	T1
Stand-to-walk	Smooth transition from standing to walking	Yes	T1

[T1 — Validated in arXiv:2505.20619 on real G1 hardware]

3. Performance

Metric	Value	Notes	Tier
Maximum walking speed	2.0 m/s	7.2 km/h	T0
Verified terrain	Tile, concrete, carpet	Office-environment surfaces	T1
Balance recovery	Light push recovery	Stable recovery from perturbations	T1
Gait transition	Smooth	No abrupt mode switches	T1

For comparison, the H1-2 (larger Unitree humanoid) achieves 3.3 m/s running. Whether the G1 has a running gait is unconfirmed. [T3]

4. Balance Control

The RL-based locomotion policy implicitly handles balance through learned behavior rather than explicit ZMP or capture-point controllers: [T1]

Inputs: IMU data (orientation, angular velocity), joint encoder feedback (position, velocity), gait command
Outputs: Target joint positions/torques for all leg joints
Rate: 500 Hz control loop
Learned behaviors: Center-of-mass tracking, foot placement, push recovery, arm counterbalancing

While classical bipedal control uses explicit ZMP constraints (see equations-and-bounds), the G1's RL policy learns these constraints implicitly during training.

For deep coverage of enhanced push recovery, perturbation training, and always-on balance architectures, see push-recovery-balance.

5. Fall Recovery

Multiple research approaches have been validated on the G1: [T1 — Research papers]

Two-stage RL: Supine and prone recovery policies (arXiv:2502.12152) — overcome limitations of hand-crafted controllers
HoST framework: Multi-critic RL with curriculum training for diverse posture recovery (arXiv:2502.08378)
Unified fall-safety: Combined fall prevention + impact mitigation + recovery from sparse demonstrations (arXiv:2511.07407) — zero-shot sim-to-real transfer

6. Terrain Adaptation

Terrain Type	Status	Notes	Tier
Flat tile	Verified	Standard office floor	T1
Concrete	Verified	Indoor/outdoor flat surfaces	T1
Carpet	Verified	Standard office carpet	T1
Stairs	Unconfirmed	Research papers suggest capability	T4
Rough terrain	Sim only	Trained in sim, real-world unconfirmed	T3
Slopes	Unconfirmed	—	T4

7. User Control Interface

Users control locomotion through the high-level sport mode API: [T0]

Velocity commands: Set forward/lateral velocity and yaw rate
Posture commands: Stand, sit, lie down
Attitude adjustment: Modify body orientation
Trajectory tracking: Follow waypoint sequences

Low-level joint control is also possible (bypassing the locomotion controller) but requires the user to implement their own balance control. This is advanced and carries significant fall risk. [T2]

8. Locomotion Computer Internals

The locomotion computer is a Rockchip RK3588 (8-core ARM Cortex-A76/A55, 8GB LPDDR4X, 32GB eMMC) running Linux kernel 5.10.176-rt86+ (real-time patched). [T1 — Security research papers arXiv:2509.14096, arXiv:2509.14139]

Software Architecture

A centralized master_service orchestrator (9.2 MB binary) supervises 26 daemons: [T1]

Daemon	Role	Resource Usage
`ai_sport`	Primary locomotion/balance policy	145% CPU, 135 MB RAM
`state_estimator`	IMU + encoder fusion	~30% CPU
`motion_switcher`	Gait mode management	—
`robot_state_service`	State broadcasting	—
`dex3_service_l/r`	Left/right hand control	—
`webrtc_bridge`	Video streaming	—
`ros_bridge`	ROS2 interface	—
Others	OTA, BLE, WiFi, telemetry, etc.	—

The ai_sport daemon is the stock RL policy. When you enter debug mode (L2+R2), this daemon is shut down, allowing direct motor control via rt/lowcmd.

Configuration files use proprietary FMX encryption (Blowfish-ECB + LCG stream cipher with static keys). This has been partially reverse-engineered by security researchers but not fully cracked. [T1]

Can You Access the Locomotion Computer?

Root access is technically possible via known BLE exploits (UniPwn, FreeBOT jailbreak), but no one has publicly documented deploying a custom policy to it: [T1]

Method	Status	Notes
SSH from network	Blocked	No SSH server exposed by default
FreeBOT jailbreak (app WiFi field injection)	Works on firmware ≤1.6.0	Patched Oct 2025
UniPwn BLE exploit (Bin4ry/UniPwn on GitHub)	Works on unpatched firmware	Hardcoded AES keys + command injection
RockUSB physical flash	Blocked by SecureBoot on G1	Works on Go2 only
Replacing `ai_sport` binary after root	Not documented	Nobody has published doing this
Extracting stock policy weights	Not documented	Binary analysis not published

Bottom line: Getting root on the RK3588 is solved. Getting a custom locomotion policy running natively on it is not — the master_service orchestrator, FMX encryption, and lack of documentation are barriers nobody has publicly overcome. [T1]

How Every Research Group Actually Deploys

All published research (BFM-Zero, gait-conditioned RL, fall recovery, etc.) uses the same approach: [T1]

Enter debug mode (L2+R2) — shuts down ai_sport
Run custom policy on the Jetson Orin NX or an external computer
Read rt/lowstate, compute actions, publish rt/lowcmd via DDS
Motor commands travel over the internal DDS network to the RK3588, which passes them to motor drivers

This works but has inherent limitations:

DDS network latency (~2ms round trip) vs. native on-board execution
No access to the RK3588's real-time Linux kernel guarantees
Policy frequency limited by DDS throughput and compute (typically 200-500 Hz from Jetson)

9. Custom Policy Replacement (Practical)

When to Replace

You need whole-body coordination (mocap + balance)
You need push recovery beyond what the stock controller provides
You want to run a custom RL policy trained with perturbation curriculum

How to Replace (Debug Mode)

Suspend robot on stand or harness
Enter damping state, press L2+R2 — ai_sport shuts down
Send MotorCmd_ messages on rt/lowcmd from Jetson or external PC
Read rt/lowstate for joint positions, velocities, and IMU data
Publish at 500 Hz for smooth control (C++ recommended over Python for lower latency)
To exit debug mode: reboot the robot (no other way)

Risks

Fall risk: If your policy fails, the robot falls immediately — no stock controller safety net
Hardware damage: Incorrect joint commands can damage actuators
Always test in simulation first (see simulation)

Alternative: Residual Overlay

Instead of full replacement, train a residual policy that adds small corrections to the stock controller output. See push-recovery-balance for details.

WBC Frameworks

For coordinated whole-body control (balance + task), see whole-body-control, particularly GR00T-WBC which is designed for exactly this use case on G1.

Key Relationships

Uses: joint-configuration (leg joints as actuators, 500 Hz commands)
Uses: sensors-perception (IMU + encoders for state estimation)
Trained via: learning-and-ai (RL training pipeline)
Bounded by: equations-and-bounds (ZMP, joint limits)
Governed by: safety-limits (fall detection, torque limits)
Extended by: push-recovery-balance (enhanced perturbation robustness)
Coordinated by: whole-body-control (WBC for combined loco-manipulation)
Enables: motion-retargeting (balance during mocap playback)

11 KiB Raw Blame History