You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 

2.8 KiB

Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation

arXiv: 2403.04436 Authors: Tairan He, Zhengyi Luo, Wenli Xiao, Chong Zhang, Kris Kitani, Changliu Liu, Guanya Shi Fetched: 2026-02-13 Type: Research Paper (IROS 2024)


Note: The user-provided arXiv ID (2403.01623) was incorrect — that ID corresponds to an unrelated machine-learning-for-physics-simulation challenge paper. The correct arXiv ID for the H2O paper is 2403.04436.

Abstract

We present Human to Humanoid (H2O), a reinforcement learning (RL) based framework that enables real-time whole-body teleoperation of a full-sized humanoid robot with only an RGB camera. To create a large-scale retargeted motion dataset of human movements for humanoid robots, we propose a scalable "sim-to-data" process to filter and pick feasible motions using a privileged motion imitator. Afterwards, we train a robust real-time humanoid motion imitator in simulation using these refined motions and transfer it to the real humanoid robot in a zero-shot manner. We successfully achieve teleoperation of dynamic whole-body motions in real-world scenarios, including walking, back jumping, kicking, turning, waving, pushing, boxing, etc. To the best of our knowledge, this is the first demonstration to achieve learning-based real-time whole-body humanoid teleoperation.

Key Contributions

  • RL-based teleoperation framework: Enables real-time whole-body humanoid teleoperation using only an RGB camera as input
  • Scalable "sim-to-data" process: A novel methodology for filtering and selecting feasible human-to-humanoid retargeted motions at scale using a privileged motion imitator
  • Zero-shot sim-to-real transfer: Trained simulation policies transfer directly to real-world humanoid robots without additional real-world fine-tuning
  • Dynamic whole-body motions: Successfully demonstrates diverse motions including walking, back jumping, kicking, turning, waving, pushing, and boxing in real-world environments
  • First of its kind: Claims to be the first demonstration of learning-based real-time whole-body humanoid teleoperation

G1 Relevance

H2O is directly relevant to the Unitree G1 as a framework for enabling teleoperation of full-sized humanoid robots. The same research group (LeCAR Lab at CMU) that produced H2O also developed OmniH2O, which explicitly targets the G1. The sim-to-data pipeline and zero-shot transfer methods are applicable to the G1's morphology. The teleoperation capabilities demonstrated (walking, manipulation, dynamic motions) align with the G1's intended use cases. The approach requires only an RGB camera, making it highly accessible for G1 operators.

References