3.3 KiB
Open-TeleVision: Teleoperation with Immersive Active Visual Feedback
arXiv: 2407.01512 Authors: Xuxin Cheng, Jialong Li, Shiqi Yang, Ge Yang, Xiaolong Wang Fetched: 2026-02-13 Type: Research Paper (CoRL 2024)
Note: The user-provided arXiv ID (2409.07455) and title ("TWIST: Teleoperation with Immersive Active Visual Streaming") were incorrect. arXiv 2409.07455 corresponds to an unrelated astronomy paper ("Genesis-Metallicity"). The paper matching the described topic — immersive teleoperation with active visual feedback/streaming — is Open-TeleVision (arXiv 2407.01512). There is also a separate, later paper called "TWIST: Teleoperated Whole-Body Imitation System" (arXiv 2505.02833, 2025) which is a distinct work. This file archives Open-TeleVision as the best match for the user's intent.
Abstract
Teleoperation serves as a powerful method for collecting on-robot data essential for robot learning from demonstrations. The intuitiveness and ease of use of the teleoperation system are crucial for ensuring high-quality, diverse, and scalable data. To achieve this, we propose an immersive teleoperation system Open-TeleVision that allows operators to actively perceive the robot's surroundings in a stereoscopic manner. Additionally, the system mirrors the operator's arm and hand movements on the robot, creating an immersive experience as if the operator's mind is transmitted to a robot embodiment. We validate the effectiveness of our system by collecting data and training imitation learning policies on four long-horizon, precise tasks (Can Sorting, Can Insertion, Folding, and Unloading) for 2 different humanoid robots and deploy them in the real world. The system is open-sourced.
Key Contributions
- Immersive stereoscopic perception: Operators actively perceive the robot's surroundings through a stereoscopic display, with the robot's head camera mirroring the operator's head movements (2-3 DoF actuation)
- Intuitive kinesthetic mirroring: The system mirrors the operator's arm and hand movements directly onto the robot, creating an embodiment experience
- Long-horizon task validation: Validated on four complex, long-horizon manipulation tasks (Can Sorting, Can Insertion, Folding, Unloading) requiring precision
- Cross-platform deployment: Demonstrated on 2 different humanoid robots with real-world policy deployment
- Imitation learning pipeline: Collected teleoperation data used to train imitation learning policies that execute autonomously
- Open-source release: The complete system is publicly available for the research community
G1 Relevance
Open-TeleVision is directly relevant to the Unitree G1 as a teleoperation and data collection framework for humanoid robots. The system has been validated on humanoid platforms and provides an intuitive VR-based interface for collecting demonstration data — a critical capability for training G1 manipulation and loco-manipulation policies. The immersive visual feedback addresses a key challenge in remote teleoperation by giving operators natural depth perception. The open-source nature makes it directly usable with the G1. The Unitree XR-Teleoperate project (already in this knowledge base) draws on similar principles.