diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..b72f8e3 --- /dev/null +++ b/.gitignore @@ -0,0 +1,2 @@ +# Local copies of upstream repos (not part of this project) +apps/ diff --git a/CLAUDE.md b/CLAUDE.md index 6e28d5c..2ca913e 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -56,6 +56,7 @@ | What we don't know, gaps, uncertainties | `context/open-questions.md` | | Term definitions, units, acronyms | `reference/glossary.yaml` | | Dell Pro Max GB10, offboard AI, external compute | `context/gb10-offboard-compute.md` | +| Teleoperation, Vision Pro, xr_teleoperate, WebXR | `context/teleoperation.md` | | Worked calculations, code examples | `examples/*.md` | --- @@ -80,6 +81,28 @@ When the user asks you to reason about something novel: - **SDK version:** Always specify which SDK version (SDK2, unitree_sdk2_python, etc.) when discussing API calls. APIs differ between versions. - **Model variant:** The G1 has multiple configurations (e.g., different DOF counts, with/without dexterous hands). Always clarify which variant is being discussed. +## ROBOT SAFETY — CRITICAL + +- **NEVER deactivate the robot's policy, disable servos, or change the robot's physical state (mode, position, orientation) without explicitly warning the user first and getting confirmation.** The user is physically next to the robot and needs to be prepared before any state change. This includes sending keys like `o` (deactivate), `]` (activate), height/pitch adjustments, or any command that changes motor behavior. +- Always tell the user what you're about to do and wait for their go-ahead before sending any command that affects the robot's physical state. + +## SSH TO THE ROBOT — CRITICAL + +- **ALWAYS use paramiko (Python) for SSH connections to the robot.** NEVER use the `ssh` command directly via Bash — it triggers Git for Windows' credential manager popup and blocks the session. +- Example pattern (used by all deploy scripts): + ```python + import paramiko + ssh = paramiko.SSHClient() + ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) + ssh.connect('10.0.0.64', username='unitree', password='123', + timeout=10, look_for_keys=False, allow_agent=False) + _, stdout, stderr = ssh.exec_command("your command here", timeout=30) + print(stdout.read().decode('utf-8', errors='replace')) + ssh.close() + ``` +- Robot SSH: `unitree@10.0.0.64` password `123` +- GB10 SSH: `mitchaiet@10.0.0.68` password `Strat3*gb10` + ## DO NOT - Do not assume G1 specs are the same as H1 or other Unitree robots — they differ significantly. @@ -157,3 +180,5 @@ Operations | 3 | 2026-02-14 | GR00T-WBC deployed on GB10. 4 critical aarch64 bugs fixed (CycloneDDS buffer overflow, ROS2 path, shared libs, sync mode). Walking sim verified. NoMachine + web viewer for remote viz. 5 new open questions, 3 resolved. gb10-offboard-compute promoted to established. | | 3.5 | 2026-02-15 | MuJoCo Playground training pipeline deployed on GB10. G1JoystickFlatTerrain verified (29-DOF, 103-dim obs). Brax PPO training at ~17K steps/sec on Blackwell. Locomotion-only baseline: 200M steps, reward +12.1, 3:08 training time. Researched unified WBC approach (ExBody2 paradigm) for AVP telepresence. Plan saved for Phase 4. | | 4 | 2026-02-15 | DDS network bridge: GB10 ↔ G1 robot. Diagnosed DDS multicast blocked by UFW firewall (root cause). Confirmed DDS active inside robot (342 pkts/4s). SSH password documented (123). GR00T-WBC real robot config + launch script created. Researched Vision Pro telepresence: xr_teleoperate (WebXR, no app) and VisionProTeleop (native, open-source). GR00T-WBC architecture fully documented (decoupled upper/lower body, CONTROL_GOAL_TOPIC integration point). Vision Pro selected as primary telepresence device. | +| 5 | 2026-02-15 | **GR00T-WBC running on real G1 robot.** Root-caused persistent backward lean to IMU mounting offset (~6° pitch). Fixed negative KD bug (MOTOR_KD[14]=-5). Confirmed action clipping is wrong (removed). Applied PR #11 for dynamic mode_machine detection. Iterative PD gain tuning: sim-trained → custom → Unitree teleop gains. IMU offset calibration implemented with live adjustment (keys 9/0). Confirmed no encoder offset, correct quaternion convention, correct JOINT2MOTOR mapping, no competing controller. Best config: IMU offset -6° + Unitree teleop gains (KP 300/300/300/300/80/80). GB10 SSH credentials corrected (mitchaiet/Strat3*gb10). WBC-AGILE identified as training framework. 5 open questions resolved. | +| 5.5 | 2026-02-18 | **Vision Pro WebXR teleoperation working.** xr_teleoperate arm tracking verified via Safari WebXR on VP. Root-caused WSS cert trust separation (Safari treats HTTPS and WebSocket trust separately). Vuer v0.0.60 required (v0.0.40 client JS incompatible with visionOS). Patched: JS port fix (hostname→host in 4 chunk files), aiohttp SSL assertion fix. VP factory reset needed to clear stale cert state. IK configuration flipping identified as known issue near singularities. Comprehensive pipeline logging plan created (--debug flag). Lab network topology documented (NETGEAR AP + AT&T BGW320 hybrid). 3 glossary terms added, 2 open questions resolved. | diff --git a/context/deployment-operations.md b/context/deployment-operations.md index 62dbf5a..86f7b54 100644 --- a/context/deployment-operations.md +++ b/context/deployment-operations.md @@ -9,10 +9,10 @@ key_terms: [ota_update, development_computer, locomotion_computer] images: [] examples: [] open_questions: - - "Complete pre-flight checklist from official manual" - "Field transport case specifications" - "Multi-robot coordination capabilities" - "Firmware update procedure (OTA details)" + - "Exact firmware version on our robot (V1.0.2 vs V1.0.4+) — test L2+B vs L1+A to determine" --- # Deployment & Field Operations @@ -27,9 +27,73 @@ Based on official quick start guide and community experience: [T0/T1] 1. Unpack robot and charger from shipping container 2. Install battery (quick-release smart battery connector) 3. Verify battery charge level -4. Power on the robot -5. Wait for system boot (locomotion + development computers) -6. Robot should enter default standing mode +4. Power on the robot (short press, then hold 2+ seconds) +5. Wait for system boot (~1 minute for locomotion + development computers) +6. Robot enters **Zero Torque** state (limp) — NOT standing + +### R3 Controller: Startup Sequence (Power-On → Regular Mode) [T1] + +The robot enforces a mandatory state machine. You CANNOT skip steps — each mode transition must happen in order. Button combos differ by firmware version. + +**State machine:** `Zero Torque (boot) → Damping → Locked Standing → Regular Mode` + +#### Firmware V1.0.4+ (Newer — L2-based combos) + +| Step | Button Combo | Result | +|------|-------------|--------| +| 1 | *(robot boots)* | Zero Torque — all joints limp | +| 2 | **Hold L2 + press B** | **Damping** — joints resist movement passively | +| 3 | **Hold L2 + press D-pad UP** | **Locked Standing** — robot stands up. **Hold its shoulder for support!** | +| 4 | **R2 + A** | **Regular Mode** — locomotion active, ready for teleop | + +#### Firmware V1.0.2 (Older — L1-based combos) + +| Step | Button Combo | Result | +|------|-------------|--------| +| 2 | **Hold L1 + press A** | **Damping** | +| 3 | **Hold L1 + press D-pad UP** | **Locked Standing** | +| 4 | **R1 + X** | **Regular Mode** | + +**How to tell which firmware:** Try L2+B first. If it works (joints become stiff), you're on V1.0.4+. If nothing happens, try L1+A (V1.0.2). + +#### Full Posture Mode Table — Firmware V1.0.4+ [T1] + +| Mode | Button Combo | Notes | +|------|-------------|-------| +| Zero Torque | Hold L2 + Y | All joints limp — robot will collapse if standing | +| Damping | Hold L2 + B | Passive resistance, safe descent | +| Locked Standing | Hold L2 + D-pad UP | Robot stands up autonomously | +| Seated | Hold L2 + D-pad LEFT | | +| Lying/Standing toggle | Hold L2 + X | | +| Squat | Hold L2 + A | | +| Debug Mode | Hold L2 + click R2 | For low-level SDK control (robot must be suspended!) | + +#### Full Posture Mode Table — Firmware V1.0.2 [T1] + +| Mode | Button Combo | Notes | +|------|-------------|-------| +| Zero Torque | Hold L1 + B | All joints limp | +| Damping | Hold L1 + A | Passive resistance | +| Locked Standing | Hold L1 + D-pad UP | Robot stands up | +| Seated | Hold L1 + D-pad LEFT | | +| Lying/Standing toggle | Hold L1 + X | | +| Squat | Hold L1 + D-pad DOWN | | +| Debug Mode | Hold L2 + R2 | Same across firmware versions | + +#### Controls in Regular Mode [T1] + +| Function | Control | +|----------|---------| +| Walk forward/backward | Left joystick up/down | +| Turn left/right | Right joystick left/right | +| Stand/Walk toggle | START | +| Step in place | Double-click START | +| Low speed mode | Double-click L2 | +| High speed mode | Double-click L1 | +| Emergency stop (damping) | L2+B (V1.0.4) or L1+A (V1.0.2) | + +#### Common Mistake: Haptic Rejection Pulses +If you press R1+X or R2+A and get haptic pulses, the robot is rejecting the command because you haven't gone through the required intermediate states. Go back to step 2 (Damping) and work forward. ### Network Setup 1. Connect to the robot's WiFi network from your development machine @@ -62,8 +126,8 @@ Recommended checks before each operation session: [T2 — Best practice from com To gain full low-level joint control (bypassing the stock locomotion controller): [T1 — Weston Robot guide] 1. **Suspend the robot** on a stand or safety harness (it WILL fall without support) -2. Put the robot into **damping state** using the remote -3. Press **L2 + R2** on the wireless remote simultaneously +2. Put the robot into **damping state** (Hold L2+B on V1.0.4, or L1+A on V1.0.2) +3. Press **Hold L2 + click R2** on the wireless remote 4. The stock locomotion controller is now disabled 5. You have full control via `rt/lowcmd` for all joints 6. To exit debug mode: power cycle the robot @@ -140,6 +204,73 @@ Common issues and resolutions from community experience: [T2 — Robonomics repo | RoboStore startup guide | robostore.com/blogs/news/unitree-g1-startup-guide | | Weston Robot dev guide | docs.westonrobot.com/tutorial/unitree/g1_dev_guide/ | +## 9. GR00T-WBC Deployment on Real G1 (Verified 2026-02-15) [T1] + +### Architecture +- **GB10** (10.0.0.68) runs GR00T-WBC, sends DDS commands to robot via 192.168.123.100 +- **Robot Jetson** (192.168.123.164) runs web control panel (`server.py:8080`) and audio driver — NO stock locomotion controller +- **Locomotion MCU** (192.168.123.161) receives `rt/lowcmd`, publishes `rt/lowstate` — no SSH access +- SSH to Jetson: `sshpass -p '123' ssh unitree@192.168.123.164` +- SSH to GB10: `ssh mitchaiet@10.0.0.68` (password: `Strat3*gb10`) + +### Confirmed Hardware Parameters (This Robot) +- **mode_machine = 5** (g1_29dof_rev_1_0) — confirmed via 63 DDS samples +- **No ai_sport service running** — Jetson only runs server.py and audio driver +- When policy deactivates, last PD gains remain active (motors hold position, do NOT actively balance) + +### Launch Procedure +1. Start Xvfb: `Xvfb :99 -screen 0 1024x768x24 &` (GLFW needs a display) +2. Launch script `/tmp/launch_groot.sh` in tmux: + ```bash + cd ~/GR00T-WholeBodyControl && source .venv/bin/activate + export LD_LIBRARY_PATH=/opt/ros/jazzy/lib:$LD_LIBRARY_PATH + export DISPLAY=:99 + export CYCLONEDDS_URI=' + ' + python3 -u gr00t_wbc/control/main/teleop/run_g1_control_loop.py --no-with-hands --interface real + ``` +3. Activate policy: `]` key in tmux +4. Deactivate policy: `o` key in tmux +5. Keyboard: `1/2` height, `5/6` pitch, `w/s/a/d` locomotion, `9/0` IMU offset ±1° + +### Known Issues & Fixes Applied +| Issue | Root Cause | Fix | +|-------|-----------|-----| +| GLFW crash on headless GB10 | `simulator_factory.py` eagerly imports mujoco | Made BaseSimulator import lazy + Xvfb :99 | +| DDS selects loopback | No CYCLONEDDS_URI set | Set explicit URI with address=192.168.123.100 | +| Negative KD bug | `configs.py` line: `MOTOR_KD[14] -= 10` → KD = -5 | Commented out the line | +| ONNX actions exceed [-1,1] | Normal RSL-RL behavior — do NOT clip | No clipping needed (NVIDIA reference has none) | +| Backward lean bias | **IMU mounting offset** — physical IMU in pelvis is pitched ~6° from sim assumption | Apply quaternion pitch rotation before gravity computation (see IMU Offset section) | +| Action clipping wrong | Added np.clip(action,-1,1) based on Isaac Lab issue — caused policy saturation | Removed clipping — NVIDIA reference code does not clip, policy needs full range | +| mode_machine hardcoded | GR00T-WBC hardcodes mode_machine=5, should read from robot | Applied PR #11: dynamic detection from rt/lowstate | + +### IMU Pitch Offset Calibration [T1 — Verified on this robot] +The real G1's pelvis IMU has a physical mounting offset vs what simulation assumes. GR00T-WBC reads raw IMU data via DDS (bypassing the stock controller's calibration), so the policy's reference frame for "upright" is wrong. This causes persistent backward lean that cannot be fixed by PD gain tuning or pitch commands (the balance loop fights back). + +**Fix:** Rotate the IMU quaternion by a pitch offset before computing gravity orientation in `g1_gear_wbc_policy.py`: +```python +# In compute_observation(), before get_gravity_orientation(quat): +half = self.imu_pitch_offset / 2.0 +pitch_quat = np.array([np.cos(half), 0.0, np.sin(half), 0.0]) # w,x,y,z +quat = quat_multiply(pitch_quat, quat) +``` +- **Current calibrated value: -6.0° (np.deg2rad(-6.0))** — negative = forward correction +- Keys `9`/`0` adjust by ±1° live without relaunching +- This is NOT a hack — it's an IMU calibration parameter that GR00T-WBC is missing +- Multiple GitHub users (Issues #21, #22, #23) report the same backward lean — likely the same root cause + +### PD Gain Tuning History +The ONNX policy was trained in sim with gains: `kps: [150,150,150,200,40,40, ...]` + +Best results achieved with Unitree teleop gains + IMU offset: + +| Source | Hip pitch | Hip roll | Hip yaw | Knee | Ankle pitch | Ankle roll | Waist | +|--------|:-:|:-:|:-:|:-:|:-:|:-:|:-:| +| GR00T-WBC sim (trained) | 150 | 150 | 150 | 200 | 40 | 40 | 250 | +| Unitree SDK example | 60 | 60 | 60 | 100 | 40 | 40 | 60/40/40 | +| Unitree xr_teleoperate | 300 | 300 | 300 | 300 | 80 | 80 | 300 | +| **Current best (teleop + IMU offset)** | **300** | **300** | **300** | **300** | **80** | **80** | **300** | + ## Key Relationships - Governed by: [[safety-limits]] (operational constraints, pre-op checks) - Constrained by: [[power-system]] (battery runtime limits deployment duration) diff --git a/context/gb10-offboard-compute.md b/context/gb10-offboard-compute.md index 663fe94..8b8a938 100644 --- a/context/gb10-offboard-compute.md +++ b/context/gb10-offboard-compute.md @@ -68,14 +68,14 @@ curl http://192.168.123.100:30000/v1/chat/completions \ Both systems are ARM64-native. Model files (.pt, .onnx, .gguf) trained on GB10 deploy directly to Orin NX without architecture conversion. Container images are interoperable (both aarch64). -## 7. GR00T-WBC Deployment — Verified (2026-02-14) [T1] +## 7. GR00T-WBC Deployment — Verified (2026-02-14/15) [T1] -GR00T-WBC runs successfully on the GB10. This is the primary use case: offboard whole-body control simulation and (eventually) real-time control relay. +GR00T-WBC runs successfully on the GB10 for both simulation and **real robot control**. The GB10 relays DDS commands to the G1 over Ethernet at 50 Hz. **Network Configuration:** -- GB10 at `10.0.0.68` on LAN (not on robot's 192.168.123.0/24 subnet yet) -- SSH key auth configured for user `mitchaiet` -- Firewall (ufw) open for: SSH (22), VNC (5900), NoMachine (4000), Sunshine (47984-47990), web viewer (8080) +- GB10 at `10.0.0.68` on LAN, also at `192.168.123.100` on robot subnet +- SSH: `ssh mitchaiet@10.0.0.68` (password: `Strat3*gb10`) +- Firewall (ufw) open for: SSH (22), VNC (5900), NoMachine (4000), Sunshine (47984-47990), web viewer (8080), robot subnet (192.168.123.0/24) **Software Stack on GB10:** - Ubuntu 24.04.3 LTS (Noble), kernel 6.14.0-1015-nvidia @@ -98,6 +98,16 @@ GR00T-WBC runs successfully on the GB10. This is the primary use case: offboard **Launch Commands:** ```bash +# Real robot control (primary use case) +Xvfb :99 -screen 0 1024x768x24 & +tmux new-session -d -s groot "cd ~/GR00T-WholeBodyControl && source .venv/bin/activate && \ + export LD_LIBRARY_PATH=/opt/ros/jazzy/lib:\$LD_LIBRARY_PATH && \ + export DISPLAY=:99 && \ + export CYCLONEDDS_URI='\ + ' && \ + python3 -u gr00t_wbc/control/main/teleop/run_g1_control_loop.py --no-with-hands --interface real \ + 2>&1 | tee /tmp/groot_diag.log" + # Simulation with viewer (from NoMachine terminal) bash ~/GR00T-WholeBodyControl/launch_sim.sh --sync diff --git a/context/learning-and-ai.md b/context/learning-and-ai.md index e91d3fe..9e7ab1e 100644 --- a/context/learning-and-ai.md +++ b/context/learning-and-ai.md @@ -55,6 +55,21 @@ Advanced RL training on NVIDIA Isaac Lab: [T0] | Fall-Safety (arXiv:2511.07407) | Unified prevention + mitigation + recovery | Yes (zero-shot) | T1 | | Vision Locomotion (arXiv:2602.06382) | End-to-end depth-based locomotion | Yes | T1 | | Safe Control (arXiv:2502.02858) | Projected Safe Set for collision avoidance | Yes | T1 | +| ASAP (sim-to-real correction) | Adaptive Skill Adaptation Pipeline — residual network corrects sim-trained policy using real-world data. 52.7% tracking error reduction on G1. | Yes | T1 | + +### WBC-AGILE — Open-Source Training Framework [T1] + +NVIDIA's **WBC-AGILE** (`nvidia-isaac/WBC-AGILE`) provides the training framework for GR00T-WBC policies: + +- **Repository:** nvidia-isaac/WBC-AGILE (GitHub) +- **Purpose:** Train locomotion + WBC policies for G1 and other humanoids +- **Framework:** Isaac Lab + RSL-RL (PPO) +- **G1 support:** Built-in G1 configuration +- **Deployment:** Exports to ONNX, drop-in replacement for GR00T-WBC pre-trained policies +- **Use cases:** Retraining with corrected dynamics, fine-tuning PD gains, adding push recovery curriculum +- **GB10 compatible:** Isaac Sim/Lab officially supported on GB10/DGX Spark + +**Note:** The GR00T-WBC repository is inference-only — it does NOT contain training code. WBC-AGILE is the separate training framework. ## 2. Imitation Learning diff --git a/context/networking-comms.md b/context/networking-comms.md index 0b401aa..2a048a8 100644 --- a/context/networking-comms.md +++ b/context/networking-comms.md @@ -42,7 +42,8 @@ Computer Computer (Livox MID360) | Device | IP Address | Access Level | |----------------------|--------------------|--------------------| | Locomotion Computer | 192.168.123.161 | Not user-accessible | -| Development Computer | 192.168.123.164 | User-accessible (SSH) | +| Development Computer (eth0) | 192.168.123.164 | User-accessible (SSH) via internal network | +| Development Computer (wlan0) | 10.0.0.64 | User-accessible (SSH) via WiFi LAN — same Jetson Orin NX | | Livox MID360 LiDAR | 192.168.123.20 | Ethernet, driver required | | External dev machine | 192.168.123.x | Via WiFi or Ethernet | @@ -140,13 +141,34 @@ Connecting the Dell Pro Max GB10 to the G1's internal DDS network via Ethernet s - Use `ChannelFactoryInitialize(0, "enP7s7")` or `--interface real` (auto-detects 192.168.123.x interface) ### SSH Credentials (Verified) -- **Jetson:** `ssh unitree@192.168.123.164` — password: `123` (NOT "unitree") [T1] +- **Jetson (internal):** `ssh unitree@192.168.123.164` — password: `123` (NOT "unitree") [T1] +- **Jetson (WiFi direct):** `ssh unitree@10.0.0.64` — password: `123` — same Jetson Orin NX, reachable directly on LAN via wlan0 [T1] - **Locomotion computer (192.168.123.161):** Not user-accessible via SSH ### Tools on GB10 - `/tmp/dds_test.py` — Quick DDS connectivity test (ping → multicast → SDK rt/lowstate) - `~/GR00T-WholeBodyControl/launch_real.sh` — Launches GR00T-WBC with `--interface real` +## 7. Lab Network Topology (Verified 2026-02-18) [T1] + +``` +[AT&T BGW320-505 Gateway] + └── [NETGEAR WiFi AP] (BSSID 34:98:B5:xx:xx:xx) + │ SSID: "Super Exmodiar LVL5G" (5GHz, 802.11ax) + │ Subnet: 10.0.0.0/24 + │ + ├── Windows PC: 10.0.0.53 (WiFi) + ├── G1 Robot: 10.0.0.64 (wlan0, Jetson Orin NX) + ├── Vision Pro: 10.0.0.74 (WiFi, after factory reset; was 10.0.0.65) + └── GB10: 10.0.0.68 (DHCP, primary) +``` + +### Notes +- **Hybrid router system**: AT&T BGW320-505 is the gateway/modem; NETGEAR provides WiFi AP. This is a common AT&T Fiber setup where the BGW320 handles routing and the NETGEAR extends WiFi. +- **Windows WiFi profile**: Classified as "Public" network (strictest Windows Firewall profile). This blocks network discovery and file sharing but does NOT block outgoing WebSocket connections. +- **Vision Pro IP changes on factory reset**: DHCP assigns a new IP after factory reset. Check with router admin page or `arp -a` if VP is unreachable at old IP. +- **WebSocket traffic**: Port 8012 (Vuer HTTPS + WSS) works fine across this network. No NAT, firewall, or routing issues observed between any devices on the 10.0.0.0/24 subnet. + ## Key Relationships - Transports: [[sdk-programming]] (SDK commands and data via DDS) - Transports: [[ros2-integration]] (ROS2 topics via same DDS layer) diff --git a/context/open-questions.md b/context/open-questions.md index f665717..2130231 100644 --- a/context/open-questions.md +++ b/context/open-questions.md @@ -152,17 +152,16 @@ This file catalogs what we know, what we don't know, and what would resolve the ### Open - **Q:** How does GR00T-WBC Walk policy compare to stock G1 controller for push recovery? - - _Partial:_ Walk policy demonstrated working in sim (walking, turning, strafing). Push recovery not yet quantified. + - _Partial:_ Balance policy tested on real robot. Robot can recover from light pushes with IMU offset + teleop gains. Not yet quantified with force measurement. Walk policy not yet tested on real robot. - _Would resolve:_ Side-by-side push testing on real robot with force measurement. - **Q:** What is the exact training recipe for the pre-trained ONNX policies (Balance, Walk)? - - _Partial:_ PPO via RSL-RL in Isaac Lab. MLP [512, 256, 128]. Domain randomization. Zero-shot transfer. Exact reward function and perturbation curriculum not published by NVIDIA. + - _Partial:_ PPO via RSL-RL in Isaac Lab. MLP [512, 256, 128]. Domain randomization. Zero-shot transfer. Exact reward function and perturbation curriculum not published by NVIDIA. WBC-AGILE (nvidia-isaac/WBC-AGILE) provides training framework but may differ from pre-trained models. - _Would resolve:_ NVIDIA publishing training code, or reverse-engineering from ONNX model + observation/action analysis. -- **Q:** Can GR00T-WBC relay real-time control from GB10 to G1 over the network? - - _Partial:_ Network bridge established (GB10 at 192.168.123.100, ping <1ms to .161/.164). DDS multicast confirmed active inside robot. UFW firewall was blocking — now resolved. Full rt/lowstate subscription test pending (robot was disconnected before final verification). - - _Update (2026-02-15):_ UFW firewall identified as root cause. `sudo ufw allow from 192.168.123.0/24` fixes it. DDS test script ready at `/tmp/dds_test.py`. Launch script at `~/GR00T-WholeBodyControl/launch_real.sh`. - - _Would resolve:_ Final DDS data verification with robot reconnected, then `launch_real.sh` test. +- **Q:** What is the optimal IMU pitch offset for the G1? + - _Partial:_ Approximately -6° (np.deg2rad(-6.0)) calibrated on one G1 EDU Ultimate E (U7). May vary per unit due to manufacturing tolerance. The stock Unitree controller handles this calibration internally. + - _Would resolve:_ Testing on multiple G1 units to determine if offset is consistent or per-unit. - **Q:** What camera-based mocap solution integrates best with GR00T-WBC's upper body teleop? - _Partial:_ GR00T-WBC supports Pico VR, LeapMotion, HTC Vive, iPhone natively. Camera-based (MediaPipe, OpenPose) not built-in but could publish to the same ROS topic. @@ -196,8 +195,8 @@ This file catalogs what we know, what we don't know, and what would resolve the ### Open - **Q:** Which Vision Pro integration path works best with GR00T-WBC? - - _Partial:_ Two paths identified. Path 1 (xr_teleoperate WebXR) is fastest but bypasses GR00T-WBC. Path 2 (VisionProTeleop native app → avp_stream → GR00T-WBC) is higher quality. - - _Would resolve:_ Prototype both, compare latency and tracking quality. + - _Partial:_ xr_teleoperate WebXR path verified working (arm tracking confirmed 2026-02-18). Bypasses GR00T-WBC (uses stock controller + `rt/arm_sdk`). VisionProTeleop native app path blocked — `avp_stream` server can't bind port 12345 on robot (port held by Unitree process). GR00T-WBC bridge untested. + - _Would resolve:_ Resolve port 12345 conflict on robot, or find alternate gRPC port for avp_stream. - **Q:** Can the GR00T-WBC Walk policy maintain balance with Vision Pro-driven arm poses? - _Partial:_ The Walk ONNX policy receives upper body joint angles as observation input and can compensate. But it was NOT trained with arbitrary arm configurations — conservative motions likely fine, extreme poses may destabilize. @@ -205,12 +204,37 @@ This file catalogs what we know, what we don't know, and what would resolve the - _Would resolve:_ Test with real Vision Pro driving arms while walking. If unstable, proceed with unified training plan. - **Q:** Does the Unitree wireless remote work under GR00T-WBC? - - **A:** No. GR00T-WBC takes over low-level motor control via rt/lowcmd, bypassing the stock controller that reads the remote. The rt/wirelesscontroller DDS topic is still published by the robot but nothing in GR00T-WBC subscribes to it on real hardware. Keyboard control (w/s/a/d/q/e) is the built-in alternative. [T1 — Source inspection, 2026-02-15] + - **A:** No. GR00T-WBC takes over low-level motor control via rt/lowcmd, bypassing the stock controller that reads the remote. The rt/wirelesscontroller DDS topic is still published by the robot but nothing in GR00T-WBC subscribes to it on real hardware. Keyboard control (w/s/a/d/q/e) is the built-in alternative. The `wireless_remote` field is present in rt/lowstate and could be bridged to GR00T-WBC's command system. [T1 — Source inspection, 2026-02-15] --- ## Resolved +### Vision Pro xr_teleoperate WebXR (Resolved 2026-02-18) + +- **Q:** Can xr_teleoperate's WebXR pipeline (Vuer/TeleVuer) work with Apple Vision Pro for arm teleoperation? + - **A:** Yes. Requires: (1) vuer v0.0.60 (v0.0.40 client JS incompatible with visionOS Safari), (2) JS port fix (`hostname` → `host` in all chunk files with `wss://`), (3) aiohttp SSL assertion fix (`assert self._paused` → `if not self._paused: return`), (4) CA-signed certs with rootCA installed + full trust enabled on VP, (5) launch from `~/xr_teleoperate/teleop/` for URDF relative paths. Arms track hand movements at 30 Hz IK loop. IK configuration flipping is a known issue near singularities (recoverable by restarting teleop). [T1 — Verified on real robot, 2026-02-18] + +- **Q:** Why does Safari on visionOS fail to establish WebSocket connections to self-signed HTTPS servers? + - **A:** Safari treats HTTPS page trust and WebSocket (`wss://`) trust separately. Clicking "Accept" on the browser cert warning only trusts the page load — WebSocket connections are silently rejected with no error and no prompt. The root CA must be installed as a device profile AND explicitly enabled in Settings → General → About → Certificate Trust Settings. If stale cert state has accumulated from debugging, a factory reset of the Vision Pro may be needed for a clean install. [T1 — Verified 2026-02-18] + +### GR00T-WBC Real Robot (Resolved 2026-02-15) + +- **Q:** Can GR00T-WBC relay real-time control from GB10 to G1 over the network? + - **A:** Yes. GB10 at 192.168.123.100 sends rt/lowcmd and receives rt/lowstate via DDS. CYCLONEDDS_URI must specify the network interface explicitly. UFW firewall must allow 192.168.123.0/24. Robot stands and balances autonomously with the Balance ONNX policy. [T1 — Verified on real robot, 2026-02-15] + +- **Q:** Why does GR00T-WBC cause persistent backward lean on the real G1? + - **A:** **IMU mounting offset.** The G1's pelvis IMU has a physical pitch offset (~6°) relative to simulation. The stock Unitree controller compensates internally, but GR00T-WBC reads raw IMU data via DDS and has no calibration step. Fix: apply a quaternion pitch rotation before gravity computation. This is NOT a sim-to-real gap in the policy — it's a missing sensor calibration. Multiple GitHub users (Issues #21, #22, #23) report the same problem. [T1 — Root-caused and verified on real robot, 2026-02-15] + +- **Q:** Should GR00T-WBC ONNX policy actions be clipped to [-1, 1]? + - **A:** No. NVIDIA's reference code does NOT clip actions. RSL-RL does not clip by default. Policy outputs exceeding [-1,1] are intentional for push recovery and large corrections. Adding np.clip() causes policy saturation — outputs rail at clip boundaries with no room for balance corrections. [T1 — Verified via NVIDIA source inspection and real robot testing, 2026-02-15] + +- **Q:** What is the G1's mode_machine value? + - **A:** mode_machine=5 (g1_29dof_rev_1_0) on our G1 EDU Ultimate E (U7), confirmed via 63 DDS samples. GR00T-WBC should read this dynamically from rt/lowstate (PR #11) rather than hardcoding. [T1 — Verified 2026-02-15] + +- **Q:** What PD gains work best for GR00T-WBC on the real G1? + - **A:** Unitree xr_teleoperate gains (KP: 300/300/300/300/80/80 per leg, 300/300/300 waist; KD: 3/3/3/3/2/2 per leg, 5/5/5 waist) combined with IMU offset calibration give the best results. The sim-trained gains (KP: 150/150/150/200/40/40) give better push recovery but less tracking precision. The policy was trained with the sim gains, so its corrections are calibrated for those — but with the correct IMU reference frame, stiffer gains improve tracking without the policy fighting itself. [T1 — Iterative tuning on real robot, 2026-02-15] + ### MuJoCo Playground Training (Resolved) - **Q:** Can MuJoCo Playground train G1 policies on Blackwell (sm_121)? diff --git a/context/robot-modifications-log.md b/context/robot-modifications-log.md new file mode 100644 index 0000000..c5bafe1 --- /dev/null +++ b/context/robot-modifications-log.md @@ -0,0 +1,91 @@ +# Robot Modifications Log — Pre-Wipe Snapshot + +**Date:** 2026-02-17 +**Purpose:** Document all changes made to the robot before resetting to clean upstream code. + +--- + +## 1. xr_teleoperate (~/xr_teleoperate on robot) + +**Upstream:** https://github.com/unitreerobotics/xr_teleoperate.git +**Robot commit:** 9fadc51 (upstream HEAD) + +### Modified files (tracked, dirty): + +#### teleop/teleop_hand_and_arm.py +- Added `import numpy as np` at top +- Added `--avp-ip` argparse flag (Vision Pro IP for avp_stream bypass) +- Added `--debug` argparse flag (enables per-frame file logging) +- Added `teleop_logger` import and `setup_teleop_logging()` call +- Added conditional: if `--avp-ip` given, uses `AVPStreamWrapper` instead of `TeleVuerWrapper` +- Added per-frame IK debug logging block (writes to `/tmp/teleop_debug_*.log`) +- Added `[IK_DBG]` print every 30 frames (sol_q, cur_q, delta) + +#### teleop/robot_control/robot_arm_ik.py +- **IK cost function:** zeroed rotation cost (`0 * self.rotation_cost`) for position-only mode — applied to ALL 4 IK classes (G1_29, G1_23, H1_2, H1) +- Added `self.last_solve_info = {}` dict to all 4 classes +- Added solve info capture in try/except blocks (status, raw_sol, cost on success; status, error on fail) + +### Untracked files (our additions): +- `teleop/avp_stream_wrapper.py` — Drop-in AVPStream wrapper using gRPC (Tracking Streamer app). Full coordinate transform pipeline matching TeleVuerWrapper. +- `teleop/teleop_logger.py` — Debug logging utility (setup_teleop_logging, fmt_arr, fmt_pos, fmt_rpy, rot_to_euler) +- `teleop/rotation_diagnostic.py` — Diagnostic script for rotation analysis +- `teleop/test_avp_stream.py` — Test script for avp_stream connectivity +- `teleop/robot_control/robot_arm_ik.py.bak_debug` — Backup before debug changes +- `teleop/teleop_hand_and_arm.py.bak_debug` — Backup before debug changes + +--- + +## 2. vuer package (~/miniforge3/envs/tv/lib/python3.10/site-packages/vuer/) + +**Package:** vuer 0.0.60 (pip install) +**Upstream:** https://github.com/vuer-ai/vuer + +### Patches applied (with .bak backups): + +#### base_protocol.py (aiohttp, NOT vuer) +**File:** `~/miniforge3/envs/tv/lib/python3.10/site-packages/aiohttp/base_protocol.py` +**Bug:** `assert self._paused` in `resume_writing()` crashes on SSL WebSocket connections (aiohttp 3.10.5, Python 3.10, aarch64) +**Fix:** Changed line 36 from `assert self._paused` to `if not self._paused: return` +**Backup:** `.bak` exists + +#### chunk-Dd3xtWba.js (Vuer client JS bundle) +**File:** `~/miniforge3/envs/tv/lib/python3.10/site-packages/vuer/client_build/assets/chunks/chunk-Dd3xtWba.js` +**Bug:** `getSocketURI()` uses `window.location.hostname` (no port) for HTTPS case, causing WebSocket to connect to port 443 instead of 8012 +**Fix:** Changed `wss://${window.location.hostname}` to `wss://${window.location.host}` +**Backup:** `.bak` exists + +#### index.html (Vuer client) +**File:** `~/miniforge3/envs/tv/lib/python3.10/site-packages/vuer/client_build/index.html` +**Change:** Added `?cb=1771359809` cache-busting to all JS/CSS references +**Backup:** `.bak` exists + +#### server.py (Vuer server) +**File:** `~/miniforge3/envs/tv/lib/python3.10/site-packages/vuer/server.py` +**Change:** Added request logging line at start of `socket_index()`: `print(f"[REQ] {request.method} {request.path_qs} upgrade=...")` +**Backup:** No .bak (logging only, can be dropped) + +--- + +## 3. Current Issue (at time of wipe) + +WebSocket connects then immediately disconnects in pass-through mode. 3 connect/disconnect cycles (react-use-websocket retries 3x then gives up). User sees hands (client-side WebXR) but no data flows to robot. Root cause investigation was in progress — examining Vuer's `downlink()` handler in `server.py` lines 597-670. + +The `downlink()` handler: +1. Creates a VuerProxy session +2. Calls `self.bound_fn(vuer_proxy)` which returns a generator +3. Calls `await generator.__anext__()` to get the first server event +4. Enters `async for msg in ws:` loop to process incoming client messages +5. When client stops sending → "websocket is now disconnected" + +**Hypothesis:** The pass-through handler pushes `Hands(stream=True)` then sleeps forever. The Vuer server's downlink expects the `socket_handler` (set by `@app.add_handler`) to be an async generator that yields events. If the handler is a plain async function (not a generator), the `hasattr(generator, "__anext__")` check may fail, causing it to call `next(generator)` on a coroutine, which could raise TypeError silently. + +--- + +## 4. Key Findings to Preserve + +1. **Vuer JS WebSocket port bug** — MUST fix in any vuer version used with HTTPS on non-443 port +2. **aiohttp SSL assertion bug** — MUST fix for aiohttp 3.10.5 on aarch64 +3. **display-mode pass-through** is required when Vision Pro can't reach robot's internal network (192.168.123.164) for WebRTC +4. **IK rotation cost = 0** was set for position-only tracking during debugging — should be restored to original values once rotation corrections are working +5. **avp_stream_wrapper.py** is a complete working wrapper for the native Tracking Streamer pipeline (gRPC, port 12345) — should be committed as a proper feature diff --git a/context/sdk-programming.md b/context/sdk-programming.md index 0fc94a4..9b47214 100644 --- a/context/sdk-programming.md +++ b/context/sdk-programming.md @@ -127,15 +127,42 @@ The SDK supports multiple control modes via the `MotorCmd_` structure: [T0 — D | `rt/dex3/left/state` | `unitree_hg` protocol | Subscribe | Left hand state | | `rt/dex3/right/state` | `unitree_hg` protocol | Subscribe | Right hand state | -### LowState_ Contents +### LowState_ Contents (HG variant for G1) - `mode_pr`: Parallel mechanism control mode -- `mode_machine`: G1 type identifier +- `mode_machine`: G1 type identifier (5=g1_29dof_rev_1_0, 11-16=newer revisions) - `tick`: Timer (increments every 1ms) -- `imu_state`: IMU sensor data (orientation, angular velocity, acceleration) +- `imu_state`: IMU sensor data (quaternion [w,x,y,z], angular velocity, acceleration) - `motor_state`: State for all body motors (position, velocity, torque) -- `wireless_remote`: Remote control data +- `wireless_remote`: Remote control data (all zeros until button pressed) - `crc`: Checksum for data integrity +### LowCmd_ Variants [T1 — Verified 2026-02-15] +The G1 uses the HG message variant (`unitree_hg_msg_dds__LowCmd_`), NOT the GO variant: + +| Field | HG (G1) | GO (Go2) | +|-------|:---:|:---:| +| `mode_pr` | Yes | No | +| `mode_machine` | Yes | No | +| `motor_cmd` | Yes | Yes | +| `reserve` | Yes | No | +| `crc` | Yes | Yes | +| `head` | **No** | Yes | +| `level_flag` | **No** | Yes | +| `gpio` | **No** | Yes | + +Using the wrong variant will cause `AttributeError` at runtime. Always use `unitree_hg_msg_dds__LowCmd_()` for G1. + +### mode_machine Values [T1] +| Value | Model | Hip Pitch Ratio | Hip Roll Ratio | Notes | +|:---:|---|:---:|:---:|---| +| 2 | g1_29dof (old) | 14.3 | 14.5 | Pre-rev-1.0 | +| 5 | g1_29dof_rev_1_0 | 14.3 | 22.5 | Common for EDU units | +| 11 | g1_29dof_mode_11 | 22.5 | 22.5 | Newer — both hips upgraded | +| 12 | g1_29dof_mode_12 | 22.5 | 22.5 | Newer variant | +| 13-16 | g1_29dof_mode_13-16 | varies | 22.5 | 5010 wrist motor variants | + +**Important:** Always read `mode_machine` from `rt/lowstate` rather than hardcoding. The firmware uses this value to interpret joint commands with the correct gear ratios. + ## 5. Development Workflow 1. **Simulation first:** Test with `unitree_mujoco` (same DDS interface, switch domain ID) diff --git a/context/sensors-perception.md b/context/sensors-perception.md index d2b2091..7f9141f 100644 --- a/context/sensors-perception.md +++ b/context/sensors-perception.md @@ -58,10 +58,48 @@ The primary visual sensor for RGB-D perception, obstacle detection, and environm Critical for balance and state estimation during locomotion. - **Type:** 6-axis (3-axis accelerometer + 3-axis gyroscope) +- **Location:** Pelvis (body torso) — measures pelvis orientation only, NOT total body lean [T1] - **Access:** Available in `rt/lowstate` → `imu_state` field +- **Quaternion format:** [w, x, y, z] — confirmed via code inspection and accelerometer cross-validation [T1] - **Used by:** Locomotion computer for real-time balance control, state estimation - **User access:** Via SDK2 LowState subscription +### IMU Mounting Offset [T1 — Verified on real G1, 2026-02-15] + +**CRITICAL:** The physical IMU in the G1's pelvis has a mounting offset of approximately **6° in pitch** relative to what simulation assumes. This means: + +- In simulation, the IMU is perfectly aligned with the body frame (zero offset by definition) +- On the real robot, the IMU's "zero" is ~6° off from true upright +- The **stock Unitree controller compensates** for this offset internally (calibration in the locomotion MCU firmware) +- **Third-party controllers (e.g., GR00T-WBC) that bypass the stock controller and read raw IMU data via DDS will see a biased reference frame**, causing persistent backward lean + +**Impact:** Any RL policy trained in simulation that reads IMU orientation will have a systematically wrong reference for "upright" on the real robot. This cannot be fixed by PD gain tuning or pitch commands alone — the policy's balance loop actively fights corrections because it believes its (wrong) reference is correct. + +**Fix:** Apply a quaternion pitch rotation to the IMU quaternion before computing the gravity vector: +```python +# Rotate quaternion by IMU pitch offset (around Y axis) +half = imu_pitch_offset / 2.0 # ~np.deg2rad(-6.0) for forward correction +pitch_quat = np.array([np.cos(half), 0.0, np.sin(half), 0.0]) # w,x,y,z +corrected_quat = quat_multiply(pitch_quat, raw_imu_quat) +gravity = get_gravity_orientation(corrected_quat) +``` + +**Note:** The exact offset may vary per unit. Calibrate by adjusting until the robot stands upright. On our G1 EDU Ultimate E (U7), -6° was the calibrated value. + +### Total Body Lean vs IMU Measurement [T1] + +The IMU only measures pelvis tilt. Total visible body lean compounds across the kinematic chain: + +| Source | Contribution | +|--------|:-:| +| Pelvis IMU (measured) | ~1-3° | +| Waist pitch sag | ~1-2° | +| Knee compression | ~1-2° | +| Ankle pitch error | ~1-2° | +| **Total visible lean** | **~5-10°** | + +When the user reports "8-10° backward lean" but the IMU shows only 1.5-3°, this is NOT an error — it's the compounding effect across joints. + ## 5. Joint Proprioception Dual encoders per joint provide high-accuracy feedback: diff --git a/context/simulation.md b/context/simulation.md index 40b54d3..66d10e2 100644 --- a/context/simulation.md +++ b/context/simulation.md @@ -126,6 +126,17 @@ Standard RL training uses domain randomization for robustness: - Latency simulation - **External force perturbations** — random pushes applied to torso during training for push recovery robustness (see [[push-recovery-balance]]) +### Known Sim-to-Real Gaps [T1 — Verified on real G1, 2026-02-15] + +| Gap | Impact | Fix | +|-----|--------|-----| +| **IMU mounting offset** | Persistent backward lean (~6° pitch offset between sim and real IMU) | Quaternion pitch rotation before gravity computation. See [[sensors-perception]] §4. | +| **PD gain calibration** | Policy trained with specific gains; changing gains degrades balance even if tracking improves | Use sim-trained gains OR recalibrate after gain change. Policy corrections are calibrated for training gains. | +| **Action clipping** | Not present in sim (RSL-RL default); adding clip in deployment causes saturation | Do NOT clip ONNX outputs — policy needs full range for push recovery | +| **Surface compliance** | Carpet adds ~1° effective ankle sag vs rigid sim floors | Minor — within policy tolerance | + +**Key insight:** When a sim-trained policy produces poor real-world behavior, check sensor calibration BEFORE tuning gains or modifying the policy. An incorrect IMU reference frame cannot be fixed by any amount of gain tuning — the policy's balance loop actively fights the correction. + ### Perturbation Testing in Simulation Both MuJoCo and Isaac Gym/Lab support applying external forces to the robot body during simulation. This is essential for: - Training push-robust locomotion policies ([[push-recovery-balance]]) diff --git a/context/teleoperation.md b/context/teleoperation.md new file mode 100644 index 0000000..885b7de --- /dev/null +++ b/context/teleoperation.md @@ -0,0 +1,363 @@ +# Teleoperation & Telepresence + +**Status:** Active — xr_teleoperate working with Vision Pro (v0.0.60 + patches, verified 2026-02-18) +**Evidence tier:** T1 (code-verified) unless otherwise noted + +--- + +## 1. xr_teleoperate (Unitree Official) + +**Repository:** https://github.com/unitreerobotics/xr_teleoperate +**Location on robot:** `/home/unitree/xr_teleoperate/` (standalone clone) and `/home/unitree/g1-control/repos/xr-teleoperate/` (submodule of friend's g1-control) +**Conda environment:** `tv` (Python 3.10, pinocchio 3.1.0, casadi 3.6.7, unitree_sdk2py 1.0.1) + +### Architecture + +``` +Apple Vision Pro (Safari WebXR) + │ + │ HTTPS + WebSocket (port 8012) + │ Data: wrist 4x4 SE3 + finger joints + head pose + ▼ +Robot Jetson (10.0.0.64): teleop_hand_and_arm.py + │ TeleVuerWrapper.get_tele_data() → wrist poses + │ → Pinocchio + CasADi IK → 14 arm joint angles (7 left + 7 right) + │ → G1_29_ArmController publishes to DDS + ▼ +DDS topic: "rt/arm_sdk" (motion mode) or "rt/lowcmd" (debug mode) + │ Motor commands at 250 Hz (internal interpolation from 30 Hz IK) + ▼ +Robot motors (arms only in motion mode; all joints in debug mode) +``` + +### Two Operating Modes + +| Aspect | Debug Mode (no `--motion`) | Motion Mode (`--motion`) | +|--------|---------------------------|--------------------------| +| MotionSwitcher called? | Yes — enters Debug Mode | No — skipped entirely | +| DDS topic | `rt/lowcmd` (full control) | `rt/arm_sdk` (arm overlay) | +| Joint control scope | ALL joints (legs locked at current pos) | Arms only; legs under built-in locomotion | +| Robot can walk? | No — frozen in place | Yes — via R3 physical controller | +| Blend weight signal | Not used | `motor_cmd[29].q = 1.0` (tells internal controller to apply arm commands) | +| Required robot state | Any (debug mode takes over) | **Must be in Regular mode (R1+X on R3 controller)** | +| Exit behavior | Arms go home | Arms go home with gradual weight ramp (1.0→0.0 over ~2s) | + +### Correct Startup Procedure (`--motion` mode) [T1] + +**Prerequisites:** +- Robot powered on, standing +- R3 physical controller available +- SSL certs at `~/.config/xr_teleoperate/{cert.pem, key.pem}` with correct IPs in SAN +- `rootCA.pem` installed and trusted on Vision Pro (Settings → General → About → Certificate Trust Settings) +- Vision Pro on same WiFi network as robot + +**Step 1: R3 controller → Regular mode** +The robot must be in Regular mode BEFORE launching teleop. The teleop script does NOT handle mode switching itself. + +From power-on: **L2+B** (Damping) → **L2+D-pad UP** (Locked Standing) → **Regular mode:** + +| Variant | Regular Mode Button | Waist DOFs | Notes | +|---------|-------------------|------------|-------| +| G1 Base (23 DOF) | **R1 + X** | 1 waist joint | | +| G1 EDU (29 DOF) | **R1 + Y** | 3 waist joints | R1+X is disabled on EDU hardware | + +**Running mode (R2+A) is NOT supported** — causes lower body joints to lock rigid during teleop (xr_teleoperate Issue #251). [T1 — Verified 2026-02-18] + +**Step 2: Launch teleop on robot** +```bash +conda activate tv +cd ~/xr_teleoperate/teleop +python3 teleop_hand_and_arm.py --arm=G1_29 --motion +``` +Wait for: `Press [r] to start syncing the robot with your movements.` + +**Step 3: Connect Vision Pro** +Open Safari → `https://10.0.0.64:8012` → tap "Virtual Reality" + +**Step 4: Align arms** +Position your arms matching the robot's initial pose (arms at sides) to avoid sudden movement. + +**Step 5: Press `r` in terminal** +Starts the IK control loop. Arms begin tracking hand movements. Velocity ramp-up over 5 seconds. + +**During operation:** +- Walk the robot using the R3 physical controller +- Press `s` to toggle recording (if `--record` enabled) +- Press `q` to exit (arms return to rest over ~5 seconds via weight ramp-down) + +### Running from GB10 (offboard) + +When running on GB10 instead of the robot itself, must specify the network interface: +```bash +python3 teleop_hand_and_arm.py \ + --arm G1_29 --motion \ + --network-interface 192.168.123.100 \ + --img-server-ip 192.168.123.164 +``` +Also requires `LD_LIBRARY_PATH` for OpenSSL compatibility: +```bash +export LD_LIBRARY_PATH=$HOME/miniforge3/envs/tv/lib:$LD_LIBRARY_PATH +``` + +### Key CLI Arguments + +| Argument | Default | Description | +|----------|---------|-------------| +| `--arm` | required | `G1_29`, `G1_23`, `H1_2`, `H1` | +| `--ee` | none | End-effector: `dex3`, `dex1`, `inspire_ftp`, `inspire_dfx`, `brainco` | +| `--motion` | false | Use `rt/arm_sdk` with built-in locomotion (vs `rt/lowcmd` debug) | +| `--frequency` | 30 | IK control loop Hz | +| `--input-mode` | `hand` | `hand` (tracking) or `controller` | +| `--display-mode` | `immersive` | `immersive`, `ego`, `pass-through` | +| `--network-interface` | none | DDS network interface IP (required when offboard) | +| `--img-server-ip` | none | Camera image server IP (teleimager on PC2) | +| `--sim` | false | Isaac Sim mode (DDS domain 1) | +| `--record` | false | Record episodes for imitation learning | +| `--headless` | false | No XR visualization | +| `--ipc` | false | Enable ZMQ IPC for external control (used by g1-control server.py) | + +### SSL Certificate Setup [T1] + +The televuer WebXR server requires HTTPS (WebXR mandates secure context). Certificate resolution order: +1. Environment variables: `XR_TELEOP_CERT` and `XR_TELEOP_KEY` +2. User config: `~/.config/xr_teleoperate/cert.pem` and `key.pem` +3. Fallback: bundled in televuer package directory + +**Current certs on robot** (regenerated 2026-02-16): +- SAN includes: `10.0.0.64` (wlan0), `192.168.123.164` (eth0), `192.168.1.21`, `127.0.0.1` +- Issuer: `CN = G1 Robot Local CA` +- Valid until: 2028-05-21 +- `rootCA.pem` served via `http://10.0.0.64:9090/rootCA.crt` for Vision Pro download + +**To regenerate certs for a new IP:** +```bash +# Generate CA +openssl req -x509 -new -nodes -newkey rsa:2048 -keyout rootCA.key -out rootCA.pem -days 825 -subj "/CN=G1 Robot Local CA" +# Generate server key + CSR with SAN +openssl req -new -nodes -newkey rsa:2048 -keyout key.pem -out server.csr -config san.cnf +# Sign with CA +openssl x509 -req -in server.csr -CA rootCA.pem -CAkey rootCA.key -CAcreateserial -out cert.pem -days 825 -extfile v3_ext.cnf +``` + +### Safari/visionOS Certificate Trust (CRITICAL) [T1 — Verified 2026-02-18] + +Safari on visionOS treats HTTPS and WebSocket (`wss://`) certificate trust **separately**. Clicking "Accept" on Safari's HTTPS cert warning only trusts the page load — **WebSocket connections are silently rejected** with no error, no prompt. This is documented Apple/WebKit behavior, not a bug. + +**The ONLY way to get `wss://` working with self-signed certs on Vision Pro:** + +1. Generate certs using a CA structure (rootCA.pem → cert.pem), NOT bare self-signed +2. Serve `rootCA.crt` via plain HTTP with MIME type `application/x-x509-ca-cert` + - **`python3 -m http.server` does NOT work** — it serves `.pem` with wrong MIME type and Safari won't trigger profile install + - Use a custom handler or serve as `.crt` with correct Content-Type header +3. On Vision Pro: download from `http://10.0.0.64:9090/rootCA.crt` +4. Install profile: Settings → General → VPN & Device Management +5. **Enable full trust**: Settings → General → About → Certificate Trust Settings → toggle ON "G1 Robot Local CA" + +**If Safari has accumulated stale state** from debugging (old certs, cached JS, partial profile installs), a **factory reset of the Vision Pro** may be necessary. After reset, do a clean cert install on the fresh OS. + +### Vuer Package Patches (Required for v0.0.60) [T1 — Verified 2026-02-18] + +The `vuer` pip package (v0.0.60) has bugs that must be patched after install. **These patches are lost on every `pip install`.** + +**1. JS Port Fix — `hostname` → `host` in WebSocket URI construction** + +The `getDefaultSocketURI()` function in the bundled JS client uses `window.location.hostname` for HTTPS, which strips the port. The WebSocket tries port 443 instead of 8012. + +Fix: In ALL chunk files containing `wss://` under `vuer/client_build/assets/chunks/`: +``` +# Find affected files +grep -l "wss://" .../vuer/client_build/assets/chunks/*.js +# In each file, replace: +wss://${window.location.hostname} → wss://${window.location.host} +``` + +On v0.0.60 this affects 4 chunk files: `chunk-Bf98F3Ua.js`, `chunk-BU6qPyb1.js`, `chunk-Dd3xtWba.js`, `chunk-DmvjxeUa.js`. Filenames change per version. + +**2. aiohttp SSL Assertion Fix** + +In `aiohttp/base_protocol.py`, the `resume_writing()` method has `assert self._paused` which crashes on SSL WebSocket connections. + +Fix: `assert self._paused` → `if not self._paused: return` + +**3. Vuer version compatibility** + +| Version | visionOS Safari client | WebSocket stability | Status | +|---------|----------------------|---------------------|--------| +| v0.0.40 | Client JS does NOT work (zero WS connections) | N/A | Do NOT use | +| v0.0.60 | Client JS works (with port fix) | Connects, known disconnect bug (vuer #85) | **Use this** | + +Community reports (vuer #85, televuer #1, xr_teleoperate #241, #242) confirm v0.0.60 has WebSocket disconnect issues. With a factory-reset VP + clean cert install, connections are stable. + +### IK Solver Details [T1] + +- **Library:** Pinocchio 3.1.0 + CasADi 3.6.7 (constrained nonlinear optimization) +- **URDF:** `~/xr_teleoperate/assets/g1/g1_body29_hand14.urdf` (29 DOF body + 14 DOF Dex3 hands) +- **Reduced model:** Locks all non-arm joints → 14 DOF (7 per arm) +- **End-effector frames:** 0.05m offset along local x-axis from wrist yaw joint +- **Cost function:** 50 × translational_error + rotation_error + 0.02 × regularization + 0.1 × smoothness +- **Max iterations:** 30, tolerance 1e-4, warm-start enabled +- **Model cache:** `g1_29_model_cache.pkl` (172 KB, avoids slow URDF parsing) +- **Control rate:** 30 Hz outer IK loop, 250 Hz internal motor command interpolation + +### IK Configuration Flipping (Known Issue) [T1 — Observed 2026-02-18] + +During teleoperation, the IK solver (CasADi/IPOPT) can suddenly jump to a different valid joint configuration. Symptoms: +- Arm suddenly sticks out to the side +- The "zero" position appears shifted (e.g., arm horizontal becomes the new resting position) +- User can only move the arm in the "wrong" direction + +**Root cause:** For any given hand position, multiple joint configurations are valid (elbow-up vs elbow-down, different shoulder rotations). The solver uses warm-starting + smoothness cost (0.1 weight) to stay in one configuration. But near singularities (fully extended arm, workspace boundary), it can jump to an alternate solution. Once jumped, the smoothness regularization keeps it in the new (wrong) configuration. + +**This is NOT an encoder issue.** The G1 has dual absolute encoders per joint [T0]. The motors faithfully follow the IK solver's commands — the solver is commanding the wrong configuration. + +**Recovery:** Press `q` to stop, restart teleop, press `r` again. The IK solver reinitializes to the default arm configuration. + +**Potential fixes (not yet implemented):** +1. Null-space bias toward preferred configuration (elbow-down, shoulder neutral) +2. Discontinuity detection — reject solutions that jump > threshold between frames +3. Tighter joint-space regularization (increase smoothness weight) +4. Workspace clamping near singularities + +### Internal Control Details [T1] + +- **Arm velocity limit:** Clips to 20–30 rad/s with gradual ramp-up over 5 seconds at start +- **PD gains (arm joints):** Shoulder: kp=80–300, kd=3. Wrist: kp=40, kd=1.5 +- **State feedback:** Always from `rt/lowstate` regardless of mode +- **mode_machine echo:** Reads current `mode_machine` from robot state, echoes it back in commands +- **Blend weight (motion mode):** `motor_cmd[kNotUsedJoint0].q` set to 1.0 during operation, ramped to 0.0 on exit + +--- + +## 2. g1-control (Friend's Custom System) + +**Location on robot:** `/home/unitree/g1-control/` +**Origin:** `experientialtech/g1-control` + +A comprehensive web-based control panel built on top of xr_teleoperate submodule. Provides: + +### Components + +| File | Purpose | +|------|---------| +| `server.py` (59 KB) | aiohttp web server (HTTP 8080, HTTPS 8443) — process management, robot mode control, camera streaming, audio, LiDAR | +| `start_teleop.sh` | Automated launch: kills videohub → starts teleimager → starts Inspire hand drivers → launches teleop | +| `hand_control.py` | Inspire FTP hand control via Modbus TCP (left: 192.168.123.210, right: 192.168.123.211) | +| `loco_helper.py` | Persistent DDS LocoClient subprocess — avoids SIGSEGV from repeated DDS init/cleanup | + +### server.py Features +- **Process management:** Start/stop teleop, image_server, hand drivers, voice/vision chat, LiDAR +- **Robot FSM control:** Via loco_helper.py — modes: damp (1), stand_up (4), walk (501) +- **TeleopIPC:** ZMQ IPC to xr_teleoperate (CMD_START, CMD_STOP, CMD_RECORD_TOGGLE) +- **Camera streaming:** MJPEG from ZMQ, single JPEG snapshots +- **Audio:** DDS-based mic/speaker, TTS, volume/LED control +- **LiDAR:** UDP listener, GLB/USDZ export, WebSocket streaming +- **Web UI:** Mobile-friendly on port 8080/8443 + +### start_teleop.sh Launch Sequence +1. Activates conda `tv` +2. Fixes OpenSSL LD_LIBRARY_PATH +3. Checks SSL certs +4. Kills `videohub_pc4` to free `/dev/video2` +5. Starts teleimager image server +6. Starts Inspire hand Modbus drivers (right=192.168.123.211/LR='l', left=192.168.123.210/LR='r' — labels intentionally swapped) +7. Launches: `python3 teleop_hand_and_arm.py --arm=G1_29 --ee=inspire_ftp` +8. **Does NOT handle robot mode switching** — assumes robot is already in correct state + +--- + +## 3. GR00T-WBC AVP Bridge (Alternative Approach) + +**Location on GB10:** `/home/mitchaiet/GR00T-WholeBodyControl/scripts/avp_wbc_bridge.py` +**Status:** Code written, not yet tested + +Alternative approach using VisionProTeleop's native "Tracking Streamer" app (App Store, free) + `avp_stream` Python library → bridge to GR00T-WBC's `ControlPolicy/upper_body_pose` ROS2 topic. + +### Why This Exists + +xr_teleoperate and GR00T-WBC are **incompatible** when running simultaneously: +- xr_teleoperate (debug mode) publishes to `rt/lowcmd` — conflicts with GR00T-WBC's 50 Hz motor commands +- xr_teleoperate (motion mode) uses `rt/arm_sdk` with Unitree's built-in locomotion — bypasses GR00T-WBC entirely + +The AVP bridge approach uses GR00T-WBC's native upper body topic, allowing: +- Arm teleoperation via Vision Pro hand tracking (native ARKit quality) +- GR00T-WBC's RL-based balance and locomotion (trained policy) +- Joystick walking via GR00T-WBC's wireless_remote integration + +### Architecture + +``` +Vision Pro (Tracking Streamer app, gRPC port 12345) + ▼ +GB10: avp_wbc_bridge.py (avp_stream.VisionProStreamer) + │ Coordinate transform (AVP → Unitree frame) + │ Pinocchio damped least-squares IK → 14 arm joints + │ Prepend 3 waist zeros → 17 upper body DOFs + ▼ +ROS2 topic: "ControlPolicy/upper_body_pose" (msgpack over ByteMultiArray) + ▼ +GB10: GR00T-WBC control loop (merges upper + lower body) + ▼ +DDS rt/lowcmd → Robot +``` + +### Dependencies (installed on GB10) +- `avp_stream` v2.51 +- `pinocchio` 2.7.0 +- `msgpack_numpy` +- URDF: `~/xr_teleoperate/assets/g1/g1_body29_hand14.urdf` + +--- + +## 4. FSM Mode Reference [T1] + +Robot FSM states (from unitree_sdk2py): + +| FSM ID | Name | Description | +|--------|------|-------------| +| 0 | ZeroTorque | Motors off | +| 1 | Damp | Soft stop, motors damped | +| 3 | Sit | Sit down | +| 4 | Stand up | Robot stands from sitting | +| 5 | Locked standing | Standing, position held (observed) [T2] | +| 200 | Start | Start locomotion | +| 501 | Walk/AI | Full AI locomotion active | +| 702 | Lie2StandUp | Stand up from lying | +| 706 | Squat2StandUp | Toggle squat/stand | + +MotionSwitcher modes (separate from FSM): +- `"ai"` — AI/RL locomotion +- `"normal"` — Normal locomotion +- `"advanced"` — Advanced mode +- Released (empty) — Debug mode (full low-level access) + +**For xr_teleoperate `--motion`:** Robot must be in Regular mode via the R3 physical controller — **R1+X** (1-DOF waist, base G1) or **R1+Y** (3-DOF waist, EDU G1). Running mode (R2+A) is NOT supported and will lock legs. This is distinct from the FSM IDs above. + +--- + +## 5. Pitfalls & Known Issues + +1. **`--motion` requires R3 controller Regular mode (R1+X) FIRST** — the teleop script does NOT handle mode switching [T1] +2. **Running mode (R2+A) is NOT supported** — only Regular mode works with xr_teleoperate [T1] +3. **xr_teleoperate + GR00T-WBC conflict** — both publish motor commands, cannot run simultaneously [T1] +4. **SSL cert SAN must include robot's IP** — Vision Pro will reject connection if IP not in cert SAN [T1] +5. **`rootCA.pem` must be explicitly trusted on Vision Pro** — installing the profile is not enough, must enable trust in Certificate Trust Settings [T1] +6. **`LD_LIBRARY_PATH` needed for conda OpenSSL** — without it, SSL handshake may fail with version mismatch [T1] +7. **Vuer WebSocket session mismatch** — if teleop is started before Vision Pro connects, may need to restart teleop for clean session [T2] +8. **Inspire hand labels are swapped** — right hand at 192.168.123.211 uses DDS label 'l', left uses 'r' [T1] +9. **Vuer v0.0.40 client JS incompatible with visionOS** — zero WebSocket connections. Must use v0.0.60 (with patches). [T1 — Verified 2026-02-18] +10. **`pip install` overwrites patches** — reinstalling vuer resets JS port fix and aiohttp SSL fix. Re-apply ALL patches after every pip install. [T1] +11. **Must launch from `~/xr_teleoperate/teleop/`** — URDF path is `../assets/g1/g1_body29_hand14.urdf` (relative). Launching from parent directory causes FileNotFoundError. [T1] +12. **Safari caches aggressively on visionOS** — after server restart, the VP may resume an old cached page with a dead WebSocket. Hard-reload (`Aa` menu → Reload) required. [T1 — Observed 2026-02-18] +13. **Running mode (R2+A) locks legs during teleop** — by design, `rt/arm_sdk` in Running mode causes lower body joints to lock rigid. Walking only works in Regular mode (R1+X base / R1+Y EDU). [T1 — xr_teleoperate Issue #251] +14. **EDU units use R1+Y, not R1+X** — R1+X enters 1-DOF waist Regular mode (disabled on 29-DOF EDU hardware). R1+Y enters 3-DOF waist Regular mode. [T1 — Verified on G1 EDU Ultimate E, 2026-02-18] + +--- + +## Key Relationships +- Hardware: [[joint-configuration]] (arm DOFs, joint limits) +- Control: [[whole-body-control]] (GR00T-WBC upper body integration) +- Control: [[manipulation]] (arm IK, end-effector control) +- Retargeting: [[motion-retargeting]] (human → robot pose mapping) +- Network: [[networking-comms]] (DDS topics, robot IPs, SSL) +- Compute: [[gb10-offboard-compute]] (offboard teleop hosting) +- Safety: [[safety-limits]] (joint velocity limits, mode switching) diff --git a/context/whole-body-control.md b/context/whole-body-control.md index 0ead674..c8ef73d 100644 --- a/context/whole-body-control.md +++ b/context/whole-body-control.md @@ -114,33 +114,58 @@ The most G1-relevant WBC framework. Open-source, designed specifically for Unitr ### Why This Matters for Mocap + Balance GR00T-WBC is the most direct path to the user's goal: the locomotion policy maintains balance (including push recovery if trained with perturbations) while the upper body tracks mocap reference trajectories. The two are coordinated through the WBC layer. -### Deployment on Dell Pro Max GB10 — Verified (2026-02-14) [T1] +### Deployment on Dell Pro Max GB10 — Verified (2026-02-14/15) [T1] -GR00T-WBC has been **successfully deployed and tested** on the Dell Pro Max GB10 (NVIDIA Grace Blackwell, aarch64, Ubuntu 24.04). Key findings: +GR00T-WBC has been **successfully deployed on a real G1 robot** via Dell Pro Max GB10 (NVIDIA Grace Blackwell, aarch64, Ubuntu 24.04). The robot stands and balances autonomously. **Pre-trained ONNX Policies:** - `GR00T-WholeBodyControl-Balance.onnx` — standing balance (15 lower-body joint targets) - `GR00T-WholeBodyControl-Walk.onnx` — locomotion with velocity commands - Both: 516-dim observation → 15-dim action. Pre-trained by NVIDIA (training code not open-sourced). - Training: PPO via RSL-RL in Isaac Lab, domain randomization, zero-shot sim-to-real. Exact reward function and perturbation curriculum not published. +- **Training code available separately** via WBC-AGILE (nvidia-isaac/WBC-AGILE) for retraining/fine-tuning + +**ONNX Policy Details:** +- Observation layout (86 dims per history step, 6 steps = 516): + - `[0:3]` = velocity commands × cmd_scale + - `[3:4]` = height command + - `[4:7]` = [roll_cmd, pitch_cmd, yaw_cmd] + - `[7:10]` = angular velocity × 0.5 + - `[10:13]` = gravity orientation (from IMU quaternion) + - `[13:42]` = (joint_pos - defaults) × 1.0 (29 DOF) + - `[42:71]` = joint_vel × 0.05 (29 DOF) + - `[71:86]` = previous action (15) +- Action transform: `cmd_q = action * 0.25 + default_angles` +- Action bounds: **No clipping** — policy outputs can exceed [-1,1], this is intentional for push recovery. Do NOT add np.clip(). NVIDIA reference code does not clip. +- Policy selection: Balance when `np.linalg.norm(cmd) < 0.05`, Walk otherwise **Performance on GB10:** - ~3.5 ms per control loop iteration at 50 Hz (sync mode) — only 17.5% of time budget - 401% CPU usage (4 cores) — MuJoCo physics dominates - Both Balance and Walk policies load and execute successfully -- Robot walks, turns, and strafes in simulation via keyboard velocity commands -**Critical Fixes Required for GB10 (aarch64):** +**Critical Fixes Required for Real Robot Deployment:** +1. **IMU pitch offset calibration** — The G1's pelvis IMU has a physical mounting offset (~6°) that sim doesn't model. Must rotate quaternion before gravity computation. See [[sensors-perception]] §4. Without this fix, robot leans backward persistently. +2. **Negative KD bug** — `configs.py` has `MOTOR_KD[14] -= 10` which makes waist_pitch KD negative (-5). Comment out this line. +3. **Dynamic mode_machine detection** — GR00T-WBC hardcodes `mode_machine=5`. Apply PR #11 to read from `rt/lowstate` instead. +4. **GLFW crash on headless** — `simulator_factory.py` eagerly imports mujoco. Make BaseSimulator import lazy + run Xvfb :99. +5. **CYCLONEDDS_URI** — Must set explicit network interface: `address="192.168.123.100"` for GB10. +6. **Do NOT clip actions** — ONNX policy outputs intentionally exceed [-1,1]. Clipping causes policy saturation at clip boundaries with no room for balance corrections. + +**Critical Fixes Required for GB10 Simulation (aarch64):** 1. **CycloneDDS buffer overflow:** The `` XML section in `unitree_sdk2py/core/channel_config.py` triggers a glibc FORTIFY_SOURCE buffer overflow on aarch64. Fix: remove the `` section entirely. (See [[dev-environment]] for patch details.) 2. **ROS2 Python path:** venv needs `.pth` file pointing to `/opt/ros/jazzy/lib/python3.12/site-packages/` 3. **ROS2 shared libraries:** `export LD_LIBRARY_PATH=/opt/ros/jazzy/lib:$LD_LIBRARY_PATH` 4. **Sync mode bug:** `run_g1_control_loop.py` checks for sim thread in sync mode where none exists. Patch: add `not config.sim_sync_mode` guard. -**Keyboard Control:** -- `sshkeyboard` library fails in remote terminals (SSH, NoMachine). Workaround: use `--keyboard-dispatcher-type ros` and publish to `/keyboard_input` ROS topic from a separate process. -- Keys: `]`=enable Walk, `w/s`=fwd/back, `a/d`=strafe, `q/e`=rotate, `z`=stop, `backspace`=reset +**Keyboard Control (Real Robot via tmux):** +- Keys sent via `tmux send-keys -t groot "key"` from remote machine +- `]`=activate policy, `o`=deactivate policy +- `w/s`=fwd/back, `a/d`=strafe, `q/e`=rotate, `z`=stop +- `1/2`=height up/down, `5/6`=pitch, `3/4`=roll, `7/8`=yaw +- `9/0`=IMU pitch offset ±1° (custom addition for calibration) -**Visualization:** +**Visualization (Simulation):** - GLFW passive viewer freezes on virtual/remote displays (Xvfb, NoMachine) after a few seconds - VNC (x11vnc) cannot capture OpenGL framebuffer updates - Working solution: NoMachine virtual desktop (NX protocol) — viewer works initially but GLFW stalls diff --git a/reference/glossary.yaml b/reference/glossary.yaml index 23ba97a..d7a0351 100644 --- a/reference/glossary.yaml +++ b/reference/glossary.yaml @@ -817,3 +817,91 @@ terms: typical_range: "4096+ parallel environments on single GPU" related_terms: ["ppo", "rsl_rl", "groot_wbc", "sim_to_real"] related_topics: ["simulation", "learning-and-ai", "whole-body-control"] + + - term: "imu_pitch_offset" + full_name: "IMU Pitch Mounting Offset" + definition: | + Physical mounting angle offset of the G1's pelvis IMU relative to the + body frame assumed in simulation. Approximately 6 degrees on tested units. + The stock Unitree controller compensates internally, but third-party + controllers (e.g., GR00T-WBC) that read raw IMU via DDS need explicit + calibration. Without correction, causes persistent backward lean. + Fix: apply quaternion pitch rotation before gravity computation. + unit: "degrees (or radians)" + typical_range: "~-6° (forward correction)" + related_terms: ["imu", "sim_to_real", "groot_wbc"] + related_topics: ["sensors-perception", "deployment-operations", "simulation"] + + - term: "wbc_agile" + full_name: "WBC-AGILE (NVIDIA)" + definition: | + NVIDIA's open-source training framework for whole-body control policies. + Repository: nvidia-isaac/WBC-AGILE. Provides Isaac Lab + RSL-RL training + pipeline with G1 support. Exports ONNX policies compatible with GR00T-WBC. + The GR00T-WBC repo is inference-only; WBC-AGILE is the training counterpart. + unit: null + typical_range: null + related_terms: ["groot_wbc", "isaac_lab", "ppo", "rsl_rl"] + related_topics: ["learning-and-ai", "whole-body-control"] + + - term: "configuration_flipping" + full_name: "IK Configuration Flipping" + definition: | + Phenomenon where a nonlinear IK solver (e.g., CasADi/IPOPT) suddenly jumps + from one valid joint configuration to another (e.g., elbow-up to elbow-down). + Occurs near kinematic singularities where the solver's warm-start fails to + constrain the solution to the current configuration. In teleoperation, manifests + as a sudden arm position shift. Recovery: restart the IK solver session. + unit: null + typical_range: null + related_terms: ["inverse_kinematics", "xr_teleoperate"] + related_topics: ["teleoperation", "manipulation"] + + - term: "vuer" + full_name: "Vuer (WebXR Server Library)" + definition: | + Python library providing a WebXR server with HTTPS + WebSocket. Used by + xr_teleoperate's TeleVuer component to serve the 3D scene to Vision Pro + via Safari. v0.0.60 works with visionOS (with patches); v0.0.40 does not. + Bundled client JS uses aiohttp for the WebSocket backend. + unit: null + typical_range: "v0.0.60 (required for visionOS)" + related_terms: ["televuer", "xr_teleoperate"] + related_topics: ["teleoperation"] + + - term: "televuer" + full_name: "TeleVuer (Teleoperation Vuer Wrapper)" + definition: | + Component of xr_teleoperate that wraps the Vuer WebXR server with + teleoperation-specific handlers: wrist pose extraction, finger joint + tracking, head pose, and hand tracking data. Runs in a separate + multiprocessing.Process from the main teleop loop. + unit: null + typical_range: null + related_terms: ["vuer", "xr_teleoperate"] + related_topics: ["teleoperation"] + + - term: "mode_machine" + full_name: "G1 Hardware Revision Identifier" + definition: | + Integer field in rt/lowstate identifying the G1 hardware revision. + Determines gear ratios used for interpreting joint commands. + Values: 2 (old), 5 (rev 1.0, common), 11-12 (newer, upgraded hip pitch), + 13-16 (5010 wrist variants). Must be read dynamically from the robot, + not hardcoded. GR00T-WBC PR #11 adds this capability. + unit: null + typical_range: "5 (g1_29dof_rev_1_0)" + related_terms: ["unitree_sdk2", "dds"] + related_topics: ["sdk-programming", "deployment-operations"] + + - term: "motion_switcher_client" + full_name: "MotionSwitcherClient" + definition: | + SDK utility that controls the stock locomotion controller (ai_sport). + ReleaseMode() disables the stock controller, giving full low-level + joint control via rt/lowcmd. Used by GR00T-WBC at startup. + Equivalent to entering debug mode (L2+R2) programmatically. + unit: null + typical_range: null + related_terms: ["unitree_sdk2", "groot_wbc"] + related_topics: ["sdk-programming", "deployment-operations", "locomotion-control"] diff --git a/scripts/diagnose_r3_buttons.py b/scripts/diagnose_r3_buttons.py new file mode 100644 index 0000000..e0ace5f --- /dev/null +++ b/scripts/diagnose_r3_buttons.py @@ -0,0 +1,112 @@ +"""Diagnose R3 controller button bitmask positions via wireless_remote in rt/lowstate. + +Tries different network interfaces and LowState types to find where +wireless_remote data flows. +""" +import paramiko, sys, time +sys.stdout.reconfigure(encoding='utf-8', errors='replace') + +ssh = paramiko.SSHClient() +ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) +ssh.connect('10.0.0.64', username='unitree', password='123', + timeout=15, look_for_keys=False, allow_agent=False) + +# Deploy diagnostic script +script = r''' +import sys, time, struct +sys.path.insert(0, "/home/unitree/miniforge3/envs/tv/lib/python3.11/site-packages") +from unitree_sdk2py.core.channel import ChannelSubscriber, ChannelFactoryInitialize +from unitree_sdk2py.idl.unitree_hg.msg.dds_ import LowState_ as hg_LowState + +# Try different interfaces +import subprocess +result = subprocess.run(['ip', 'link', 'show'], capture_output=True, text=True) +print("=== Network interfaces ===") +for line in result.stdout.split('\n'): + if ': ' in line and 'state' in line: + print(f" {line.strip()}") + +# Use eth0 which connects to the internal bus +iface = "eth0" +print(f"\nUsing interface: {iface}") + +ChannelFactoryInitialize(0, iface) + +count = [0] +last_wr = [None] + +def handler(msg): + count[0] += 1 + # Check if wireless_remote exists and has data + try: + wr = msg.wireless_remote + data = bytes(wr) + + # Only print if changed or first time or periodic + if data != last_wr[0] or count[0] % 200 == 1: + btn1 = data[2] if len(data) > 2 else 0 + btn2 = data[3] if len(data) > 3 else 0 + + btn1_names = [] + if btn1 & 0x01: btn1_names.append("R1") + if btn1 & 0x02: btn1_names.append("L1") + if btn1 & 0x04: btn1_names.append("Start") + if btn1 & 0x08: btn1_names.append("Select") + if btn1 & 0x10: btn1_names.append("R2") + if btn1 & 0x20: btn1_names.append("L2") + if btn1 & 0x40: btn1_names.append("F1") + if btn1 & 0x80: btn1_names.append("F3") + + btn2_names = [] + if btn2 & 0x01: btn2_names.append("A") + if btn2 & 0x02: btn2_names.append("B") + if btn2 & 0x04: btn2_names.append("X") + if btn2 & 0x08: btn2_names.append("Y") + if btn2 & 0x10: btn2_names.append("Up") + if btn2 & 0x20: btn2_names.append("Right") + if btn2 & 0x40: btn2_names.append("Down") + if btn2 & 0x80: btn2_names.append("Left") + + any_nonzero = any(b != 0 for b in data[:4]) + ts = time.strftime('%H:%M:%S') + print(f"[{ts}] #{count[0]:4d} len={len(data)} " + f"btn1=0x{btn1:02x}[{','.join(btn1_names) or '-'}] " + f"btn2=0x{btn2:02x}[{','.join(btn2_names) or '-'}] " + f"raw={data[:8].hex()} " + f"{'*** BUTTON ***' if any_nonzero else ''}", flush=True) + last_wr[0] = data + except Exception as e: + if count[0] <= 3: + print(f"Error reading wireless_remote: {e}", flush=True) + +sub = ChannelSubscriber("rt/lowstate", hg_LowState) +sub.Init(handler, 10) + +print("\n=== R3 Button Diagnostic (hg_LowState on eth0) ===", flush=True) +print("Press buttons on R3 controller. Running 15 seconds...\n", flush=True) + +time.sleep(15) +print(f"\nTotal messages received: {count[0]}", flush=True) +''' + +print("Deploying R3 button diagnostic to robot...") +sftp = ssh.open_sftp() +with sftp.file('/tmp/r3_diag.py', 'w') as f: + f.write(script) +sftp.close() +print("Deployed.") + +print("\n=== Running diagnostic (15 seconds) ===") +print("Press each R3 button now!\n") + +_, o, e = ssh.exec_command( + '/home/unitree/miniforge3/envs/tv/bin/python /tmp/r3_diag.py', + timeout=30) + +output = o.read().decode('utf-8', errors='replace') +print(output) +err = e.read().decode('utf-8', errors='replace') +if err: + print(f"Stderr: {err}") + +ssh.close() diff --git a/scripts/launch_teleop.py b/scripts/launch_teleop.py new file mode 100644 index 0000000..31456d1 --- /dev/null +++ b/scripts/launch_teleop.py @@ -0,0 +1,76 @@ +"""Launch teleop server from correct directory + ensure cert server running.""" +import paramiko, sys, time +sys.stdout.reconfigure(encoding='utf-8', errors='replace') +ssh = paramiko.SSHClient() +ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) +ssh.connect('10.0.0.64', username='unitree', password='123', + timeout=15, look_for_keys=False, allow_agent=False) + +# Kill any existing teleop screen +ssh.exec_command('screen -S teleop -X quit 2>/dev/null; sleep 1', timeout=5) +time.sleep(2) + +# Ensure cert server is running on 9090 +_, o, _ = ssh.exec_command('ss -tlnp | grep 9090', timeout=5) +if not o.read().decode().strip(): + print('Starting cert server on port 9090...') + ssh.exec_command( + 'cd ~/.config/xr_teleoperate && nohup python3 -m http.server 9090 --bind 0.0.0.0 > /dev/null 2>&1 &', + timeout=5) + time.sleep(2) +else: + print('Cert server already running on 9090') + +# Launch teleop from ~/xr_teleoperate/teleop (correct dir for URDF paths) +cmd = ( + 'screen -dmS teleop bash -c "' + 'source ~/miniforge3/etc/profile.d/conda.sh && conda activate tv && ' + 'cd ~/xr_teleoperate/teleop && ' + 'python3 teleop_hand_and_arm.py --arm=G1_29 --input-mode hand --motion --display-mode pass-through ' + '2>&1 | tee /tmp/teleop_v60.log' + '"' +) +print('Starting teleop server...') +ssh.exec_command(cmd, timeout=10) +time.sleep(1) + +# Wait for server to start +for i in range(10): + time.sleep(2) + _, o, _ = ssh.exec_command('ss -tlnp | grep 8012', timeout=5) + port = o.read().decode().strip() + if port: + print(f'Server listening on port 8012! ({(i+1)*2}s)') + break + # Check if it crashed + _, o, _ = ssh.exec_command('tail -3 /tmp/teleop_v60.log', timeout=5) + log = o.read().decode('utf-8', errors='replace').strip() + if 'exiting program' in log.lower() or 'error' in log.lower(): + print(f'Server crashed! Log:') + _, o, _ = ssh.exec_command('tail -20 /tmp/teleop_v60.log', timeout=5) + print(o.read().decode('utf-8', errors='replace')) + ssh.close() + sys.exit(1) + print(f' waiting... ({(i+1)*2}s)') +else: + print('Server did not start in 20s. Log:') + _, o, _ = ssh.exec_command('tail -20 /tmp/teleop_v60.log', timeout=5) + print(o.read().decode('utf-8', errors='replace')) + ssh.close() + sys.exit(1) + +# Show final state +_, o, _ = ssh.exec_command('tail -5 /tmp/teleop_v60.log', timeout=5) +print('\nLog (last 5 lines):') +print(o.read().decode('utf-8', errors='replace')) + +# Verify cert server accessible +_, o, _ = ssh.exec_command('curl -s http://localhost:9090/rootCA.pem | head -1', timeout=5) +cert = o.read().decode().strip() +print(f'\nCert server: {"OK" if "BEGIN" in cert else "NOT WORKING"}') + +print('\n=== READY ===') +print('Cert install: http://10.0.0.64:9090/rootCA.pem') +print('Teleop page: https://10.0.0.64:8012') + +ssh.close() diff --git a/scripts/pull_on_robot.py b/scripts/pull_on_robot.py new file mode 100644 index 0000000..0973bc9 --- /dev/null +++ b/scripts/pull_on_robot.py @@ -0,0 +1,47 @@ +"""Pull xr_teleoperate changes from Gitea on the robot.""" +import paramiko, sys +sys.stdout.reconfigure(encoding='utf-8', errors='replace') +ssh = paramiko.SSHClient() +ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) +ssh.connect('10.0.0.64', username='unitree', password='123', + timeout=15, look_for_keys=False, allow_agent=False) + +# Check current remotes - origin already points to gitea +print('=== Current remotes ===') +_, o, _ = ssh.exec_command('cd ~/xr_teleoperate && git remote -v', timeout=10) +print(o.read().decode().strip()) + +# Check for local changes +print('\n=== Git status ===') +_, o, _ = ssh.exec_command('cd ~/xr_teleoperate && git status --short', timeout=10) +status = o.read().decode().strip() +print(status if status else '(clean)') + +# Pull from origin (which is already gitea) +print('\n=== Pulling from origin main ===') +_, o, e = ssh.exec_command('cd ~/xr_teleoperate && git pull origin main', timeout=30) +pull_out = o.read().decode().strip() +pull_err = e.read().decode().strip() +print(pull_out) +if pull_err: + print(pull_err) + +# Verify the changes are present +print('\n=== Verifying compute_fk exists ===') +_, o, _ = ssh.exec_command('grep -n "def compute_fk" ~/xr_teleoperate/teleop/robot_control/robot_arm_ik.py', timeout=5) +print(o.read().decode().strip() or 'NOT FOUND') + +print('\n=== Verifying --ki arg exists ===') +_, o, _ = ssh.exec_command('grep -n "\\-\\-ki" ~/xr_teleoperate/teleop/teleop_hand_and_arm.py', timeout=5) +print(o.read().decode().strip() or 'NOT FOUND') + +print('\n=== Verifying I-term logic exists ===') +_, o, _ = ssh.exec_command('grep -n "I-term" ~/xr_teleoperate/teleop/teleop_hand_and_arm.py', timeout=5) +print(o.read().decode().strip() or 'NOT FOUND') + +print('\n=== Latest commit ===') +_, o, _ = ssh.exec_command('cd ~/xr_teleoperate && git log --oneline -3', timeout=5) +print(o.read().decode().strip()) + +ssh.close() +print('\nDone!')