14 KiB

Raw Blame History

id	title	status	source_sections	related_topics	key_equations	key_terms	images	examples	open_questions
g1-integration	Unitree G1 Robot Integration	proposed	Cross-referenced from git/unitree-g1 context system + Dell Pro Max GB10 context	[connectivity ai-workloads ai-frameworks multi-unit-stacking]	[]	[unitree-g1 jetson-orin-nx offboard-compute cyclonedds dds teleoperation isaac-lab lerobot]	[]	[]	[DDS latency over Wi-Fi between GB10 and G1 under realistic conditions Optimal LLM size for real-time task planning (latency vs. capability tradeoff) Isaac Lab on GB10 vs. dedicated x86 workstation for RL training throughput Docker container availability for CycloneDDS 0.10.2 on ARM64 DGX OS Power-efficient always-on inference mode for persistent LLM service]

Unitree G1 Robot Integration

Architecture and use cases for pairing the Dell Pro Max GB10 as an offboard AI companion to the Unitree G1 humanoid robot.

1. Why Integrate?

The G1's development computer (Jetson Orin NX 16GB, 100 TOPS) is excellent for real-time control but severely limited for large-scale AI. The GB10 removes that ceiling.

Capability	G1 Orin NX	GB10	Factor
AI compute	100 TOPS	1,000 TFLOPS (FP4)	~10,000x
Memory for models	16 GB	128 GB unified	8x
Max LLM (quantized)	~7B	~200B	~30x
CUDA architecture	sm_87 (Orin)	sm_121 (Blackwell)	2 gens newer
Storage	32 GB eMMC (loco)	2-4 TB NVMe	60-125x

The GB10 acts as an offboard AI brain — handling reasoning, planning, training, and heavy inference that the Orin NX cannot.

2. Connection Architecture

┌──────────────────────────┐              ┌───────────────────────────┐
│      Unitree G1          │              │   Dell Pro Max GB10       │
│                          │              │                           │
│  ┌─────────────────┐     │              │   128 GB unified memory   │
│  │ Locomotion CPU   │    │              │   1 PFLOP FP4             │
│  │ RK3588           │    │              │   DGX OS (Ubuntu 24.04)   │
│  │ 192.168.123.161  │    │              │                           │
│  │ 500 Hz control   │    │              │   ┌──────────────────┐    │
│  └──────┬───────────┘    │              │   │ LLM Server       │    │
│         │ CycloneDDS     │              │   │ (llama.cpp/       │    │
│  ┌──────┴───────────┐    │   Wi-Fi 6/7  │   │  TensorRT-LLM/   │    │
│  │ Development CPU   │◄──┼──────────────┼──►│  Ollama)          │    │
│  │ Jetson Orin NX    │   │   or 10GbE   │   └──────────────────┘    │
│  │ 192.168.123.164   │   │              │                           │
│  │ 100 TOPS          │   │              │   ┌──────────────────┐    │
│  └──────────────────┘    │              │   │ Training Server  │    │
│                          │              │   │ (Isaac Lab /      │    │
│  Sensors:                │              │   │  PyTorch /        │    │
│  - D435i (RGB-D)         │              │   │  LeRobot)         │    │
│  - MID360 (LiDAR)        │              │   └──────────────────┘    │
│  - 6-axis IMU            │              │                           │
│  - 4-mic array           │              │   ┌──────────────────┐    │
│  - Joint encoders (500Hz)│              │   │ VLM Server       │    │
│                          │              │   │ (vision-language) │    │
└──────────────────────────┘              └───────────────────────────┘

Connection Options

Method	Bandwidth	Latency	Best For
Wi-Fi (G1 Wi-Fi 6 ↔ GB10 Wi-Fi 7)	~1 Gbps theoretical	5-50 ms variable	Untethered operation, API calls
10GbE Ethernet (GB10 RJ45)	10 Gbps	<1 ms	Lab/tethered, sensor streaming
QSFP (via switch)	200 Gbps	<1 ms	Multi-robot + GB10 cluster (T4)

Recommended: Wi-Fi for mobile operation; 10GbE for lab/training workflows where the robot is stationary or tethered.

Network Configuration

The G1 uses a fixed 192.168.123.0/24 subnet. The GB10 must join this subnet or use a router/bridge.

Option A — Direct subnet join (simplest):

# On GB10, configure 10GbE or Wi-Fi interface
sudo nmcli con add type ethernet ifname eth0 con-name g1-net \
  ip4 192.168.123.100/24

Option B — Dual-network (GB10 keeps internet access):

GB10's 10GbE on 192.168.123.0/24 (to G1)
GB10's Wi-Fi on home network (internet access for model downloads, updates)

Option C — G1 joins GB10's network:

Configure G1's Wi-Fi to connect to the same network as GB10
G1's internal subnet (192.168.123.0/24) remains separate for internal DDS

3. Use Cases

3a. LLM-Based Task Planning

The G1's learning-and-ai.md lists "LLM-based task planning integration (firmware v3.2+)" as preliminary. The GB10 can serve as the LLM backend.

Architecture:

User (voice/text) → G1 mic array → STT on Orin NX
  → HTTP/gRPC to GB10 → LLM (Llama 70B+) generates plan
  → Plan sent back to G1 → G1 executes via locomotion policy + manipulation

GB10 serves an OpenAI-compatible API via llama.cpp or Ollama:

# On GB10: start LLM server
./llama-server --model ~/models/llama-70b-q4.gguf \
  --host 0.0.0.0 --port 30000 --n-gpu-layers 99

# On G1 Orin NX: call the API
curl http://192.168.123.100:30000/v1/chat/completions \
  -d '{"model":"llama","messages":[{"role":"user","content":"Walk to the kitchen table and pick up the red cup"}]}'

Recommended models:

Model	Size	Speed (est.)	Task Planning Quality
Llama 3.2 3B	3B	~100 tok/s	Basic commands
Nemotron-3-Nano 30B	3B active	Fast	Good (built-in reasoning)
Llama 3.1 70B (Q4)	70B	~15-20 tok/s	Strong
Llama 3.1 70B (Q8)	70B	~10-15 tok/s	Strong, higher quality

For task planning, 1-5 second response time is acceptable — the robot doesn't need instant responses for high-level commands.

3b. Vision-Language Models

Stream camera frames from the G1's Intel RealSense D435i to the GB10 for analysis by large VLMs that can't run on the Orin NX.

Use cases:

Scene understanding ("describe what you see")
Object identification ("find the red cup on the table")
Spatial reasoning ("is the path ahead clear?")
Anomaly detection ("does anything look unusual?")

Data path:

G1 D435i (1920x1080 @ 30fps) → RealSense SDK on Orin NX
  → JPEG compress → HTTP POST to GB10
  → VLM inference → JSON response back to G1

Bandwidth: A 1080p JPEG frame is ~100-500 KB. At 5 fps sampling: ~2.5 MB/s (~20 Mbps). Well within Wi-Fi capacity.

3c. RL Policy Training

Train locomotion and manipulation policies on the GB10 using Isaac Lab or MuJoCo, then deploy to the G1.

Why GB10 over the Orin NX:

Isaac Lab runs thousands of parallel environments — needs GPU memory
GB10's 128 GB unified memory holds large batches + simulation state
Blackwell GPU (6,144 CUDA cores, sm_121) accelerates physics simulation
CUDA 13.0 + PyTorch container nvcr.io/nvidia/pytorch:25.11-py3

Workflow:

1. GB10: Train policy in Isaac Lab (unitree_rl_lab, G1-29dof config)
2. GB10: Validate in MuJoCo (sim2sim cross-validation)
3. GB10 → G1: Transfer trained model file (.pt / .onnx)
4. G1 Orin NX: Deploy via unitree_sdk2_python (low-gain start)
5. G1: Gradual gain ramp-up with tethered safety testing

Key: Both systems are ARM64. Model files trained on GB10 (ARM) deploy directly to Orin NX (ARM) without architecture-related issues.

3d. Imitation Learning Data Processing

Process teleoperation demonstration data collected on the G1 using larger models on the GB10.

Workflow:

1. G1: Collect demonstrations via XR teleoperation (Vision Pro / Quest 3)
2. Transfer data to GB10 (SCP over network)
3. GB10: Train behavior cloning / diffusion policy (LeRobot, 128GB memory)
4. GB10: Validate in simulation
5. Transfer model to G1 for deployment

The GB10's 128 GB memory enables training larger LeRobot policies and processing more demonstration episodes than the Orin NX's 16 GB allows.

Combine the G1's sensor suite with GB10's compute for rich interaction:

G1 Microphones (4-mic array) → Speech-to-Text (Orin NX: Whisper small)
  → Text to GB10 → LLM reasoning → Response text
  → TTS on Orin NX → G1 Speaker (5W stereo)

G1 Camera (D435i) → Frames to GB10 → VLM → Situational awareness
G1 LiDAR (MID360) → Point cloud to GB10 → Semantic mapping

3f. Simulation Environment

Run full physics simulation of the G1 on the GB10:

MuJoCo: CPU-based simulation for policy validation
Isaac Lab: GPU-accelerated parallel environments for training
Isaac Gym: Legacy GPU simulation (unitree_rl_gym)

The GB10 can run these natively (ARM64 NVIDIA stack) while the G1 operates in the real world. Sim2Real transfer between GB10 simulation and real G1 hardware.

4. Latency Budget

Operation	Latency	Acceptable?
G1 locomotion control loop	2 ms (500 Hz)	N/A — stays on-robot
DDS internal (loco ↔ dev)	2 ms	N/A — stays on-robot
Wi-Fi round-trip (G1 ↔ GB10)	5-50 ms	Yes for planning, No for control
10GbE round-trip (G1 ↔ GB10)	<1 ms	Yes for most tasks
LLM inference (70B Q4)	1-5 seconds	Yes for task planning
VLM inference (per frame)	0.5-2 seconds	Yes for scene understanding

Critical rule: Real-time control (500 Hz locomotion, balance, joint commands) MUST stay on the G1's locomotion computer. The GB10 handles only high-level reasoning with relaxed latency requirements.

5. Communication Protocol Options

Option A: REST/gRPC API (Simplest)

GB10 runs an HTTP API server (llama.cpp, Ollama, or custom FastAPI). G1's Orin NX sends requests and receives responses.

Pros: Simple, stateless, easy to debug, OpenAI-compatible
Cons: Higher latency per request, no streaming sensor data
Best for: Task planning, VLM queries, one-shot inference

Option B: CycloneDDS (Native G1 protocol)

Install CycloneDDS 0.10.2 on the GB10, making it appear as another node on the G1's DDS network.

Pros: Native integration, low latency, pub/sub for streaming data
Cons: More complex setup, must match exact DDS version (0.10.2)
Best for: Streaming sensor data, continuous perception, tight integration

# On GB10: Install CycloneDDS (must be 0.10.2 to match G1)
pip install cyclonedds==0.10.2

# Subscribe to G1 state
# (requires unitree_sdk2 IDL definitions)

Option C: ROS 2 Bridge

Both systems support ROS 2 (G1 via unitree_ros2, GB10 via standard Ubuntu packages).

Pros: Rich ecosystem, visualization (RViz), bag recording
Cons: Additional overhead, ROS 2 setup complexity
Best for: Research workflows, multi-sensor fusion, SLAM

6. What NOT to Use the GB10 For

Real-time joint control — Too much latency over network. Locomotion stays on RK3588.
Balance/stability — 500 Hz control loop cannot tolerate network jitter.
Safety-critical decisions — Emergency stop must be on-robot, not dependent on network.
Field deployment brain — GB10 requires wall power (280W). It's a lab/indoor companion only.

7. Hardware Compatibility Notes

Aspect	G1	GB10	Compatible?
CPU architecture	ARM (A76/A55, Orin ARM)	ARM (Cortex-X925/A725)	Yes
CUDA	sm_87 (Orin)	sm_121 (Blackwell)	Model files portable
Linux	Custom (kernel 5.10 RT)	Ubuntu 24.04 (kernel 6.8)	Yes (userspace)
Python	3.x on Orin NX	3.x on DGX OS	Yes
DDS	CycloneDDS 0.10.2	Not pre-installed (installable)	Yes (with setup)
Docker	Available on Orin NX	Pre-installed + NVIDIA Runtime	Yes
Network	Wi-Fi 6, Ethernet	Wi-Fi 7, 10GbE, QSFP	Yes

ARM alignment is a significant advantage. Both systems are ARM-native, eliminating cross-architecture issues with model files, shared libraries, and container images.

8. Cost Context

Component	Price	Role
G1 EDU Standard	$43,500	Robot platform
G1 EDU Ultimate B	$73,900	Robot + full DOF + tactile
Dell Pro Max GB10 (2TB)	$3,699	Offboard AI brain
Dell Pro Max GB10 (4TB)	$3,999	Offboard AI brain

The GB10 adds <10% to the cost of a G1 EDU while multiplying AI capability by orders of magnitude. Even at the $13,500 base G1 price, the GB10 is a ~27% add-on that transforms the AI capabilities.

9. Quick Start (Proposed)

Network: Connect GB10 to G1's subnet (Wi-Fi or Ethernet to 192.168.123.100)
LLM server: Start llama.cpp or Ollama on GB10 (port 30000 or 12000)
Test connectivity: curl http://192.168.123.100:30000/v1/models from G1 Orin NX
First integration: Simple Python script on Orin NX that sends voice-to-text to GB10 LLM, receives plan, prints it
Iterate: Add VLM, sensor streaming, RL training as needed

Key Relationships

GB10 compute: gb10-superchip, ai-frameworks, ai-workloads
GB10 networking: connectivity
G1 reference: git/unitree-g1/context/ (separate knowledge base)

14 KiB Raw Blame History