2.2 KiB
| id | title | status | source_sections | related_topics | key_equations | key_terms | images | examples | open_questions |
|---|---|---|---|---|---|---|---|---|---|
| multi-unit-stacking | Multi-Unit Stacking | provisional | Web research: WCCFTech, NVIDIA newsroom | [connectivity gb10-superchip ai-workloads memory-and-storage] | [] | [connectx-7 smartnic qsfp stacking nvlink] | [] | [] | [Exact cable/interconnect required between units (QSFP type, length limits) Software configuration steps for multi-unit mode Performance overhead of inter-unit communication vs. single unit Does stacking appear as a single device to frameworks or require explicit multi-node code? Can more than 2 units be stacked?] |
Multi-Unit Stacking
Two Dell Pro Max GB10 units can be connected together to create a more powerful combined system, effectively doubling the available compute and memory.
1. How It Works
Each Dell Pro Max GB10 has 2x QSFP 200 Gbps ports powered by the NVIDIA ConnectX-7 SmartNIC. These ports enable direct unit-to-unit connection:
- Combined memory: 256 GB unified (128 GB per unit)
- Combined compute: 2 PFLOP FP4 (1 PFLOP per unit)
- Interconnect bandwidth: Up to 400 Gbps (2x 200 Gbps QSFP)
2. Model Capacity
| Configuration | Memory | Max Model Size (approx) |
|---|---|---|
| Single unit | 128 GB | ~200B parameters (FP4) |
| Dual stacked | 256 GB | ~400B parameters (FP4) |
This enables running models like Llama 3.1 405B (with quantization) that would not fit in a single unit's memory.
3. Physical Configuration
The compact form factor (150x150x51mm per unit) is designed to be stackable — two units can sit on top of each other on a desk, connected via short QSFP cables.
4. Open Areas
This feature is one of the less-documented aspects of the system. Key unknowns include the exact software configuration, whether it presents as a single logical device, and inter-node communication overhead. See open questions in frontmatter.
Key Relationships
- Connected via: connectivity (QSFP/ConnectX-7 ports)
- Extends capacity of: ai-workloads
- Doubles resources from: gb10-superchip, memory-and-storage