You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 

2.2 KiB

id title status source_sections related_topics key_equations key_terms images examples open_questions
multi-unit-stacking Multi-Unit Stacking provisional Web research: WCCFTech, NVIDIA newsroom [connectivity gb10-superchip ai-workloads memory-and-storage] [] [connectx-7 smartnic qsfp stacking nvlink] [] [] [Exact cable/interconnect required between units (QSFP type, length limits) Software configuration steps for multi-unit mode Performance overhead of inter-unit communication vs. single unit Does stacking appear as a single device to frameworks or require explicit multi-node code? Can more than 2 units be stacked?]

Multi-Unit Stacking

Two Dell Pro Max GB10 units can be connected together to create a more powerful combined system, effectively doubling the available compute and memory.

1. How It Works

Each Dell Pro Max GB10 has 2x QSFP 200 Gbps ports powered by the NVIDIA ConnectX-7 SmartNIC. These ports enable direct unit-to-unit connection:

  • Combined memory: 256 GB unified (128 GB per unit)
  • Combined compute: 2 PFLOP FP4 (1 PFLOP per unit)
  • Interconnect bandwidth: Up to 400 Gbps (2x 200 Gbps QSFP)

2. Model Capacity

Configuration Memory Max Model Size (approx)
Single unit 128 GB ~200B parameters (FP4)
Dual stacked 256 GB ~400B parameters (FP4)

This enables running models like Llama 3.1 405B (with quantization) that would not fit in a single unit's memory.

3. Physical Configuration

The compact form factor (150x150x51mm per unit) is designed to be stackable — two units can sit on top of each other on a desk, connected via short QSFP cables.

4. Open Areas

This feature is one of the less-documented aspects of the system. Key unknowns include the exact software configuration, whether it presents as a single logical device, and inter-node communication overhead. See open questions in frontmatter.

Key Relationships