--- id: equations-and-bounds title: "Equations and Bounds" status: established source_sections: "Derived from context files and official specifications" related_topics: [gb10-superchip, memory-and-storage, ai-workloads, connectivity] key_equations: [flops-fp4, memory-bandwidth, model-memory-estimate, nvlink-c2c-bandwidth, storage-throughput] key_terms: [tflops, pflop, bandwidth, throughput, fp4, fp8, fp16, fp32] images: [] examples: [llm-memory-estimation.md] open_questions: - "Sustained vs. peak TFLOPS under real workloads" - "Actual memory bandwidth under mixed CPU+GPU access patterns" --- # Equations and Bounds Reference for all quantitative specifications, formulas, and validation ranges for the Dell Pro Max GB10. ## 1. Compute Performance ### Peak TFLOPS by Precision | Precision | Peak TFLOPS | Source | Notes | |-----------|-------------|----------|------------------------------------| | FP4 | 1,000 | T0 Spec | Headline figure, 1 PFLOP (w/ sparsity) | | FP8 | ~500 | T3 Infer | Typical 2:1 ratio from FP4 | | FP16 | ~250 | T3 Infer | Typical 4:1 ratio from FP4 | | FP32 | ~125 | T3 Infer | Typical 8:1 ratio from FP4 | | FP64 | ~0.675 | T2 Bench | HPL Linpack, Jeff Geerling | *Note: FP8/FP16/FP32 values are inferred from typical Blackwell architecture ratios. FP64 is benchmarked. FP4 headline includes sparsity.* ### GPU Cores - **CUDA cores:** 6,144 (T0 Spec) - **Tensor Cores:** 5th generation (count TBD) - **RT Cores:** 4th generation (T0 Spec) - **Copy engines:** 2 (T0 Spec) - **NVENC:** 1 (T0 Spec) - **NVDEC:** 1 (T0 Spec) ## 2. Memory ### Bandwidth - **Memory bandwidth:** 273 GB/s (T0 Spec, LPDDR5X at 9,400 MT/s) - **NVLink-C2C bandwidth:** 600 GB/s bidirectional (T0 Spec, CPU-GPU interconnect) ### Configuration - **Interface:** 256-bit, 16 channels LPDDR5X 8533 (T0 Spec, NVIDIA DGX Spark User Guide) ### Capacity - **Total unified memory:** 128 GB LPDDR5X (T0 Spec) - **Usable for models:** ~109-115 GB (T3 Infer, after OS/framework/KV cache overhead) ## 3. Model Memory Estimation ### Formula: Memory Required for Model Weights ``` Memory (GB) = Parameters (billions) × Bytes_per_parameter ``` | Precision | Bytes/Param | Formula | |-----------|-------------|-----------------------------------| | FP4 | 0.5 | Params_B × 0.5 | | FP8/INT8 | 1.0 | Params_B × 1.0 | | FP16 | 2.0 | Params_B × 2.0 | | FP32 | 4.0 | Params_B × 4.0 | ### Total Inference Memory (approximate) ``` Total Memory ≈ Model_Weights + KV_Cache + Activation_Memory + Framework_Overhead ``` Rule of thumb: budget **1.2-1.5x** the raw model weight size for total inference memory. ### Maximum Model Sizes (single unit, 128 GB) | Precision | Max Params (raw) | Max Params (practical, ~110 GB usable) | |-----------|-------------------|----------------------------------------| | FP4 | 256B | ~200B | | FP8/INT8 | 128B | ~100B | | FP16 | 64B | ~55B | | FP32 | 32B | ~27B | ## 4. Networking Bounds | Interface | Spec Bandwidth | Measured | Direction | |---------------------|--------------------|-------------------|-----------------| | NVLink-C2C | 600 GB/s | — | Bidirectional | | LPDDR5X memory | 273 GB/s | — | System memory | | QSFP (per port) | 200 Gbps (25 GB/s) | ~106 Gbps single stream | Network | | QSFP (total) | 400 Gbps (50 GB/s) | 200+ Gbps RDMA | 2 ports combined| | QSFP PCIe link | x4 PCIe Gen 5 | — | Internal | | 10 GbE Ethernet | 10 Gbps (1.25 GB/s)| — | Network | | USB-C (per port) | 20 Gbps (2.5 GB/s) | — | I/O | *QSFP measured throughput from Jeff Geerling (T2 Benchmarked).* ## 5. Power Bounds | Parameter | Dell Pro Max | DGX Spark | Source | |-------------------------|-------------|-----------|-------------| | PSU rating | 280W | 240W | T0 Spec | | GPU TDP | 140W | 140W | T0 Spec | | Idle draw | ~30W | ~40-45W | T2 Bench | | AI inference (LLM) | 60-90W | 60-90W | T2 Bench | | CPU-only load | 120-140W | 120-130W | T2 Bench | | CPU + GPU load | ~200W | ~200W | T2 Bench | ## 6. Environmental Bounds | Parameter | Dell (T0 Owner's Manual) | DGX Spark (T0 User Guide) | |------------------------|------------------------------------|---------------------------------| | Operating temperature | 0°C to 35°C (32°F to 95°F) | 5°C to 30°C (41°F to 86°F) | | Storage temperature | -40°C to 65°C | — | | Operating humidity | 10% to 90% (non-condensing) | 10% to 90% (non-condensing) | | Operating altitude | -15.2 m to 3,048 m | Up to 3,000 m | | Operating vibration | 0.66 GRMS | — | | Operating shock | 110 G (2ms half-sine) | — | | Noise | < 40 dB at 1-1.5m (T2 Bench) | — | ## 7. Physical Bounds | Parameter | Value | |---------------|---------------| | Volume | ~1.15 L | | Weight | 1.31 kg | | Footprint | 150 × 150 mm | | Height | 51 mm | ## 8. Validation Rules When checking calculations: - Model size estimates should not exceed 128 GB (single) or 256 GB (stacked) - TFLOPS claims must specify precision — reject unqualified "1 PFLOP" statements - Memory bandwidth (273 GB/s) is the system memory bus, NOT the NVLink-C2C (600 GB/s) - Network bandwidth (QSFP) is in Gbps, not GB/s — divide by 8 for bytes