Initial knowledge base for Dell Pro Max GB10 expert agent

Bootstrap expert agent context system with 12 topic files, glossary, equations/bounds reference, open questions tracker, worked example, and CLAUDE.md agent operating manual. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 month ago · 1a2e908a14
17 changed files with 1562 additions and 0 deletions
--- a/.claude/settings.local.json
+++ b/.claude/settings.local.json
@ -0,0 +1,18 @@
+{
+  "permissions": {
+    "allow": [
+      "Bash",
+      "Edit",
+      "Read",
+      "Write",
+      "Glob",
+      "Grep",
+      "WebFetch",
+      "WebSearch",
+      "Skill(constraint-lookup)",
+      "Skill(phase-analysis)"
+    ],
+    "deny": [],
+    "ask": []
+  }
+}
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -0,0 +1,167 @@
+# Dell Pro Max GB10 - Expert Knowledge Base
+
+**Project:** Domain expert agent for the Dell Pro Max with NVIDIA GB10 Grace Blackwell desktop AI system
+**Format:** Linked context files (Markdown + YAML) with cross-references
+**Status:** Active research
+
+## YOU ARE THE EXPERT AGENT
+
+**You (Claude) are the Dell Pro Max GB10 expert.** The `context/` files, `reference/glossary.yaml`,
+`examples/`, and source materials are YOUR knowledge base. They exist so you can give accurate,
+deeply-sourced answers to technical questions about the Dell Pro Max GB10 hardware, software,
+configuration, AI development workflows, and troubleshooting.
+
+**ALWAYS consult the context system before answering any Dell Pro Max GB10 question or proposing
+new ideas.** Do not rely on your training data alone — the context files contain curated,
+cross-validated data that is more precise and more specific than general knowledge.
+
+---
+
+## How to Answer a Question
+
+1. **Identify the topic(s).** Use the Quick Topic Lookup table (below) to determine
+   which context file(s) are relevant. Most questions touch 1-3 topics.
+
+2. **Read the relevant context file(s).** Each file in `context/` is a self-contained
+   deep dive on one topic. Read the full file — don't guess from the filename.
+
+3. **Follow cross-references.** Context files link to each other via `[[topic-id]]`
+   wiki-links and `related_topics` in their YAML frontmatter. If a question spans
+   topics, follow these links.
+
+4. **Check equations-and-bounds.md for numbers.** If the question involves a number,
+   formula, or physical bound, check here first.
+
+5. **Check glossary.yaml for definitions.** Use this when the user asks "what is X?"
+   or when you need to verify a term's meaning.
+
+6. **Check open-questions.md for known unknowns.** If the question touches something
+   uncertain, this file catalogs what is known vs. unknown.
+
+7. **Cite your sources.** Reference the specific context file and section. If data
+   came from external literature, include the citation.
+
+---
+
+## Quick Topic Lookup
+
+| User asks about...                                | Read this file                          |
+|---------------------------------------------------|-----------------------------------------|
+| GB10 chip, Grace Blackwell, SoC, CPU, GPU cores   | `context/gb10-superchip.md`             |
+| Memory, LPDDR5X, unified memory, bandwidth        | `context/memory-and-storage.md`         |
+| SSD, NVMe, storage options, 2TB, 4TB              | `context/memory-and-storage.md`         |
+| Ports, USB-C, HDMI, ethernet, QSFP, connectivity  | `context/connectivity.md`               |
+| Network, 10GbE, ConnectX-7, SmartNIC, Wi-Fi 7     | `context/connectivity.md`               |
+| DGX OS, Ubuntu, Linux, OS setup, drivers           | `context/dgx-os-software.md`            |
+| CUDA, PyTorch, NeMo, RAPIDS, AI frameworks         | `context/ai-frameworks.md`              |
+| LLM, model inference, Llama, 200B parameters       | `context/ai-workloads.md`               |
+| Stacking, multi-unit, ConnectX-7, 400B models      | `context/multi-unit-stacking.md`        |
+| Physical size, dimensions, weight, form factor      | `context/physical-specs.md`             |
+| Power, 280W adapter, TDP, thermals                 | `context/physical-specs.md`             |
+| Price, SKUs, configurations, purchasing             | `context/skus-and-pricing.md`           |
+| Setup, first boot, initial config, wizard           | `context/setup-and-config.md`           |
+| Troubleshooting, reinstall OS, recovery             | `context/setup-and-config.md`           |
+| Formulas, bounds, constants, performance numbers    | `context/equations-and-bounds.md`       |
+| What we don't know, gaps, unknowns                  | `context/open-questions.md`             |
+| Term definitions, units, acronyms                   | `reference/glossary.yaml`               |
+| Worked calculations, example workflows              | `examples/*.md`                         |
+
+---
+
+## How to Formulate New Ideas
+
+When the user asks you to reason about something novel:
+
+1. **Ground it in existing data.** Read relevant context files first.
+2. **Check the bounds.** Verify reasoning doesn't violate known constraints
+   (e.g., memory limits, TFLOPS ceilings, power envelope).
+3. **Cross-validate.** Multiple sources often cover the same quantity — use them as
+   cross-checks.
+4. **Flag uncertainty honestly.** If reasoning depends on uncertain parameters, say so.
+5. **Preserve new insights.** If reasoning produces a genuinely new finding, offer to
+   add it to the appropriate context file so it persists for future sessions.
+
+---
+
+## Conventions (CRITICAL)
+
+- **Architecture is ARM, not x86.** The GB10 uses ARMv9.2 cores. Never assume x86 compatibility.
+- **Memory is unified.** CPU and GPU share 128GB LPDDR5X — there is no separate VRAM pool.
+- **OS is Linux only.** DGX OS 7 is based on Ubuntu 24.04. Windows is not supported.
+- **Power is via USB-C.** The 280W adapter connects over USB Type-C, not a barrel jack or ATX PSU.
+- **Units:** Use metric (mm, kg) for physical specs. Use binary (GB, TB) for memory/storage.
+- **Model names:** "Dell Pro Max GB10" or "Dell Pro Max with GB10" — this is the Dell-branded product. "DGX Spark" is NVIDIA's own-brand equivalent using the same GB10 superchip.
+- **TFLOPS figures:** 1 PFLOP (1,000 TFLOPS) is at FP4 precision. Always state the precision when quoting performance.
+
+## DO NOT
+
+- Do not assume x86 software compatibility — this is an ARM system
+- Do not confuse the Dell Pro Max GB10 with Dell's other Pro Max desktops (which use Intel/AMD)
+- Do not state the 1 PFLOP figure without specifying FP4 precision
+- Do not assume Windows can be installed
+- Do not confuse "unified memory" with "system RAM + VRAM" — it is a single shared pool
+- Do not assume standard PCIe GPU upgrades are possible — the GPU is part of the SoC
+- Do not quote bandwidth numbers without specifying the interface (NVLink-C2C, memory bus, network)
+
+---
+
+## Evidence Tiers
+
+| Tier | Label         | Meaning                                                    |
+|------|---------------|------------------------------------------------------------|
+| T0   | Spec Sheet    | Official Dell/NVIDIA published specifications              |
+| T1   | Documented    | In official manuals, user guides, or support articles      |
+| T2   | Benchmarked   | Independent review measurements (Phoronix, etc.)           |
+| T3   | Inferred      | Grounded reasoning from known specs, not directly tested   |
+| T4   | Speculative   | Consistent with architecture but no confirming data        |
+
+- Tag individual claims, not sections. One paragraph can mix tiers.
+- A derivation inherits the highest (least certain) tier of its inputs.
+- Mention the tier to the user when presenting T3 or T4 claims.
+
+---
+
+## Key Concepts Quick Map
+
+```
+Dell Pro Max GB10 (product)
+  │
+  ├── GB10 Superchip (SoC) ──── Grace CPU (ARM), Blackwell GPU, NVLink-C2C
+  │       │
+  │       ├── Memory System ──── 128GB unified LPDDR5X, 273 GB/s
+  │       │
+  │       └── AI Compute ──── 1 PFLOP FP4, Tensor Cores (5th gen), CUDA cores
+  │               │
+  │               ├── AI Frameworks ──── PyTorch, NeMo, RAPIDS, CUDA
+  │               │
+  │               └── AI Workloads ──── LLM inference (up to 200B), fine-tuning
+  │
+  ├── Connectivity ──── USB-C, HDMI 2.1b, 10GbE, ConnectX-7 QSFP
+  │       │
+  │       └── Multi-Unit Stacking ──── 2x units via ConnectX-7, up to 400B models
+  │
+  ├── DGX OS 7 ──── Ubuntu 24.04, NVIDIA drivers, CUDA toolkit
+  │
+  ├── Physical ──── 150x150x51mm, 1.31kg, 280W USB-C PSU
+  │
+  └── SKUs ──── 2TB ($3,699) / 4TB ($3,999)
+```
+
+---
+
+## How to Add Content
+
+- **New findings on existing topic:** Edit the relevant `context/*.md` file
+- **New topic:** Create a new file in `context/`, add cross-references to related topics, and add a row to the Quick Topic Lookup table above
+- **Split a topic:** When a context file exceeds ~500 lines, decompose into subtopics
+- **New research phase:** Create a new file in `phases/`
+- **New worked example:** Add to `examples/`
+- **Archive, never delete:** Move superseded files to `_archive/`
+
+---
+
+## History
+
+| Phase | Date       | Summary                                              |
+|-------|------------|------------------------------------------------------|
+| 1     | 2026-02-14 | Initial knowledge base created from web research     |
--- a/context/ai-frameworks.md
+++ b/context/ai-frameworks.md
@ -0,0 +1,76 @@
+---
+id: ai-frameworks
+title: "AI Frameworks and Development Tools"
+status: established
+source_sections: "Web research: NVIDIA newsroom, Arm learning paths"
+related_topics: [dgx-os-software, gb10-superchip, ai-workloads]
+key_equations: []
+key_terms: [pytorch, nemo, rapids, cuda, ngc, jupyter, tensorrt, llama-cpp]
+images: []
+examples: []
+open_questions:
+  - "TensorFlow support status on ARM GB10 (official vs. community)"
+  - "Full NGC catalog availability — which containers work on GB10?"
+  - "vLLM or other inference server support on ARM Blackwell"
+  - "JAX support status"
+---
+
+# AI Frameworks and Development Tools
+
+The Dell Pro Max GB10 supports a broad AI software ecosystem, pre-configured through DGX OS.
+
+## 1. Core Frameworks
+
+### PyTorch
+- Primary deep learning framework
+- ARM64-native builds available
+- Full CUDA support on Blackwell GPU
+
+### NVIDIA NeMo
+- Framework for fine-tuning and customizing large language models
+- Supports supervised fine-tuning (SFT), RLHF, and other alignment techniques
+- Optimized for NVIDIA hardware
+
+### NVIDIA RAPIDS
+- GPU-accelerated data science libraries
+- Includes cuDF (DataFrames), cuML (machine learning), cuGraph (graph analytics)
+- Drop-in replacements for pandas, scikit-learn, and NetworkX
+
+## 2. Inference Tools
+
+### CUDA Toolkit
+- Low-level GPU compute API
+- Compiler (nvcc) for custom CUDA kernels
+- Profiling and debugging tools
+
+### llama.cpp
+- Quantized LLM inference engine
+- ARM-optimized builds available for GB10
+- Supports GGUF model format
+- Documented in [Arm Learning Path](https://learn.arm.com/learning-paths/laptops-and-desktops/dgx_spark_llamacpp/)
+
+### TensorRT (expected)
+- NVIDIA's inference optimizer
+- Blackwell architecture support expected
+
+## 3. Development Environment
+
+- **Jupyter Notebooks** — pre-installed for interactive development
+- **Python** — system Python with AI/ML package ecosystem
+- **NVIDIA NGC Catalog** — library of pre-trained models, containers, and SDKs
+- **Containers** — Docker/container support for reproducible environments
+
+## 4. Software Compatibility Notes
+
+Since the GB10 is an ARM system:
+
+- All Python packages must have ARM64 wheels or be compilable from source
+- Most popular ML libraries (PyTorch, NumPy, etc.) have ARM64 support
+- Some niche packages may require building from source
+- x86-only binary packages will not work
+
+## Key Relationships
+
+- Runs on: [[dgx-os-software]]
+- Accelerated by: [[gb10-superchip]]
+- Powers: [[ai-workloads]]
--- a/context/ai-workloads.md
+++ b/context/ai-workloads.md
@ -0,0 +1,78 @@
+---
+id: ai-workloads
+title: "AI Workloads and Model Capabilities"
+status: established
+source_sections: "Web research: NVIDIA newsroom, Dell product page, WCCFTech"
+related_topics: [gb10-superchip, memory-and-storage, ai-frameworks, multi-unit-stacking]
+key_equations: [model-memory-estimate]
+key_terms: [llm, inference, fine-tuning, quantization, fp4, fp8, fp16, parameter-count]
+images: []
+examples: [llm-memory-estimation.md]
+open_questions:
+  - "Actual tokens/sec benchmarks for common models (Llama 3.3 70B, Mixtral, etc.)"
+  - "Maximum batch size for inference at various model sizes"
+  - "Fine-tuning performance — how long to SFT a 7B model on this hardware?"
+  - "Stable Diffusion / image generation performance"
+  - "Training from scratch — is it practical for any meaningful model size?"
+---
+
+# AI Workloads and Model Capabilities
+
+The Dell Pro Max GB10 is designed primarily for **local AI inference and fine-tuning**, bringing capabilities previously requiring cloud or data center hardware to a desktop form factor.
+
+## 1. Headline Capabilities
+
+- **Up to 200 billion parameter models** locally (with quantization)
+- **1 PFLOP (1,000 TFLOPS)** at FP4 precision
+- **Llama 3.3 70B** confirmed to run locally (single unit)
+- **Up to 400B parameter models** with two-unit stacking (see [[multi-unit-stacking]])
+
+## 2. Model Size vs. Memory
+
+With 128 GB of unified memory, the system can hold:
+
+| Precision | Bytes/Param | Max Params (approx) | Example Models            |
+|-----------|-------------|----------------------|---------------------------|
+| FP4       | 0.5 B       | ~200B+               | Large quantized models     |
+| FP8/INT8  | 1 B         | ~100B                | Llama 3.3 70B, Mixtral    |
+| FP16      | 2 B         | ~50-55B              | Medium models at full prec |
+| FP32      | 4 B         | ~25-28B              | Small models, debugging    |
+
+*Note: Actual usable capacity is less than 128 GB due to OS, KV cache, framework overhead, and activation memory. Estimates assume ~85-90% of memory available for model weights.*
+
+## 3. Primary Use Cases
+
+### Local LLM Inference
+- Run large language models privately, no cloud dependency
+- Interactive chat, code generation, document analysis
+- Privacy-sensitive applications (medical, legal, financial)
+
+### Fine-Tuning
+- Supervised fine-tuning (SFT) of models using NVIDIA NeMo
+- LoRA/QLoRA for parameter-efficient fine-tuning of larger models
+- Custom domain adaptation
+
+### AI Prototyping
+- Rapid iteration on model architectures
+- Dataset preprocessing with RAPIDS
+- Experiment tracking and evaluation
+
+### Data Science
+- GPU-accelerated analytics with RAPIDS
+- Large-scale data processing
+- Graph analytics
+
+## 4. Target Users
+
+- AI researchers and developers
+- Privacy-conscious organizations
+- Academic institutions
+- AI prototyping teams
+- Independent developers building AI applications
+
+## Key Relationships
+
+- Compute provided by: [[gb10-superchip]]
+- Memory constraints: [[memory-and-storage]]
+- Frameworks used: [[ai-frameworks]]
+- Scaling beyond single unit: [[multi-unit-stacking]]
--- a/context/connectivity.md
+++ b/context/connectivity.md
@ -0,0 +1,61 @@
+---
+id: connectivity
+title: "Connectivity and Networking"
+status: established
+source_sections: "Web research: Dell product page, WCCFTech, Phoronix"
+related_topics: [gb10-superchip, multi-unit-stacking, physical-specs, setup-and-config]
+key_equations: []
+key_terms: [usb-c, hdmi, connectx-7, smartnic, qsfp, wifi-7, bluetooth, displayport-alt-mode, 10gbe]
+images: []
+examples: []
+open_questions:
+  - "Which USB-C ports support DisplayPort Alt Mode (all or specific ones)?"
+  - "Maximum display resolution and refresh rate via HDMI 2.1b and DP Alt Mode"
+  - "Can the QSFP ports be used for general networking or only for multi-unit stacking?"
+---
+
+# Connectivity and Networking
+
+The Dell Pro Max GB10 provides extensive I/O for a system of its size, including high-speed networking for multi-unit configurations.
+
+## 1. USB Ports
+
+- **1x USB Type-C (20 Gbps)** — power input port (280W adapter connects here)
+- **3x USB Type-C (20 Gbps)** — general purpose
+- USB-C ports support **DisplayPort Alt Mode** for display output
+
+## 2. Display Output
+
+- **1x HDMI 2.1b** — dedicated display output
+- **USB-C DisplayPort Alt Mode** — additional display(s) via USB-C
+
+## 3. Wired Networking
+
+- **1x 10 GbE Ethernet** (RJ45) — standard network connectivity
+- **2x QSFP 200 Gbps ports** — via NVIDIA ConnectX-7 SmartNIC
+  - Each port supports 200 Gbps
+  - Primary use: [[multi-unit-stacking]] for scaling to 2-unit configurations
+  - Based on ConnectX-7 SmartNIC technology
+
+## 4. Wireless
+
+- **Wi-Fi 7** (IEEE 802.11be)
+- **Bluetooth 5.4**
+
+## 5. Port Summary Table
+
+| Port               | Count | Speed/Spec     | Notes                    |
+|--------------------|-------|----------------|--------------------------|
+| USB-C (power)      | 1     | 20 Gbps        | 280W power delivery      |
+| USB-C (data)       | 3     | 20 Gbps        | DP Alt Mode supported    |
+| HDMI               | 1     | 2.1b           | Display output           |
+| RJ45 Ethernet      | 1     | 10 GbE         | Standard networking      |
+| QSFP               | 2     | 200 Gbps each  | ConnectX-7 SmartNIC      |
+| Wi-Fi              | 1     | Wi-Fi 7        | 802.11be                 |
+| Bluetooth          | 1     | 5.4            | Integrated               |
+
+## Key Relationships
+
+- Enables: [[multi-unit-stacking]]
+- Setup guide: [[setup-and-config]]
+- Physical port locations: [[physical-specs]]
--- a/context/dgx-os-software.md
+++ b/context/dgx-os-software.md
@ -0,0 +1,81 @@
+---
+id: dgx-os-software
+title: "DGX OS and System Software"
+status: established
+source_sections: "Web research: NVIDIA DGX OS 7 User Guide, Dell support articles, Phoronix"
+related_topics: [ai-frameworks, setup-and-config, gb10-superchip]
+key_equations: []
+key_terms: [dgx-os, ubuntu, cuda, nvidia-driver, dgx-spark, kernel]
+images: []
+examples: []
+open_questions:
+  - "Can a stock Ubuntu 24.04 ARM be installed instead of DGX OS?"
+  - "Full list of pre-installed NVIDIA packages and versions"
+  - "OTA update mechanism and cadence for DGX OS"
+  - "Does DGX OS include Docker/container runtime by default?"
+---
+
+# DGX OS and System Software
+
+The Dell Pro Max GB10 ships with NVIDIA DGX OS 7, a purpose-built Linux distribution for AI development.
+
+## 1. DGX OS 7 Overview
+
+- **Base:** Ubuntu 24.04 LTS (Noble Numbat)
+- **Kernel:** Linux 6.8
+- **Architecture:** ARM64 (aarch64)
+- **NVIDIA branding:** Also called "DGX OS for DGX Spark"
+
+DGX OS is not a separate distribution — it is Ubuntu 24.04 with NVIDIA's customizations layered on top:
+
+- Pre-configured NVIDIA GPU drivers
+- CUDA toolkit and libraries
+- Platform-specific optimizations and configurations
+- Diagnostic and monitoring tools
+- System-specific firmware management
+
+## 2. Pre-installed Software Stack
+
+The system ships ready to run AI workloads with:
+
+- **CUDA toolkit** — GPU compute API and compiler
+- **NVIDIA drivers** — optimized for GB10 Blackwell GPU
+- **Python** — system Python plus development environments
+- **GCC** — ARM-native compiler toolchain
+- **OpenJDK** — Java runtime
+- **Jupyter notebooks** — interactive development environment
+
+For AI frameworks, see [[ai-frameworks]].
+
+## 3. First Boot and Setup
+
+DGX OS uses a **setup wizard** on first boot that handles:
+
+- User account creation
+- Network configuration
+- System preferences
+- Software configuration
+
+The process is designed for fast onboarding. See [[setup-and-config]] for detailed walkthrough.
+
+## 4. OS Reinstallation
+
+Dell provides a documented process for reinstalling DGX OS:
+
+- Boot to GRUB menu
+- Select "Install DGX OS 7.2.1 for DGX Spark" from DGX Spark Installation Options
+- Installation takes approximately **25-30 minutes**
+
+Source: [Dell Support KB Article](https://www.dell.com/support/kbdoc/en-us/000382042/how-to-reinstall-the-nvidia-dgx-operating-system-on-dell-pro-max-with-grace-blackwell-systems)
+
+## 5. Important Notes
+
+- **ARM-only:** All software must be ARM64/aarch64 compatible. x86 binaries will not run natively.
+- **No Windows:** This system does not support Windows installation.
+- **Package management:** Standard Ubuntu `apt` package manager, plus NVIDIA's own repositories.
+
+## Key Relationships
+
+- Runs on: [[gb10-superchip]]
+- Provides platform for: [[ai-frameworks]]
+- Setup process: [[setup-and-config]]
--- a/context/equations-and-bounds.md
+++ b/context/equations-and-bounds.md
@ -0,0 +1,113 @@
+---
+id: equations-and-bounds
+title: "Equations and Bounds"
+status: established
+source_sections: "Derived from context files and official specifications"
+related_topics: [gb10-superchip, memory-and-storage, ai-workloads, connectivity]
+key_equations: [flops-fp4, memory-bandwidth, model-memory-estimate, nvlink-c2c-bandwidth, storage-throughput]
+key_terms: [tflops, pflop, bandwidth, throughput, fp4, fp8, fp16, fp32]
+images: []
+examples: [llm-memory-estimation.md]
+open_questions:
+  - "Sustained vs. peak TFLOPS under real workloads"
+  - "Actual memory bandwidth under mixed CPU+GPU access patterns"
+---
+
+# Equations and Bounds
+
+Reference for all quantitative specifications, formulas, and validation ranges for the Dell Pro Max GB10.
+
+## 1. Compute Performance
+
+### Peak TFLOPS by Precision
+
+| Precision | Peak TFLOPS | Source   | Notes                              |
+|-----------|-------------|----------|------------------------------------|
+| FP4       | 1,000       | T0 Spec  | Headline figure, 1 PFLOP           |
+| FP8       | ~500        | T3 Infer | Typical 2:1 ratio from FP4         |
+| FP16      | ~250        | T3 Infer | Typical 4:1 ratio from FP4         |
+| FP32      | ~125        | T3 Infer | Typical 8:1 ratio from FP4         |
+
+*Note: FP8/FP16/FP32 values are inferred from typical Blackwell architecture ratios. Actual values not yet independently confirmed.*
+
+### GPU Cores
+- **CUDA cores:** 6,144 (T0 Spec)
+- **Tensor Cores:** 5th generation (count TBD)
+
+## 2. Memory
+
+### Bandwidth
+- **Memory bandwidth:** 273 GB/s (T0 Spec, LPDDR5X at 9,400 MT/s)
+- **NVLink-C2C bandwidth:** 600 GB/s bidirectional (T0 Spec, CPU-GPU interconnect)
+
+### Capacity
+- **Total unified memory:** 128 GB LPDDR5X (T0 Spec)
+- **Usable for models:** ~109-115 GB (T3 Infer, after OS/framework/KV cache overhead)
+
+## 3. Model Memory Estimation
+
+### Formula: Memory Required for Model Weights
+
+```
+Memory (GB) = Parameters (billions) × Bytes_per_parameter
+```
+
+| Precision | Bytes/Param | Formula                           |
+|-----------|-------------|-----------------------------------|
+| FP4       | 0.5         | Params_B × 0.5                    |
+| FP8/INT8  | 1.0         | Params_B × 1.0                    |
+| FP16      | 2.0         | Params_B × 2.0                    |
+| FP32      | 4.0         | Params_B × 4.0                    |
+
+### Total Inference Memory (approximate)
+
+```
+Total Memory ≈ Model_Weights + KV_Cache + Activation_Memory + Framework_Overhead
+```
+
+Rule of thumb: budget **1.2-1.5x** the raw model weight size for total inference memory.
+
+### Maximum Model Sizes (single unit, 128 GB)
+
+| Precision | Max Params (raw)  | Max Params (practical, ~110 GB usable) |
+|-----------|-------------------|----------------------------------------|
+| FP4       | 256B              | ~200B                                  |
+| FP8/INT8  | 128B              | ~100B                                  |
+| FP16      | 64B               | ~55B                                   |
+| FP32      | 32B               | ~27B                                   |
+
+## 4. Networking Bounds
+
+| Interface           | Bandwidth          | Direction       |
+|---------------------|--------------------|-----------------|
+| NVLink-C2C          | 600 GB/s           | Bidirectional   |
+| LPDDR5X memory      | 273 GB/s           | System memory   |
+| QSFP (per port)     | 200 Gbps (25 GB/s) | Network         |
+| QSFP (total)        | 400 Gbps (50 GB/s) | 2 ports combined|
+| 10 GbE Ethernet     | 10 Gbps (1.25 GB/s)| Network         |
+| USB-C (per port)    | 20 Gbps (2.5 GB/s) | I/O             |
+
+## 5. Power Bounds
+
+| Parameter           | Value   |
+|---------------------|---------|
+| PSU rating          | 280W    |
+| System TDP          | ~140W   |
+| Power delivery      | USB-C PD|
+
+## 6. Physical Bounds
+
+| Parameter     | Value         |
+|---------------|---------------|
+| Volume        | ~1.15 L       |
+| Weight        | 1.31 kg       |
+| Footprint     | 150 × 150 mm  |
+| Height        | 51 mm         |
+
+## 7. Validation Rules
+
+When checking calculations:
+- Model size estimates should not exceed 128 GB (single) or 256 GB (stacked)
+- TFLOPS claims must specify precision — reject unqualified "1 PFLOP" statements
+- Memory bandwidth (273 GB/s) is the system memory bus, NOT the NVLink-C2C (600 GB/s)
+- Network bandwidth (QSFP) is in Gbps, not GB/s — divide by 8 for bytes
--- a/context/gb10-superchip.md
+++ b/context/gb10-superchip.md
@ -0,0 +1,72 @@
+---
+id: gb10-superchip
+title: "NVIDIA GB10 Grace Blackwell Superchip"
+status: established
+source_sections: "Web research: NVIDIA newsroom, WCCFTech, Phoronix, The Register, Arm"
+related_topics: [memory-and-storage, ai-frameworks, ai-workloads, connectivity, physical-specs]
+key_equations: [flops-fp4, nvlink-c2c-bandwidth]
+key_terms: [gb10, grace-blackwell, superchip, cortex-x925, cortex-a725, blackwell-gpu, tensor-core, cuda-core, nvlink-c2c, soc]
+images: []
+examples: []
+open_questions:
+  - "Exact clock speeds for CPU and GPU dies under sustained load"
+  - "Detailed per-precision TFLOPS breakdown (FP4/FP8/FP16/FP32/FP64)"
+  - "Thermal throttling behavior and sustained vs. peak performance"
+---
+
+# NVIDIA GB10 Grace Blackwell Superchip
+
+The GB10 is a system-on-a-chip (SoC) that combines an NVIDIA Grace CPU and an NVIDIA Blackwell GPU on a single package, connected via NVLink Chip-to-Chip (NVLink-C2C) interconnect. It is the core silicon in the Dell Pro Max GB10 and the NVIDIA DGX Spark.
+
+## 1. Architecture Overview
+
+The GB10 is composed of two distinct compute dies:
+
+- **CPU tile:** Designed by MediaTek, based on ARM architecture v9.2
+- **GPU tile:** Designed by NVIDIA, based on the Blackwell architecture
+
+These are stitched together using TSMC's 2.5D advanced packaging technology and connected via NVIDIA's proprietary NVLink-C2C interconnect, which provides **600 GB/s of bidirectional bandwidth** between the CPU and GPU dies.
+
+## 2. CPU: Grace (ARM)
+
+The Grace CPU portion contains **20 cores** in a big.LITTLE-style configuration:
+
+- **10x ARM Cortex-X925** — high-performance cores
+- **10x ARM Cortex-A725** — efficiency cores
+
+Architecture: ARMv9.2
+
+This is the same Grace CPU lineage used in NVIDIA's data center Grace Hopper and Grace Blackwell products, adapted for desktop power envelopes.
+
+## 3. GPU: Blackwell
+
+The Blackwell GPU portion features:
+
+- **6,144 CUDA cores** (comparable to the RTX 5070 core count)
+- **5th-generation Tensor Cores** — optimized for AI inference and training
+- Peak performance: **1 PFLOP (1,000 TFLOPS) at FP4 precision**
+
+The Tensor Cores are the key differentiator for AI workloads, providing hardware acceleration for mixed-precision matrix operations used in deep learning.
+
+## 4. NVLink-C2C Interconnect
+
+The CPU and GPU communicate via NVLink Chip-to-Chip:
+
+- **Bidirectional bandwidth:** 600 GB/s
+- Enables **unified coherent memory** — both CPU and GPU see the same 128GB LPDDR5X pool
+- Eliminates the PCIe bottleneck found in traditional discrete GPU systems
+
+This coherent memory architecture means there is no need to explicitly copy data between "host" and "device" memory, simplifying AI development workflows.
+
+## 5. Power Envelope
+
+- **System TDP:** ~140W (from related specifications)
+- **External PSU:** 280W USB Type-C adapter (headroom for storage, networking, peripherals)
+
+## Key Relationships
+
+- Provides compute for: [[ai-workloads]], [[ai-frameworks]]
+- Memory subsystem: [[memory-and-storage]]
+- Housed in: [[physical-specs]]
+- Connected externally via: [[connectivity]]
+- Scales via: [[multi-unit-stacking]]
--- a/context/memory-and-storage.md
+++ b/context/memory-and-storage.md
@ -0,0 +1,50 @@
+---
+id: memory-and-storage
+title: "Memory and Storage"
+status: established
+source_sections: "Web research: Dell product page, WCCFTech, Phoronix"
+related_topics: [gb10-superchip, ai-workloads, skus-and-pricing]
+key_equations: [memory-bandwidth, storage-throughput]
+key_terms: [lpddr5x, unified-memory, nvme, pcie-gen4, sed]
+images: []
+examples: []
+open_questions:
+  - "Is the M.2 SSD user-replaceable or soldered?"
+  - "Exact sequential and random IOPS for the included NVMe drives"
+  - "Memory channel configuration (number of channels)"
+---
+
+# Memory and Storage
+
+The Dell Pro Max GB10 features a unified memory architecture and NVMe solid-state storage.
+
+## 1. System Memory
+
+- **Capacity:** 128 GB LPDDR5X
+- **Speed:** Up to 9,400 MT/s (megatransfers per second)
+- **Bandwidth:** 273 GB/s
+- **Architecture:** Unified coherent memory shared between CPU and GPU via [[gb10-superchip|NVLink-C2C]]
+
+### Unified Memory Model
+
+Unlike traditional desktop systems with separate system RAM and GPU VRAM, the GB10's memory is a **single shared pool**. Both the Grace CPU and Blackwell GPU access the same 128 GB with full cache coherence. This means:
+
+- No PCIe transfer bottleneck between CPU and GPU memory
+- AI models up to ~200B parameters can fit in memory (with quantization)
+- Frameworks see the full 128 GB as available device memory
+
+The LPDDR5X is likely soldered to the SoC package (not user-upgradeable), consistent with the compact form factor.
+
+## 2. Storage
+
+- **Interface:** PCIe Gen 4 M.2 NVMe
+- **Options:** 2 TB or 4 TB
+- **SED-ready:** Self-Encrypting Drive support available on 4 TB option
+
+Storage configurations map to SKU pricing — see [[skus-and-pricing]].
+
+## Key Relationships
+
+- Accessed by: [[gb10-superchip]]
+- Determines model capacity: [[ai-workloads]]
+- SKU differentiation: [[skus-and-pricing]]
--- a/context/multi-unit-stacking.md
+++ b/context/multi-unit-stacking.md
@ -0,0 +1,52 @@
+---
+id: multi-unit-stacking
+title: "Multi-Unit Stacking"
+status: provisional
+source_sections: "Web research: WCCFTech, NVIDIA newsroom"
+related_topics: [connectivity, gb10-superchip, ai-workloads, memory-and-storage]
+key_equations: []
+key_terms: [connectx-7, smartnic, qsfp, stacking, nvlink]
+images: []
+examples: []
+open_questions:
+  - "Exact cable/interconnect required between units (QSFP type, length limits)"
+  - "Software configuration steps for multi-unit mode"
+  - "Performance overhead of inter-unit communication vs. single unit"
+  - "Does stacking appear as a single device to frameworks or require explicit multi-node code?"
+  - "Can more than 2 units be stacked?"
+---
+
+# Multi-Unit Stacking
+
+Two Dell Pro Max GB10 units can be connected together to create a more powerful combined system, effectively doubling the available compute and memory.
+
+## 1. How It Works
+
+Each Dell Pro Max GB10 has **2x QSFP 200 Gbps ports** powered by the NVIDIA ConnectX-7 SmartNIC. These ports enable direct unit-to-unit connection:
+
+- **Combined memory:** 256 GB unified (128 GB per unit)
+- **Combined compute:** 2 PFLOP FP4 (1 PFLOP per unit)
+- **Interconnect bandwidth:** Up to 400 Gbps (2x 200 Gbps QSFP)
+
+## 2. Model Capacity
+
+| Configuration  | Memory  | Max Model Size (approx) |
+|---------------|---------|-------------------------|
+| Single unit    | 128 GB  | ~200B parameters (FP4)  |
+| Dual stacked   | 256 GB  | ~400B parameters (FP4)  |
+
+This enables running models like **Llama 3.1 405B** (with quantization) that would not fit in a single unit's memory.
+
+## 3. Physical Configuration
+
+The compact form factor (150x150x51mm per unit) is designed to be **stackable** — two units can sit on top of each other on a desk, connected via short QSFP cables.
+
+## 4. Open Areas
+
+This feature is one of the less-documented aspects of the system. Key unknowns include the exact software configuration, whether it presents as a single logical device, and inter-node communication overhead. See open questions in frontmatter.
+
+## Key Relationships
+
+- Connected via: [[connectivity]] (QSFP/ConnectX-7 ports)
+- Extends capacity of: [[ai-workloads]]
+- Doubles resources from: [[gb10-superchip]], [[memory-and-storage]]
--- a/context/open-questions.md
+++ b/context/open-questions.md
@ -0,0 +1,125 @@
+---
+id: open-questions
+title: "Open Questions"
+status: active
+source_sections: "Aggregated from all context files"
+related_topics: [gb10-superchip, memory-and-storage, connectivity, dgx-os-software, ai-frameworks, ai-workloads, multi-unit-stacking, physical-specs, setup-and-config, skus-and-pricing]
+---
+
+# Open Questions
+
+Catalog of known unknowns, research gaps, and unresolved questions about the Dell Pro Max GB10.
+
+## Hardware
+
+### GB10 Superchip
+- **Q:** What are the exact clock speeds for CPU and GPU dies under sustained load?
+  - *Status:* Unknown. No official boost/base clocks published.
+  - *Would resolve:* Performance prediction, thermal modeling
+- **Q:** What is the detailed per-precision TFLOPS breakdown (FP4/FP8/FP16/FP32/FP64)?
+  - *Status:* Only FP4 (1,000 TFLOPS) is officially published. Others are inferred.
+  - *Would resolve:* Accurate workload performance estimation
+- **Q:** What is the thermal throttling behavior?
+  - *Status:* Unknown. Sustained vs. peak performance delta not documented.
+  - *Would resolve:* Real-world performance expectations
+
+### Memory
+- **Q:** Is the LPDDR5X soldered or socketed?
+  - *Status:* Almost certainly soldered (given LPDDR5X and form factor), but not confirmed.
+  - *Would resolve:* Upgradeability
+- **Q:** What is the memory channel configuration?
+  - *Status:* Unknown. Number of channels not published.
+  - *Would resolve:* Memory performance modeling
+
+### Storage
+- **Q:** Is the M.2 SSD user-replaceable?
+  - *Status:* Unknown. Owner's manual may clarify.
+  - *Would resolve:* Storage upgrade path
+- **Q:** What are the exact sequential and random IOPS?
+  - *Status:* Unknown. Drive model not publicly identified.
+  - *Would resolve:* Storage performance expectations
+
+## Software
+
+### DGX OS
+- **Q:** Can stock Ubuntu 24.04 ARM be installed instead of DGX OS?
+  - *Status:* Likely possible but unsupported. Not documented.
+  - *Would resolve:* OS flexibility
+- **Q:** Full list of pre-installed NVIDIA packages and versions?
+  - *Status:* Partially known. Full manifest not published.
+  - *Would resolve:* Development environment baseline
+- **Q:** Does DGX OS include Docker/container runtime by default?
+  - *Status:* Unknown.
+  - *Would resolve:* Container workflow setup
+- **Q:** OTA update mechanism and cadence?
+  - *Status:* Unknown.
+  - *Would resolve:* Maintenance planning
+
+### AI Frameworks
+- **Q:** TensorFlow support status on ARM GB10?
+  - *Status:* Unknown. Official vs. community builds unclear.
+  - *Would resolve:* Framework selection for TF users
+- **Q:** Full NGC catalog availability for GB10?
+  - *Status:* Unknown. Which containers have ARM builds.
+  - *Would resolve:* Software ecosystem breadth
+- **Q:** vLLM or other inference server support on ARM Blackwell?
+  - *Status:* Unknown.
+  - *Would resolve:* Production inference deployment options
+- **Q:** JAX support status?
+  - *Status:* Unknown.
+  - *Would resolve:* Framework selection for JAX users
+
+## Networking / Multi-Unit
+
+- **Q:** What cable/interconnect is required for multi-unit stacking?
+  - *Status:* QSFP cables, but exact type/spec not documented.
+  - *Would resolve:* Multi-unit setup purchasing
+- **Q:** Software configuration steps for multi-unit mode?
+  - *Status:* Not documented publicly.
+  - *Would resolve:* Multi-unit deployment
+- **Q:** Does stacking appear as a single logical device to frameworks?
+  - *Status:* Unknown. May require explicit multi-node code.
+  - *Would resolve:* Development complexity for stacked setups
+- **Q:** Can more than 2 units be stacked?
+  - *Status:* Only 2-unit configuration documented.
+  - *Would resolve:* Maximum scaling potential
+- **Q:** Can QSFP ports be used for general networking?
+  - *Status:* Unknown. May be reserved for stacking.
+  - *Would resolve:* Network architecture options
+
+## Physical / Environmental
+
+- **Q:** Noise levels under load?
+  - *Status:* No dB measurements published.
+  - *Would resolve:* Office/desk suitability
+- **Q:** Operating temperature range?
+  - *Status:* Unknown.
+  - *Would resolve:* Deployment environment requirements
+- **Q:** VESA mount compatibility?
+  - *Status:* Unknown.
+  - *Would resolve:* Mounting options
+- **Q:** Cooling solution details (fan count, heatsink type)?
+  - *Status:* Unknown.
+  - *Would resolve:* Thermal management understanding
+
+## Performance Benchmarks
+
+- **Q:** Actual tokens/sec for common LLMs (Llama 3.3 70B, Mixtral, etc.)?
+  - *Status:* No published benchmarks from Dell or independent reviewers yet.
+  - *Would resolve:* Real-world inference performance expectations
+- **Q:** Fine-tuning time estimates for common model sizes?
+  - *Status:* Unknown.
+  - *Would resolve:* Training workflow planning
+- **Q:** Stable Diffusion / image generation performance?
+  - *Status:* Unknown.
+  - *Would resolve:* Non-LLM AI workload suitability
+
+---
+
+## Resolved Questions
+
+*(Move questions here as they get answered, with date and resolution)*
+
+| Date | Question | Resolution | Source |
+|------|----------|------------|--------|
+| —    | —        | —          | —      |
--- a/context/physical-specs.md
+++ b/context/physical-specs.md
@ -0,0 +1,62 @@
+---
+id: physical-specs
+title: "Physical Specifications"
+status: established
+source_sections: "Web research: Dell product page, WCCFTech"
+related_topics: [connectivity, gb10-superchip, skus-and-pricing]
+key_equations: [volume-calculation]
+key_terms: [form-factor, micro-desktop, usb-c-psu, tdp]
+images: []
+examples: []
+open_questions:
+  - "Noise levels under load (dB)"
+  - "Operating temperature range"
+  - "VESA mount compatibility"
+  - "Cooling solution details (fan count, heatsink type)"
+---
+
+# Physical Specifications
+
+The Dell Pro Max GB10 is an ultra-compact mini desktop designed to sit on or near a desk.
+
+## 1. Dimensions and Weight
+
+| Spec          | Value                      |
+|---------------|----------------------------|
+| Width         | 150 mm (5.9 in)            |
+| Depth         | 150 mm (5.9 in)            |
+| Height        | 51 mm (2.0 in)             |
+| Volume        | ~1.15 liters               |
+| Weight        | 1.31 kg (2.89 lbs) base    |
+
+For reference, the footprint is roughly the size of a large coaster or small book.
+
+## 2. Power Supply
+
+- **External adapter:** 280W USB Type-C
+- **Connection:** USB-C power delivery
+- **System TDP:** ~140W
+
+The PSU is external, keeping the unit itself compact and cool. The 280W rating provides headroom beyond the ~140W system TDP for peripherals, storage, and networking.
+
+## 3. Form Factor
+
+- **Classification:** Micro desktop / Mini PC
+- **Design:** Stackable (for [[multi-unit-stacking]])
+- **Chassis:** Compact rectangular enclosure
+
+## 4. Scale Comparison
+
+| Compared to...          | Dell Pro Max GB10          |
+|-------------------------|----------------------------|
+| Mac Mini M4 Pro         | Similar footprint, thinner |
+| NVIDIA DGX Spark        | Identical hardware         |
+| Traditional desktop     | ~20x smaller by volume     |
+| Laptop                  | Comparable weight           |
+
+## Key Relationships
+
+- Houses: [[gb10-superchip]]
+- External ports: [[connectivity]]
+- Stacking design: [[multi-unit-stacking]]
+- Pricing: [[skus-and-pricing]]
--- a/context/setup-and-config.md
+++ b/context/setup-and-config.md
@ -0,0 +1,83 @@
+---
+id: setup-and-config
+title: "Setup and Configuration"
+status: provisional
+source_sections: "Web research: NVIDIA DGX OS 7 User Guide, Dell support KB"
+related_topics: [dgx-os-software, connectivity, physical-specs]
+key_equations: []
+key_terms: [first-boot, setup-wizard, grub, reinstall, dgx-os]
+images: []
+examples: []
+open_questions:
+  - "Full first-boot wizard steps with screenshots"
+  - "BIOS/firmware update procedure"
+  - "Network boot (PXE) capabilities"
+  - "Remote management / BMC / IPMI availability"
+  - "Factory reset procedure beyond OS reinstall"
+---
+
+# Setup and Configuration
+
+Guide for initial setup, configuration, and recovery of the Dell Pro Max GB10.
+
+## 1. Initial Setup (First Boot)
+
+### Physical Setup
+1. Place the unit on a stable surface (stackable design allows multiple units)
+2. Connect the **280W USB-C power adapter** to the designated power USB-C port
+3. Connect a display via **HDMI 2.1b** or **USB-C DisplayPort Alt Mode**
+4. Connect keyboard and mouse (USB-C or Bluetooth)
+5. Optionally connect **10GbE Ethernet** for wired networking
+
+### First Boot Wizard
+On first power-on, DGX OS presents a setup wizard:
+1. Language and locale selection
+2. User account creation
+3. Network configuration (Wi-Fi 7 or Ethernet)
+4. System preferences
+5. Software configuration
+
+The wizard is designed for fast onboarding — the system is ready to use shortly after.
+
+## 2. OS Reinstallation
+
+If you need to reinstall DGX OS from scratch:
+
+1. Power on or reboot the system
+2. Access the **GRUB boot menu**
+3. Navigate to **DGX Spark Installation Options**
+4. Select **"Install DGX OS 7.2.1 for DGX Spark"**
+5. Follow on-screen prompts
+6. Installation takes approximately **25-30 minutes**
+
+Source: [Dell Support — How to Reinstall DGX OS](https://www.dell.com/support/kbdoc/en-us/000382042/how-to-reinstall-the-nvidia-dgx-operating-system-on-dell-pro-max-with-grace-blackwell-systems)
+
+## 3. Post-Setup Configuration
+
+### Recommended Steps
+- Update DGX OS packages: `sudo apt update && sudo apt upgrade`
+- Verify GPU is detected: `nvidia-smi`
+- Verify CUDA toolkit: `nvcc --version`
+- Configure SSH for remote access
+- Set up development environment (Jupyter, conda/venv, etc.)
+
+### Network Configuration
+- **Wi-Fi 7:** Configure via Network Manager or `nmcli`
+- **10GbE Ethernet:** Auto-configured via DHCP or manual static IP
+- **QSFP ports:** For [[multi-unit-stacking]] configuration
+
+## 4. Troubleshooting
+
+| Symptom                     | Check                                        |
+|-----------------------------|----------------------------------------------|
+| No display output           | Try both HDMI and USB-C DP Alt Mode          |
+| GPU not detected            | Run `nvidia-smi`, check driver installation  |
+| Network not connecting      | Verify cable/Wi-Fi config, run `ip addr`     |
+| System won't boot           | Access GRUB menu, try OS reinstall           |
+| Slow AI performance         | Check `nvidia-smi` for thermal throttling    |
+
+## Key Relationships
+
+- Operating system: [[dgx-os-software]]
+- Physical ports: [[connectivity]]
+- Hardware: [[physical-specs]]
--- a/context/skus-and-pricing.md
+++ b/context/skus-and-pricing.md
@ -0,0 +1,62 @@
+---
+id: skus-and-pricing
+title: "SKUs and Pricing"
+status: established
+source_sections: "Web research: Dell product page, WCCFTech, Phoronix"
+related_topics: [memory-and-storage, physical-specs]
+key_equations: []
+key_terms: [fcm1253, sku]
+images: []
+examples: []
+open_questions:
+  - "Are there additional SKU variants beyond 2TB/4TB?"
+  - "Enterprise/volume pricing"
+  - "Warranty and support tiers available"
+  - "Availability by region"
+---
+
+# SKUs and Pricing
+
+The Dell Pro Max GB10 is available in two primary storage configurations.
+
+## 1. Available Models
+
+| Model             | Storage | SED  | Price (USD) |
+|-------------------|---------|------|-------------|
+| FCM1253 (2TB)     | 2 TB    | No   | $3,699      |
+| FCM1253 (4TB)     | 4 TB    | Yes  | $3,999      |
+
+Both models share identical compute and memory specifications:
+
+- NVIDIA GB10 Superchip
+- 128 GB LPDDR5X
+- All connectivity options
+
+The only differentiator between SKUs is storage capacity and SED (Self-Encrypting Drive) support.
+
+## 2. Model Number
+
+- **Dell model identifier:** Dell Pro Max FCM1253
+- **Form factor designation:** Micro
+
+## 3. Release Timeline
+
+- **Announced:** CES 2025 (as NVIDIA Project DIGITS)
+- **Available:** October 15, 2025
+- **Current status:** Shipping
+
+## 4. Competitive Positioning
+
+| Product                    | Price  | Memory | AI Compute     |
+|---------------------------|--------|--------|----------------|
+| Dell Pro Max GB10 (2TB)   | $3,699 | 128 GB | 1 PFLOP FP4    |
+| Dell Pro Max GB10 (4TB)   | $3,999 | 128 GB | 1 PFLOP FP4    |
+| NVIDIA DGX Spark          | $2,999 | 128 GB | 1 PFLOP FP4    |
+| Mac Studio M4 Ultra       | $3,999 | 192 GB | ~55 TOPS (ANE) |
+
+*Note: The NVIDIA DGX Spark uses the same GB10 hardware at a lower price point. The Dell version adds Dell's enterprise support, warranty, and supply chain.*
+
+## Key Relationships
+
+- Storage options: [[memory-and-storage]]
+- Physical form factor: [[physical-specs]]
--- a/examples/llm-memory-estimation.md
+++ b/examples/llm-memory-estimation.md
@ -0,0 +1,48 @@
+# Worked Example: LLM Memory Estimation on Dell Pro Max GB10
+
+## Problem
+
+Estimate whether Llama 3.3 70B can run on a single Dell Pro Max GB10, and at what precision.
+
+## Given
+
+- **Model:** Llama 3.3 70B (70 billion parameters)
+- **Available memory:** 128 GB unified LPDDR5X
+- **Usable memory:** ~110 GB (after OS, framework, overhead)
+
+## Calculation
+
+### Step 1: Raw Model Weight Memory
+
+| Precision | Bytes/Param | Memory for 70B        |
+|-----------|-------------|-----------------------|
+| FP4       | 0.5         | 70 × 0.5 = 35 GB     |
+| FP8/INT8  | 1.0         | 70 × 1.0 = 70 GB     |
+| FP16      | 2.0         | 70 × 2.0 = 140 GB    |
+| FP32      | 4.0         | 70 × 4.0 = 280 GB    |
+
+### Step 2: Total Memory with Overhead (1.3x multiplier)
+
+| Precision | Weights | Total (~1.3x) | Fits in 110 GB? |
+|-----------|---------|----------------|-----------------|
+| FP4       | 35 GB   | ~46 GB         | Yes             |
+| FP8/INT8  | 70 GB   | ~91 GB         | Yes             |
+| FP16      | 140 GB  | ~182 GB        | No              |
+| FP32      | 280 GB  | ~364 GB        | No              |
+
+### Step 3: Conclusion
+
+- **FP4 quantized:** Fits comfortably (46/110 GB = 42% utilization). Plenty of room for large KV cache and batch sizes.
+- **FP8/INT8 quantized:** Fits (91/110 GB = 83% utilization). Tight but workable for single-request inference.
+- **FP16 (half precision):** Does NOT fit in a single unit. Would require 2-unit stacking (see [[multi-unit-stacking]]).
+- **FP32 (full precision):** Does NOT fit even with stacking.
+
+## Verification
+
+NVIDIA confirms Llama 3.3 70B runs locally on a single GB10 unit. This is consistent with FP8 or FP4 quantized inference, which our calculation shows fitting within memory bounds.
+
+## Sources
+
+- Memory specs: [[memory-and-storage]]
+- Estimation formulas: [[equations-and-bounds]]
+- Model capabilities: [[ai-workloads]]
--- a/phases/phase-01-initial-build.md
+++ b/phases/phase-01-initial-build.md
@ -0,0 +1,48 @@
+# Phase 1: Initial Knowledge Base Build
+
+**Date:** 2026-02-14
+**Goal:** Bootstrap the expert agent context system for the Dell Pro Max GB10
+
+## What Was Done
+
+1. Created full directory structure following the expert agent template
+2. Researched Dell Pro Max GB10 specifications from multiple sources
+3. Created 10 context files covering all major topics:
+   - `gb10-superchip.md` — SoC architecture, CPU/GPU details, NVLink-C2C
+   - `memory-and-storage.md` — 128GB LPDDR5X, NVMe storage options
+   - `connectivity.md` — All ports, networking, wireless
+   - `dgx-os-software.md` — DGX OS 7, Ubuntu 24.04, software stack
+   - `ai-frameworks.md` — PyTorch, NeMo, RAPIDS, CUDA, llama.cpp
+   - `ai-workloads.md` — LLM inference, fine-tuning, model capacity
+   - `multi-unit-stacking.md` — Dual-unit configuration via ConnectX-7
+   - `physical-specs.md` — Dimensions, weight, power supply
+   - `skus-and-pricing.md` — 2TB/4TB models, pricing, competitive positioning
+   - `setup-and-config.md` — First boot, OS reinstall, troubleshooting
+4. Created `equations-and-bounds.md` with formulas and validation ranges
+5. Created `open-questions.md` with 25+ tracked unknowns
+6. Created `reference/glossary.yaml` with 35 term definitions
+7. Created worked example: LLM memory estimation
+8. Created `CLAUDE.md` with full agent operating manual
+
+## Sources Used
+
+- Dell product page (dell.com)
+- NVIDIA newsroom (nvidianews.nvidia.com)
+- WCCFTech review/specs article
+- Phoronix Linux benchmarking preview
+- NVIDIA DGX OS 7 User Guide (docs.nvidia.com)
+- Dell Support KB articles
+- Arm Learning Paths (learn.arm.com)
+- The Register GB10 architecture article
+
+## What Changed
+
+- All files are new (initial build)
+
+## Known Gaps
+
+- No independent benchmark data yet (Phoronix review in progress)
+- Multi-unit stacking details are sparse
+- Some TFLOPS figures are inferred (only FP4 officially published)
+- Owner's manual details not yet integrated (403 from Dell support)
+- No hands-on configuration walkthrough yet
--- a/reference/glossary.yaml
+++ b/reference/glossary.yaml
@ -0,0 +1,366 @@
+terms:
+  - term: "gb10"
+    full_name: "NVIDIA GB10 Superchip"
+    definition: |
+      System-on-chip combining an NVIDIA Grace CPU and Blackwell GPU
+      connected via NVLink-C2C. The core silicon in the Dell Pro Max GB10
+      and NVIDIA DGX Spark.
+    unit: null
+    typical_range: null
+    related_terms: ["grace-blackwell", "superchip", "nvlink-c2c"]
+    related_topics: ["gb10-superchip"]
+
+  - term: "grace-blackwell"
+    full_name: "Grace Blackwell Architecture"
+    definition: |
+      NVIDIA's combined CPU+GPU architecture pairing a Grace ARM CPU
+      with a Blackwell GPU via NVLink-C2C coherent interconnect.
+    unit: null
+    typical_range: null
+    related_terms: ["gb10", "blackwell-gpu", "grace-cpu"]
+    related_topics: ["gb10-superchip"]
+
+  - term: "superchip"
+    full_name: "Superchip"
+    definition: |
+      NVIDIA's term for a system-on-chip that integrates both CPU and GPU
+      dies on a single package with high-bandwidth interconnect.
+    unit: null
+    typical_range: null
+    related_terms: ["gb10", "soc"]
+    related_topics: ["gb10-superchip"]
+
+  - term: "soc"
+    full_name: "System-on-Chip"
+    definition: |
+      An integrated circuit that combines multiple components (CPU, GPU,
+      memory controller, I/O) on a single die or package.
+    unit: null
+    typical_range: null
+    related_terms: ["gb10", "superchip"]
+    related_topics: ["gb10-superchip"]
+
+  - term: "cortex-x925"
+    full_name: "ARM Cortex-X925"
+    definition: |
+      ARM's high-performance CPU core design (ARMv9.2 architecture).
+      The GB10 contains 10 of these as its "big" cores.
+    unit: null
+    typical_range: null
+    related_terms: ["cortex-a725", "gb10"]
+    related_topics: ["gb10-superchip"]
+
+  - term: "cortex-a725"
+    full_name: "ARM Cortex-A725"
+    definition: |
+      ARM's efficiency-focused CPU core design (ARMv9.2 architecture).
+      The GB10 contains 10 of these as its "LITTLE" cores.
+    unit: null
+    typical_range: null
+    related_terms: ["cortex-x925", "gb10"]
+    related_topics: ["gb10-superchip"]
+
+  - term: "blackwell-gpu"
+    full_name: "NVIDIA Blackwell GPU"
+    definition: |
+      NVIDIA's GPU architecture generation. In the GB10, it provides
+      6,144 CUDA cores and 5th-gen Tensor Cores.
+    unit: null
+    typical_range: null
+    related_terms: ["cuda-core", "tensor-core", "gb10"]
+    related_topics: ["gb10-superchip"]
+
+  - term: "cuda-core"
+    full_name: "CUDA Core"
+    definition: |
+      NVIDIA's basic parallel processing unit for general-purpose GPU
+      computing. The GB10 has 6,144 CUDA cores.
+    unit: "cores"
+    typical_range: "6,144 in GB10"
+    related_terms: ["blackwell-gpu", "tensor-core"]
+    related_topics: ["gb10-superchip"]
+
+  - term: "tensor-core"
+    full_name: "Tensor Core (5th Generation)"
+    definition: |
+      Specialized GPU cores for matrix multiply-accumulate operations,
+      critical for deep learning inference and training. 5th-gen Tensor
+      Cores in Blackwell support FP4, FP8, FP16, and other precisions.
+    unit: "cores"
+    typical_range: null
+    related_terms: ["blackwell-gpu", "fp4", "fp8"]
+    related_topics: ["gb10-superchip", "ai-workloads"]
+
+  - term: "nvlink-c2c"
+    full_name: "NVLink Chip-to-Chip"
+    definition: |
+      NVIDIA's proprietary die-to-die interconnect connecting the Grace CPU
+      and Blackwell GPU within the GB10 superchip. Provides 600 GB/s
+      bidirectional bandwidth and enables unified coherent memory.
+    unit: "GB/s"
+    typical_range: "600 GB/s bidirectional"
+    related_terms: ["gb10", "unified-memory"]
+    related_topics: ["gb10-superchip", "memory-and-storage"]
+
+  - term: "unified-memory"
+    full_name: "Unified Coherent Memory"
+    definition: |
+      Memory architecture where CPU and GPU share the same physical memory
+      pool with hardware cache coherence. Eliminates explicit host-device
+      memory copies. In the GB10, both processors see the full 128 GB.
+    unit: "GB"
+    typical_range: "128 GB in GB10"
+    related_terms: ["lpddr5x", "nvlink-c2c"]
+    related_topics: ["memory-and-storage", "gb10-superchip"]
+
+  - term: "lpddr5x"
+    full_name: "Low-Power DDR5X"
+    definition: |
+      Latest generation of low-power DRAM. In the GB10, runs at up to
+      9,400 MT/s providing 273 GB/s of memory bandwidth.
+    unit: "MT/s"
+    typical_range: "9,400 MT/s in GB10"
+    related_terms: ["unified-memory"]
+    related_topics: ["memory-and-storage"]
+
+  - term: "tflops"
+    full_name: "Tera Floating-Point Operations Per Second"
+    definition: |
+      Unit of compute performance. 1 TFLOPS = 10^12 floating-point
+      operations per second. ALWAYS specify the precision (FP4, FP8,
+      FP16, FP32) when quoting TFLOPS figures.
+    unit: "TFLOPS"
+    typical_range: "1,000 TFLOPS FP4 for GB10"
+    related_terms: ["pflop", "fp4"]
+    related_topics: ["gb10-superchip", "equations-and-bounds"]
+
+  - term: "pflop"
+    full_name: "Peta Floating-Point Operations Per Second"
+    definition: |
+      1 PFLOP = 1,000 TFLOPS = 10^15 floating-point operations per second.
+      The GB10's headline figure is 1 PFLOP at FP4 precision.
+    unit: "PFLOP"
+    typical_range: "1 PFLOP FP4 for GB10"
+    related_terms: ["tflops", "fp4"]
+    related_topics: ["gb10-superchip", "equations-and-bounds"]
+
+  - term: "fp4"
+    full_name: "4-bit Floating Point"
+    definition: |
+      Ultra-low precision numerical format using 4 bits per value.
+      Used for quantized inference. The GB10's 1 PFLOP headline
+      is measured at FP4 precision.
+    unit: "bits"
+    typical_range: null
+    related_terms: ["fp8", "fp16", "quantization", "tflops"]
+    related_topics: ["ai-workloads", "equations-and-bounds"]
+
+  - term: "fp8"
+    full_name: "8-bit Floating Point"
+    definition: |
+      Low-precision numerical format using 8 bits per value. Common
+      for quantized LLM inference with good accuracy/performance tradeoff.
+    unit: "bits"
+    typical_range: null
+    related_terms: ["fp4", "fp16", "quantization"]
+    related_topics: ["ai-workloads", "equations-and-bounds"]
+
+  - term: "fp16"
+    full_name: "16-bit Floating Point (Half Precision)"
+    definition: |
+      Standard training precision for many deep learning models.
+      Good balance of range, precision, and memory efficiency.
+    unit: "bits"
+    typical_range: null
+    related_terms: ["fp4", "fp8", "fp32"]
+    related_topics: ["ai-workloads", "equations-and-bounds"]
+
+  - term: "quantization"
+    full_name: "Model Quantization"
+    definition: |
+      Technique for reducing model memory footprint by using lower-precision
+      number formats (FP4, FP8, INT4, INT8) for model weights. Enables
+      running larger models in limited memory at some accuracy cost.
+    unit: null
+    typical_range: null
+    related_terms: ["fp4", "fp8", "parameter-count"]
+    related_topics: ["ai-workloads"]
+
+  - term: "parameter-count"
+    full_name: "Model Parameter Count"
+    definition: |
+      The number of trainable weights in a neural network, typically
+      expressed in billions (B). Determines memory requirements and
+      roughly correlates with model capability.
+    unit: "billions (B)"
+    typical_range: "7B-200B on single GB10, up to 400B stacked"
+    related_terms: ["quantization", "unified-memory"]
+    related_topics: ["ai-workloads", "memory-and-storage"]
+
+  - term: "dgx-os"
+    full_name: "NVIDIA DGX OS 7"
+    definition: |
+      NVIDIA's customized Linux distribution based on Ubuntu 24.04 LTS.
+      Includes pre-configured GPU drivers, CUDA toolkit, and platform
+      optimizations for DGX/DGX Spark hardware.
+    unit: null
+    typical_range: null
+    related_terms: ["ubuntu", "cuda"]
+    related_topics: ["dgx-os-software"]
+
+  - term: "dgx-spark"
+    full_name: "NVIDIA DGX Spark"
+    definition: |
+      NVIDIA's own-branded desktop AI computer using the GB10 superchip.
+      Same hardware as the Dell Pro Max GB10, different branding and
+      support channel. Priced at $2,999.
+    unit: null
+    typical_range: null
+    related_terms: ["gb10"]
+    related_topics: ["skus-and-pricing"]
+
+  - term: "connectx-7"
+    full_name: "NVIDIA ConnectX-7 SmartNIC"
+    definition: |
+      High-performance network interface card integrated into the
+      Dell Pro Max GB10. Provides 2x QSFP 200 Gbps ports, primarily
+      used for multi-unit stacking.
+    unit: "Gbps"
+    typical_range: "200 Gbps per port"
+    related_terms: ["qsfp", "smartnic"]
+    related_topics: ["connectivity", "multi-unit-stacking"]
+
+  - term: "qsfp"
+    full_name: "Quad Small Form-factor Pluggable"
+    definition: |
+      High-speed networking connector standard. The Dell Pro Max GB10
+      has 2x QSFP ports supporting 200 Gbps each via ConnectX-7.
+    unit: "Gbps"
+    typical_range: "200 Gbps per port in GB10"
+    related_terms: ["connectx-7"]
+    related_topics: ["connectivity", "multi-unit-stacking"]
+
+  - term: "smartnic"
+    full_name: "Smart Network Interface Card"
+    definition: |
+      Network adapter with onboard processing capability for offloading
+      network tasks from the main CPU. The ConnectX-7 in the GB10 is
+      a SmartNIC.
+    unit: null
+    typical_range: null
+    related_terms: ["connectx-7", "qsfp"]
+    related_topics: ["connectivity"]
+
+  - term: "10gbe"
+    full_name: "10 Gigabit Ethernet"
+    definition: |
+      Standard Ethernet networking at 10 Gbps. The Dell Pro Max GB10
+      includes one 10GbE RJ45 port for general network connectivity.
+    unit: "Gbps"
+    typical_range: "10 Gbps"
+    related_terms: []
+    related_topics: ["connectivity"]
+
+  - term: "pytorch"
+    full_name: "PyTorch"
+    definition: |
+      Open-source deep learning framework. Primary ML framework
+      supported on the GB10 with ARM64-native builds and full
+      CUDA acceleration.
+    unit: null
+    typical_range: null
+    related_terms: ["cuda", "nemo"]
+    related_topics: ["ai-frameworks"]
+
+  - term: "nemo"
+    full_name: "NVIDIA NeMo"
+    definition: |
+      NVIDIA's framework for building, customizing, and deploying
+      generative AI models. Supports fine-tuning (SFT, RLHF) and
+      is optimized for NVIDIA hardware.
+    unit: null
+    typical_range: null
+    related_terms: ["pytorch", "cuda"]
+    related_topics: ["ai-frameworks"]
+
+  - term: "rapids"
+    full_name: "NVIDIA RAPIDS"
+    definition: |
+      Suite of GPU-accelerated data science libraries including cuDF
+      (DataFrames), cuML (ML), and cuGraph (graph analytics). Drop-in
+      replacements for pandas, scikit-learn, and NetworkX.
+    unit: null
+    typical_range: null
+    related_terms: ["cuda"]
+    related_topics: ["ai-frameworks"]
+
+  - term: "cuda"
+    full_name: "Compute Unified Device Architecture"
+    definition: |
+      NVIDIA's parallel computing platform and API for GPU-accelerated
+      computing. Pre-installed on the GB10 via DGX OS.
+    unit: null
+    typical_range: null
+    related_terms: ["cuda-core", "pytorch", "nemo"]
+    related_topics: ["ai-frameworks", "dgx-os-software"]
+
+  - term: "ngc"
+    full_name: "NVIDIA NGC Catalog"
+    definition: |
+      NVIDIA's hub for GPU-optimized AI software including pre-trained
+      models, containers, SDKs, and Helm charts.
+    unit: null
+    typical_range: null
+    related_terms: ["cuda", "nemo"]
+    related_topics: ["ai-frameworks"]
+
+  - term: "llama-cpp"
+    full_name: "llama.cpp"
+    definition: |
+      Open-source C/C++ inference engine for running quantized LLMs.
+      Supports ARM-optimized builds for GB10 and GGUF model format.
+    unit: null
+    typical_range: null
+    related_terms: ["quantization"]
+    related_topics: ["ai-frameworks", "ai-workloads"]
+
+  - term: "fcm1253"
+    full_name: "Dell Pro Max FCM1253"
+    definition: |
+      Dell's model number for the Pro Max with GB10 desktop system.
+      Available in 2TB and 4TB storage configurations.
+    unit: null
+    typical_range: null
+    related_terms: ["gb10"]
+    related_topics: ["skus-and-pricing"]
+
+  - term: "sed"
+    full_name: "Self-Encrypting Drive"
+    definition: |
+      Storage drive with built-in hardware encryption. Available
+      on the 4TB configuration of the Dell Pro Max GB10.
+    unit: null
+    typical_range: null
+    related_terms: []
+    related_topics: ["memory-and-storage", "skus-and-pricing"]
+
+  - term: "tdp"
+    full_name: "Thermal Design Power"
+    definition: |
+      Maximum amount of heat a cooling system must dissipate.
+      The GB10 system TDP is approximately 140W.
+    unit: "watts"
+    typical_range: "~140W for GB10 system"
+    related_terms: []
+    related_topics: ["physical-specs", "gb10-superchip"]
+
+  - term: "displayport-alt-mode"
+    full_name: "DisplayPort Alternate Mode"
+    definition: |
+      Protocol allowing DisplayPort video signals to be carried
+      over a USB Type-C connector. Used for display output on
+      the GB10's USB-C ports.
+    unit: null
+    typical_range: null
+    related_terms: ["usb-c", "hdmi"]
+    related_topics: ["connectivity"]