You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
5.4 KiB
5.4 KiB
| id | title | status | source_sections | related_topics | key_equations | key_terms | images | examples | open_questions |
|---|---|---|---|---|---|---|---|---|---|
| ai-frameworks | AI Frameworks and Development Tools | established | Web research: NVIDIA newsroom, Arm learning paths, NVIDIA DGX Spark User Guide, build.nvidia.com/spark playbooks | [dgx-os-software gb10-superchip ai-workloads] | [] | [pytorch nemo rapids cuda ngc jupyter tensorrt tensorrt-llm llama-cpp docker nvidia-container-runtime fex ollama comfyui sm_121 cu130 speculative-decoding] | [] | [] | [TensorFlow support status on ARM GB10 (official vs. community) Full NGC catalog availability — which containers work on GB10? vLLM or other inference server support on ARM Blackwell JAX support status] |
AI Frameworks and Development Tools
The Dell Pro Max GB10 supports a broad AI software ecosystem, pre-configured through DGX OS.
1. Core Frameworks
PyTorch
- Primary deep learning framework
- ARM64-native builds available
- Full CUDA support on Blackwell GPU
NVIDIA NeMo
- Framework for fine-tuning and customizing large language models
- Supports supervised fine-tuning (SFT), RLHF, and other alignment techniques
- Optimized for NVIDIA hardware
NVIDIA RAPIDS
- GPU-accelerated data science libraries
- Includes cuDF (DataFrames), cuML (machine learning), cuGraph (graph analytics)
- Drop-in replacements for pandas, scikit-learn, and NetworkX
2. Inference Tools
CUDA Toolkit (v13.0)
- CUDA compute capability:
sm_121(Blackwell on GB10) — use-DCMAKE_CUDA_ARCHITECTURES="121"when compiling - PyTorch CUDA wheels:
cu130(e.g.,pip3 install torch --index-url https://download.pytorch.org/whl/cu130) - Low-level GPU compute API, compiler (nvcc), profiling and debugging tools
llama.cpp
- Quantized LLM inference engine
- ARM-optimized builds available for GB10
- Supports GGUF model format
- Build with CUDA:
cmake .. -DGGML_CUDA=ON -DCMAKE_CUDA_ARCHITECTURES="121"(T1, build.nvidia.com/spark) - Provides OpenAI-compatible API via
llama-server(chat completions, streaming, function calling) - Documented in Arm Learning Path
TensorRT-LLM
- NVIDIA's LLM inference optimizer — confirmed available (T1, build.nvidia.com/spark)
- Container:
tensorrt-llm/release:1.2.0rc6 - Supports speculative decoding for faster inference:
- EAGLE-3: Built-in drafting head, no separate draft model needed
- Draft-Target: Pairs small (8B) and large (70B) models, uses FP4 quantization
- Configurable KV cache memory fraction for memory management
Ollama
- LLM runtime with model library — runs via Docker on GB10 (T1, build.nvidia.com/spark)
- Container:
ghcr.io/open-webui/open-webui:ollama(bundles Open WebUI + Ollama) - Models available from ollama.com/library (e.g.,
gpt-oss:20b) - Port: 12000 (via NVIDIA Sync) or 8080 (direct)
3. Development Environment
- DGX Dashboard — web-based system monitor at
http://localhost:11000with integrated JupyterLab (T0 Spec). JupyterLab ports configured in/opt/nvidia/dgx-dashboard-service/jupyterlab_ports.yaml. - VS Code — ARM64 .deb available; also remote SSH via NVIDIA Sync or manual SSH (T1, build.nvidia.com/spark)
- Cursor — supported via NVIDIA Sync remote SSH launch (T1, build.nvidia.com/spark)
- NVIDIA AI Workbench — launchable via NVIDIA Sync (T1, build.nvidia.com/spark)
- Python — system Python with AI/ML package ecosystem
- NVIDIA NGC Catalog — library of pre-trained models, containers, and SDKs
- Docker + NVIDIA Container Runtime — pre-installed for containerized workflows (T0 Spec)
- NVIDIA AI Enterprise — enterprise-grade AI software and services
- Tutorials & Playbooks: https://build.nvidia.com/spark
Key NGC Containers (confirmed ARM64)
| Container | Tag | Use Case |
|---|---|---|
nvcr.io/nvidia/pytorch |
25.11-py3 |
PyTorch training & fine-tuning |
tensorrt-llm/release |
1.2.0rc6 |
Optimized LLM inference |
| RAPIDS | 25.10 |
GPU-accelerated data science |
ghcr.io/open-webui/open-webui |
ollama |
Open WebUI + Ollama LLM chat |
4. Image Generation
ComfyUI
- Node-based image generation UI for Stable Diffusion, SDXL, Flux, etc. (T1, build.nvidia.com/spark)
- Runs natively on GB10 Blackwell GPU
- Requires: Python 3.8+, CUDA toolkit, PyTorch with
cu130 - Port: 8188 (
--listen 0.0.0.0for remote access) - Storage: ~20 GB minimum (plus model files, e.g., SD 1.5 ~2 GB)
5. UMA Memory Management Tip
DGX Spark uses Unified Memory Architecture (UMA) — CPU and GPU share the same LPDDR5X pool. If GPU memory appears low due to filesystem buffer cache:
sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'
This frees cached memory back to the unified pool without data loss. (T1, build.nvidia.com/spark)
6. Software Compatibility Notes
Since the GB10 is an ARM system:
- All Python packages must have ARM64 wheels or be compilable from source
- Most popular ML libraries (PyTorch, NumPy, etc.) have ARM64 support
- Some niche packages may require building from source
- x86-only binary packages will not run natively
- FEX emulator can translate x86 binaries to ARM at a performance cost (used for Steam/Proton gaming — see ai-workloads)
- Container images must be ARM64/aarch64 builds
Key Relationships
- Runs on: dgx-os-software
- Accelerated by: gb10-superchip
- Powers: ai-workloads