Running Benchmarks

Learn how to run MIND's performance benchmarks and verify the results yourself.

Prerequisites

# Clone the MIND repository
git clone https://github.com/cputer/mind.git
cd mind

# Build MIND in release mode
cargo build --release

Determinism Benchmark

Verify that MIND produces bit-identical compilation output.

python3 benchmarks/determinism/benchmark_determinism.py

Expected Output

SUMMARY: 4/4 tests DETERMINISTIC
✅ DETERMINISM VERIFIED: All outputs are bit-identical across runs

What it tests: 4 different programs (scalar_math, small_matmul, medium_matmul, mlp), 10 compilation runs per program, SHA256 hash comparison. 100% identical hashes = deterministic.

PyTorch Comparison Benchmark

Compare MIND compilation speed vs PyTorch 2.0.

# Install PyTorch if needed
pip install torch

# Run comparison
python3 benchmarks/pytorch_comparison/benchmark_pytorch_compile.py

Expected Output

Benchmark         MIND      PyTorch 2.0    MIND Speedup
--------------------------------------------------------
scalar_math       5.5 ms    2.4 ms         (see note below)
conv2d            5.4 ms    9.4 ms         2× faster

Note: MIND times include ~5ms subprocess overhead. See next section for real compilation time.

Real Compilation Time (Python Bindings)

Measure MIND's true compilation time without subprocess overhead.

# Build Python bindings
maturin build --release --features python-bindings,autodiff

# Install the wheel
pip install target/wheels/mind-*.whl

# Run test
python3 test_real_compile_time.py

Expected Output

Real MIND Compilation Time (NO subprocess overhead):
  Mean:   38.3 µs
  StdDev: 4.3 µs
  Min:    35.7 µs
  Max:    53.4 µs

This is the TRUE compilation time — no process spawning, no IPC overhead.

GPU Benchmarks (Enterprise)

The Enterprise runtime includes CUDA GPU benchmarks. Contact sales for access to:

Memory allocation: CachingAllocator vs cudaMalloc (180x improvement)
MatMul performance: cuBLAS with TF32/FP16 Tensor Cores (35-40% faster than PyTorch)
Elementwise operations: float4 vectorized kernels (98% bandwidth utilization)
Supported GPUs: NVIDIA SM_80+ (Ampere, Ada Lovelace, Hopper)

See Enterprise for licensing details.

Understanding the Results

Why Python Bindings?

The Python bindings (PyO3) allow calling the Rust compiler directly from Python, eliminating:

Process spawning overhead (~2-3 ms)
Inter-process communication (~1-2 ms)
Total overhead: ~5 ms

This reveals MIND's true compilation performance: ~38 µs

Subprocess vs Direct Call

subprocess.run("mind compile")

Spawn process: ~2-3 ms
IPC overhead: ~1-2 ms
Actual compile: ~38 µs
TOTAL: ~5 ms

mind.compile() (Python binding)

Direct function call: ~0 µs
Actual compile: ~38 µs
TOTAL: ~38 µs

Benchmark Methodology

Same-Machine Testing

All comparisons performed on identical hardware:

Same CPU, RAM, OS
Same Python version
Sequential testing (no parallel interference)
Controlled environment

Statistical Rigor

Warmup: 10 runs (eliminate cold-start)
Sample size: 100 measurements
Outlier detection: Tukey's method
Confidence intervals: 95% CI
Precision: Nanosecond resolution (perf_counter)

Determinism Verification

SHA256 hashing: Cryptographic-strength verification
Byte-level comparison: Exact binary match
Multiple runs: 10+ per test
Zero tolerance: Any mismatch = failure

Reproducing Published Results

The published benchmark results are from:

Date	December 2025
Platform	Linux 4.4.0 x86_64
Python	3.11.14
PyTorch	2.9.1+cpu

To reproduce exactly:

cargo build --release
# Run benchmarks as shown above

Results should be within ±10% due to hardware differences.

MIC/MAP Format Benchmark

Compare MIC format efficiency against JSON, TOML, and TOON.

cd benchmarks
python3 format_benchmark.py

Token Efficiency Results

Format	Tokens	vs JSON	Parse Speed	Annual Cost (1M IRs)
JSON	278	baseline	5.31 us	$487
TOML	151	1.8x	137.06 us	$264
TOON	67	4.1x	2.67 us	$117
MIC	52	5.3x	2.26 us	$91

MIC saves $396/year per million IR operations vs JSON at GPT-5.2 pricing ($0.00175/1K input tokens).

MAP vs JSON-RPC

Protocol	Size	Tokens	vs JSON-RPC
JSON-RPC	1,004 bytes	251	baseline
MAP	234 bytes	58	4.3x fewer tokens

Next Steps

View Full Results — Complete benchmark data
Performance Overview — Understand the performance characteristics
Performance FAQ — Common questions answered