Running Benchmarks

Learn how to run MIND's performance benchmarks and verify the results yourself.

Prerequisites

# Clone the MIND repository
git clone https://github.com/cputer/mind.git
cd mind

# Build MIND in release mode
cargo build --release

Determinism Benchmark

Verify that MIND produces bit-identical compilation output.

python3 benchmarks/determinism/benchmark_determinism.py

Expected Output

SUMMARY: 4/4 tests DETERMINISTIC
✅ DETERMINISM VERIFIED: All outputs are bit-identical across runs

What it tests: 4 different programs (scalar_math, small_matmul, medium_matmul, mlp), 10 compilation runs per program, SHA256 hash comparison. 100% identical hashes = deterministic.

PyTorch Comparison Benchmark

Compare MIND compilation speed vs PyTorch 2.0.

# Install PyTorch if needed
pip install torch

# Run comparison
python3 benchmarks/pytorch_comparison/benchmark_pytorch_compile.py

Expected Output

Benchmark         MIND      PyTorch 2.0    MIND Speedup
--------------------------------------------------------
scalar_math       5.5 ms    2.4 ms         (see note below)
conv2d            5.4 ms    9.4 ms         2× faster

Note: MIND times include ~5ms subprocess overhead. See next section for real compilation time.

Real Compilation Time (Python Bindings)

Measure MIND's true compilation time without subprocess overhead.

# Build Python bindings
maturin build --release --features python-bindings,autodiff

# Install the wheel
pip install target/wheels/mind-*.whl

# Run test
python3 test_real_compile_time.py

Expected Output

Real MIND Compilation Time (NO subprocess overhead):
  Mean:   38.3 µs
  StdDev: 4.3 µs
  Min:    35.7 µs
  Max:    53.4 µs

This is the TRUE compilation time — no process spawning, no IPC overhead.

GPU Benchmarks (Enterprise)

The Enterprise runtime includes CUDA GPU benchmarks. Contact sales for access to:

  • Memory allocation: CachingAllocator vs cudaMalloc (180x improvement)
  • MatMul performance: cuBLAS with TF32/FP16 Tensor Cores (35-40% faster than PyTorch)
  • Elementwise operations: float4 vectorized kernels (98% bandwidth utilization)
  • Supported GPUs: NVIDIA SM_80+ (Ampere, Ada Lovelace, Hopper)

See Enterprise for licensing details.

Understanding the Results

Why Python Bindings?

The Python bindings (PyO3) allow calling the Rust compiler directly from Python, eliminating:

  • Process spawning overhead (~2-3 ms)
  • Inter-process communication (~1-2 ms)
  • Total overhead: ~5 ms

This reveals MIND's true compilation performance: ~38 µs

Subprocess vs Direct Call

subprocess.run("mind compile")

  • Spawn process: ~2-3 ms
  • IPC overhead: ~1-2 ms
  • Actual compile: ~38 µs
  • TOTAL: ~5 ms

mind.compile() (Python binding)

  • Direct function call: ~0 µs
  • Actual compile: ~38 µs
  • TOTAL: ~38 µs

Benchmark Methodology

Same-Machine Testing

All comparisons performed on identical hardware:

  • Same CPU, RAM, OS
  • Same Python version
  • Sequential testing (no parallel interference)
  • Controlled environment

Statistical Rigor

  • Warmup: 10 runs (eliminate cold-start)
  • Sample size: 100 measurements
  • Outlier detection: Tukey's method
  • Confidence intervals: 95% CI
  • Precision: Nanosecond resolution (perf_counter)

Determinism Verification

  • SHA256 hashing: Cryptographic-strength verification
  • Byte-level comparison: Exact binary match
  • Multiple runs: 10+ per test
  • Zero tolerance: Any mismatch = failure

Reproducing Published Results

The published benchmark results are from:

DateDecember 2025
PlatformLinux 4.4.0 x86_64
Python3.11.14
PyTorch2.9.1+cpu

To reproduce exactly:

cargo build --release
# Run benchmarks as shown above

Results should be within ±10% due to hardware differences.

MIC/MAP Format Benchmark

Compare MIC format efficiency against JSON, TOML, and TOON.

cd benchmarks
python3 format_benchmark.py

Token Efficiency Results

FormatTokensvs JSONParse SpeedAnnual Cost (1M IRs)
JSON278baseline5.31 us$487
TOML1511.8x137.06 us$264
TOON674.1x2.67 us$117
MIC525.3x2.26 us$91

MIC saves $396/year per million IR operations vs JSON at GPT-5.2 pricing ($0.00175/1K input tokens).

MAP vs JSON-RPC

ProtocolSizeTokensvs JSON-RPC
JSON-RPC1,004 bytes251baseline
MAP234 bytes584.3x fewer tokens

Next Steps