Running Benchmarks
Learn how to run MIND's performance benchmarks and verify the results yourself.
Prerequisites
# Clone the MIND repository git clone https://github.com/cputer/mind.git cd mind # Build MIND in release mode cargo build --release
Determinism Benchmark
Verify that MIND produces bit-identical compilation output.
python3 benchmarks/determinism/benchmark_determinism.py
Expected Output
SUMMARY: 4/4 tests DETERMINISTIC ✅ DETERMINISM VERIFIED: All outputs are bit-identical across runs
What it tests: 4 different programs (scalar_math, small_matmul, medium_matmul, mlp), 10 compilation runs per program, SHA256 hash comparison. 100% identical hashes = deterministic.
PyTorch Comparison Benchmark
Compare MIND compilation speed vs PyTorch 2.0.
# Install PyTorch if needed pip install torch # Run comparison python3 benchmarks/pytorch_comparison/benchmark_pytorch_compile.py
Expected Output
Benchmark MIND PyTorch 2.0 MIND Speedup -------------------------------------------------------- scalar_math 5.5 ms 2.4 ms (see note below) conv2d 5.4 ms 9.4 ms 2× faster
Note: MIND times include ~5ms subprocess overhead. See next section for real compilation time.
Real Compilation Time (Python Bindings)
Measure MIND's true compilation time without subprocess overhead.
# Build Python bindings maturin build --release --features python-bindings,autodiff # Install the wheel pip install target/wheels/mind-*.whl # Run test python3 test_real_compile_time.py
Expected Output
Real MIND Compilation Time (NO subprocess overhead): Mean: 38.3 µs StdDev: 4.3 µs Min: 35.7 µs Max: 53.4 µs
This is the TRUE compilation time — no process spawning, no IPC overhead.
GPU Benchmarks (Enterprise)
The Enterprise runtime includes CUDA GPU benchmarks. Contact sales for access to:
- Memory allocation: CachingAllocator vs cudaMalloc (180x improvement)
- MatMul performance: cuBLAS with TF32/FP16 Tensor Cores (35-40% faster than PyTorch)
- Elementwise operations: float4 vectorized kernels (98% bandwidth utilization)
- Supported GPUs: NVIDIA SM_80+ (Ampere, Ada Lovelace, Hopper)
See Enterprise for licensing details.
Understanding the Results
Why Python Bindings?
The Python bindings (PyO3) allow calling the Rust compiler directly from Python, eliminating:
- Process spawning overhead (~2-3 ms)
- Inter-process communication (~1-2 ms)
- Total overhead: ~5 ms
This reveals MIND's true compilation performance: ~38 µs
Subprocess vs Direct Call
subprocess.run("mind compile")
- Spawn process: ~2-3 ms
- IPC overhead: ~1-2 ms
- Actual compile: ~38 µs
- TOTAL: ~5 ms
mind.compile() (Python binding)
- Direct function call: ~0 µs
- Actual compile: ~38 µs
- TOTAL: ~38 µs
Benchmark Methodology
Same-Machine Testing
All comparisons performed on identical hardware:
- Same CPU, RAM, OS
- Same Python version
- Sequential testing (no parallel interference)
- Controlled environment
Statistical Rigor
- Warmup: 10 runs (eliminate cold-start)
- Sample size: 100 measurements
- Outlier detection: Tukey's method
- Confidence intervals: 95% CI
- Precision: Nanosecond resolution (perf_counter)
Determinism Verification
- SHA256 hashing: Cryptographic-strength verification
- Byte-level comparison: Exact binary match
- Multiple runs: 10+ per test
- Zero tolerance: Any mismatch = failure
Reproducing Published Results
The published benchmark results are from:
| Date | December 2025 |
| Platform | Linux 4.4.0 x86_64 |
| Python | 3.11.14 |
| PyTorch | 2.9.1+cpu |
To reproduce exactly:
cargo build --release # Run benchmarks as shown above
Results should be within ±10% due to hardware differences.
MIC/MAP Format Benchmark
Compare MIC format efficiency against JSON, TOML, and TOON.
cd benchmarks python3 format_benchmark.py
Token Efficiency Results
| Format | Tokens | vs JSON | Parse Speed | Annual Cost (1M IRs) |
|---|---|---|---|---|
| JSON | 278 | baseline | 5.31 us | $487 |
| TOML | 151 | 1.8x | 137.06 us | $264 |
| TOON | 67 | 4.1x | 2.67 us | $117 |
| MIC | 52 | 5.3x | 2.26 us | $91 |
MIC saves $396/year per million IR operations vs JSON at GPT-5.2 pricing ($0.00175/1K input tokens).
MAP vs JSON-RPC
| Protocol | Size | Tokens | vs JSON-RPC |
|---|---|---|---|
| JSON-RPC | 1,004 bytes | 251 | baseline |
| MAP | 234 bytes | 58 | 4.3x fewer tokens |
Next Steps
- View Full Results — Complete benchmark data
- Performance Overview — Understand the performance characteristics
- Performance FAQ — Common questions answered