Skip to content

End-to-End Performance

This benchmark compares the full basin stability estimation pipeline across MATLAB and Python implementations.

Methodology

All implementations use the same:

  • ODE system: Damped driven pendulum
  • Parameters: α=0.1, T=0.5, K=1.0
  • Integration: t_span=(0, 1000), rtol=1e-8, atol=1e-6
  • Sample sizes: 100 to 100,000 initial conditions

Implementations Compared

Implementation Platform Parallelization
MATLAB bSTAB-M CPU MATLAB parfor
pybasin + JAX CPU Vectorized (vmap)
pybasin + JAX CUDA GPU Vectorized (vmap)
Attractors.jl CPU Threaded

Results

Performance Comparison

N MATLAB Python CPU Python CUDA Attractors.jl pynamicalsys
100 0.76s 1.30s (0.6×) 12.86s (0.1×) 0.12s (6.2×) 0.16s (4.7×)
200 1.02s 1.50s (0.7×) 12.87s (0.1×) 0.16s (6.4×) 0.28s (3.6×)
500 1.90s 1.62s (1.2×) 12.98s (0.1×) 0.41s (4.6×) 0.62s (3.1×)
1,000 3.27s 2.00s (1.6×) 12.05s (0.3×) 0.85s (3.9×) 1.15s (2.8×)
2,000 6.29s 2.72s (2.3×) 12.32s (0.5×) 1.68s (3.8×) 2.51s (2.5×)
5,000 15.90s 5.73s (2.8×) 12.82s (1.2×) 4.22s (3.8×) 5.55s (2.9×)
10,000 31.01s 10.52s (2.9×) 12.64s (2.5×) 9.12s (3.4×) 11.11s (2.8×)
20,000 62.73s 20.94s (3.0×) 12.27s (5.1×) 21.08s (3.0×) 23.03s (2.7×)
50,000 153.04s 30.07s (5.1×) 12.40s (12.3×) 86.38s (1.8×) 56.02s (2.7×)
100,000 309.07s 62.94s (4.9×) 12.57s (24.6×) 115.96s (2.7×)

Bold marks the fastest per row. When the GPU wins, the best CPU-only option is also bolded — use it as the recommended alternative when no GPU is available.

Attractors.jl memory limit

Attractors.jl is not benchmarked at N=100,000. The GroupViaClustering (DBSCAN) step allocates a full N×N pairwise distance matrix, requiring ~80 GB of RAM at that scale. The practical ceiling on this machine is N=50,000 (~20 GB).

Scaling Analysis

Implementation Scaling Exponent α
Attractors.jl O(N) 1.05 ± 0.08 0.989
pynamicalsys O(N) 0.96 ± 0.02 0.999
Python CPU O(N^0.59) 0.59 ± 0.10 0.942
MATLAB O(N) 0.90 ± 0.06 0.992
Python CUDA O(1) -0.00 ± 0.01 0.168

Comparison Plot

Benchmark Comparison

Scaling Plot (Log-Log)

Scaling Analysis

Key Findings

  1. Python CPU becomes 3-5× faster than MATLAB for N > 5,000
  2. Python CUDA achieves near-constant time (~12s) regardless of N due to GPU parallelization
  3. At N=100,000: GPU is ~25× faster than MATLAB (as long as data fits in GPU memory)
  4. Attractors.jl scales linearly (O(N)) and matches Python CPU throughput up to N=50,000

Hardware

Benchmarks run on:

  • CPU: Intel Core Ultra 9 275HX
  • GPU: NVIDIA GeForce RTX 5070 Ti Laptop GPU (12 GB VRAM)