End-to-End Performance
This benchmark compares the full basin stability estimation pipeline across MATLAB and Python implementations.
Methodology
All implementations use the same:
- ODE system: Damped driven pendulum
- Parameters:
α=0.1,T=0.5,K=1.0 - Integration:
t_span=(0, 1000),rtol=1e-8,atol=1e-6 - Sample sizes: 100 to 100,000 initial conditions
Implementations Compared
| Implementation | Platform | Parallelization |
|---|---|---|
| MATLAB bSTAB-M | CPU | MATLAB parfor |
| pybasin + JAX | CPU | Vectorized (vmap) |
| pybasin + JAX | CUDA GPU | Vectorized (vmap) |
| Attractors.jl | CPU | Threaded |
Results
Performance Comparison
| N | MATLAB | Python CPU | Python CUDA | Attractors.jl | pynamicalsys |
|---|---|---|---|---|---|
| 100 | 0.76s | 1.30s (0.6×) | 12.86s (0.1×) | 0.12s (6.2×) | 0.16s (4.7×) |
| 200 | 1.02s | 1.50s (0.7×) | 12.87s (0.1×) | 0.16s (6.4×) | 0.28s (3.6×) |
| 500 | 1.90s | 1.62s (1.2×) | 12.98s (0.1×) | 0.41s (4.6×) | 0.62s (3.1×) |
| 1,000 | 3.27s | 2.00s (1.6×) | 12.05s (0.3×) | 0.85s (3.9×) | 1.15s (2.8×) |
| 2,000 | 6.29s | 2.72s (2.3×) | 12.32s (0.5×) | 1.68s (3.8×) | 2.51s (2.5×) |
| 5,000 | 15.90s | 5.73s (2.8×) | 12.82s (1.2×) | 4.22s (3.8×) | 5.55s (2.9×) |
| 10,000 | 31.01s | 10.52s (2.9×) | 12.64s (2.5×) | 9.12s (3.4×) | 11.11s (2.8×) |
| 20,000 | 62.73s | 20.94s (3.0×) | 12.27s (5.1×) | 21.08s (3.0×) | 23.03s (2.7×) |
| 50,000 | 153.04s | 30.07s (5.1×) | 12.40s (12.3×) | 86.38s (1.8×) | 56.02s (2.7×) |
| 100,000 | 309.07s | 62.94s (4.9×) | 12.57s (24.6×) | — | 115.96s (2.7×) |
Bold marks the fastest per row. When the GPU wins, the best CPU-only option is also bolded — use it as the recommended alternative when no GPU is available.
Attractors.jl memory limit
Attractors.jl is not benchmarked at N=100,000. The GroupViaClustering (DBSCAN) step
allocates a full N×N pairwise distance matrix, requiring ~80 GB of RAM at that scale.
The practical ceiling on this machine is N=50,000 (~20 GB).
Scaling Analysis
| Implementation | Scaling | Exponent α | R² |
|---|---|---|---|
| Attractors.jl | O(N) | 1.05 ± 0.08 | 0.989 |
| pynamicalsys | O(N) | 0.96 ± 0.02 | 0.999 |
| Python CPU | O(N^0.59) | 0.59 ± 0.10 | 0.942 |
| MATLAB | O(N) | 0.90 ± 0.06 | 0.992 |
| Python CUDA | O(1) | -0.00 ± 0.01 | 0.168 |
Comparison Plot

Scaling Plot (Log-Log)

Key Findings
- Python CPU becomes 3-5× faster than MATLAB for N > 5,000
- Python CUDA achieves near-constant time (~12s) regardless of N due to GPU parallelization
- At N=100,000: GPU is ~25× faster than MATLAB (as long as data fits in GPU memory)
- Attractors.jl scales linearly (O(N)) and matches Python CPU throughput up to N=50,000
Hardware
Benchmarks run on:
- CPU: Intel Core Ultra 9 275HX
- GPU: NVIDIA GeForce RTX 5070 Ti Laptop GPU (12 GB VRAM)