Run 1x1 | Number processes: 1Number nodes: 1Number processes per node: 1Run Command: <executable> MPI Command: mpirun -np <number_processes> -ppn <number_processes_per_node>Dataset: Run Directory: /home/eoseret/tst_HACCmkI_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spreadOMP_NUM_THREADS: 1 |
---|---|
Run 1x2 | OMP_NUM_THREADS: 2I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 1x4 | OMP_NUM_THREADS: 4I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 1x8 | OMP_NUM_THREADS: 8I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 1x16 | OMP_NUM_THREADS: 16I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 1x32 | OMP_NUM_THREADS: 32I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 1x64 | OMP_NUM_THREADS: 64I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 1x128 | OMP_NUM_THREADS: 128I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 1x192 | OMP_NUM_THREADS: 192I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Loop id | Source Location | Source Function | Level | Coverage 1x1 (%) | Coverage 1x2 (%) | Coverage 1x4 (%) | Coverage 1x8 (%) | Coverage 1x16 (%) | Coverage 1x32 (%) | Coverage 1x64 (%) | Coverage 1x128 (%) | Coverage 1x192 (%) | Max Time Over Threads 1x1 (s) | Max Time Over Threads 1x2 (s) | Max Time Over Threads 1x4 (s) | Max Time Over Threads 1x8 (s) | Max Time Over Threads 1x16 (s) | Max Time Over Threads 1x32 (s) | Max Time Over Threads 1x64 (s) | Max Time Over Threads 1x128 (s) | Max Time Over Threads 1x192 (s) | Time w.r.t. Wall Time 1x1 (s) | Time w.r.t. Wall Time 1x2 (s) | Time w.r.t. Wall Time 1x4 (s) | Time w.r.t. Wall Time 1x8 (s) | Time w.r.t. Wall Time 1x16 (s) | Time w.r.t. Wall Time 1x32 (s) | Time w.r.t. Wall Time 1x64 (s) | Time w.r.t. Wall Time 1x128 (s) | Time w.r.t. Wall Time 1x192 (s) | Nb Threads 1x1 | Nb Threads 1x2 | Nb Threads 1x4 | Nb Threads 1x8 | Nb Threads 1x16 | Nb Threads 1x32 | Nb Threads 1x64 | Nb Threads 1x128 | Nb Threads 1x192 | GFLOPS 1x1 | GFLOPS 1x2 | GFLOPS 1x4 | GFLOPS 1x8 | GFLOPS 1x16 | GFLOPS 1x32 | GFLOPS 1x64 | GFLOPS 1x128 | GFLOPS 1x192 | Vectorization Ratio (%) | Vector Length Use (%) | Speedup If No Scalar Integer | Speedup If FP Vectorized | Speedup If Fully Vectorized | Speedup If Perfect Load Balancing 1x1 | Speedup If Perfect Load Balancing 1x2 | Speedup If Perfect Load Balancing 1x4 | Speedup If Perfect Load Balancing 1x8 | Speedup If Perfect Load Balancing 1x16 | Speedup If Perfect Load Balancing 1x32 | Speedup If Perfect Load Balancing 1x64 | Speedup If Perfect Load Balancing 1x128 | Speedup If Perfect Load Balancing 1x192 | Stride 0 | Stride 1 | Stride n | Stride Unknown | Stride Indirect | (1x1) Efficiency | (1x1) Potential Speed-Up (%) | (1x2) Efficiency | (1x2) Potential Speed-Up (%) | (1x4) Efficiency | (1x4) Potential Speed-Up (%) | (1x8) Efficiency | (1x8) Potential Speed-Up (%) | (1x16) Efficiency | (1x16) Potential Speed-Up (%) | (1x32) Efficiency | (1x32) Potential Speed-Up (%) | (1x64) Efficiency | (1x64) Potential Speed-Up (%) | (1x128) Efficiency | (1x128) Potential Speed-Up (%) | (1x192) Efficiency | (1x192) Potential Speed-Up (%) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
3 | exec - Step10_orig.c:19-31 | Step10_orig | Single | 99.87 | 98.63 | 97.2 | 94.61 | 89.83 | 79.44 | 67.81 | 51.57 | 43.11 | 661.1 | 331.1 | 165.68 | 83.25 | 42.28 | 22.12 | 11.96 | 6.67 | 5.16 | 661.1 | 331.04 | 165.5 | 82.94 | 41.92 | 21.33 | 11.29 | 6.04 | 4.51 | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 192 | 3.54 | 7.08 | 14.16 | 28.25 | 55.89 | 109.85 | 207.54 | 387.83 | 519.28 | 100 | 92.05 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1.01 | 1.04 | 1.06 | 1.11 | 1.14 | 0 | 4 | 0 | 0 | 0 | 1 | 0 | 1 | 0.15 | 1 | 0.13 | 1 | 0.35 | 0.99 | 1.29 | 0.97 | 2.5 | 0.91 | 5.77 | 0.86 | 7.47 | 0.76 | 10.2 |
0 | exec - main.c:111-116 | main | Innermost | 0.04 | 0.61 | 0.61 | 0.61 | 0.56 | 0.51 | 0.41 | 0.32 | 0.24 | 0.25 | 4.08 | 4.14 | 4.3 | 4.17 | 4.39 | 4.41 | 4.74 | 4.74 | 0.25 | 2.04 | 1.04 | 0.54 | 0.26 | 0.14 | 0.07 | 0.04 | 0.02 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 10.01 | 1.22 | 2.40 | 4.62 | 9.62 | 17.95 | 35.61 | 62.44 | 124.69 | 0 | 6.25 | 1 | 1.5 | 13.71 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 4 | 0 | 0 | 0 | 1 | 0 | 0.06 | 0.57 | 0.06 | 0.57 | 0.06 | 0.57 | 0.06 | 0.53 | 0.06 | 0.48 | 0.06 | 0.39 | 0.05 | 0.3 | 0.07 | 0.22 |
2 | exec - Step10_orig.c:19-35 | Step10_orig | Single | 0.02 | 0.01 | 0.01 | 0.01 | 0.01 | 0.02 | 0.01 | 0.01 | 0.01 | 0.11 | 0.05 | 0.04 | 0.01 | 0.02 | 0.01 | 0.01 | 0 | 0.01 | 0.11 | 0.04 | 0.02 | 0.01 | 0.01 | 0 | 0 | 0 | 0 | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 124 | 176 | 4.99 | 13.59 | 26.63 | 55.25 | 50.88 | 0.00 | 0.00 | 0.00 | 0.00 | 6.45 | 8.47 | 1 | 4 | 4 | 1 | 1.25 | 2 | 1 | 2 | 0 | 0 | 0 | 0 | NA | NA | NA | NA | NA | 1 | 0 | 1.38 | -0 | 1.38 | -0 | 1.38 | -0 | 0.69 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
4 | exec - main.c:142-146 | main._omp_fn.1 | Single | 0.02 | 0.02 | 0.02 | 0.02 | 0.02 | 0.04 | 0.03 | 0.03 | 0.02 | 0.12 | 0.07 | 0.05 | 0.02 | 0.03 | 0.03 | 0.01 | 0.02 | 0.04 | 0.12 | 0.06 | 0.03 | 0.02 | 0.01 | 0.01 | 0 | 0 | 0 | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 124 | 181 | 3.01 | 6.23 | 12.96 | 18.38 | 52.75 | 54.00 | 0.00 | 0.00 | 0.00 | 0 | 6.25 | 1.06 | 1.06 | 9.26 | 1 | 1.17 | 1.67 | 1 | 3 | 3 | 0 | 0 | 0 | 1 | 0 | 0 | 2 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0.75 | 0.01 | 0.75 | 0.01 | 0.38 | 0.03 | 1 | 0 | 1 | 0 | 1 | 0 |