Run 2x1 | Number processes: 2Number nodes: 1Number processes per node: 2Run Command: <executable> -x 100 -y 100 -z 100 --xproc=2 --yproc=1 --zproc=1MPI Command: mpirun -np <number_processes> /usr/bin/numactl -m 8-15Dataset: Run Directory: /home/eoseret/qaas_runs_CPU_9468/171-110-4860/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_run_1711106847I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spreadOMP_NUM_THREADS: 1 |
---|---|
Run 2x2 | OMP_NUM_THREADS: 2I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 2x4 | OMP_NUM_THREADS: 4I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 2x8 | OMP_NUM_THREADS: 8I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 2x16 | OMP_NUM_THREADS: 16I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 2x32 | OMP_NUM_THREADS: 32I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 2x48 | OMP_NUM_THREADS: 48I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Loop id | Source Location | Source Function | Level | Coverage 2x1 (%) | Coverage 2x2 (%) | Coverage 2x4 (%) | Coverage 2x8 (%) | Coverage 2x16 (%) | Coverage 2x32 (%) | Coverage 2x48 (%) | Max Time Over Threads 2x1 (s) | Max Time Over Threads 2x2 (s) | Max Time Over Threads 2x4 (s) | Max Time Over Threads 2x8 (s) | Max Time Over Threads 2x16 (s) | Max Time Over Threads 2x32 (s) | Max Time Over Threads 2x48 (s) | Time w.r.t. Wall Time 2x1 (s) | Time w.r.t. Wall Time 2x2 (s) | Time w.r.t. Wall Time 2x4 (s) | Time w.r.t. Wall Time 2x8 (s) | Time w.r.t. Wall Time 2x16 (s) | Time w.r.t. Wall Time 2x32 (s) | Time w.r.t. Wall Time 2x48 (s) | Nb Threads 2x1 | Nb Threads 2x2 | Nb Threads 2x4 | Nb Threads 2x8 | Nb Threads 2x16 | Nb Threads 2x32 | Nb Threads 2x48 | GFLOPS 2x1 | GFLOPS 2x2 | GFLOPS 2x4 | GFLOPS 2x8 | GFLOPS 2x16 | GFLOPS 2x32 | GFLOPS 2x48 | Vectorization Ratio (%) | Vector Length Use (%) | Speedup If No Scalar Integer | Speedup If FP Vectorized | Speedup If Fully Vectorized | Speedup If Perfect Load Balancing 2x1 | Speedup If Perfect Load Balancing 2x2 | Speedup If Perfect Load Balancing 2x4 | Speedup If Perfect Load Balancing 2x8 | Speedup If Perfect Load Balancing 2x16 | Speedup If Perfect Load Balancing 2x32 | Speedup If Perfect Load Balancing 2x48 | Stride 0 | Stride 1 | Stride n | Stride Unknown | Stride Indirect | (2x1) Efficiency | (2x1) Potential Speed-Up (%) | (2x2) Efficiency | (2x2) Potential Speed-Up (%) | (2x4) Efficiency | (2x4) Potential Speed-Up (%) | (2x8) Efficiency | (2x8) Potential Speed-Up (%) | (2x16) Efficiency | (2x16) Potential Speed-Up (%) | (2x32) Efficiency | (2x32) Potential Speed-Up (%) | (2x48) Efficiency | (2x48) Potential Speed-Up (%) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
93 | exec - ljForce.c:191-216 [...] | ljForce.extracted | Innermost | 92.58 | 90.96 | 88.36 | 83.08 | 74.49 | 53.72 | 44.31 | 238.84 | 119.78 | 59.99 | 30.31 | 15.26 | 9.7 | 7.68 | 238.88 | 119.86 | 60.03 | 30.05 | 15.03 | 8.85 | 6.76 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 8.58 | 17.09 | 34.12 | 68.17 | 136.28 | 231.46 | 303.00 | 35.93 | 16.99 | 1 | 2.59 | 5.18 | 1 | 1 | 1 | 1.01 | 1.02 | 1.1 | 1.14 | 1.33 | 0 | 1 | 0 | 0.67 | 1 | 0 | 1 | 0.32 | 0.99 | 0.46 | 0.99 | 0.53 | 0.99 | 0.5 | 0.84 | 8.41 | 0.74 | 11.69 |
101 | exec - timestep.c:74-78 | advanceVelocity.extracted | Innermost | 1.33 | 1.36 | 1.41 | 1.4 | 1.5 | 2.75 | 4.41 | 3.45 | 1.88 | 1.04 | 0.54 | 0.34 | 0.62 | 0.94 | 3.44 | 1.79 | 0.96 | 0.51 | 0.3 | 0.45 | 0.67 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 1.31 | 2.52 | 4.75 | 8.92 | 15.23 | 10.00 | 6.77 | 0 | 12.5 | 1.08 | 1.13 | 8 | 1 | 1.05 | 1.08 | 1.06 | 1.13 | 1.38 | 1.4 | 0 | 2 | 0 | 0 | 0 | 1 | 0 | 0.96 | 0.05 | 0.9 | 0.15 | 0.84 | 0.22 | 0.72 | 0.42 | 0.24 | 2.09 | 0.11 | 3.94 |
59 | exec - haloExchange.c:621-629 | sortAtomsInCell | Single | 1.14 | 1.2 | 1.19 | 1.17 | 1.27 | 3 | 4.21 | 2.98 | 1.69 | 0.89 | 0.5 | 0.3 | 0.62 | 0.87 | 2.95 | 1.59 | 0.81 | 0.42 | 0.26 | 0.49 | 0.64 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 50 | 22.92 | 1.6 | 1 | 5.55 | 1.01 | 1.07 | 1.1 | 1.19 | 1.15 | 1.27 | 1.36 | 0 | 2 | 3 | 0 | 0 | 1 | 0 | 0.93 | 0.09 | 0.91 | 0.11 | 0.88 | 0.14 | 0.71 | 0.37 | 0.19 | 2.44 | 0.1 | 3.81 |
92 | exec - ljForce.c:187-216 [...] | ljForce.extracted | InBetween | 0.99 | 0.96 | 0.94 | 0.86 | 0.78 | 0.55 | 0.47 | 2.61 | 1.31 | 0.69 | 0.37 | 0.2 | 0.16 | 0.12 | 2.55 | 1.26 | 0.64 | 0.31 | 0.16 | 0.09 | 0.07 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 12.51 | 25.41 | 49.98 | 103.34 | 199.16 | 354.47 | 457.76 | 0 | 12.5 | 1 | 1 | 8 | 1.02 | 1.04 | 1.08 | 1.19 | 1.25 | 1.78 | 1.71 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 1.01 | 0 | 1 | 0 | 1.03 | 0 | 1 | 0 | 0.89 | 0.06 | 0.76 | 0.11 |
85 | exec - linkCells.c:209-371 [...] | updateLinkCells | Innermost | 0.79 | 0.77 | 0.76 | 0.73 | 0.64 | 0.42 | 0.3 | 2.05 | 2.05 | 2.07 | 2.17 | 2.11 | 2.32 | 2.17 | 2.03 | 1.02 | 0.52 | 0.26 | 0.13 | 0.07 | 0.05 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 1.16 | 2.30 | 4.53 | 9.00 | 18.11 | 33.62 | 47.03 | 34.43 | 14.45 | 2.65 | 2.55 | 11.22 | 1.01 | 1.01 | 1 | 1.03 | 1.02 | 1.05 | 1 | NA | NA | NA | NA | NA | 1 | 0 | 1 | 0 | 0.98 | 0.02 | 0.98 | 0.02 | 0.98 | 0.02 | 0.91 | 0.04 | 0.85 | 0.05 |
103 | exec - timestep.c:88-94 | advancePosition.extracted | Innermost | 0.7 | 0.72 | 0.72 | 0.72 | 0.8 | 1.67 | 2.57 | 1.84 | 0.99 | 0.51 | 0.31 | 0.19 | 0.39 | 0.53 | 1.81 | 0.94 | 0.49 | 0.26 | 0.16 | 0.28 | 0.39 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 1.45 | 2.79 | 5.39 | 10.12 | 16.62 | 9.31 | 6.59 | 0 | 12.5 | 1 | 1.68 | 2 | 1.02 | 1.05 | 1.04 | 1.19 | 1.19 | 1.44 | 1.36 | 0 | 3 | 0 | 0 | 1 | 1 | 0 | 0.96 | 0.03 | 0.92 | 0.06 | 0.87 | 0.09 | 0.71 | 0.23 | 0.2 | 1.33 | 0.1 | 2.32 |
91 | exec - ljForce.c:178-216 [...] | ljForce.extracted | InBetween | 0.42 | 0.41 | 0.38 | 0.36 | 0.36 | 0.25 | 0.21 | 1.08 | 0.59 | 0.33 | 0.18 | 0.1 | 0.08 | 0.06 | 1.08 | 0.54 | 0.26 | 0.13 | 0.07 | 0.04 | 0.03 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 11.89 | 24.46 | 49.77 | 99.34 | 188.32 | 326.07 | 436.25 | 0 | 9.71 | 1 | 1 | 13.87 | 1 | 1.09 | 1.27 | 1.38 | 1.43 | 2 | 2 | 1 | 1 | 0.75 | 1.75 | 0.75 | 1 | 0 | 1 | 0 | 1.04 | 0 | 1.04 | 0 | 0.96 | 0.01 | 0.84 | 0.04 | 0.75 | 0.05 |
89 | exec - ljForce.c:157-158 [...] | ljForce.extracted.25 | Single | 0.31 | 0.34 | 0.34 | 0.38 | 0.62 | 4.95 | 5.49 | 0.84 | 0.5 | 0.27 | 0.19 | 0.16 | 0.98 | 1.1 | 0.81 | 0.44 | 0.23 | 0.14 | 0.13 | 0.82 | 0.84 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 50 | 15.63 | 2 | 1 | 6.4 | 1.05 | 1.14 | 1.17 | 1.36 | 1.23 | 1.21 | 1.33 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0.92 | 0.03 | 0.88 | 0.04 | 0.72 | 0.11 | 0.39 | 0.38 | 0.03 | 4.8 | 0.02 | 5.38 |
47 | exec - haloExchange.c:380-389 | loadAtomsBuffer | Innermost | 0.12 | 0.11 | 0.1 | 0.1 | 0.09 | 0.05 | 0.05 | 0.31 | 0.3 | 0.3 | 0.32 | 0.31 | 0.28 | 0.41 | 0.31 | 0.14 | 0.07 | 0.04 | 0.02 | 0.01 | 0.01 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 0.46 | 1.12 | 2.01 | 3.65 | 7.93 | 15.50 | 13.30 | 30.77 | 13.94 | 1.58 | 1.06 | 7.24 | 1 | 1.03 | 1.11 | 1.1 | 1.03 | 1.08 | 1.24 | 0 | 2 | 3 | 0 | 0 | 1 | 0 | 1.11 | 0 | 1.11 | 0 | 0.97 | 0 | 0.97 | 0 | 0.97 | 0 | 0.65 | 0.02 |
60 | exec - haloExchange.c:633-642 | sortAtomsInCell | Single | 0.07 | 0.07 | 0.07 | 0.06 | 0.06 | 0.05 | 0.04 | 0.17 | 0.11 | 0.07 | 0.05 | 0.02 | 0.03 | 0.02 | 0.17 | 0.1 | 0.05 | 0.02 | 0.01 | 0.01 | 0.01 | 2 | 4 | 8 | 16 | 29 | 51 | 68 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 33.33 | 14.58 | 1.5 | 1 | 6.86 | 1 | 1.1 | 1.4 | 2.5 | 2 | 3 | 2 | 0 | 2 | 3 | 0 | 0 | 1 | 0 | 0.85 | 0.01 | 0.85 | 0.01 | 1.06 | -0 | 1.06 | -0 | 0.53 | 0.02 | 0.35 | 0.03 |
99 | exec - timestep.c:110-116 | kineticEnergy.extracted | Innermost | 0.05 | 0.06 | 0.05 | 0.05 | 0.07 | 0.14 | 0.19 | 0.13 | 0.08 | 0.05 | 0.03 | 0.03 | 0.04 | 0.05 | 0.14 | 0.07 | 0.03 | 0.02 | 0.01 | 0.02 | 0.03 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 2.56 | 4.79 | 12.03 | 17.28 | 32.00 | 17.70 | 12.35 | 91.3 | 36.96 | 1.17 | 1.24 | 1.31 | 1 | 1.14 | 1.67 | 1.5 | 3 | 2 | 1.67 | 0 | 1 | 0 | 0 | 2 | 1 | 0 | 1 | 0 | 1.17 | 0 | 0.88 | 0.01 | 0.88 | 0.01 | 0.22 | 0.11 | 0.1 | 0.17 |
100 | exec - timestep.c:71-78 | advanceVelocity.extracted | Outermost | 0.04 | 0.06 | 0.04 | 0.03 | 0.03 | 0.07 | 0.12 | 0.1 | 0.11 | 0.04 | 0.02 | 0.01 | 0.03 | 0.07 | 0.1 | 0.08 | 0.02 | 0.01 | 0.01 | 0.01 | 0.02 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 2.78 | 3.53 | 11.83 | 24.25 | 22.40 | 29.85 | 12.88 | 0 | 9.82 | 1.41 | 1.54 | 12.43 | 1 | 1.38 | 2 | 2 | 1 | 3 | 3.5 | 0.75 | 2 | 1 | 0 | 0.75 | 1 | 0 | 0.63 | 0.02 | 1.25 | 0 | 1.25 | 0 | 0.63 | 0.01 | 0.31 | 0.05 | 0.1 | 0.11 |
73 | exec - initAtoms.c:154-162 [...] | setTemperature.extracted.30 | InBetween | 0.03 | 0.04 | 0.04 | 0.02 | 0.02 | 0.02 | 0.01 | 0.08 | 0.06 | 0.05 | 0.02 | 0.01 | 0.01 | 0.01 | 0.08 | 0.05 | 0.02 | 0.01 | 0 | 0 | 0 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 4.77 | 7.56 | 19.05 | 36.25 | 0.00 | 0.00 | 0.00 | 4.11 | 12.93 | 1 | 2 | 2 | 1 | 1.2 | 2.5 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 3 | 0 | 1 | 0 | 0.8 | 0.01 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
77 | exec - initAtoms.c:197-202 [...] | randomDisplacements.extracted | Innermost | 0.03 | 0.03 | 0.03 | 0.02 | 0.02 | 0.03 | 0.02 | 0.07 | 0.04 | 0.02 | 0.01 | 0.01 | 0.01 | 0.01 | 0.07 | 0.04 | 0.02 | 0.01 | 0.01 | 0.01 | 0 | 2 | 4 | 8 | 16 | 32 | 64 | 54 | 0.69 | 1.20 | 2.40 | 4.80 | 4.80 | 1.55 | 0.00 | 3.67 | 12.96 | 10.8 | 1.45 | 8 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 0.88 | 0 | 0.88 | 0 | 0.88 | 0 | 0.44 | 0.01 | 0.22 | 0.02 | 1 | 0 |
102 | exec - timestep.c:85-94 | advancePosition.extracted | Outermost | 0.03 | 0.03 | 0.03 | 0.02 | 0.02 | 0.04 | 0.06 | 0.08 | 0.04 | 0.03 | 0.01 | 0.02 | 0.02 | 0.04 | 0.07 | 0.04 | 0.02 | 0.01 | 0 | 0.01 | 0.01 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 2.52 | 4.40 | 8.05 | 17.30 | 0.00 | 18.60 | 20.00 | 0 | 10.12 | 1.3 | 1.61 | 3.83 | 1.14 | 1 | 1.5 | 1 | 0 | 2 | 4 | 0.75 | 2 | 1.75 | 0 | 1.75 | 1 | 0 | 0.88 | 0 | 0.88 | 0 | 0.88 | 0 | 1 | 0 | 0.22 | 0.03 | 0.15 | 0.05 |
90 | exec - ljForce.c:172-216 [...] | ljForce.extracted | Outermost | 0.02 | 0.04 | 0.02 | 0.03 | 0.02 | 0.02 | 0.02 | 0.06 | 0.08 | 0.03 | 0.02 | 0.02 | 0.01 | 0.02 | 0.05 | 0.05 | 0.01 | 0.01 | 0 | 0 | 0 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 14.79 | 15.26 | 75.35 | 77.30 | 0.00 | 0.00 | 0.00 | 0 | 7.34 | 1 | 1 | 15.7 | 1.2 | 1.6 | 3 | 2 | 0 | 0 | 0 | 1 | 0 | 2 | 0.5 | 0.5 | 1 | 0 | 0.5 | 0.02 | 1.25 | 0 | 0.63 | 0.01 | 1 | 0 | 1 | 0 | 1 | 0 |