Loops
MultiBsplineRef.hpp: 68 - 146.25%
Run orig_HBM_CACHE | Run gcc_1_HBM_CACHE | Run icx_5_HBM_CACHE | Run orig_DDR | Run gcc_11_DDR | Run icx_7_DDR | ||||||||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
873 | 22.74 | 22.19 | 27.19 | 100 | 25 | 260.15 | 820 | 21.38 | 23 | 23.17 | 100 | 50 | 266.42 | 768 | 22.51 | 21.54 | 26.89 | 100 | 25 | 267.31 | 873 | 27.97 | 26.69 | 23.66 | 100 | 25 | 228.14 | 676 | 27.45 | 26.12 | 22.24 | 100 | 50 | 234.6 | 748 | 27.55 | 26 | 23.1 | 100 | 25 | 225.8 |
SoaDistanceTableAAOMPTarget.h: 440 - 34.91%
Run orig_HBM_CACHE | Run gcc_1_HBM_CACHE | Run icx_5_HBM_CACHE | Run orig_DDR | Run gcc_11_DDR | Run icx_7_DDR | ||||||||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1864 | 4.58 | 4.34 | 5.32 | 54.55 | 15.91 | 0 | 2289 | 4.5 | 4.78 | 4.81 | 27.27 | 15.91 | 0 | 1742 | 4.57 | 4.22 | 5.27 | 54.55 | 15.91 | 0 | 1864 | 8.17 | 7.71 | 6.83 | 54.55 | 15.91 | 0 | 184 | 7.49 | 7.09 | 6.03 | 27.27 | 15.91 | 0 | 1724 | 8.22 | 7.49 | 6.65 | 54.55 | 15.91 | 0 |
BsplineFunctor.h: 236 - 8.72%
Run orig_HBM_CACHE | Run gcc_1_HBM_CACHE | Run icx_5_HBM_CACHE | Run orig_DDR | Run gcc_11_DDR | Run icx_7_DDR | ||||||||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
393 | 1.27 | 0.99 | 1.22 | 89.47 | 44.08 | 0.74 | 261 | 0.06 | 0.04 | 0.04 | 0 | 10 | 0.54 | 274 | 1.28 | 1.09 | 1.36 | 84.44 | 43.19 | 0.74 | 393 | 1.98 | 1.47 | 1.3 | 89.47 | 44.08 | 0.53 | 642 | 0.12 | 0.05 | 0.04 | 0 | 10 | 0.57 | 252 | 2.15 | 1.7 | 1.51 | 0 | 11.16 | 0.18 |
308 | 0.06 | 0.03 | 0.03 | 88.24 | 43.38 | 1.78 | 342 | 1.58 | 1.55 | 1.56 | 0 | 10 | 0.3 | 308 | 0.1 | 0.04 | 0.04 | 88.24 | 43.38 | 1.76 | 558 | 2.2 | 1.91 | 1.62 | 0 | 10 | 0.25 |
inner_product.hpp: 155 - 6.44%
Run orig_HBM_CACHE | Run gcc_1_HBM_CACHE | Run icx_5_HBM_CACHE | Run orig_DDR | Run gcc_11_DDR | Run icx_7_DDR | ||||||||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
979 | 0.1 | 0.07 | 0.09 | 33.33 | 16.67 | 181.25 | 1000 | 0.07 | 0.06 | 0.06 | 100 | 50 | 212.31 | 885 | 0.4 | 0.35 | 0.43 | 33.33 | 16.67 | 181.68 | 979 | 0.28 | 0.13 | 0.12 | 33.33 | 16.67 | 97.54 | 658 | 0.22 | 0.14 | 0.12 | 100 | 44.87 | 451.06 | 866 | 0.81 | 0.62 | 0.55 | 33.33 | 16.67 | 102 |
981 | 0.48 | 0.38 | 0.46 | 33.33 | 16.67 | 167.82 | 977 | 0.39 | 0.31 | 0.32 | 100 | 50 | 204.81 | 870 | 0.09 | 0.07 | 0.08 | 33.33 | 16.67 | 182.03 | 981 | 0.78 | 0.62 | 0.55 | 33.33 | 16.67 | 103.17 | 854 | 0.41 | 0.27 | 0.24 | 33.33 | 16.67 | 234.17 | |||||||
982 | 0.38 | 0.28 | 0.35 | 33.33 | 16.67 | 225.34 | 981 | 0.38 | 0.32 | 0.32 | 100 | 50 | 197.36 | 873 | 0.35 | 0.22 | 0.27 | 33.33 | 16.67 | 288 | 982 | 0.42 | 0.32 | 0.28 | 33.33 | 16.67 | 197.18 | 853 | 0.72 | 0.59 | 0.52 | 33.33 | 16.67 | 108.63 | |||||||
994 | 0.43 | 0.36 | 0.44 | 33.33 | 16.67 | 176.37 | 971 | 0.16 | 0.12 | 0.12 | 100 | 50 | 529.62 | 872 | 0.46 | 0.35 | 0.44 | 33.33 | 16.67 | 181.79 | 994 | 0.83 | 0.65 | 0.58 | 33.33 | 16.67 | 97.65 | 851 | 0.18 | 0.11 | 0.1 | 33.33 | 16.67 | 116.08 |
einspline_spo_ref.hpp: 223 - 5.28%
Run orig_HBM_CACHE | Run gcc_1_HBM_CACHE | Run icx_5_HBM_CACHE | Run orig_DDR | Run gcc_11_DDR | Run icx_7_DDR | ||||||||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
875 | 1.04 | 0.81 | 0.99 | 30 | 15.31 | 0 | 890 | 0.96 | 0.91 | 0.91 | 11.11 | 13.89 | 0 | 770 | 0.78 | 0.63 | 0.79 | 20 | 13.13 | 0 | 875 | 1.23 | 1.02 | 0.91 | 30 | 15.31 | 0 | 682 | 1.16 | 1.02 | 0.87 | 11.11 | 13.89 | 0 | 750 | 1.13 | 0.91 | 0.81 | 0 | 11.93 | 0 |
<unknown>: 0 - 4.12%
Run orig_HBM_CACHE | Run gcc_1_HBM_CACHE | Run icx_5_HBM_CACHE | Run orig_DDR | Run gcc_11_DDR | Run icx_7_DDR | ||||||||||||||||||||||||||||||||||||
Loop Source Regions | Loop Source Regions | Loop Source Regions | Loop Source Regions | Loop Source Regions | Loop Source Regions | ||||||||||||||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1345 | 0 | 0 | 0 | 0 | 0 | NA | 170 | 0.02 | 0 | 0 | 0 | 0 | NA | 2280 | 0.88 | 0.76 | 0.95 | 100 | 50 | 0.39 | 2345 | 1.39 | 1.24 | 1.1 | 100 | 50 | 0.28 | 92 | 0 | 0 | 0 | 0 | 0 | NA | 830 | 0 | 0 | 0 | 0 | 0 | NA |
2345 | 0.94 | 0.79 | 0.97 | 100 | 50 | 0.34 | 816 | 0.02 | 0 | 0 | 0 | 0 | NA | 244 | 0.01 | 0 | 0 | 0 | 0 | NA | 353 | 0 | 0 | 0 | 0 | 0 | NA | 81 | 0 | 0 | 0 | 0 | 0 | NA | 82 | 0.01 | 0 | 0 | 0 | 0 | NA |
353 | 0 | 0 | 0 | 0 | 0 | NA | 1891 | 0.04 | 0 | 0 | 0 | 0 | NA | 246 | 0 | 0 | 0 | 0 | 0 | NA | 355 | 0.02 | 0 | 0 | 0 | 0 | NA | 80 | 0 | 0 | 0 | 0 | 0 | NA | 178 | 0 | 0 | 0 | 0 | 0 | NA |
355 | 0.02 | 0 | 0 | 0 | 0 | NA | 815 | 0.01 | 0 | 0 | 0 | 0 | NA | 240 | 0.01 | 0 | 0 | 0 | 0 | NA | 365 | 0.01 | 0 | 0 | 0 | 0 | NA | 87 | 0 | 0 | 0 | 0 | 0 | NA | 179 | 0.02 | 0 | 0 | 0 | 0 | NA |
365 | 0.01 | 0 | 0 | 0 | 0 | NA | 124 | 0 | 0 | 0 | 0 | 0 | NA | 1068 | 0 | 0 | 0 | 0 | 0 | NA | 1058 | 0.01 | 0 | 0 | 0 | 0 | NA | 45 | 0 | 0 | 0 | 0 | 0 | NA | 212 | 0.02 | 0 | 0 | 0 | 0 | NA |
1058 | 0.01 | 0 | 0 | 0 | 0 | NA | 2533 | 0.02 | 0 | 0 | 0 | 0 | NA | 784 | 0 | 0 | 0 | 0 | 0 | NA | 367 | 0 | 0 | 0 | 0 | 0 | NA | 85 | 0.01 | 0 | 0 | 0 | 0 | NA | 858 | 0 | 0 | 0 | 0 | 0 | NA |
367 | 0 | 0 | 0 | 0 | 0 | NA | 2453 | 0.01 | 0 | 0 | 0 | 0 | NA | 62 | 0 | 0 | 0 | 0 | 0 | NA | 1544 | 0 | 0 | 0 | 0 | 0 | NA | 94 | 0.01 | 0 | 0 | 0 | 0 | NA | 1198 | 0 | 0 | 0 | 0 | 0 | NA |
1283 | 0 | 0 | 0 | 0 | 0 | NA | 127 | 0 | 0 | 0 | 0 | 0 | NA | 1243 | 0 | 0 | 0 | 0 | 0 | NA | 1283 | 0 | 0 | 0 | 0 | 0 | NA | 75 | 0.03 | 0.01 | 0.01 | 30.95 | 14.96 | 8.75 | 29 | 0 | 0 | 0 | 0 | 0 | NA |
372 | 0 | 0 | 0 | 0 | 0 | NA | 2418 | 0.01 | 0 | 0 | 0 | 0 | NA | 61 | 0 | 0 | 0 | 0 | 0 | NA | 372 | 0 | 0 | 0 | 0 | 0 | NA | 742 | 0 | 0 | 0 | 0 | 0 | NA | 744 | 0.01 | 0 | 0 | 0 | 0 | NA |
369 | 0 | 0 | 0 | 0 | 0 | NA | 1743 | 0 | 0 | 0 | 0 | 0 | NA | 249 | 0.01 | 0 | 0 | 0 | 0 | NA | 1323 | 0 | 0 | 0 | 0 | 0 | NA | 206 | 0.02 | 0 | 0 | 0 | 0 | NA | 1441 | 0.07 | 0 | 0 | 0 | 0 | NA |
371 | 0 | 0 | 0 | 0 | 0 | NA | 1739 | 0 | 0 | 0 | 0 | 0 | NA | 764 | 0.03 | 0 | 0 | 0 | 0 | NA | 371 | 0 | 0 | 0 | 0 | 0 | NA | 204 | 0 | 0 | 0 | 0 | 0 | NA | 204 | 0.01 | 0 | 0 | 0 | 0 | NA |
20 | 0 | 0 | 0 | 0 | 0 | NA | 379 | 0 | 0 | 0 | 0 | 0 | NA | 29 | 0 | 0 | 0 | 0 | 0 | NA | 380 | 0 | 0 | 0 | 0 | 0 | NA | 205 | 0.01 | 0 | 0 | 0 | 0 | NA | 976 | 0 | 0 | 0 | 0 | 0 | NA |
382 | 0 | 0 | 0 | 0 | 0 | NA | 1532 | 0 | 0 | 0 | 0 | 0 | NA | 228 | 0 | 0 | 0 | 0 | 0 | NA | 382 | 0 | 0 | 0 | 0 | 0 | NA | 692 | 0 | 0 | 0 | 0 | 0 | NA | 730 | 0.01 | 0 | 0 | 0 | 0 | NA |
388 | 0 | 0 | 0 | 0 | 0 | NA | 988 | 0 | 0 | 0 | 0 | 0 | NA | 292 | 0 | 0 | 0 | 0 | 0 | NA | 388 | 0 | 0 | 0 | 0 | 0 | NA | 691 | 0 | 0 | 0 | 0 | 0 | NA | 728 | 0 | 0 | 0 | 0 | 0 | NA |
384 | 0 | 0 | 0 | 0 | 0 | NA | 294 | 0 | 0 | 0 | 0 | 0 | NA | 313 | 0 | 0 | 0 | 0 | 0 | NA | 384 | 0 | 0 | 0 | 0 | 0 | NA | 684 | 0.01 | 0 | 0 | 0 | 0 | NA | 984 | 0 | 0 | 0 | 0 | 0 | NA |
258 | 0.01 | 0 | 0 | 0 | 0 | NA | 986 | 0 | 0 | 0 | 0 | 0 | NA | 270 | 0 | 0 | 0 | 0 | 0 | NA | 387 | 0.02 | 0.01 | 0 | 0 | 0 | 233.81 | 672 | 0.01 | 0 | 0 | 0 | 0 | NA | 1966 | 0.02 | 0 | 0 | 0 | 0 | NA |
987 | 0 | 0 | 0 | 0 | 0 | NA | 701 | 0 | 0 | 0 | 0 | 0 | NA | 315 | 0 | 0 | 0 | 0 | 0 | NA | 101 | 0.02 | 0 | 0 | 0 | 0 | NA | 208 | 0.01 | 0 | 0 | 0 | 0 | NA | 280 | 0.03 | 0 | 0 | 0 | 0 | NA |
989 | 0.01 | 0 | 0 | 0 | 0 | NA | 707 | 0.01 | 0 | 0 | 0 | 0 | NA | 1242 | 0 | 0 | 0 | 0 | 0 | NA | 258 | 0.02 | 0 | 0 | 0 | 0 | NA | 210 | 0 | 0 | 0 | 0 | 0 | NA | 872 | 0 | 0 | 0 | 0 | 0 | NA |
1203 | 0 | 0 | 0 | 0 | 0 | NA | 708 | 0.01 | 0 | 0 | 0 | 0 | NA | 1434 | 0 | 0 | 0 | 0 | 0 | NA | 986 | 0 | 0 | 0 | 0 | 0 | NA | 443 | 0 | 0 | 0 | 0 | 0 | NA | 282 | 0 | 0 | 0 | 0 | 0 | NA |
284 | 0 | 0 | 0 | 0 | 0 | NA | 702 | 0 | 0 | 0 | 0 | 0 | NA | 997 | 0 | 0 | 0 | 0 | 0 | NA | 989 | 0.01 | 0 | 0 | 0 | 0 | NA | 442 | 0 | 0 | 0 | 0 | 0 | NA | 297 | 0 | 0 | 0 | 0 | 0 | NA |
969 | 0 | 0 | 0 | 0 | 0 | NA | 600 | 0 | 0 | 0 | 0 | 0 | NA | 1215 | 0 | 0 | 0 | 0 | 0 | NA | 102 | 0.01 | 0 | 0 | 0 | 0 | NA | 605 | 0 | 0 | 0 | 0 | 0 | NA | 1719 | 0 | 0 | 0 | 0 | 0 | NA |
1115 | 0 | 0 | 0 | 0 | 0 | NA | 1406 | 0 | 0 | 0 | 0 | 0 | NA | 1447 | 0 | 0 | 0 | 0 | 0 | NA | 284 | 0 | 0 | 0 | 0 | 0 | NA | 444 | 0 | 0 | 0 | 0 | 0 | NA | 1093 | 0 | 0 | 0 | 0 | 0 | NA |
378 | 0 | 0 | 0 | 0 | 0 | NA | 689 | 0.02 | 0 | 0 | 0 | 0 | NA | 1159 | 0 | 0 | 0 | 0 | 0 | NA | 969 | 0.01 | 0 | 0 | 0 | 0 | NA | 448 | 0 | 0 | 0 | 0 | 0 | NA | 287 | 0 | 0 | 0 | 0 | 0 | NA |
1368 | 0.01 | 0 | 0 | 0 | 0 | NA | 690 | 0.01 | 0 | 0 | 0 | 0 | NA | 317 | 0 | 0 | 0 | 0 | 0 | NA | 951 | 0 | 0 | 0 | 0 | 0 | NA | 449 | 0 | 0 | 0 | 0 | 0 | NA | 272 | 0.02 | 0 | 0 | 0 | 0 | NA |
99 | 0 | 0 | 0 | 0 | 0 | NA | 566 | 0.02 | 0 | 0 | 0 | 0 | NA | 1458 | 0.01 | 0 | 0 | 0 | 0 | NA | 987 | 0 | 0 | 0 | 0 | 0 | NA | 618 | 0 | 0 | 0 | 0 | 0 | NA | 268 | 0 | 0 | 0 | 0 | 0 | NA |
302 | 0.01 | 0 | 0 | 0 | 0 | NA | 120 | 0 | 0 | 0 | 0 | 0 | NA | 877 | 0 | 0 | 0 | 0 | 0 | NA | 984 | 0 | 0 | 0 | 0 | 0 | NA | 617 | 0 | 0 | 0 | 0 | 0 | NA | 270 | 0.01 | 0 | 0 | 0 | 0 | NA |
295 | 0.01 | 0 | 0 | 0 | 0 | NA | 174 | 0 | 0 | 0 | 0 | 0 | NA | 1733 | 0 | 0 | 0 | 0 | 0 | NA | 49 | 0 | 0 | 0 | 0 | 0 | NA | 615 | 0 | 0 | 0 | 0 | 0 | NA | 295 | 0 | 0 | 0 | 0 | 0 | NA |
984 | 0 | 0 | 0 | 0 | 0 | NA | 77 | 0.01 | 0 | 0 | 0 | 0 | NA | 1444 | 0 | 0 | 0 | 0 | 0 | NA | 1534 | 0 | 0 | 0 | 0 | 0 | NA | 613 | 0 | 0 | 0 | 0 | 0 | NA | 856 | 0 | 0 | 0 | 0 | 0 | NA |
1000 | 0 | 0 | 0 | 0 | 0 | NA | 1883 | 0 | 0 | 0 | 0 | 0 | NA | 74 | 0 | 0 | 0 | 0 | 0 | NA | 295 | 0.02 | 0 | 0 | 0 | 0 | NA | 463 | 0.02 | 0.01 | 0 | 0 | 0 | 630.47 | 61 | 0 | 0 | 0 | 0 | 0 | NA |
1536 | 0 | 0 | 0 | 0 | 0 | NA | 295 | 0.01 | 0 | 0 | 0 | 0 | NA | 202 | 0.01 | 0 | 0 | 0 | 0 | NA | 369 | 0 | 0 | 0 | 0 | 0 | NA | 609 | 0 | 0 | 0 | 0 | 0 | NA | 1089 | 0 | 0 | 0 | 0 | 0 | NA |
1107 | 0 | 0 | 0 | 0 | 0 | NA | 601 | 0 | 0 | 0 | 0 | 0 | NA | 749 | 0 | 0 | 0 | 0 | 0 | NA | 1857 | 0 | 0 | 0 | 0 | 0 | NA | 606 | 0 | 0 | 0 | 0 | 0 | NA | 293 | 0 | 0 | 0 | 0 | 0 | NA |
2095 | 0 | 0 | 0 | 0 | 0 | NA | 296 | 0.01 | 0 | 0 | 0 | 0 | NA | 1737 | 0 | 0 | 0 | 0 | 0 | NA | 1115 | 0 | 0 | 0 | 0 | 0 | NA | 466 | 0.02 | 0 | 0 | 0 | 0 | NA | 291 | 0 | 0 | 0 | 0 | 0 | NA |
1555 | 0.01 | 0 | 0 | 0 | 0 | NA | 1168 | 0.01 | 0 | 0 | 0 | 0 | NA | 21 | 0 | 0 | 0 | 0 | 0 | NA | 2095 | 0 | 0 | 0 | 0 | 0 | NA | 598 | 0.01 | 0 | 0 | 0 | 0 | NA | 289 | 0 | 0 | 0 | 0 | 0 | NA |
1557 | 0.01 | 0 | 0 | 0 | 0 | NA | 599 | 0 | 0 | 0 | 0 | 0 | NA | 225 | 0 | 0 | 0 | 0 | 0 | NA | 1555 | 0 | 0 | 0 | 0 | 0 | NA | 43 | 0 | 0 | 0 | 0 | 0 | NA | 1191 | 0.01 | 0 | 0 | 0 | 0 | NA |
1857 | 0.01 | 0 | 0 | 0 | 0 | NA | 609 | 0 | 0 | 0 | 0 | 0 | NA | 1194 | 0 | 0 | 0 | 0 | 0 | NA | 1557 | 0.01 | 0 | 0 | 0 | 0 | NA | 619 | 0 | 0 | 0 | 0 | 0 | NA | 205 | 0 | 0 | 0 | 0 | 0 | NA |
285 | 0 | 0 | 0 | 0 | 0 | NA | 2736 | 0 | 0 | 0 | 0 | 0 | NA | 294 | 0.02 | 0 | 0 | 0 | 0 | NA | 378 | 0 | 0 | 0 | 0 | 0 | NA | 683 | 0.02 | 0 | 0 | 0 | 0 | NA | 301 | 0.01 | 0 | 0 | 0 | 0 | NA |
300 | 0.01 | 0 | 0 | 0 | 0 | NA | 1001 | 0.01 | 0 | 0 | 0 | 0 | NA | 1097 | 0.01 | 0 | 0 | 0 | 0 | NA | 48 | 0 | 0 | 0 | 0 | 0 | NA | 330 | 0.01 | 0 | 0 | 0 | 0 | NA | 77 | 0.01 | 0 | 0 | 0 | 0 | NA |
986 | 0 | 0 | 0 | 0 | 0 | NA | 591 | 0.01 | 0 | 0 | 0 | 0 | NA | 1103 | 0.01 | 0 | 0 | 0 | 0 | NA | 53 | 0 | 0 | 0 | 0 | 0 | NA | 331 | 0.01 | 0 | 0 | 0 | 0 | NA | 299 | 0 | 0 | 0 | 0 | 0 | NA |
109 | 0 | 0 | 0 | 0 | 0 | NA | 1242 | 0.01 | 0 | 0 | 0 | 0 | NA | 782 | 0 | 0 | 0 | 0 | 0 | NA | 109 | 0 | 0 | 0 | 0 | 0 | NA | 48 | 0 | 0 | 0 | 0 | 0 | NA | 924 | 0.01 | 0 | 0 | 0 | 0 | NA |
297 | 0.01 | 0 | 0 | 0 | 0 | NA | 1236 | 0.01 | 0 | 0 | 0 | 0 | NA | 303 | 0.01 | 0 | 0 | 0 | 0 | NA | 296 | 0.01 | 0 | 0 | 0 | 0 | NA | 607 | 0 | 0 | 0 | 0 | 0 | NA | 1083 | 0.01 | 0 | 0 | 0 | 0 | NA |
296 | 0.01 | 0 | 0 | 0 | 0 | NA | 968 | 0 | 0 | 0 | 0 | 0 | NA | 319 | 0 | 0 | 0 | 0 | 0 | NA | 285 | 0 | 0 | 0 | 0 | 0 | NA | 207 | 0 | 0 | 0 | 0 | 0 | NA | 59 | 0 | 0 | 0 | 0 | 0 | NA |
882 | 0 | 0 | 0 | 0 | 0 | NA | 880 | 0 | 0 | 0 | 0 | 0 | NA | 301 | 0.01 | 0 | 0 | 0 | 0 | NA | 110 | 0 | 0 | 0 | 0 | 0 | NA | 298 | 0 | 0 | 0 | 0 | 0 | NA | 764 | 0 | 0 | 0 | 0 | 0 | NA |
110 | 0 | 0 | 0 | 0 | 0 | NA | 119 | 0 | 0 | 0 | 0 | 0 | NA | 77 | 0.01 | 0 | 0 | 0 | 0 | NA | 1332 | 0 | 0 | 0 | 0 | 0 | NA | 297 | 0.01 | 0 | 0 | 0 | 0 | NA | 1428 | 0 | 0 | 0 | 0 | 0 | NA |
48 | 0 | 0 | 0 | 0 | 0 | NA | 169 | 0.01 | 0 | 0 | 0 | 0 | NA | 311 | 0 | 0 | 0 | 0 | 0 | NA | 1560 | 0.01 | 0 | 0 | 0 | 0 | NA | 223 | 0 | 0 | 0 | 0 | 0 | NA | 79 | 0 | 0 | 0 | 0 | 0 | NA |
885 | 0 | 0 | 0 | 0 | 0 | NA | 891 | 0.02 | 0 | 0 | 0 | 0 | NA | 309 | 0 | 0 | 0 | 0 | 0 | NA | 306 | 0 | 0 | 0 | 0 | 0 | NA | 399 | 0 | 0 | 0 | 0 | 0 | NA | 1431 | 0 | 0 | 0 | 0 | 0 | NA |
306 | 0 | 0 | 0 | 0 | 0 | NA | 1533 | 0.01 | 0 | 0 | 0 | 0 | NA | 1453 | 0.01 | 0 | 0 | 0 | 0 | NA | 300 | 0.01 | 0 | 0 | 0 | 0 | NA | 220 | 0 | 0 | 0 | 0 | 0 | NA | 220 | 0.01 | 0 | 0 | 0 | 0 | NA |
988 | 0 | 0 | 0 | 0 | 0 | NA | 616 | 0 | 0 | 0 | 0 | 0 | NA | 1989 | 0.02 | 0 | 0 | 0 | 0 | NA | 954 | 0 | 0 | 0 | 0 | 0 | NA | 345 | 0.01 | 0 | 0 | 0 | 0 | NA | 766 | 0 | 0 | 0 | 0 | 0 | NA |
1560 | 0.01 | 0 | 0 | 0 | 0 | NA | 615 | 0 | 0 | 0 | 0 | 0 | NA | 879 | 0 | 0 | 0 | 0 | 0 | NA | 988 | 0 | 0 | 0 | 0 | 0 | NA | 38 | 0.03 | 0 | 0 | 0 | 0 | NA | 1442 | 0.02 | 0 | 0 | 0 | 0 | NA |
1559 | 0.03 | 0 | 0 | 0 | 0 | NA | 117 | 0.01 | 0 | 0 | 0 | 0 | NA | 34 | 0 | 0 | 0 | 0 | 0 | NA | 104 | 0.01 | 0 | 0 | 0 | 0 | NA | 39 | 0.08 | 0 | 0 | 0 | 0 | NA | 1437 | 0 | 0 | 0 | 0 | 0 | NA |
104 | 0.01 | 0 | 0 | 0 | 0 | NA | 606 | 0 | 0 | 0 | 0 | 0 | NA | 891 | 0 | 0 | 0 | 0 | 0 | NA | 1238 | 0.01 | 0 | 0 | 0 | 0 | NA | 185 | 0 | 0 | 0 | 0 | 0 | NA | 222 | 0.01 | 0 | 0 | 0 | 0 | NA |
1238 | 0.02 | 0 | 0 | 0 | 0 | NA | 605 | 0 | 0 | 0 | 0 | 0 | NA | 229 | 0 | 0 | 0 | 0 | 0 | NA | 1232 | 0.01 | 0 | 0 | 0 | 0 | NA | 152 | 0 | 0 | 0 | 0 | 0 | NA | 1145 | 0 | 0 | 0 | 0 | 0 | NA |
1232 | 0.02 | 0 | 0 | 0 | 0 | NA | 592 | 0.02 | 0 | 0 | 0 | 0 | NA | 1455 | 0 | 0 | 0 | 0 | 0 | NA | 1368 | 0.01 | 0 | 0 | 0 | 0 | NA | 443 | 0 | 0 | 0 | 0 | 0 | NA | 2247 | 1.44 | 1.23 | 1.09 | 100 | 50 | 0.26 |
954 | 0 | 0 | 0 | 0 | 0 | NA | 112 | 0 | 0 | 0 | 0 | 0 | NA | 878 | 0 | 0 | 0 | 0 | 0 | NA | 882 | 0 | 0 | 0 | 0 | 0 | NA | 442 | 0 | 0 | 0 | 0 | 0 | NA | 216 | 0.02 | 0 | 0 | 0 | 0 | NA |
106 | 0 | 0 | 0 | 0 | 0 | NA | 172 | 0.01 | 0 | 0 | 0 | 0 | NA | 321 | 0.01 | 0 | 0 | 0 | 0 | NA | 106 | 0 | 0 | 0 | 0 | 0 | NA | 130 | 0 | 0 | 0 | 0 | 0 | NA | 860 | 0 | 0 | 0 | 0 | 0 | NA |
281 | 0 | 0 | 0 | 0 | 0 | NA | 1887 | 0.02 | 0 | 0 | 0 | 0 | NA | 290 | 0.01 | 0 | 0 | 0 | 0 | NA | 281 | 0.01 | 0 | 0 | 0 | 0 | NA | 150 | 0 | 0 | 0 | 0 | 0 | NA | 207 | 0.01 | 0 | 0 | 0 | 0 | NA |
869 | 0.03 | 0 | 0 | 0 | 0 | NA | 2290 | 0.01 | 0 | 0 | 0 | 0 | NA | 85 | 0 | 0 | 0 | 0 | 0 | NA | 869 | 0.03 | 0 | 0 | 0 | 0 | NA | 74 | 0.03 | 0 | 0 | 0 | 0 | NA | |||||||
1859 | 0 | 0 | 0 | 0 | 0 | NA | 880 | 0.01 | 0 | 0 | 0 | 0 | NA | 1559 | 0.04 | 0 | 0 | 0 | 0 | NA | 1226 | 0.01 | 0 | 0 | 0 | 0 | NA | ||||||||||||||
52 | 0 | 0 | 0 | 0 | 0 | NA | 849 | 0 | 0 | 0 | 0 | 0 | NA | 1859 | 0.01 | 0 | 0 | 0 | 0 | NA | 1439 | 0.01 | 0 | 0 | 0 | 0 | NA | ||||||||||||||
1369 | 0 | 0 | 0 | 0 | 0 | NA | 860 | 0 | 0 | 0 | 0 | 0 | NA | 302 | 0 | 0 | 0 | 0 | 0 | NA | 75 | 0.02 | 0 | 0 | 0 | 0 | NA | ||||||||||||||
2091 | 0.01 | 0 | 0 | 0 | 0 | NA | 1457 | 0.09 | 0 | 0 | 0 | 0 | NA | 2091 | 0.01 | 0 | 0 | 0 | 0 | NA | 52 | 0 | 0 | 0 | 0 | 0 | NA | ||||||||||||||
855 | 0 | 0 | 0 | 0 | 0 | NA | 1208 | 0 | 0 | 0 | 0 | 0 | NA | 855 | 0 | 0 | 0 | 0 | 0 | NA | 249 | 0.03 | 0.01 | 0 | 0 | 0 | 270.71 | ||||||||||||||
331 | 0 | 0 | 0 | 0 | 0 | NA | 236 | 0.01 | 0 | 0 | 0 | 0 | NA | 1369 | 0.01 | 0 | 0 | 0 | 0 | NA | 225 | 0.01 | 0 | 0 | 0 | 0 | NA | ||||||||||||||
57 | 0 | 0 | 0 | 0 | 0 | NA | 239 | 0.01 | 0 | 0 | 0 | 0 | NA | 1126 | 0 | 0 | 0 | 0 | 0 | NA | 861 | 0.01 | 0 | 0 | 0 | 0 | NA | ||||||||||||||
846 | 0 | 0 | 0 | 0 | 0 | NA | 57 | 0 | 0 | 0 | 0 | 0 | NA | 859 | 0.01 | 0 | 0 | 0 | 0 | NA | |||||||||||||||||||||
751 | 0 | 0 | 0 | 0 | 0 | NA | 298 | 0.04 | 0.01 | 0 | 0 | 0 | 205.26 | ||||||||||||||||||||||||||||
942 | 0.01 | 0 | 0 | 0 | 0 | NA | 841 | 0.01 | 0 | 0 | 0 | 0 | NA | ||||||||||||||||||||||||||||
81 | 0 | 0 | 0 | 0 | 0 | NA | 215 | 0.02 | 0 | 0 | 0 | 0 | NA | ||||||||||||||||||||||||||||
79 | 0.01 | 0 | 0 | 0 | 0 | NA | 827 | 0 | 0 | 0 | 0 | 0 | NA |
inner_product.hpp: 82 - 3.89%
Run orig_HBM_CACHE | Run gcc_1_HBM_CACHE | Run icx_5_HBM_CACHE | Run orig_DDR | Run gcc_11_DDR | Run icx_7_DDR | ||||||||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
985 | 0.13 | 0.08 | 0.1 | 100 | 50 | 193.41 | 999 | 0.04 | 0.02 | 0.02 | 100 | 50 | 204.21 | 783 | 0.37 | 0.28 | 0.35 | 100 | 50 | 297.69 | 985 | 0.16 | 0.11 | 0.1 | 100 | 50 | 186.17 | 730 | 0.15 | 0.1 | 0.09 | 100 | 50 | 209.73 | 765 | 0.46 | 0.36 | 0.32 | 100 | 50 | 234.25 |
883 | 0.37 | 0.29 | 0.36 | 100 | 50 | 290.63 | 982 | 0.22 | 0.18 | 0.18 | 100 | 50 | 117.49 | 876 | 0.11 | 0.07 | 0.08 | 100 | 50 | 287.81 | 883 | 0.46 | 0.36 | 0.31 | 100 | 50 | 234.08 | 728 | 0.44 | 0.29 | 0.24 | 100 | 50 | 72.52 | 849 | 0.07 | 0.03 | 0.03 | 100 | 50 | 139.74 |
996 | 0.23 | 0.17 | 0.2 | 100 | 50 | 123.02 | 972 | 0.11 | 0.07 | 0.07 | 100 | 50 | 295.41 | 887 | 0.24 | 0.16 | 0.2 | 100 | 50 | 130.97 | 996 | 0.44 | 0.3 | 0.26 | 100 | 50 | 70.12 | 41 | 0.44 | 0.36 | 0.3 | 100 | 50 | 232.96 | 857 | 0.16 | 0.1 | 0.09 | 100 | 50 | 210.05 |
977 | 0.04 | 0.02 | 0.02 | 100 | 50 | 210.76 | 976 | 0.33 | 0.24 | 0.25 | 100 | 50 | 347.87 | 868 | 0.03 | 0.02 | 0.02 | 100 | 50 | 210.33 | 977 | 0.08 | 0.04 | 0.03 | 100 | 50 | 106.99 | 733 | 0.06 | 0.03 | 0.02 | 100 | 50 | 140.67 | 868 | 0.46 | 0.28 | 0.25 | 100 | 50 | 75.39 |
MultiBsplineRef.hpp: 276 - 3.7%
Run orig_HBM_CACHE | Run gcc_1_HBM_CACHE | Run icx_5_HBM_CACHE | Run orig_DDR | Run gcc_11_DDR | Run icx_7_DDR | ||||||||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
877 | 0.46 | 0.37 | 0.46 | 100 | 50 | 190.73 | 885 | 2.01 | 1.14 | 1.15 | 0 | 12.5 | 33.54 | 772 | 0.46 | 0.32 | 0.4 | 100 | 50 | 291.47 | 877 | 0.55 | 0.42 | 0.37 | 100 | 50 | 273.62 | 677 | 1.47 | 1.18 | 1 | 0 | 12.5 | 119.33 | 753 | 0.45 | 0.36 | 0.32 | 100 | 50 | 305.85 |
BsplineFunctor.h: 291 - 3.19%
Run orig_HBM_CACHE | Run gcc_1_HBM_CACHE | Run icx_5_HBM_CACHE | Run orig_DDR | Run gcc_11_DDR | Run icx_7_DDR | ||||||||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
332 | 0.56 | 0.44 | 0.54 | 86.96 | 44.57 | 0.35 | 565 | 0.6 | 0.52 | 0.53 | 0 | 9.38 | 0.07 | 271 | 0.51 | 0.38 | 0.48 | 83.48 | 42.77 | 0.26 | 332 | 0.73 | 0.59 | 0.52 | 86.96 | 44.57 | 0.44 | 485 | 0.34 | 0.26 | 0.22 | 0 | 9.38 | 0.18 | 248 | 0.75 | 0.62 | 0.55 | 0 | 9.94 | 0.03 |
590 | 0.06 | 0.04 | 0.04 | 0 | 9.38 | 0.26 | 551 | 0.4 | 0.33 | 0.28 | 0 | 9.38 | 0.1 | ||||||||||||||||||||||||||||
465 | 0.07 | 0.03 | 0.03 | 0 | 9.38 | 0.6 |
TwoBodyJastrowRef.h: 324 - 1.88%
Run orig_HBM_CACHE | Run gcc_1_HBM_CACHE | Run icx_5_HBM_CACHE | Run orig_DDR | Run gcc_11_DDR | Run icx_7_DDR | ||||||||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
363 | 0.28 | 0.21 | 0.26 | 100 | 50 | 322.05 | 571 | 0.38 | 0.35 | 0.35 | 0 | 12.5 | 90.59 | 299 | 0.27 | 0.19 | 0.24 | 100 | 50 | 428.97 | 363 | 0.45 | 0.37 | 0.33 | 100 | 50 | 257.2 | 490 | 0.55 | 0.45 | 0.38 | 0 | 12.5 | 198.77 | 278 | 0.47 | 0.36 | 0.32 | 100 | 50 | 263.08 |
inner_product.hpp: 211 - 1.19%
Run orig_HBM_CACHE | Run gcc_1_HBM_CACHE | Run icx_5_HBM_CACHE | Run orig_DDR | Run gcc_11_DDR | Run icx_7_DDR | ||||||||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
962 | 0.16 | 0.12 | 0.14 | 85.71 | 41.07 | 0 | 996 | 0.22 | 0.23 | 0.24 | 33.33 | 16.67 | 0 | 854 | 0.06 | 0.06 | 0.07 | 85.71 | 41.07 | 0 | 962 | 0.22 | 0.15 | 0.13 | 85.71 | 41.07 | 0 | 695 | 0.35 | 0.27 | 0.23 | 0 | 12.5 | 0 | 835 | 0.32 | 0.15 | 0.13 | 66.67 | 31.25 | 0 |
961 | 0.16 | 0.11 | 0.14 | 85.71 | 41.07 | 0 | 961 | 0.19 | 0.13 | 0.11 | 85.71 | 41.07 | 0 |
TwoBodyJastrowRef.h: 381 - 0.4%
Run orig_HBM_CACHE | Run gcc_1_HBM_CACHE | Run icx_5_HBM_CACHE | Run orig_DDR | Run gcc_11_DDR | Run icx_7_DDR | ||||||||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
381 | 0.05 | 0.02 | 0.03 | 100 | 50 | 204.06 | 587 | 0.11 | 0.07 | 0.07 | 100 | 50 | 178.57 | 310 | 0.05 | 0.02 | 0.02 | 100 | 50 | 208.68 | 381 | 0.06 | 0.02 | 0.02 | 100 | 50 | 214.96 | 462 | 0.11 | 0.06 | 0.05 | 100 | 50 | 207.86 | 294 | 0.05 | 0.02 | 0.02 | 100 | 50 | 205.98 |
383 | 0.05 | 0.02 | 0.03 | 100 | 50 | 206.76 | 314 | 0.05 | 0.02 | 0.03 | 100 | 50 | 208.96 | 383 | 0.05 | 0.02 | 0.02 | 100 | 50 | 209.88 | 292 | 0.06 | 0.02 | 0.02 | 100 | 50 | 212.93 | ||||||||||||||
379 | 0.05 | 0.02 | 0.03 | 100 | 50 | 209.43 | 312 | 0.04 | 0.02 | 0.02 | 100 | 50 | 215.31 | 379 | 0.05 | 0.02 | 0.02 | 100 | 50 | 209.71 | 290 | 0.05 | 0.02 | 0.02 | 100 | 50 | 209.93 |
TwoBodyJastrowRef.h: 388 - 0.09%
Run orig_HBM_CACHE | Run gcc_1_HBM_CACHE | Run icx_5_HBM_CACHE | Run orig_DDR | Run gcc_11_DDR | Run icx_7_DDR | ||||||||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
377 | 0.04 | 0.02 | 0.02 | 100 | 50 | 430.14 | 586 | 0.04 | 0.02 | 0.02 | 100 | 50 | 417.19 | 308 | 0.03 | 0.01 | 0.02 | 100 | 50 | 840.08 | 377 | 0.04 | 0.02 | 0.01 | 100 | 50 | 416.69 | 461 | 0.04 | 0.02 | 0.01 | 100 | 50 | 425.76 | 288 | 0.03 | 0.02 | 0.01 | 100 | 50 | 422.51 |
OneBodyJastrowRef.h: 214 - 0.06%
Run orig_HBM_CACHE | Run gcc_1_HBM_CACHE | Run icx_5_HBM_CACHE | Run orig_DDR | Run gcc_11_DDR | Run icx_7_DDR | ||||||||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | Assembly Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
329 | 0.04 | 0.01 | 0.01 | 0 | 11.11 | 2.2 | 217 | 0.02 | 0.01 | 0.01 | 0 | 12.5 | 0.2 | 268 | 0.03 | 0.01 | 0.01 | 0 | 11.61 | 0.8 | 329 | 0.03 | 0.01 | 0.01 | 0 | 11.11 | 1.8 | 596 | 0.04 | 0.01 | 0.01 | 0 | 12.5 | 0.25 | 244 | 0.03 | 0.01 | 0.01 | 0 | 11.61 | 1.5 |