Run G++ O3 + Vectorize + Funroll + Ffastmath | Run Clang++ O3 + Ffastmath | Run ICPX O3 + More Aggressive Flags |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 70-79
| | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 70-79
| | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 70-82
|
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
7 | 92.26 | 16.53 | 11.08 | 10 | 12.28 | 0.13 | 45.35 | 9 | 71.53 | 14.81 | 12.89 | 10 | 3.33 | 0.00 | 42.19 | 11 | 65.73 | 12.07 | 10.26 | 10 | 4.31 | 0.15 | 51.74 |
Run G++ O3 + Vectorize + Funroll + Ffastmath | Run Clang++ O3 + Ffastmath | Run ICPX O3 + More Aggressive Flags |
| | | | | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
336 | 0.75 | 0.13 | 0.12 | 10 | 0.09 | 0.01 | 0.00 | 1164 | 0.02 | 0.00 | 0.01 | 5 | 0.01 | 0.00 | 0.00 | 1164 | 0.04 | 0.01 | 0.01 | 7 | 0.03 | 0.00 | 0.00 |
332 | 0.90 | 0.16 | 0.12 | 9 | 0.05 | 0.01 | 0.00 | 700 | 0.02 | 0.00 | 0.01 | 7 | 0.01 | 0.00 | 0.00 | 1106 | 0.60 | 0.11 | 0.15 | 10 | 0.27 | 0.04 | 0.00 |
-1 | 0.02 | 0.00 | 0.01 | 4 | 0.04 | 0.00 | 6.60 | 651 | 18.92 | 3.92 | 3.89 | 9 | 0.38 | 0.07 | 0.00 | 1163 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 |
| 1055 | 5.13 | 1.06 | 1.13 | 10 | 1.86 | 0.33 | 0.00 | 2813 | 0.02 | 0.00 | 0.00 | 6 | 0.00 | 0.00 | 0.00 |
| -1 | 0.04 | 0.01 | 0.01 | 8 | 0.02 | 0.00 | 0.00 | 1102 | 28.47 | 5.23 | 5.08 | 10 | 8.94 | 1.29 | 0.00 |
| | -1 | 0.04 | 0.01 | 0.02 | 8 | 0.05 | 0.01 | 0.00 |
Run G++ O3 + Vectorize + Funroll + Ffastmath | Run Clang++ O3 + Ffastmath | Run ICPX O3 + More Aggressive Flags |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 67-70
- /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 85-108
| | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 67-70
- /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 85-108
| | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 67-71
- /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 85-108
|
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
12 | 6.07 | 1.09 | 7.18 | 1 | 0.00 | 0.00 | 4.59 | 8 | 4.34 | 0.90 | 7.81 | 1 | 0.00 | 0.00 | 5.56 | 9 | 5.09 | 0.94 | 7.76 | 1 | 0.00 | 0.00 | 5.35 |
Name | Module | Coverage (%) | Inclusive Time w.r.t. Wall Time(s) | Max Inc. Time over Threads(s) | Nb Threads | GFLOP/s | Deviation (coverage) | Deviation (time) |
G++ O3 + Vectorize + Funroll + Ffastmath | Clang++ O3 + Ffastmath | ICPX O3 + More Aggressive Flags | G++ O3 + Vectorize + Funroll + Ffastmath | Clang++ O3 + Ffastmath | ICPX O3 + More Aggressive Flags | G++ O3 + Vectorize + Funroll + Ffastmath | Clang++ O3 + Ffastmath | ICPX O3 + More Aggressive Flags | G++ O3 + Vectorize + Funroll + Ffastmath | Clang++ O3 + Ffastmath | ICPX O3 + More Aggressive Flags | G++ O3 + Vectorize + Funroll + Ffastmath | Clang++ O3 + Ffastmath | ICPX O3 + More Aggressive Flags | G++ O3 + Vectorize + Funroll + Ffastmath | Clang++ O3 + Ffastmath | ICPX O3 + More Aggressive Flags | G++ O3 + Vectorize + Funroll + Ffastmath | Clang++ O3 + Ffastmath | ICPX O3 + More Aggressive Flags |
k_means(int, point_t*, point_t*, int*, point_t*, int, int) [clone ._omp_fn.0] | binary | 92.26 | NA | NA | 16.53 | NA | NA | 11.08 | NA | NA | 10 | NA | NA | 45.35 | NA | NA | 12.28 | NA | NA | 0.13 | NA | NA |
k_means(int, point_t*, point_t*, int*, point_t*, int, int) [clone .omp_outlined] | binary | NA | 71.53 | NA | NA | 14.81 | NA | NA | 12.89 | NA | NA | 10 | NA | NA | 42.19 | NA | NA | 3.33 | NA | NA | 0.00 | NA |
k_means(int, point_t*, point_t*, int*, point_t*, int, int) [clone .extracted] | binary | NA | NA | 65.73 | NA | NA | 12.07 | NA | NA | 10.26 | NA | NA | 10 | NA | NA | 51.74 | NA | NA | 4.31 | NA | NA | 0.15 |
kmp_flag_64<false, true>::wait(kmp_info*, int, void*) | libiomp5.so | NA | NA | 28.47 | NA | NA | 5.23 | NA | NA | 5.08 | NA | NA | 10 | NA | NA | 0.00 | NA | NA | 8.94 | NA | NA | 1.29 |
__kmpc_threadprivate_register_vec | libomp.so | NA | 18.92 | NA | NA | 3.92 | NA | NA | 3.89 | NA | NA | 9 | NA | NA | 0.00 | NA | NA | 0.38 | NA | NA | 0.07 | NA |
k_means(int, point_t*, point_t*, int*, point_t*, int, int) | binary | 6.07 | 4.34 | 5.09 | 1.09 | 0.90 | 0.94 | 7.18 | 7.81 | 7.76 | 1 | 1 | 1 | 4.59 | 5.56 | 5.35 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
__kmp_invoke_microtask | libomp.so | NA | 5.13 | NA | NA | 1.06 | NA | NA | 1.13 | NA | NA | 10 | NA | NA | 0.00 | NA | NA | 1.86 | NA | NA | 0.33 | NA |
gomp_barrier_wait_end | libgomp.so.1.0.0 | 0.90 | NA | NA | 0.16 | NA | NA | 0.12 | NA | NA | 9 | NA | NA | 0.00 | NA | NA | 0.05 | NA | NA | 0.01 | NA | NA |
gomp_team_barrier_wait_end | libgomp.so.1.0.0 | 0.75 | NA | NA | 0.13 | NA | NA | 0.12 | NA | NA | 10 | NA | NA | 0.00 | NA | NA | 0.09 | NA | NA | 0.01 | NA | NA |
kmp_flag_native<unsigned long long, (flag_type)1, true>::notdone_check() | libiomp5.so | NA | NA | 0.60 | NA | NA | 0.11 | NA | NA | 0.15 | NA | NA | 10 | NA | NA | 0.00 | NA | NA | 0.27 | NA | NA | 0.04 |
unknown_kernel_region | kernel | 0.02 | 0.04 | 0.04 | 0.00 | 0.01 | 0.01 | 0.01 | 0.01 | 0.02 | 4 | 8 | 8 | 6.60 | 0.00 | 0.00 | 0.04 | 0.02 | 0.05 | 0.00 | 0.00 | 0.01 |
__sched_yield | libc.so.6 | NA | 0.02 | 0.04 | NA | 0.00 | 0.01 | NA | 0.01 | 0.01 | NA | 5 | 7 | NA | 0.00 | 0.00 | NA | 0.01 | 0.03 | NA | 0.00 | 0.00 |
__kmp_reap_worker | libomp.so | NA | 0.02 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 7 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 0.00 | NA |
__kmp_yield | libiomp5.so | NA | NA | 0.02 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 6 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 |
__kmpc_for_static_fini | libiomp5.so | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 |