Run G++ O3 + Vectorize + Funroll + Ffastmath | Run ACFL O3 + Vectorize + Funroll + Ffastmath |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 58-67
| | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 58-67
|
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
10 | 90.88 | 9.68 | 1.60 | 64 | 10.59 | 0.05 | 0.23 | 13 | 27.92 | 2.93 | 1.52 | 64 | 1.81 | 0.01 | 23.76 |
Run G++ O3 + Vectorize + Funroll + Ffastmath | Run ACFL O3 + Vectorize + Funroll + Ffastmath |
| | | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
647 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 | 1 | 0.16 | 0.02 | 0.03 | 55 | 0.11 | 0.01 | 0.00 |
330 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 | 528 | 3.83 | 0.40 | 0.42 | 64 | 0.85 | 0.04 | 0.00 |
276 | 0.20 | 0.02 | 0.01 | 62 | 0.10 | 0.00 | 0.00 | 1259 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 |
280 | 0.17 | 0.02 | 0.01 | 57 | 0.11 | 0.00 | 0.00 | 1281 | 0.06 | 0.01 | 0.02 | 27 | 0.06 | 0.00 | 0.00 |
-1 | 0.00 | 0.00 | 0.00 | 12 | 0.00 | 0.00 | NA | 1241 | 0.07 | 0.01 | 0.02 | 32 | 0.07 | 0.00 | 0.00 |
| 854 | 62.23 | 6.53 | 3.62 | 64 | 8.09 | 0.43 | 0.00 |
| 731 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 |
| 1137 | 2.63 | 0.28 | 0.23 | 63 | 0.48 | 0.03 | 0.00 |
| -1 | 0.48 | 0.05 | 0.06 | 63 | 0.22 | 0.01 | 0.00 |
| -1 | 0.00 | 0.00 | 0.00 | 49 | 0.00 | 0.00 | NA |
Run G++ O3 + Vectorize + Funroll + Ffastmath | Run ACFL O3 + Vectorize + Funroll + Ffastmath |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 55-58
- /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 73-96
| | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 55-58
- /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 73-96
|
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
14 | 8.75 | 0.93 | 9.06 | 1 | 0.00 | 0.00 | 3.99 | 12 | 2.61 | 0.27 | 8.98 | 1 | 0.00 | 0.00 | 8.10 |
Name | Module | Coverage (%) | Inclusive Time w.r.t. Wall Time(s) | Max Inc. Time over Threads(s) | Nb Threads | GFLOP/s | Deviation (coverage) | Deviation (time) |
G++ O3 + Vectorize + Funroll + Ffastmath | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Vectorize + Funroll + Ffastmath | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Vectorize + Funroll + Ffastmath | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Vectorize + Funroll + Ffastmath | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Vectorize + Funroll + Ffastmath | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Vectorize + Funroll + Ffastmath | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Vectorize + Funroll + Ffastmath | ACFL O3 + Vectorize + Funroll + Ffastmath |
k_means(int, point_t*, point_t*, int*, point_t*, int, int) [clone ._omp_fn.0] | binary | 90.88 | NA | 9.68 | NA | 1.60 | NA | 64 | NA | 0.23 | NA | 10.59 | NA | 0.05 | NA |
kmp_flag_64<false, true>::wait(kmp_info*, int, void*) | libomp.so | NA | 62.23 | NA | 6.53 | NA | 3.62 | NA | 64 | NA | 0.00 | NA | 8.09 | NA | 0.43 |
k_means(int, point_t*, point_t*, int*, point_t*, int, int) [clone .omp_outlined] | binary | NA | 27.92 | NA | 2.93 | NA | 1.52 | NA | 64 | NA | 23.76 | NA | 1.81 | NA | 0.01 |
k_means(int, point_t*, point_t*, int*, point_t*, int, int) | binary | 8.75 | 2.61 | 0.93 | 0.27 | 9.06 | 8.98 | 1 | 1 | 3.99 | 8.10 | 0.00 | 0.00 | 0.00 | 0.00 |
kmp_flag_native<unsigned long long, (flag_type)1, true>::notdone_check() | libomp.so | NA | 3.83 | NA | 0.40 | NA | 0.42 | NA | 64 | NA | 0.00 | NA | 0.85 | NA | 0.04 |
__sched_yield | libc.so.6 | NA | 2.63 | NA | 0.28 | NA | 0.23 | NA | 63 | NA | 0.00 | NA | 0.48 | NA | 0.03 |
unknown_function | [vdso] | NA | 0.48 | NA | 0.05 | NA | 0.06 | NA | 63 | NA | 0.00 | NA | 0.22 | NA | 0.01 |
gomp_barrier_wait_end | libgomp.so.1.0.0 | 0.20 | NA | 0.02 | NA | 0.01 | NA | 62 | NA | 0.00 | NA | 0.10 | NA | 0.00 | NA |
gomp_team_barrier_wait_end | libgomp.so.1.0.0 | 0.17 | NA | 0.02 | NA | 0.01 | NA | 57 | NA | 0.00 | NA | 0.11 | NA | 0.00 | NA |
@plt_start@ | libomp.so | NA | 0.16 | NA | 0.02 | NA | 0.03 | NA | 55 | NA | 0.00 | NA | 0.11 | NA | 0.01 |
__kmp_yield | libomp.so | NA | 0.07 | NA | 0.01 | NA | 0.02 | NA | 32 | NA | 0.00 | NA | 0.07 | NA | 0.00 |
__kmp_now_nsec | libomp.so | NA | 0.06 | NA | 0.01 | NA | 0.02 | NA | 27 | NA | 0.00 | NA | 0.06 | NA | 0.00 |
__kmp_suspend_initialize_thread | libomp.so | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
__kmp_finish_implicit_task | libomp.so | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
__xpg_basename | libc.so.6 | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
pthread_condattr_setpshared | libc.so.6 | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
unknown_kernel_region | kernel | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 12 | 49 | NA | NA | 0.00 | 0.00 | 0.00 | 0.00 |