| Run G++ O3 + Vectorize + Funroll + Ffastmath | Run ACFL O3 + Vectorize + Funroll + Ffastmath |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 58-67
| | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 58-67
|
| ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| 10 | 90.88 | 9.68 | 1.60 | 64 | 10.59 | 0.05 | 0.23 | 13 | 27.77 | 2.96 | 1.50 | 64 | 1.79 | 0.00 | 0.75 |
| Run G++ O3 + Vectorize + Funroll + Ffastmath | Run ACFL O3 + Vectorize + Funroll + Ffastmath |
| | | |
| ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| 647 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 | 1 | 0.18 | 0.02 | 0.02 | 63 | 0.06 | 0.00 | 0.00 |
| 330 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 | 528 | 5.25 | 0.56 | 0.41 | 64 | 0.96 | 0.05 | 0.00 |
| 276 | 0.20 | 0.02 | 0.01 | 62 | 0.10 | 0.00 | 0.00 | 1281 | 0.06 | 0.01 | 0.01 | 62 | 0.03 | 0.00 | 0.00 |
| 280 | 0.17 | 0.02 | 0.01 | 57 | 0.11 | 0.00 | 0.00 | 1241 | 0.09 | 0.01 | 0.01 | 62 | 0.05 | 0.00 | 0.00 |
| -1 | 0.00 | 0.00 | 0.00 | 12 | 0.00 | 0.00 | NA | 854 | 60.88 | 6.49 | 3.45 | 64 | 7.89 | 0.42 | 0.00 |
| 730 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 |
| 10 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 |
| 1137 | 2.66 | 0.28 | 0.18 | 63 | 0.25 | 0.01 | 0.00 |
| 789 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 |
| -1 | 0.44 | 0.05 | 0.04 | 63 | 0.10 | 0.01 | 0.00 |
| -1 | 0.00 | 0.00 | 0.00 | 13 | 0.00 | 0.00 | NA |
| Run G++ O3 + Vectorize + Funroll + Ffastmath | Run ACFL O3 + Vectorize + Funroll + Ffastmath |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 55-58
- /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 73-96
| | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 55-58
- /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 73-96
|
| ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| 14 | 8.75 | 0.93 | 9.06 | 1 | 0.00 | 0.00 | 3.99 | 12 | 2.67 | 0.28 | 9.16 | 1 | 0.00 | 0.00 | 7.24 |
| Name | Module | Coverage (%) | Inclusive Time w.r.t. Wall Time(s) | Max Inc. Time over Threads(s) | Nb Threads | GFLOP/s | Deviation (coverage) | Deviation (time) |
| G++ O3 + Vectorize + Funroll + Ffastmath | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Vectorize + Funroll + Ffastmath | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Vectorize + Funroll + Ffastmath | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Vectorize + Funroll + Ffastmath | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Vectorize + Funroll + Ffastmath | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Vectorize + Funroll + Ffastmath | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Vectorize + Funroll + Ffastmath | ACFL O3 + Vectorize + Funroll + Ffastmath |
| k_means(int, point_t*, point_t*, int*, point_t*, int, int) [clone ._omp_fn.0] | binary | 90.88 | NA | 9.68 | NA | 1.60 | NA | 64 | NA | 0.23 | NA | 10.59 | NA | 0.05 | NA |
| kmp_flag_64<false, true>::wait(kmp_info*, int, void*) | libomp.so | NA | 60.88 | NA | 6.49 | NA | 3.45 | NA | 64 | NA | 0.00 | NA | 7.89 | NA | 0.42 |
| k_means(int, point_t*, point_t*, int*, point_t*, int, int) [clone .omp_outlined] | binary | NA | 27.77 | NA | 2.96 | NA | 1.50 | NA | 64 | NA | 0.75 | NA | 1.79 | NA | 0.00 |
| k_means(int, point_t*, point_t*, int*, point_t*, int, int) | binary | 8.75 | 2.67 | 0.93 | 0.28 | 9.06 | 9.16 | 1 | 1 | 3.99 | 7.24 | 0.00 | 0.00 | 0.00 | 0.00 |
| kmp_flag_native<unsigned long long, (flag_type)1, true>::notdone_check() | libomp.so | NA | 5.25 | NA | 0.56 | NA | 0.41 | NA | 64 | NA | 0.00 | NA | 0.96 | NA | 0.05 |
| __sched_yield | libc.so.6 | NA | 2.66 | NA | 0.28 | NA | 0.18 | NA | 63 | NA | 0.00 | NA | 0.25 | NA | 0.01 |
| unknown_function | [vdso] | NA | 0.44 | NA | 0.05 | NA | 0.04 | NA | 63 | NA | 0.00 | NA | 0.10 | NA | 0.01 |
| gomp_barrier_wait_end | libgomp.so.1.0.0 | 0.20 | NA | 0.02 | NA | 0.01 | NA | 62 | NA | 0.00 | NA | 0.10 | NA | 0.00 | NA |
| @plt_start@ | libomp.so | NA | 0.18 | NA | 0.02 | NA | 0.02 | NA | 63 | NA | 0.00 | NA | 0.06 | NA | 0.00 |
| gomp_team_barrier_wait_end | libgomp.so.1.0.0 | 0.17 | NA | 0.02 | NA | 0.01 | NA | 57 | NA | 0.00 | NA | 0.11 | NA | 0.00 | NA |
| __kmp_yield | libomp.so | NA | 0.09 | NA | 0.01 | NA | 0.01 | NA | 62 | NA | 0.00 | NA | 0.05 | NA | 0.00 |
| __kmp_now_nsec | libomp.so | NA | 0.06 | NA | 0.01 | NA | 0.01 | NA | 62 | NA | 0.00 | NA | 0.03 | NA | 0.00 |
| pthread_condattr_setpshared | libc.so.6 | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| __xpg_basename | libc.so.6 | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| __default_morecore | libc.so.6 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
| _dl_rtld_di_serinfo | ld-linux-aarch64.so.1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
| __kmp_init_implicit_task | libomp.so | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
| unknown_kernel_region | kernel | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 12 | 13 | NA | NA | 0.00 | 0.00 | 0.00 | 0.00 |