| Run G++ O3 + Funroll | Run ACFL O3 + Vectorize + Funroll + Ffastmath |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 58-67
| | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 58-67
|
| ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| 9 | 92.47 | 7.83 | 1.06 | 96 | 8.91 | 0.01 | 55.50 | 13 | 20.85 | 1.89 | 1.19 | 96 | 0.92 | 0.02 | 255.65 |
| Run G++ O3 + Funroll | Run ACFL O3 + Vectorize + Funroll + Ffastmath |
| | | |
| ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| 101 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 | 1 | 0.17 | 0.02 | 0.03 | 80 | 0.11 | 0.01 | 0.00 |
| 276 | 0.29 | 0.02 | 0.01 | 46 | 0.23 | 0.00 | 0.00 | 529 | 3.57 | 0.32 | 0.26 | 96 | 0.66 | 0.04 | 0.00 |
| 280 | 0.26 | 0.02 | 0.01 | 43 | 0.23 | 0.00 | 0.00 | 1242 | 0.13 | 0.01 | 0.04 | 71 | 0.11 | 0.01 | 0.00 |
| 10 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 | 855 | 71.19 | 6.47 | 4.08 | 96 | 7.37 | 0.40 | 0.00 |
| -1 | 0.00 | 0.00 | 0.00 | 74 | 0.00 | 0.00 | NA | 1282 | 0.07 | 0.01 | 0.01 | 50 | 0.06 | 0.00 | 0.00 |
| 71 | 1.95 | 0.18 | 0.19 | 96 | 0.44 | 0.02 | 0.00 |
| 984 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 |
| -1 | 0.57 | 0.05 | 0.05 | 93 | 0.22 | 0.01 | 0.00 |
| -1 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | NA |
| -1 | 0.00 | 0.00 | 0.00 | 59 | 0.00 | 0.00 | NA |
| Run G++ O3 + Funroll | Run ACFL O3 + Vectorize + Funroll + Ffastmath |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 55-58
- /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 73-96
| | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 55-58
- /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 73-96
|
| ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| 13 | 6.97 | 0.59 | 7.43 | 1 | 0.00 | 0.00 | 8.47 | 12 | 1.50 | 0.14 | 7.88 | 1 | 0.00 | 0.00 | 36.77 |
| Name | Module | Coverage (%) | Inclusive Time w.r.t. Wall Time(s) | Max Inc. Time over Threads(s) | Nb Threads | GFLOP/s | Deviation (coverage) | Deviation (time) |
| G++ O3 + Funroll | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Funroll | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Funroll | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Funroll | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Funroll | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Funroll | ACFL O3 + Vectorize + Funroll + Ffastmath | G++ O3 + Funroll | ACFL O3 + Vectorize + Funroll + Ffastmath |
| k_means(int, point_t*, point_t*, int*, point_t*, int, int) [clone ._omp_fn.0] | binary | 92.47 | NA | 7.83 | NA | 1.06 | NA | 96 | NA | 55.50 | NA | 8.91 | NA | 0.01 | NA |
| kmp_flag_64<false, true>::wait(kmp_info*, int, void*) | libomp.so | NA | 71.19 | NA | 6.47 | NA | 4.08 | NA | 96 | NA | 0.00 | NA | 7.37 | NA | 0.40 |
| k_means(int, point_t*, point_t*, int*, point_t*, int, int) [clone .omp_outlined] | binary | NA | 20.85 | NA | 1.89 | NA | 1.19 | NA | 96 | NA | 255.65 | NA | 0.92 | NA | 0.02 |
| k_means(int, point_t*, point_t*, int*, point_t*, int, int) | binary | 6.97 | 1.50 | 0.59 | 0.14 | 7.43 | 7.88 | 1 | 1 | 8.47 | 36.77 | 0.00 | 0.00 | 0.00 | 0.00 |
| kmp_flag_native<unsigned long long, (flag_type)1, true>::notdone_check() | libomp.so | NA | 3.57 | NA | 0.32 | NA | 0.26 | NA | 96 | NA | 0.00 | NA | 0.66 | NA | 0.04 |
| __sched_yield | libc.so.6 | NA | 1.95 | NA | 0.18 | NA | 0.19 | NA | 96 | NA | 0.00 | NA | 0.44 | NA | 0.02 |
| unknown_function | [vdso] | NA | 0.57 | NA | 0.05 | NA | 0.05 | NA | 93 | NA | 0.00 | NA | 0.22 | NA | 0.01 |
| gomp_barrier_wait_end | libgomp.so.1.0.0 | 0.29 | NA | 0.02 | NA | 0.01 | NA | 46 | NA | 0.00 | NA | 0.23 | NA | 0.00 | NA |
| gomp_team_barrier_wait_end | libgomp.so.1.0.0 | 0.26 | NA | 0.02 | NA | 0.01 | NA | 43 | NA | 0.00 | NA | 0.23 | NA | 0.00 | NA |
| @plt_start@ | libomp.so | NA | 0.17 | NA | 0.02 | NA | 0.03 | NA | 80 | NA | 0.00 | NA | 0.11 | NA | 0.01 |
| __kmp_yield | libomp.so | NA | 0.13 | NA | 0.01 | NA | 0.04 | NA | 71 | NA | 0.00 | NA | 0.11 | NA | 0.01 |
| __kmp_now_nsec | libomp.so | NA | 0.07 | NA | 0.01 | NA | 0.01 | NA | 50 | NA | 0.00 | NA | 0.06 | NA | 0.00 |
| syscall | libc.so.6 | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) [clone .isra.0] | binary | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| __GI___nptl_deallocate_tsd | libc.so.6 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
| unknown_kernel_region | kernel | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 74 | 59 | NA | NA | 0.00 | 0.00 | 0.00 | 0.00 |
| unknown_function | binary | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | NA | NA | 0.00 | NA | 0.00 |