| Run G++ O3 + Vectorize + Funroll + Ffastmath | Run Clang++ O3 + Ffastmath | Run ICPX O3 + More Aggressive Flags |
| Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 73-79
| Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 73-79
| Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 73-79
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
| 1 | 10.32 | 15.30 | 85.41 | 57.14 | 18.75 | 45.43 | 9 | 12.60 | 14.43 | 69.69 | 55 | 18.59 | 42.17 | 18 | 10.16 | 11.94 | 65.02 | 57.89 | 18.86 | 51.88 |
| | |
| Sum on 1 analyzed binary loop (kmeans-gcc-O3-all - 1) | Sum on 1 analyzed binary loop (kmeans-clang-O3-ffast-math - 9) | Sum on 1 analyzed binary loop (kmeans-icpx-O3-aggressive - 18) |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Loop Computation Issues | | Loop Computation Issues | | Loop Computation Issues | |
| Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 |
| Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 0 |
| Control Flow Issues | | Control Flow Issues | | Control Flow Issues | |
| Presence of 2 to 4 paths | 0 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 0 |
| Presence of more than 4 paths | 1 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 1 |
| Data Access Issues | | Data Access Issues | | Data Access Issues | |
| Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 |
| Vectorization Roadblocks | | Vectorization Roadblocks | | Vectorization Roadblocks | |
| Presence of 2 to 4 paths | 0 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 0 |
| Presence of more than 4 paths | 1 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 1 |
| Inefficient Vectorization | | Inefficient Vectorization | | Inefficient Vectorization | |
| Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 |