options

Help is available by moving the cursor above any symbol or by checking MAQAO website.

Global Metrics

Metricr0r1r2r3
Total Time (s)20.5628.3919.8817.77
Max (Thread Active Time) (s)20.5128.3019.7817.70
Average Active Time (s)17.6622.7115.2411.23
Activity Ratio (%)85.980.176.763.2
Average number of active threads6.8726.3996.1345.057
Affinity Stability (%)99.599.199.198.4
GFLOPS30.63122.1860.30023.093
Time in analyzed loops (%)75.981.678.299.9
Time in analyzed innermost loops (%)75.280.777.214.4
Time in user code (%)75.981.678.2100.0
Compilation Options Score (%)100100100100
Array Access Efficiency (%)40.040.094.341.7
Potential Speedups
Perfect Flow Complexity2.362.621.001.00
Perfect OpenMP + MPI + Pthread1.021.001.001.00
Perfect OpenMP + MPI + Pthread + Perfect Load Distribution1.531.531.661.58
No Scalar IntegerPotential Speedup1.081.081.023.19
Nb Loops to get 80%1112
FP VectorisedPotential Speedup1.591.661.321.42
Nb Loops to get 80%1111
Fully VectorisedPotential Speedup2.723.132.171.44
Nb Loops to get 80%2221
Only FP ArithmeticPotential Speedup1.321.351.051.56
Nb Loops to get 80%2212

Cumulated Speedup If No Scalar Integer

Cumulated Speedup If FP Vectorized

Cumulated Speedup If Fully Vectorized

Cumulated Speedup If Only FP Arithmetic

Loop Based Profiles

Innermost / Single Loops

Inbetween Loops

Outermost Loops

Cumulated Coverage With All Loops

Innermost Loop Based Profiles

Coverage

Count

Application Categorization

Time

Coverage

Compilation Options

Source ObjectIssue
kmeans-icpx-O3-aggressive
main.cpp

Path Count Profiles

Coverage

Count

Low Iteration Count Profiles

Coverage

Count

Average Number of Active Threads

Run 1 - CASCADE LAKE | ICPX O3 + More Aggressive Flags

Run 2 - SKYLAKE | ICPX O3 + More Aggressive Flags

Run 3 - NEOVERSE V1 | ACFL O3 + Funroll + Ffastmath

Run 4 - NEOVERSE V2 | G++ O3 + Funroll

Experiment Summaries

r0r1r2r3
Experiment NameK-Means scalability icpx-O3-aggressive 100000000K-Means scalability icpx-O3-aggressive 100000000K-Means scalability acfl-O3-all 100000000K-Means scalability gcc-O3-funroll 100000000
Application./kmeans/kmeans-icpx-O3-aggressivesame as r0./kmeans/kmeans-acfl-O3-all./kmeans/kmeans-gcc-O3-funroll
Timestamp2025-06-25 14:19:272025-06-23 16:14:502025-07-07 09:20:122025-06-24 09:38:07
Experiment TypeOpenMP; same as r0same as r0same as r0
Machineotterfallskylakeip-172-31-18-66ip-172-31-47-249.ec2.internal
Architecturex86_64same as r0aarch64same as r2
Micro ArchitectureSKYLAKEsame as r0ARM_NEOVERSE_V1ARM_NEOVERSE_V2
Model NameIntel(R) Xeon(R) Silver 4210R CPU @ 2.40GHzIntel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz
Cache Size14080 KB36608 KB
Number of Cores1026
Maximal Frequency3.2 GHz2.1 GHz0 GHzsame as r2
OS VersionLinux 6.12.1-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 22 Nov 2024 16:04:27 +0000Linux 6.10.10-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 12 Sep 2024 17:21:02 +0000Linux 6.8.0-1030-aws #32-Ubuntu SMP Wed May 28 19:33:40 UTC 2025Linux 6.1.109-118.189.amzn2023.aarch64 #1 SMP Tue Sep 10 08:58:40 UTC 2024
Architecture used during static analysisx86_64same as r0aarch64same as r2
Micro Architecture used during static analysisSKYLAKEsame as r0ARM_NEOVERSE_V1ARM_NEOVERSE_V2
Compilation Options kmeans-icpx-O3-aggressive: clang based Intel(R) oneAPI DPC++/C++ Compiler 2024.2.1 (2024.2.1.20240711) --driver-mode=g++ --intel -I . -MMD -MP -march=native -std=c++14 -g -fno-omit-frame-pointer -fiopenmp -O3 -x Host -funroll-loops -ffast-math -c -o main.o main.cpp -fveclib=SVML -faltmathlib=SVML -fheinous-gnu-extensions kmeans-icpx-O3-aggressive: clang based Intel(R) oneAPI DPC++/C++ Compiler 2024.2.1 (2024.2.1.20240711) --driver-mode=g++ --intel -I . -MMD -MP -march=native -std=c++14 -g -fno-omit-frame-pointer -fiopenmp -O3 -x Host -funroll-loops -ffast-math -qopt-report=5 -c -o main.o main.cpp -fveclib=SVML -faltmathlib=SVML -fheinous-gnu-extensions kmeans-acfl-O3-all: Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_Ubuntu-20.04/llvm-bin/clang-19 --driver-mode=g++ -I . -MMD -MP -march=native -std=c++14 -g -fno-omit-frame-pointer -fopenmp -O3 -funroll-loops -ffast-math -grecord-command-line -c -o main.o main.cpp kmeans-gcc-O3-funroll: GNU C++14 14.2.0 -mlittle-endian -mabi=lp64 -mcpu=neoverse-v2+crc+sve2-aes+sve2-sha3+nossbs -g -O3 -std=c++14 -fno-omit-frame-pointer -fopenmp -funroll-loops
Number of processes observed1same as r0same as r0same as r0
Number of threads observed8same as r0same as r0same as r0
Frequency Driverintel_pstateintel_cpufreqNAsame as r2
Frequency Governorperformancesame as r0NAsame as r2
Huge Pagesalwayssame as r0madvisesame as r2
Hyperthreadingoffsame as r0same as r0same as r0
Number of sockets12same as r0same as r0
Number of cores per socket10266496
MAQAO version2025.1.0same as r02025.1.1same as r0
MAQAO build1cd8232d3b2009bc695f526f903b266bda9bb996::20250623-181852e913a471001afb562449956a906221b5bfa8ea0d::20250617-163738f3e40b5f1dbd62488bc0cc5f885d40677c87bfe8::20250630-094248same as r0
CommentsIntel Xeon 42104R (Cascade Lake CPU), 1-10 threads runsIntel Xeon Platinum 8170 (Skylake CPU), 1-26 threads runsAWS Graviton 3 (Neoverse V1) CPU, 1-64 threads runsAWS Graviton 4 (Neoverse V2) CPU, 1-96 threads runs
×