OV - K-Means scalability acfl-Ofast 100000000

********************************************************************************
MAQAO 2025.1.1 - f3e40b5f1dbd62488bc0cc5f885d40677c87bfe8::20250630-094248 || 2025/06/30
/home/fmusial/MAQAO/bin/maqao oneview --create-report=one --with-FLOPS --replace --run-directory=/home/fmusial/KMEANS_Benchmarks --executable=kmeans/kmeans-acfl-Ofast "--experiment-name=K-Means scalability acfl-Ofast 100000000" "--run-command= input/100000000.in 1000 100000000 50 25" -c=/home/fmusial/KMEANS_Benchmarks/kmeans_multiruns_conf_neoverse_v2.json -WS --debug=1 -xp=/home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000 
CPY:  [true] ./kmeans/kmeans-acfl-Ofast --> /home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/binaries/kmeans-acfl-Ofast
CMD:  OMP_PROC_BIND=true  OMP_NUM_THREADS=1   /home/fmusial/MAQAO/bin/maqao lprof _caller=oneview  --xp="/home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/tools/lprof_npsu_run_0" --mpi-command="" --collect-CPU-time-intervals -p=NEON_SVE_FLOP  --collect-topology tpp=1   -- /home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/binaries/kmeans-acfl-Ofast input/100000000.in 1000 100000000 50 25
CMD:  OMP_PROC_BIND=true  OMP_NUM_THREADS=2   /home/fmusial/MAQAO/bin/maqao lprof _caller=oneview  --xp="/home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/tools/lprof_npsu_run_1" --mpi-command="" --collect-CPU-time-intervals -p=NEON_SVE_FLOP  --collect-topology tpp=2   -- /home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/binaries/kmeans-acfl-Ofast input/100000000.in 1000 100000000 50 25
CMD:  OMP_PROC_BIND=true  OMP_NUM_THREADS=4   /home/fmusial/MAQAO/bin/maqao lprof _caller=oneview  --xp="/home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/tools/lprof_npsu_run_2" --mpi-command="" --collect-CPU-time-intervals -p=NEON_SVE_FLOP  --collect-topology tpp=4   -- /home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/binaries/kmeans-acfl-Ofast input/100000000.in 1000 100000000 50 25
CMD:  OMP_PROC_BIND=true  OMP_NUM_THREADS=8   /home/fmusial/MAQAO/bin/maqao lprof _caller=oneview  --xp="/home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/tools/lprof_npsu_run_3" --mpi-command="" --collect-CPU-time-intervals -p=NEON_SVE_FLOP  --collect-topology tpp=8   -- /home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/binaries/kmeans-acfl-Ofast input/100000000.in 1000 100000000 50 25
CMD:  OMP_PROC_BIND=true  OMP_NUM_THREADS=16   /home/fmusial/MAQAO/bin/maqao lprof _caller=oneview  --xp="/home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/tools/lprof_npsu_run_4" --mpi-command="" --collect-CPU-time-intervals -p=NEON_SVE_FLOP  --collect-topology tpp=16   -- /home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/binaries/kmeans-acfl-Ofast input/100000000.in 1000 100000000 50 25
CMD:  OMP_PROC_BIND=true  OMP_NUM_THREADS=32   /home/fmusial/MAQAO/bin/maqao lprof _caller=oneview  --xp="/home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/tools/lprof_npsu_run_5" --mpi-command="" --collect-CPU-time-intervals -p=NEON_SVE_FLOP  --collect-topology tpp=32   -- /home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/binaries/kmeans-acfl-Ofast input/100000000.in 1000 100000000 50 25
CMD:  OMP_PROC_BIND=true  OMP_NUM_THREADS=48   /home/fmusial/MAQAO/bin/maqao lprof _caller=oneview  --xp="/home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/tools/lprof_npsu_run_6" --mpi-command="" --collect-CPU-time-intervals -p=NEON_SVE_FLOP  --collect-topology tpp=48   -- /home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/binaries/kmeans-acfl-Ofast input/100000000.in 1000 100000000 50 25
CMD:  OMP_PROC_BIND=true  OMP_NUM_THREADS=64   /home/fmusial/MAQAO/bin/maqao lprof _caller=oneview  --xp="/home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/tools/lprof_npsu_run_7" --mpi-command="" --collect-CPU-time-intervals -p=NEON_SVE_FLOP  --collect-topology tpp=64   -- /home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/binaries/kmeans-acfl-Ofast input/100000000.in 1000 100000000 50 25
CMD:  OMP_PROC_BIND=true  OMP_NUM_THREADS=80   /home/fmusial/MAQAO/bin/maqao lprof _caller=oneview  --xp="/home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/tools/lprof_npsu_run_8" --mpi-command="" --collect-CPU-time-intervals -p=NEON_SVE_FLOP  --collect-topology tpp=80   -- /home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/binaries/kmeans-acfl-Ofast input/100000000.in 1000 100000000 50 25
CMD:  OMP_PROC_BIND=true  OMP_NUM_THREADS=96   /home/fmusial/MAQAO/bin/maqao lprof _caller=oneview  --xp="/home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/tools/lprof_npsu_run_9" --mpi-command="" --collect-CPU-time-intervals -p=NEON_SVE_FLOP  --collect-topology tpp=96   -- /home/fmusial/KMEANS_Benchmarks/results/scalability/acfl/acfl-Ofast/points_100000000/binaries/kmeans-acfl-Ofast input/100000000.in 1000 100000000 50 25
In run run_1_thread, 1 loops were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0.0051647555083036% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
2 functions were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0.0051647555083036% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
In run run_2_threads, 1 loops were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 1.0122745037079% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
5 functions were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0.3716563358903% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
In run run_4_threads, 1 loops were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0.93137925863266% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
1 functions were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0.0045655844733119% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
In run run_8_threads, 1 loops were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0.75441741943359% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
In run run_16_threads, 1 loops were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0.64757162332535% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
1 functions were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0.0031133247539401% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
In run run_32_threads, 1 loops were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0.43711671233177% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
1 functions were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0.0021219258196652% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
In run run_48_threads, 1 loops were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0.36270013451576% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
1 functions were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0.0016191970789805% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
In run run_64_threads, 1 loops were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0.30305817723274% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
1 functions were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
In run run_80_threads, 1 loops were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0.22031334042549% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
2 functions were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0.0022253871429712% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
In run run_96_threads, 1 loops were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0.20362547039986% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
3 functions were discarded from static analysis because their coverage
are lower than object_coverage_threshold value (0.01%).
That represents 0.0028412858373485% of the execution time. To include them, change the value
in the experiment directory configuration file, then rerun the command with the additionnal parameter
--force-static-analysis
Report Configuration