* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 37761)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 234.15621 +- 0.000001. Correct Result: 234.156208
Configuration              
Number of Threads:         1
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     590.168
Minimum kernel time:       0.00586987
Maximum kernel time:       0.00662208
Arithm. Mean kernel time:  0.00590157
Performance results        
Total GFlops/s:            2.45582
Minimum GFlops/s:          2.18866
Maximum GFlops/s:          2.46914
Arithm. Mean GFlops/s:     2.45587
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 37761)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 37761)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 37761)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 37761)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0
To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0  #
#################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 38017)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 234.62428 +- 0.000001. Correct Result: 234.624284
Configuration              
Number of Threads:         2
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     323.397
Minimum kernel time:       0.00319886
Maximum kernel time:       0.00425601
Arithm. Mean kernel time:  0.00323376
Performance results        
Total GFlops/s:            4.48164
Minimum GFlops/s:          3.40542
Maximum GFlops/s:          4.53083
Arithm. Mean GFlops/s:     4.48193
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 38017)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 38017)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 38017)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 38017)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1
To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1  #
#################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 38257)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 234.58571 +- 0.000001. Correct Result: 234.585715
Configuration              
Number of Threads:         4
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     208.93
Minimum kernel time:       0.00207186
Maximum kernel time:       0.00274611
Arithm. Mean kernel time:  0.0020892
Performance results        
Total GFlops/s:            6.937
Minimum GFlops/s:          5.27784
Maximum GFlops/s:          6.99541
Arithm. Mean GFlops/s:     6.93735
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 38257)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 38257)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 38257)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 38257)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2
To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2  #
#################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 38496)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 233.85259 +- 0.000001. Correct Result: 233.852586
Configuration              
Number of Threads:         8
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     137.136
Minimum kernel time:       0.00135493
Maximum kernel time:       0.00195289
Arithm. Mean kernel time:  0.00137131
Performance results        
Total GFlops/s:            10.5687
Minimum GFlops/s:          7.42157
Maximum GFlops/s:          10.6968
Arithm. Mean GFlops/s:     10.5691
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 38496)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 38496)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 38496)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 38496)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3
To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3  #
#################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 38734)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 233.44446 +- 0.000001. Correct Result: 233.444460
Configuration              
Number of Threads:         12
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     127.941
Minimum kernel time:       0.00127006
Maximum kernel time:       0.00197387
Arithm. Mean kernel time:  0.00127936
Performance results        
Total GFlops/s:            11.3283
Minimum GFlops/s:          7.34269
Maximum GFlops/s:          11.4117
Arithm. Mean GFlops/s:     11.3287
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 38734)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 38734)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 38734)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 38734)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4
To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4  #
#################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 38977)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 233.44792 +- 0.000001. Correct Result: 233.447917
Configuration              
Number of Threads:         16
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     98.4199
Minimum kernel time:       0.000740051
Maximum kernel time:       0.00429988
Arithm. Mean kernel time:  0.000984149
Performance results        
Total GFlops/s:            14.7262
Minimum GFlops/s:          3.37067
Maximum GFlops/s:          19.5844
Arithm. Mean GFlops/s:     14.7269
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 38977)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 38977)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 38977)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 38977)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5
To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5  #
#################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 39221)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 234.59363 +- 0.000001. Correct Result: 234.593630
Configuration              
Number of Threads:         20
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     64.5958
Minimum kernel time:       0.000545979
Maximum kernel time:       0.00401187
Arithm. Mean kernel time:  0.00064581
Performance results        
Total GFlops/s:            22.4372
Minimum GFlops/s:          3.61265
Maximum GFlops/s:          26.5459
Arithm. Mean GFlops/s:     22.4424
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 39221)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 39221)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 39221)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 39221)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6
To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6  #
#################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 39467)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 234.56958 +- 0.000001. Correct Result: 234.569577
Configuration              
Number of Threads:         24
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     60.4525
Minimum kernel time:       0.000437975
Maximum kernel time:       0.00449586
Arithm. Mean kernel time:  0.000604366
Performance results        
Total GFlops/s:            23.975
Minimum GFlops/s:          3.22374
Maximum GFlops/s:          33.0921
Arithm. Mean GFlops/s:     23.9813
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 39467)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 39467)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 39467)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 39467)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7
To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7  #
#################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 39715)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 235.15984 +- 0.000001. Correct Result: 235.159842
Configuration              
Number of Threads:         28
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     66.5848
Minimum kernel time:       0.000463963
Maximum kernel time:       0.00509214
Arithm. Mean kernel time:  0.000665786
Performance results        
Total GFlops/s:            21.767
Minimum GFlops/s:          2.84625
Maximum GFlops/s:          31.2385
Arithm. Mean GFlops/s:     21.769
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 39715)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 39715)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 39715)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 39715)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8
To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8  #
#################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 39969)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 234.50150 +- 0.000001. Correct Result: 234.501495
Configuration              
Number of Threads:         32
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     55.5627
Minimum kernel time:       0.000385046
Maximum kernel time:       0.00746298
Arithm. Mean kernel time:  0.000555561
Performance results        
Total GFlops/s:            26.085
Minimum GFlops/s:          1.94205
Maximum GFlops/s:          37.6409
Arithm. Mean GFlops/s:     26.088
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 39969)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 39969)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 39969)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 39969)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9
To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9  #
#################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 40226)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 234.49471 +- 0.000001. Correct Result: 234.494710
Configuration              
Number of Threads:         36
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     46.1579
Minimum kernel time:       0.00035882
Maximum kernel time:       0.00494289
Arithm. Mean kernel time:  0.00046152
Performance results        
Total GFlops/s:            31.3998
Minimum GFlops/s:          2.93219
Maximum GFlops/s:          40.3921
Arithm. Mean GFlops/s:     31.4038
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 40226)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 40226)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 40226)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 40226)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10  #
##################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 40488)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 232.04594 +- 0.000001. Correct Result: 232.045941
Configuration              
Number of Threads:         40
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     37.8883
Minimum kernel time:       0.000302076
Maximum kernel time:       0.00499606
Arithm. Mean kernel time:  0.000378823
Performance results        
Total GFlops/s:            38.2532
Minimum GFlops/s:          2.90098
Maximum GFlops/s:          47.9796
Arithm. Mean GFlops/s:     38.2593
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 40488)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 40488)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 40488)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 40488)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11  #
##################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 40754)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 234.89474 +- 0.000001. Correct Result: 234.894736
Configuration              
Number of Threads:         44
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     48.6906
Minimum kernel time:       0.00028801
Maximum kernel time:       0.0108039
Arithm. Mean kernel time:  0.000486843
Performance results        
Total GFlops/s:            29.7665
Minimum GFlops/s:          1.3415
Maximum GFlops/s:          50.3229
Arithm. Mean GFlops/s:     29.7704
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 40754)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 40754)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 40754)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 40754)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12  #
##################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 41020)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 234.20207 +- 0.000001. Correct Result: 234.202071
Configuration              
Number of Threads:         48
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     34.0594
Minimum kernel time:       0.000223875
Maximum kernel time:       0.00328398
Arithm. Mean kernel time:  0.000340534
Performance results        
Total GFlops/s:            42.5535
Minimum GFlops/s:          4.4134
Maximum GFlops/s:          64.7392
Arithm. Mean GFlops/s:     42.5611
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 41020)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 41020)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 41020)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 41020)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13  #
##################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 41294)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 234.09174 +- 0.000001. Correct Result: 234.091745
Configuration              
Number of Threads:         52
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     32.4846
Minimum kernel time:       0.000189066
Maximum kernel time:       0.00662804
Arithm. Mean kernel time:  0.00032477
Performance results        
Total GFlops/s:            44.6165
Minimum GFlops/s:          2.18669
Maximum GFlops/s:          76.6584
Arithm. Mean GFlops/s:     44.627
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 41294)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 41294)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 41294)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 41294)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14  #
##################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 41572)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 234.67576 +- 0.000001. Correct Result: 234.675759
Configuration              
Number of Threads:         56
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     21.7361
Minimum kernel time:       0.000133991
Maximum kernel time:       0.00357389
Arithm. Mean kernel time:  0.0002173
Performance results        
Total GFlops/s:            66.6794
Minimum GFlops/s:          4.05538
Maximum GFlops/s:          108.167
Arithm. Mean GFlops/s:     66.6981
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 41572)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 41572)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 41572)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 41572)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15  #
##################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 41851)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 235.59394 +- 0.000001. Correct Result: 235.593936
Configuration              
Number of Threads:         60
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     17.426
Minimum kernel time:       0.000114918
Maximum kernel time:       0.00346708
Arithm. Mean kernel time:  0.000174228
Performance results        
Total GFlops/s:            83.1715
Minimum GFlops/s:          4.18031
Maximum GFlops/s:          126.121
Arithm. Mean GFlops/s:     83.1869
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 41851)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 41851)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 41851)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 41851)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16  #
##################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 42133)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 234.14378 +- 0.000001. Correct Result: 234.143784
Configuration              
Number of Threads:         64
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     13.993
Minimum kernel time:       9.58443e-05
Maximum kernel time:       0.00653195
Arithm. Mean kernel time:  0.000139869
Performance results        
Total GFlops/s:            103.576
Minimum GFlops/s:          2.21886
Maximum GFlops/s:          151.219
Arithm. Mean GFlops/s:     103.622
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 42133)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 42133)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 42133)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 42133)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17  #
##################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 42418)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 233.48366 +- 0.000001. Correct Result: 233.483659
Configuration              
Number of Threads:         68
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     11.2414
Minimum kernel time:       7.79629e-05
Maximum kernel time:       0.00682306
Arithm. Mean kernel time:  0.000112379
Performance results        
Total GFlops/s:            128.93
Minimum GFlops/s:          2.12419
Maximum GFlops/s:          185.903
Arithm. Mean GFlops/s:     128.97
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 42418)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 42418)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 42418)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 42418)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18  #
##################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 42709)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 234.57436 +- 0.000001. Correct Result: 234.574356
Configuration              
Number of Threads:         72
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     8.93133
Minimum kernel time:       6.69956e-05
Maximum kernel time:       0.00545096
Arithm. Mean kernel time:  8.92672e-05
Performance results        
Total GFlops/s:            162.277
Minimum GFlops/s:          2.65889
Maximum GFlops/s:          216.335
Arithm. Mean GFlops/s:     162.361
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 42709)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 42709)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 42709)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 42709)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19  #
##################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 43002)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 234.59649 +- 0.000001. Correct Result: 234.596489
Configuration              
Number of Threads:         76
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     7.97909
Minimum kernel time:       6.10352e-05
Maximum kernel time:       0.00944781
Arithm. Mean kernel time:  7.97562e-05
Performance results        
Total GFlops/s:            181.643
Minimum GFlops/s:          1.53406
Maximum GFlops/s:          237.461
Arithm. Mean GFlops/s:     181.722
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 43002)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 43002)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 43002)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 43002)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20  #
##################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 43301)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 234.10632 +- 0.000001. Correct Result: 234.106322
Configuration              
Number of Threads:         80
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     7.33808
Minimum kernel time:       5.57899e-05
Maximum kernel time:       0.0101359
Arithm. Mean kernel time:  7.33123e-05
Performance results        
Total GFlops/s:            197.511
Minimum GFlops/s:          1.42992
Maximum GFlops/s:          259.787
Arithm. Mean GFlops/s:     197.695
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 43301)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 43301)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 43301)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 43301)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21  #
##################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 43603)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 235.07646 +- 0.000001. Correct Result: 235.076459
Configuration              
Number of Threads:         84
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     6.94959
Minimum kernel time:       5.48363e-05
Maximum kernel time:       0.00653505
Arithm. Mean kernel time:  6.94627e-05
Performance results        
Total GFlops/s:            208.552
Minimum GFlops/s:          2.21781
Maximum GFlops/s:          264.305
Arithm. Mean GFlops/s:     208.651
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 43603)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 43603)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 43603)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 43603)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22  #
##################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 43910)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 235.59482 +- 0.000001. Correct Result: 235.594825
Configuration              
Number of Threads:         88
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     6.97071
Minimum kernel time:       5.19753e-05
Maximum kernel time:       0.00780201
Arithm. Mean kernel time:  6.96405e-05
Performance results        
Total GFlops/s:            207.92
Minimum GFlops/s:          1.85766
Maximum GFlops/s:          278.854
Arithm. Mean GFlops/s:     208.119
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 43910)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 43910)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 43910)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 43910)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23  #
##################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 44219)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 234.21674 +- 0.000001. Correct Result: 234.216745
Configuration              
Number of Threads:         92
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     5.98542
Minimum kernel time:       4.79221e-05
Maximum kernel time:       0.00679994
Arithm. Mean kernel time:  5.98088e-05
Performance results        
Total GFlops/s:            242.147
Minimum GFlops/s:          2.13142
Maximum GFlops/s:          302.438
Arithm. Mean GFlops/s:     242.331
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 44219)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 44219)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 44219)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 44219)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24  #
##################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 44532)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 233.25479 +- 0.000001. Correct Result: 233.254789
Configuration              
Number of Threads:         96
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     5.70626
Minimum kernel time:       4.50611e-05
Maximum kernel time:       0.00710297
Arithm. Mean kernel time:  5.70324e-05
Performance results        
Total GFlops/s:            253.993
Minimum GFlops/s:          2.04048
Maximum GFlops/s:          321.641
Arithm. Mean GFlops/s:     254.127
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 44532)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 44532)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 44532)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 44532)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25  #
##################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 44850)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 234.58324 +- 0.000001. Correct Result: 234.583237
Configuration              
Number of Threads:         100
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     5.69868
Minimum kernel time:       4.57764e-05
Maximum kernel time:       0.0186312
Arithm. Mean kernel time:  5.69526e-05
Performance results        
Total GFlops/s:            254.331
Minimum GFlops/s:          0.777914
Maximum GFlops/s:          316.615
Arithm. Mean GFlops/s:     254.483
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 44850)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 44850)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 44850)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 44850)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26  #
##################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com
* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 45174)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS
Correctness check
Success, correct result: 235.66481 +- 0.000001. Correct Result: 235.664808
Configuration              
Number of Threads:         104
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt
Time measurements          
Total experiment time:     5.83936
Minimum kernel time:       4.60148e-05
Maximum kernel time:       0.0177529
Arithm. Mean kernel time:  5.83615e-05
Performance results        
Total GFlops/s:            248.204
Minimum GFlops/s:          0.816402
Maximum GFlops/s:          314.975
Arithm. Mean GFlops/s:     248.34
* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 45174)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 45174)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 45174)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 45174)
Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27
To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27  #
##################################################################################################################################