options

Executable Output


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 13536)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.13128 +- 0.000001. Correct Result: 234.131285

Configuration              
Number of Threads:         1
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     593.367
Minimum kernel time:       0.00590587
Maximum kernel time:       0.00661588
Arithm. Mean kernel time:  0.00593356

Performance results        
Total GFlops/s:            2.44259
Minimum GFlops/s:          2.19071
Maximum GFlops/s:          2.45408
Arithm. Mean GFlops/s:     2.44263


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 13536)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 13536)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 13536)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 13536)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0

To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_0  #
#################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 13788)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 235.40982 +- 0.000001. Correct Result: 235.409821

Configuration              
Number of Threads:         2
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     306.107
Minimum kernel time:       0.00283289
Maximum kernel time:       0.0143871
Arithm. Mean kernel time:  0.00306098

Performance results        
Total GFlops/s:            4.73478
Minimum GFlops/s:          1.00739
Maximum GFlops/s:          5.11615
Arithm. Mean GFlops/s:     4.73491


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 13788)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 13788)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 13788)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 13788)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1

To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_1  #
#################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 14029)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 233.68230 +- 0.000001. Correct Result: 233.682303

Configuration              
Number of Threads:         4
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     126.101
Minimum kernel time:       0.0011301
Maximum kernel time:       0.0163009
Arithm. Mean kernel time:  0.00126096

Performance results        
Total GFlops/s:            11.4936
Minimum GFlops/s:          0.889121
Maximum GFlops/s:          12.8249
Arithm. Mean GFlops/s:     11.4941


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 14029)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 14029)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 14029)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 14029)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2

To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_2  #
#################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 14273)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.24167 +- 0.000001. Correct Result: 234.241672

Configuration              
Number of Threads:         8
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     65.5343
Minimum kernel time:       0.000580788
Maximum kernel time:       0.00996208
Arithm. Mean kernel time:  0.00065529

Performance results        
Total GFlops/s:            22.1159
Minimum GFlops/s:          1.45487
Maximum GFlops/s:          24.9549
Arithm. Mean GFlops/s:     22.1177


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 14273)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 14273)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 14273)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 14273)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3

To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_3  #
#################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 14509)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 233.69263 +- 0.000001. Correct Result: 233.692625

Configuration              
Number of Threads:         12
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     41.6672
Minimum kernel time:       0.000396967
Maximum kernel time:       0.0100789
Arithm. Mean kernel time:  0.000416619

Performance results        
Total GFlops/s:            34.7839
Minimum GFlops/s:          1.438
Maximum GFlops/s:          36.5106
Arithm. Mean GFlops/s:     34.7884


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 14509)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 14509)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 14509)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 14509)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4

To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_4  #
#################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 14745)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 233.48735 +- 0.000001. Correct Result: 233.487348

Configuration              
Number of Threads:         16
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     32.4508
Minimum kernel time:       0.000310898
Maximum kernel time:       0.00753713
Arithm. Mean kernel time:  0.000324455

Performance results        
Total GFlops/s:            44.663
Minimum GFlops/s:          1.92295
Maximum GFlops/s:          46.6182
Arithm. Mean GFlops/s:     44.6702


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 14745)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 14745)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 14745)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 14745)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5

To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_5  #
#################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 14982)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 235.59383 +- 0.000001. Correct Result: 235.593829

Configuration              
Number of Threads:         20
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     34.6215
Minimum kernel time:       0.000272989
Maximum kernel time:       0.00858188
Arithm. Mean kernel time:  0.000346101

Performance results        
Total GFlops/s:            41.8627
Minimum GFlops/s:          1.68885
Maximum GFlops/s:          53.0918
Arithm. Mean GFlops/s:     41.8765


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 14982)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 14982)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 14982)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 14982)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6

To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_6  #
#################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 15226)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.58904 +- 0.000001. Correct Result: 234.589038

Configuration              
Number of Threads:         24
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     23.698
Minimum kernel time:       0.000216007
Maximum kernel time:       0.00768089
Arithm. Mean kernel time:  0.000236927

Performance results        
Total GFlops/s:            61.1593
Minimum GFlops/s:          1.88695
Maximum GFlops/s:          67.0973
Arithm. Mean GFlops/s:     61.1729


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 15226)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 15226)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 15226)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 15226)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7

To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_7  #
#################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 15473)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.14513 +- 0.000001. Correct Result: 234.145133

Configuration              
Number of Threads:         28
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     20.4115
Minimum kernel time:       0.000188828
Maximum kernel time:       0.00948095
Arithm. Mean kernel time:  0.000204061

Performance results        
Total GFlops/s:            71.0064
Minimum GFlops/s:          1.5287
Maximum GFlops/s:          76.7552
Arithm. Mean GFlops/s:     71.0252


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 15473)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 15473)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 15473)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 15473)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8

To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_8  #
#################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 15726)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 235.22514 +- 0.000001. Correct Result: 235.225142

Configuration              
Number of Threads:         32
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     21.3642
Minimum kernel time:       0.000185013
Maximum kernel time:       0.0105119
Arithm. Mean kernel time:  0.000213587

Performance results        
Total GFlops/s:            67.84
Minimum GFlops/s:          1.37877
Maximum GFlops/s:          78.3378
Arithm. Mean GFlops/s:     67.8575


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 15726)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 15726)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 15726)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 15726)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9

To display your profiling results:
#################################################################################################################################
#    LEVEL    |     REPORT     |                                            COMMAND                                             #
#################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_9  #
#################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 15981)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 233.97469 +- 0.000001. Correct Result: 233.974691

Configuration              
Number of Threads:         36
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     22.0839
Minimum kernel time:       0.000165939
Maximum kernel time:       0.0135689
Arithm. Mean kernel time:  0.000220782

Performance results        
Total GFlops/s:            65.6292
Minimum GFlops/s:          1.06814
Maximum GFlops/s:          87.3421
Arithm. Mean GFlops/s:     65.6463


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 15981)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 15981)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 15981)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 15981)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_10  #
##################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 16238)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 235.77747 +- 0.000001. Correct Result: 235.777468

Configuration              
Number of Threads:         40
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     19.9751
Minimum kernel time:       0.000165939
Maximum kernel time:       0.0085659
Arithm. Mean kernel time:  0.000199694

Performance results        
Total GFlops/s:            72.5577
Minimum GFlops/s:          1.692
Maximum GFlops/s:          87.3421
Arithm. Mean GFlops/s:     72.5784


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 16238)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 16238)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 16238)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 16238)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_11  #
##################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 16504)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.17924 +- 0.000001. Correct Result: 234.179236

Configuration              
Number of Threads:         44
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     18.884
Minimum kernel time:       0.000154018
Maximum kernel time:       0.00546503
Arithm. Mean kernel time:  0.000188782

Performance results        
Total GFlops/s:            76.7502
Minimum GFlops/s:          2.65204
Maximum GFlops/s:          94.1024
Arithm. Mean GFlops/s:     76.7737


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 16504)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 16504)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 16504)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 16504)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_12  #
##################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 16769)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.32689 +- 0.000001. Correct Result: 234.326892

Configuration              
Number of Threads:         48
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     19.8904
Minimum kernel time:       0.000148058
Maximum kernel time:       0.00611591
Arithm. Mean kernel time:  0.000198848

Performance results        
Total GFlops/s:            72.8669
Minimum GFlops/s:          2.3698
Maximum GFlops/s:          97.8907
Arithm. Mean GFlops/s:     72.8875


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 16769)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 16769)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 16769)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 16769)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_13  #
##################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 17040)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 235.22136 +- 0.000001. Correct Result: 235.221363

Configuration              
Number of Threads:         52
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     15.2174
Minimum kernel time:       0.000120878
Maximum kernel time:       0.00525117
Arithm. Mean kernel time:  0.000152117

Performance results        
Total GFlops/s:            95.2428
Minimum GFlops/s:          2.76005
Maximum GFlops/s:          119.902
Arithm. Mean GFlops/s:     95.2788


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 17040)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 17040)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 17040)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 17040)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_14  #
##################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 17315)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.80422 +- 0.000001. Correct Result: 234.804219

Configuration              
Number of Threads:         56
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     13.6366
Minimum kernel time:       0.000108004
Maximum kernel time:       0.00648093
Arithm. Mean kernel time:  0.000136305

Performance results        
Total GFlops/s:            106.284
Minimum GFlops/s:          2.23633
Maximum GFlops/s:          134.195
Arithm. Mean GFlops/s:     106.331


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 17315)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 17315)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 17315)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 17315)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_15  #
##################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 17595)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.93527 +- 0.000001. Correct Result: 234.935274

Configuration              
Number of Threads:         60
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     12.1888
Minimum kernel time:       9.20296e-05
Maximum kernel time:       0.00765491
Arithm. Mean kernel time:  0.000121855

Performance results        
Total GFlops/s:            118.908
Minimum GFlops/s:          1.89336
Maximum GFlops/s:          157.487
Arithm. Mean GFlops/s:     118.941


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 17595)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 17595)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 17595)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 17595)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_16  #
##################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 17876)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.70960 +- 0.000001. Correct Result: 234.709604

Configuration              
Number of Threads:         64
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     9.95278
Minimum kernel time:       8.10623e-05
Maximum kernel time:       0.00903606
Arithm. Mean kernel time:  9.94954e-05

Performance results        
Total GFlops/s:            145.623
Minimum GFlops/s:          1.60396
Maximum GFlops/s:          178.794
Arithm. Mean GFlops/s:     145.67


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 17876)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 17876)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 17876)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 17876)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_17  #
##################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 18162)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 235.14749 +- 0.000001. Correct Result: 235.147486

Configuration              
Number of Threads:         68
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     9.08714
Minimum kernel time:       7.58171e-05
Maximum kernel time:       0.00590515
Arithm. Mean kernel time:  9.08348e-05

Performance results        
Total GFlops/s:            159.495
Minimum GFlops/s:          2.45438
Maximum GFlops/s:          191.164
Arithm. Mean GFlops/s:     159.559


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 18162)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 18162)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 18162)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 18162)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_18  #
##################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 18454)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.56137 +- 0.000001. Correct Result: 234.561373

Configuration              
Number of Threads:         72
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     7.72663
Minimum kernel time:       6.60419e-05
Maximum kernel time:       0.00704598
Arithm. Mean kernel time:  7.72375e-05

Performance results        
Total GFlops/s:            187.579
Minimum GFlops/s:          2.05699
Maximum GFlops/s:          219.459
Arithm. Mean GFlops/s:     187.648


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 18454)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 18454)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 18454)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 18454)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_19  #
##################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 18747)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 235.94005 +- 0.000001. Correct Result: 235.940049

Configuration              
Number of Threads:         76
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     7.27518
Minimum kernel time:       6.10352e-05
Maximum kernel time:       0.00456595
Arithm. Mean kernel time:  7.27231e-05

Performance results        
Total GFlops/s:            199.218
Minimum GFlops/s:          3.17425
Maximum GFlops/s:          237.461
Arithm. Mean GFlops/s:     199.297


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 18747)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 18747)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 18747)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 18747)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_20  #
##################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 19044)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.15881 +- 0.000001. Correct Result: 234.158806

Configuration              
Number of Threads:         80
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     7.1559
Minimum kernel time:       5.60284e-05
Maximum kernel time:       0.011795
Arithm. Mean kernel time:  7.15314e-05

Performance results        
Total GFlops/s:            202.539
Minimum GFlops/s:          1.22878
Maximum GFlops/s:          258.681
Arithm. Mean GFlops/s:     202.617


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 19044)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 19044)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 19044)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 19044)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_21  #
##################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 19345)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.63902 +- 0.000001. Correct Result: 234.639019

Configuration              
Number of Threads:         84
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     6.29052
Minimum kernel time:       5.29289e-05
Maximum kernel time:       0.0132921
Arithm. Mean kernel time:  6.28767e-05

Performance results        
Total GFlops/s:            230.402
Minimum GFlops/s:          1.09039
Maximum GFlops/s:          273.829
Arithm. Mean GFlops/s:     230.507


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 19345)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 19345)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 19345)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 19345)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_22  #
##################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 19655)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 233.78120 +- 0.000001. Correct Result: 233.781199

Configuration              
Number of Threads:         88
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     7.26697
Minimum kernel time:       5.29289e-05
Maximum kernel time:       0.0185201
Arithm. Mean kernel time:  7.26403e-05

Performance results        
Total GFlops/s:            199.443
Minimum GFlops/s:          0.782581
Maximum GFlops/s:          273.829
Arithm. Mean GFlops/s:     199.524


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 19655)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 19655)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 19655)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 19655)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_23  #
##################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 19965)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 233.35842 +- 0.000001. Correct Result: 233.358416

Configuration              
Number of Threads:         92
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     5.77013
Minimum kernel time:       4.69685e-05
Maximum kernel time:       0.0111721
Arithm. Mean kernel time:  5.76731e-05

Performance results        
Total GFlops/s:            251.182
Minimum GFlops/s:          1.2973
Maximum GFlops/s:          308.579
Arithm. Mean GFlops/s:     251.304


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 19965)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 19965)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 19965)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 19965)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_24  #
##################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 20278)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.94978 +- 0.000001. Correct Result: 234.949779

Configuration              
Number of Threads:         96
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     5.52277
Minimum kernel time:       4.69685e-05
Maximum kernel time:       0.00875688
Arithm. Mean kernel time:  5.51975e-05

Performance results        
Total GFlops/s:            262.432
Minimum GFlops/s:          1.6551
Maximum GFlops/s:          308.579
Arithm. Mean GFlops/s:     262.575


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 20278)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 20278)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 20278)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 20278)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_25  #
##################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 20595)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.50809 +- 0.000001. Correct Result: 234.508089

Configuration              
Number of Threads:         100
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     5.73318
Minimum kernel time:       4.69685e-05
Maximum kernel time:       0.023932
Arithm. Mean kernel time:  5.72862e-05

Performance results        
Total GFlops/s:            252.8
Minimum GFlops/s:          0.605612
Maximum GFlops/s:          308.579
Arithm. Mean GFlops/s:     253.002


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 20595)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 20595)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 20595)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 20595)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_26  #
##################################################################################################################################


* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 20917)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 235.21801 +- 0.000001. Correct Result: 235.218006

Configuration              
Number of Threads:         104
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     5.51837
Minimum kernel time:       4.69685e-05
Maximum kernel time:       0.012172
Arithm. Mean kernel time:  5.51514e-05

Performance results        
Total GFlops/s:            262.641
Minimum GFlops/s:          1.19073
Maximum GFlops/s:          308.579
Arithm. Mean GFlops/s:     262.795


* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 20917)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 20917)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 20917)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 20917)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27

To display your profiling results:
##################################################################################################################################
#    LEVEL    |     REPORT     |                                             COMMAND                                             #
##################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_fine/tools/lprof_npsu_run_27  #
##################################################################################################################################

×