
Executable Output

* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 17227)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 233.30511 +- 0.000001. Correct Result: 233.305107

Number of Threads:         1
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     603.703
Minimum kernel time:       0.006006
Maximum kernel time:       0.006706
Arithm. Mean kernel time:  0.00603689

Performance results        
Total GFlops/s:            2.40077
Minimum GFlops/s:          2.16127
Maximum GFlops/s:          2.41317
Arithm. Mean GFlops/s:     2.40082

* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 17227)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 17227)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 17227)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 17227)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_0

To display your profiling results:
#    LEVEL    |     REPORT     |                                             COMMAND                                              #
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_0  #

* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 17485)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 235.08731 +- 0.000001. Correct Result: 235.087310

Number of Threads:         2
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     303.438
Minimum kernel time:       0.00283504
Maximum kernel time:       0.0165291
Arithm. Mean kernel time:  0.0030343

Performance results        
Total GFlops/s:            4.77642
Minimum GFlops/s:          0.876848
Maximum GFlops/s:          5.11228
Arithm. Mean GFlops/s:     4.77655

* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 17485)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 17485)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 17485)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 17485)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_1

To display your profiling results:
#    LEVEL    |     REPORT     |                                             COMMAND                                              #
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_1  #

* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 17725)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.83030 +- 0.000001. Correct Result: 234.830297

Number of Threads:         4
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     128.766
Minimum kernel time:       0.00112486
Maximum kernel time:       0.0160851
Arithm. Mean kernel time:  0.00128761

Performance results        
Total GFlops/s:            11.2557
Minimum GFlops/s:          0.901048
Maximum GFlops/s:          12.8847
Arithm. Mean GFlops/s:     11.2561

* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 17725)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 17725)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 17725)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 17725)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_2

To display your profiling results:
#    LEVEL    |     REPORT     |                                             COMMAND                                              #
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_2  #

* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 17958)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.51698 +- 0.000001. Correct Result: 234.516981

Number of Threads:         8
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     62.182
Minimum kernel time:       0.00058198
Maximum kernel time:       0.0130529
Arithm. Mean kernel time:  0.000621784

Performance results        
Total GFlops/s:            23.3082
Minimum GFlops/s:          1.11036
Maximum GFlops/s:          24.9038
Arithm. Mean GFlops/s:     23.3095

* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 17958)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 17958)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 17958)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 17958)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_3

To display your profiling results:
#    LEVEL    |     REPORT     |                                             COMMAND                                              #
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_3  #

* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 18191)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 233.97253 +- 0.000001. Correct Result: 233.972526

Number of Threads:         16
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     34.7781
Minimum kernel time:       0.000311852
Maximum kernel time:       0.00440598
Arithm. Mean kernel time:  0.000347745

Performance results        
Total GFlops/s:            41.6742
Minimum GFlops/s:          3.28951
Maximum GFlops/s:          46.4756
Arithm. Mean GFlops/s:     41.6785

* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 18191)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 18191)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 18191)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 18191)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_4

To display your profiling results:
#    LEVEL    |     REPORT     |                                             COMMAND                                              #
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_4  #

* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 18432)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.22711 +- 0.000001. Correct Result: 234.227105

Number of Threads:         32
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     26.9688
Minimum kernel time:       0.000185013
Maximum kernel time:       0.013273
Arithm. Mean kernel time:  0.000269652

Performance results        
Total GFlops/s:            53.7416
Minimum GFlops/s:          1.09195
Maximum GFlops/s:          78.3378
Arithm. Mean GFlops/s:     53.7488

* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 18432)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 18432)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 18432)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 18432)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_5

To display your profiling results:
#    LEVEL    |     REPORT     |                                             COMMAND                                              #
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_5  #

* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 18685)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 233.65258 +- 0.000001. Correct Result: 233.652583

Number of Threads:         52
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     17.2496
Minimum kernel time:       0.000123978
Maximum kernel time:       0.00764799
Arithm. Mean kernel time:  0.000172457

Performance results        
Total GFlops/s:            84.0222
Minimum GFlops/s:          1.89507
Maximum GFlops/s:          116.904
Arithm. Mean GFlops/s:     84.0414

* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 18685)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 18685)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 18685)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 18685)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_6

To display your profiling results:
#    LEVEL    |     REPORT     |                                             COMMAND                                              #
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_6  #

* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 18962)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 233.58766 +- 0.000001. Correct Result: 233.587659

Number of Threads:         104
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     5.68779
Minimum kernel time:       4.50611e-05
Maximum kernel time:       0.00997901
Arithm. Mean kernel time:  5.68462e-05

Performance results        
Total GFlops/s:            254.818
Minimum GFlops/s:          1.4524
Maximum GFlops/s:          321.641
Arithm. Mean GFlops/s:     254.96

* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 18962)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 18962)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 18962)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 18962)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_7

To display your profiling results:
#    LEVEL    |     REPORT     |                                             COMMAND                                              #
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_7  #

* Info: Selecting the 'perf-low-ppn' engine for node ifcp01.benchmarkcenter.megware.com

* Info: Process launched (host ifcp01.benchmarkcenter.megware.com, process 19289)reading matrix in matlab format from input-matrix/mat_dim_493039.txt
Loaded Matrix and random RHS

Correctness check
Success, correct result: 234.42686 +- 0.000001. Correct Result: 234.426860

Number of Threads:         208
Number of Repetitions:     100000
Input filename:            input-matrix/mat_dim_493039.txt

Time measurements          
Total experiment time:     5.2333
Minimum kernel time:       4.07696e-05
Maximum kernel time:       0.0176921
Arithm. Mean kernel time:  5.22982e-05

Performance results        
Total GFlops/s:            276.947
Minimum GFlops/s:          0.819208
Maximum GFlops/s:          355.498
Arithm. Mean GFlops/s:     277.132

* Info: Process finished (host ifcp01.benchmarkcenter.megware.com, process 19289)
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host ifcp01.benchmarkcenter.megware.com, process 19289)
* Info: Dumping source info for callchain nodes (host ifcp01.benchmarkcenter.megware.com, process 19289)
* Info: Building/writing metadata (host ifcp01.benchmarkcenter.megware.com)
* Info: Finished collect step (host ifcp01.benchmarkcenter.megware.com, process 19289)

Your experiment path is /home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_8

To display your profiling results:
#    LEVEL    |     REPORT     |                                             COMMAND                                              #
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_8      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_8  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_8  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_8  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_8      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_8  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_8  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/epi-spmxv-main/icx_ov1_scala_spread/tools/lprof_npsu_run_8  #
