
Executable Output

* Info: Detected 1 Lprof instances in o401: processes-per-node/ppn set accordingly.
If this is incorrect, rerun with an explicit value for this setting

* Info: Selecting the 'perf-high-ppn' engine for node o401

* Info: Process launched (host o401, process 36193)-------------------------------------------------------------
STREAM version $Revision: 5.10 $
This system uses 8 bytes per array element.
Array size = 860160000 (elements), Offset = 0 (elements)
Memory per array = 6562.5 MiB (= 6.4 GiB).
Total memory required = 19687.5 MiB (= 19.2 GiB).
Each kernel will be executed 100 times.
 The *best* time for each kernel (excluding the first iteration)
 will be used to compute the reported bandwidth.
Number of Threads requested = 112
Number of Threads counted = 112
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 29917 microseconds.
   (= 29917 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:          459545.0     0.030052     0.029948     0.030765
Scale:         457606.9     0.030193     0.030075     0.032782
Add:           481659.8     0.042975     0.042860     0.044219
Triad:         482051.3     0.042995     0.042825     0.045287
Solution Validates: avg error less than 1.000000e-13 on all three arrays

* Info: Process finished (host o401, process 36193)

Your experiment path is /scratch_na/users/xoserete/qaas_runs/171-415-1514/intel/stream/run/oneview_runs/compilers/gcc_6/oneview_results_1714153775/tools/lprof_npsu_run_0

To display your profiling results:
#    LEVEL    |     REPORT     |                                                                                  COMMAND                                                                                   #
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/scratch_na/users/xoserete/qaas_runs/171-415-1514/intel/stream/run/oneview_runs/compilers/gcc_6/oneview_results_1714153775/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/scratch_na/users/xoserete/qaas_runs/171-415-1514/intel/stream/run/oneview_runs/compilers/gcc_6/oneview_results_1714153775/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/scratch_na/users/xoserete/qaas_runs/171-415-1514/intel/stream/run/oneview_runs/compilers/gcc_6/oneview_results_1714153775/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/scratch_na/users/xoserete/qaas_runs/171-415-1514/intel/stream/run/oneview_runs/compilers/gcc_6/oneview_results_1714153775/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/scratch_na/users/xoserete/qaas_runs/171-415-1514/intel/stream/run/oneview_runs/compilers/gcc_6/oneview_results_1714153775/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/scratch_na/users/xoserete/qaas_runs/171-415-1514/intel/stream/run/oneview_runs/compilers/gcc_6/oneview_results_1714153775/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/scratch_na/users/xoserete/qaas_runs/171-415-1514/intel/stream/run/oneview_runs/compilers/gcc_6/oneview_results_1714153775/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/scratch_na/users/xoserete/qaas_runs/171-415-1514/intel/stream/run/oneview_runs/compilers/gcc_6/oneview_results_1714153775/tools/lprof_npsu_run_0  #
