options

Executable Output


* Info: Selecting the 'perf-high-ppn' engine for node idp09.benchmarkcenter.megware.com

* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 291918)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 291923)
   _  __       _         _
  | |/ /      (_)       | |
  | ' /  _ __  _  _ __  | | __ ___
  |  <  | '__|| || '_ \ | |/ // _ \ 
  | . \ | |   | || |_) ||   <|  __/
  |_|\_\|_|   |_|| .__/ |_|\_\\___|
                 | |
                 |_|        Version 1.2.4

LLNL-CODE-775068

Copyright (c) 2014-2019, Lawrence Livermore National Security, LLC

Kripke is released under the BSD 3-Clause License, please see the
LICENSE file for the full license

This work was produced under the auspices of the U.S. Department of
Energy by Lawrence Livermore National Laboratory under Contract
DE-AC52-07NA27344.

Author: Adam J. Kunen 

Compilation Options:
  Architecture:           OpenMP
  Compiler:               /cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicpc
  Compiler Flags:         "-O3 -march=native -O2 -march=sapphirerapids -mprefer-vector-width=512 -funroll-loops -ffast-math -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cxx=g++     -Wall -Wextra  "
  Linker Flags:           " "
  CHAI Enabled:           No
  CUDA Enabled:           No
  MPI Enabled:            Yes
  OpenMP Enabled:         Yes
  Caliper Enabled:        No

OpenMP Thread->Core mapping for 1 threads on rank 0
    0->  0

Input Parameters
================

  Problem Size:
    Zones:                 16 x 16 x 16  (4096 total)
    Groups:                1024
    Legendre Order:        4
    Quadrature Set:        Dummy S2 with 96 points

  Physical Properties:
    Total X-Sec:           sigt=[0.100000, 0.000100, 0.100000]
    Scattering X-Sec:      sigs=[0.050000, 0.000050, 0.050000]

  Solver Options:
    Number iterations:     10

  MPI Decomposition Options:
    Total MPI tasks:       2
    Spatial decomp:        2 x 1 x 1 MPI tasks
    Block solve method:    Sweep

  Per-Task Options:
    DirSets/Directions:    8 sets, 12 directions/set
    GroupSet/Groups:       2 sets, 512 groups/set
    Zone Sets:             1 x 1 x 1
    Architecture:          OpenMP
    Data Layout:           DGZ

Generating Problem
==================

  Decomposition Space:   Procs:      Subdomains (local/global):
  ---------------------  ----------  --------------------------
  (P) Energy:            1           2 / 2
  (Q) Direction:         1           8 / 8
  (R) Space:             2           1 / 2
  (Rx,Ry,Rz) R in XYZ:   2x1x1       1x1x1 / 2x1x1
  (PQR) TOTAL:           2           16 / 32

  Material Volumes=[8.789062e+03, 1.177734e+05, 2.753438e+06]

  Memory breakdown of Field variables:
  Field Variable            Num Elements    Megabytes
  --------------            ------------    ---------
  data/sigs                     15728640      120.000
  dx                                  16        0.000
  dy                                  16        0.000
  dz                                  16        0.000
  ell                               2400        0.018
  ell_plus                          2400        0.018
  i_plane                       25165824      192.000
  j_plane                       25165824      192.000
  k_plane                       25165824      192.000
  mixelem_to_fraction               4352        0.033
  phi                          104857600      800.000
  phi_out                      104857600      800.000
  psi                          402653184     3072.000
  quadrature/w                        96        0.001
  quadrature/xcos                     96        0.001
  quadrature/ycos                     96        0.001
  quadrature/zcos                     96        0.001
  rhs                          402653184     3072.000
  sigt_zonal                     4194304       32.000
  volume                            4096        0.031
  --------                  ------------    ---------
  TOTAL                       1110455664     8472.104

  Generation Complete!

Steady State Solve
==================

  iter 0: particle count=1.197998e+09, change=1.000000e+00
  iter 1: particle count=1.801368e+09, change=3.349511e-01
  iter 2: particle count=2.102278e+09, change=1.431351e-01
  iter 3: particle count=2.251810e+09, change=6.640521e-02
  iter 4: particle count=2.325888e+09, change=3.184924e-02
  iter 5: particle count=2.362467e+09, change=1.548355e-02
  iter 6: particle count=2.380471e+09, change=7.563193e-03
  iter 7: particle count=2.389305e+09, change=3.697158e-03
  iter 8: particle count=2.393627e+09, change=1.805479e-03
  iter 9: particle count=2.395735e+09, change=8.801810e-04
  Solver terminated

Timers
======

  Timer                    Count       Seconds
  ----------------  ------------  ------------
  Generate                     1       0.01863
  LPlusTimes                  10      48.85946
  LTimes                      10      47.98556
  Population                  10       2.43956
  Scattering                  10     926.58573
  Solve                        1    1041.63222
  Source                      10       0.01012
  SweepSolver                 10      14.69851
  SweepSubdomain             160      14.04403

TIMER_NAMES:Generate,LPlusTimes,LTimes,Population,Scattering,Solve,Source,SweepSolver,SweepSubdomain
TIMER_DATA:0.018629,48.859463,47.985561,2.439560,926.585725,1041.632218,0.010116,14.698511,14.044030

Figures of Merit
================

  Throughput:         3.865598e+06 [unknowns/(second/iteration)]
  Grind time :        2.586922e-07 [(seconds/iteration)/unknowns]
  Sweep efficiency :  95.54730 [100.0 * SweepSubdomain time / SweepSolver time]
  Number of unknowns: 402653184

END

* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 291918)
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 291923)

Your experiment path is /home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0

To display your profiling results:
#################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                            COMMAND                                                                             #
#################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0  #
#################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node idp09.benchmarkcenter.megware.com

* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292055)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292060)
   _  __       _         _
  | |/ /      (_)       | |
  | ' /  _ __  _  _ __  | | __ ___
  |  <  | '__|| || '_ \ | |/ // _ \ 
  | . \ | |   | || |_) ||   <|  __/
  |_|\_\|_|   |_|| .__/ |_|\_\\___|
                 | |
                 |_|        Version 1.2.4

LLNL-CODE-775068

Copyright (c) 2014-2019, Lawrence Livermore National Security, LLC

Kripke is released under the BSD 3-Clause License, please see the
LICENSE file for the full license

This work was produced under the auspices of the U.S. Department of
Energy by Lawrence Livermore National Laboratory under Contract
DE-AC52-07NA27344.

Author: Adam J. Kunen 

Compilation Options:
  Architecture:           OpenMP
  Compiler:               /cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicpc
  Compiler Flags:         "-O3 -march=native -O2 -march=sapphirerapids -mprefer-vector-width=512 -funroll-loops -ffast-math -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cxx=g++     -Wall -Wextra  "
  Linker Flags:           " "
  CHAI Enabled:           No
  CUDA Enabled:           No
  MPI Enabled:            Yes
  OpenMP Enabled:         Yes
  Caliper Enabled:        No

OpenMP Thread->Core mapping for 2 threads on rank 0
    0->  0    1-> 24

Input Parameters
================

  Problem Size:
    Zones:                 16 x 16 x 16  (4096 total)
    Groups:                1024
    Legendre Order:        4
    Quadrature Set:        Dummy S2 with 96 points

  Physical Properties:
    Total X-Sec:           sigt=[0.100000, 0.000100, 0.100000]
    Scattering X-Sec:      sigs=[0.050000, 0.000050, 0.050000]

  Solver Options:
    Number iterations:     10

  MPI Decomposition Options:
    Total MPI tasks:       2
    Spatial decomp:        2 x 1 x 1 MPI tasks
    Block solve method:    Sweep

  Per-Task Options:
    DirSets/Directions:    8 sets, 12 directions/set
    GroupSet/Groups:       2 sets, 512 groups/set
    Zone Sets:             1 x 1 x 1
    Architecture:          OpenMP
    Data Layout:           DGZ

Generating Problem
==================

  Decomposition Space:   Procs:      Subdomains (local/global):
  ---------------------  ----------  --------------------------
  (P) Energy:            1           2 / 2
  (Q) Direction:         1           8 / 8
  (R) Space:             2           1 / 2
  (Rx,Ry,Rz) R in XYZ:   2x1x1       1x1x1 / 2x1x1
  (PQR) TOTAL:           2           16 / 32

  Material Volumes=[8.789062e+03, 1.177734e+05, 2.753438e+06]

  Memory breakdown of Field variables:
  Field Variable            Num Elements    Megabytes
  --------------            ------------    ---------
  data/sigs                     15728640      120.000
  dx                                  16        0.000
  dy                                  16        0.000
  dz                                  16        0.000
  ell                               2400        0.018
  ell_plus                          2400        0.018
  i_plane                       25165824      192.000
  j_plane                       25165824      192.000
  k_plane                       25165824      192.000
  mixelem_to_fraction               4352        0.033
  phi                          104857600      800.000
  phi_out                      104857600      800.000
  psi                          402653184     3072.000
  quadrature/w                        96        0.001
  quadrature/xcos                     96        0.001
  quadrature/ycos                     96        0.001
  quadrature/zcos                     96        0.001
  rhs                          402653184     3072.000
  sigt_zonal                     4194304       32.000
  volume                            4096        0.031
  --------                  ------------    ---------
  TOTAL                       1110455664     8472.104

  Generation Complete!

Steady State Solve
==================

  iter 0: particle count=1.197998e+09, change=1.000000e+00
  iter 1: particle count=1.801368e+09, change=3.349511e-01
  iter 2: particle count=2.102278e+09, change=1.431351e-01
  iter 3: particle count=2.251810e+09, change=6.640521e-02
  iter 4: particle count=2.325888e+09, change=3.184924e-02
  iter 5: particle count=2.362467e+09, change=1.548355e-02
  iter 6: particle count=2.380471e+09, change=7.563193e-03
  iter 7: particle count=2.389305e+09, change=3.697158e-03
  iter 8: particle count=2.393627e+09, change=1.805479e-03
  iter 9: particle count=2.395735e+09, change=8.801810e-04
  Solver terminated

Timers
======

  Timer                    Count       Seconds
  ----------------  ------------  ------------
  Generate                     1       0.01970
  LPlusTimes                  10      27.48438
  LTimes                      10      29.81147
  Population                  10       1.24365
  Scattering                  10     465.17646
  Solve                        1     540.76831
  Source                      10       0.00591
  SweepSolver                 10      15.98915
  SweepSubdomain             160       8.56812

TIMER_NAMES:Generate,LPlusTimes,LTimes,Population,Scattering,Solve,Source,SweepSolver,SweepSubdomain
TIMER_DATA:0.019701,27.484381,29.811470,1.243646,465.176457,540.768311,0.005915,15.989155,8.568121

Figures of Merit
================

  Throughput:         7.445946e+06 [unknowns/(second/iteration)]
  Grind time :        1.343013e-07 [(seconds/iteration)/unknowns]
  Sweep efficiency :  53.58708 [100.0 * SweepSubdomain time / SweepSolver time]
  Number of unknowns: 402653184

END

* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292060)
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292055)

Info: 1/2 lprof instances finished


Your experiment path is /home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1

To display your profiling results:
#################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                            COMMAND                                                                             #
#################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1  #
#################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node idp09.benchmarkcenter.megware.com

* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292164)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292169)
   _  __       _         _
  | |/ /      (_)       | |
  | ' /  _ __  _  _ __  | | __ ___
  |  <  | '__|| || '_ \ | |/ // _ \ 
  | . \ | |   | || |_) ||   <|  __/
  |_|\_\|_|   |_|| .__/ |_|\_\\___|
                 | |
                 |_|        Version 1.2.4

LLNL-CODE-775068

Copyright (c) 2014-2019, Lawrence Livermore National Security, LLC

Kripke is released under the BSD 3-Clause License, please see the
LICENSE file for the full license

This work was produced under the auspices of the U.S. Department of
Energy by Lawrence Livermore National Laboratory under Contract
DE-AC52-07NA27344.

Author: Adam J. Kunen 

Compilation Options:
  Architecture:           OpenMP
  Compiler:               /cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicpc
  Compiler Flags:         "-O3 -march=native -O2 -march=sapphirerapids -mprefer-vector-width=512 -funroll-loops -ffast-math -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cxx=g++     -Wall -Wextra  "
  Linker Flags:           " "
  CHAI Enabled:           No
  CUDA Enabled:           No
  MPI Enabled:            Yes
  OpenMP Enabled:         Yes
  Caliper Enabled:        No

OpenMP Thread->Core mapping for 4 threads on rank 0
    0->  0    1-> 12    2-> 24    3-> 36

Input Parameters
================

  Problem Size:
    Zones:                 16 x 16 x 16  (4096 total)
    Groups:                1024
    Legendre Order:        4
    Quadrature Set:        Dummy S2 with 96 points

  Physical Properties:
    Total X-Sec:           sigt=[0.100000, 0.000100, 0.100000]
    Scattering X-Sec:      sigs=[0.050000, 0.000050, 0.050000]

  Solver Options:
    Number iterations:     10

  MPI Decomposition Options:
    Total MPI tasks:       2
    Spatial decomp:        2 x 1 x 1 MPI tasks
    Block solve method:    Sweep

  Per-Task Options:
    DirSets/Directions:    8 sets, 12 directions/set
    GroupSet/Groups:       2 sets, 512 groups/set
    Zone Sets:             1 x 1 x 1
    Architecture:          OpenMP
    Data Layout:           DGZ

Generating Problem
==================

  Decomposition Space:   Procs:      Subdomains (local/global):
  ---------------------  ----------  --------------------------
  (P) Energy:            1           2 / 2
  (Q) Direction:         1           8 / 8
  (R) Space:             2           1 / 2
  (Rx,Ry,Rz) R in XYZ:   2x1x1       1x1x1 / 2x1x1
  (PQR) TOTAL:           2           16 / 32

  Material Volumes=[8.789062e+03, 1.177734e+05, 2.753438e+06]

  Memory breakdown of Field variables:
  Field Variable            Num Elements    Megabytes
  --------------            ------------    ---------
  data/sigs                     15728640      120.000
  dx                                  16        0.000
  dy                                  16        0.000
  dz                                  16        0.000
  ell                               2400        0.018
  ell_plus                          2400        0.018
  i_plane                       25165824      192.000
  j_plane                       25165824      192.000
  k_plane                       25165824      192.000
  mixelem_to_fraction               4352        0.033
  phi                          104857600      800.000
  phi_out                      104857600      800.000
  psi                          402653184     3072.000
  quadrature/w                        96        0.001
  quadrature/xcos                     96        0.001
  quadrature/ycos                     96        0.001
  quadrature/zcos                     96        0.001
  rhs                          402653184     3072.000
  sigt_zonal                     4194304       32.000
  volume                            4096        0.031
  --------                  ------------    ---------
  TOTAL                       1110455664     8472.104

  Generation Complete!

Steady State Solve
==================

  iter 0: particle count=1.197998e+09, change=1.000000e+00
  iter 1: particle count=1.801368e+09, change=3.349511e-01
  iter 2: particle count=2.102278e+09, change=1.431351e-01
  iter 3: particle count=2.251810e+09, change=6.640521e-02
  iter 4: particle count=2.325888e+09, change=3.184924e-02
  iter 5: particle count=2.362467e+09, change=1.548355e-02
  iter 6: particle count=2.380471e+09, change=7.563193e-03
  iter 7: particle count=2.389305e+09, change=3.697158e-03
  iter 8: particle count=2.393627e+09, change=1.805479e-03
  iter 9: particle count=2.395735e+09, change=8.801810e-04
  Solver terminated

Timers
======

  Timer                    Count       Seconds
  ----------------  ------------  ------------
  Generate                     1       0.01965
  LPlusTimes                  10      14.85723
  LTimes                      10      16.40089
  Population                  10       0.76111
  Scattering                  10     241.42428
  Solve                        1     286.50605
  Source                      10       0.00386
  SweepSolver                 10      12.00208
  SweepSubdomain             160       4.57279

TIMER_NAMES:Generate,LPlusTimes,LTimes,Population,Scattering,Solve,Source,SweepSolver,SweepSubdomain
TIMER_DATA:0.019651,14.857232,16.400891,0.761110,241.424276,286.506050,0.003865,12.002080,4.572790

Figures of Merit
================

  Throughput:         1.405392e+07 [unknowns/(second/iteration)]
  Grind time :        7.115455e-08 [(seconds/iteration)/unknowns]
  Sweep efficiency :  38.09998 [100.0 * SweepSubdomain time / SweepSolver time]
  Number of unknowns: 402653184

END

* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292164)
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292169)

Your experiment path is /home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2

To display your profiling results:
#################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                            COMMAND                                                                             #
#################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2  #
#################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node idp09.benchmarkcenter.megware.com

* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292261)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292266)
   _  __       _         _
  | |/ /      (_)       | |
  | ' /  _ __  _  _ __  | | __ ___
  |  <  | '__|| || '_ \ | |/ // _ \ 
  | . \ | |   | || |_) ||   <|  __/
  |_|\_\|_|   |_|| .__/ |_|\_\\___|
                 | |
                 |_|        Version 1.2.4

LLNL-CODE-775068

Copyright (c) 2014-2019, Lawrence Livermore National Security, LLC

Kripke is released under the BSD 3-Clause License, please see the
LICENSE file for the full license

This work was produced under the auspices of the U.S. Department of
Energy by Lawrence Livermore National Laboratory under Contract
DE-AC52-07NA27344.

Author: Adam J. Kunen 

Compilation Options:
  Architecture:           OpenMP
  Compiler:               /cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicpc
  Compiler Flags:         "-O3 -march=native -O2 -march=sapphirerapids -mprefer-vector-width=512 -funroll-loops -ffast-math -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cxx=g++     -Wall -Wextra  "
  Linker Flags:           " "
  CHAI Enabled:           No
  CUDA Enabled:           No
  MPI Enabled:            Yes
  OpenMP Enabled:         Yes
  Caliper Enabled:        No

OpenMP Thread->Core mapping for 8 threads on rank 0
    0->  0    1->  6    2-> 12    3-> 18    4-> 24    5-> 30    6-> 36    7-> 42

Input Parameters
================

  Problem Size:
    Zones:                 16 x 16 x 16  (4096 total)
    Groups:                1024
    Legendre Order:        4
    Quadrature Set:        Dummy S2 with 96 points

  Physical Properties:
    Total X-Sec:           sigt=[0.100000, 0.000100, 0.100000]
    Scattering X-Sec:      sigs=[0.050000, 0.000050, 0.050000]

  Solver Options:
    Number iterations:     10

  MPI Decomposition Options:
    Total MPI tasks:       2
    Spatial decomp:        2 x 1 x 1 MPI tasks
    Block solve method:    Sweep

  Per-Task Options:
    DirSets/Directions:    8 sets, 12 directions/set
    GroupSet/Groups:       2 sets, 512 groups/set
    Zone Sets:             1 x 1 x 1
    Architecture:          OpenMP
    Data Layout:           DGZ

Generating Problem
==================

  Decomposition Space:   Procs:      Subdomains (local/global):
  ---------------------  ----------  --------------------------
  (P) Energy:            1           2 / 2
  (Q) Direction:         1           8 / 8
  (R) Space:             2           1 / 2
  (Rx,Ry,Rz) R in XYZ:   2x1x1       1x1x1 / 2x1x1
  (PQR) TOTAL:           2           16 / 32

  Material Volumes=[8.789062e+03, 1.177734e+05, 2.753438e+06]

  Memory breakdown of Field variables:
  Field Variable            Num Elements    Megabytes
  --------------            ------------    ---------
  data/sigs                     15728640      120.000
  dx                                  16        0.000
  dy                                  16        0.000
  dz                                  16        0.000
  ell                               2400        0.018
  ell_plus                          2400        0.018
  i_plane                       25165824      192.000
  j_plane                       25165824      192.000
  k_plane                       25165824      192.000
  mixelem_to_fraction               4352        0.033
  phi                          104857600      800.000
  phi_out                      104857600      800.000
  psi                          402653184     3072.000
  quadrature/w                        96        0.001
  quadrature/xcos                     96        0.001
  quadrature/ycos                     96        0.001
  quadrature/zcos                     96        0.001
  rhs                          402653184     3072.000
  sigt_zonal                     4194304       32.000
  volume                            4096        0.031
  --------                  ------------    ---------
  TOTAL                       1110455664     8472.104

  Generation Complete!

Steady State Solve
==================

  iter 0: particle count=1.197998e+09, change=1.000000e+00
  iter 1: particle count=1.801368e+09, change=3.349511e-01
  iter 2: particle count=2.102278e+09, change=1.431351e-01
  iter 3: particle count=2.251810e+09, change=6.640521e-02
  iter 4: particle count=2.325888e+09, change=3.184924e-02
  iter 5: particle count=2.362467e+09, change=1.548355e-02
  iter 6: particle count=2.380471e+09, change=7.563193e-03
  iter 7: particle count=2.389305e+09, change=3.697158e-03
  iter 8: particle count=2.393627e+09, change=1.805479e-03
  iter 9: particle count=2.395735e+09, change=8.801810e-04
  Solver terminated

Timers
======

  Timer                    Count       Seconds
  ----------------  ------------  ------------
  Generate                     1       0.02043
  LPlusTimes                  10       9.51116
  LTimes                      10       8.58667
  Population                  10       0.85079
  Scattering                  10     140.17268
  Solve                        1     164.22890
  Source                      10       0.00289
  SweepSolver                 10       4.04279
  SweepSubdomain             160       2.33546

TIMER_NAMES:Generate,LPlusTimes,LTimes,Population,Scattering,Solve,Source,SweepSolver,SweepSubdomain
TIMER_DATA:0.020426,9.511162,8.586667,0.850788,140.172681,164.228900,0.002890,4.042791,2.335459

Figures of Merit
================

  Throughput:         2.451780e+07 [unknowns/(second/iteration)]
  Grind time :        4.078669e-08 [(seconds/iteration)/unknowns]
  Sweep efficiency :  57.76847 [100.0 * SweepSubdomain time / SweepSolver time]
  Number of unknowns: 402653184

END

* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292266)
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292261)

Info: 1/2 lprof instances finished


Your experiment path is /home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3

To display your profiling results:
#################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                            COMMAND                                                                             #
#################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3  #
#################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node idp09.benchmarkcenter.megware.com

* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292370)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292375)
   _  __       _         _
  | |/ /      (_)       | |
  | ' /  _ __  _  _ __  | | __ ___
  |  <  | '__|| || '_ \ | |/ // _ \ 
  | . \ | |   | || |_) ||   <|  __/
  |_|\_\|_|   |_|| .__/ |_|\_\\___|
                 | |
                 |_|        Version 1.2.4

LLNL-CODE-775068

Copyright (c) 2014-2019, Lawrence Livermore National Security, LLC

Kripke is released under the BSD 3-Clause License, please see the
LICENSE file for the full license

This work was produced under the auspices of the U.S. Department of
Energy by Lawrence Livermore National Laboratory under Contract
DE-AC52-07NA27344.

Author: Adam J. Kunen 

Compilation Options:
  Architecture:           OpenMP
  Compiler:               /cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicpc
  Compiler Flags:         "-O3 -march=native -O2 -march=sapphirerapids -mprefer-vector-width=512 -funroll-loops -ffast-math -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cxx=g++     -Wall -Wextra  "
  Linker Flags:           " "
  CHAI Enabled:           No
  CUDA Enabled:           No
  MPI Enabled:            Yes
  OpenMP Enabled:         Yes
  Caliper Enabled:        No

OpenMP Thread->Core mapping for 16 threads on rank 0
    0->  0    1->  3    2->  6    3->  9    4-> 12    5-> 15    6-> 18    7-> 21
    8-> 24    9-> 27   10-> 30   11-> 33   12-> 36   13-> 39   14-> 42   15-> 45

Input Parameters
================

  Problem Size:
    Zones:                 16 x 16 x 16  (4096 total)
    Groups:                1024
    Legendre Order:        4
    Quadrature Set:        Dummy S2 with 96 points

  Physical Properties:
    Total X-Sec:           sigt=[0.100000, 0.000100, 0.100000]
    Scattering X-Sec:      sigs=[0.050000, 0.000050, 0.050000]

  Solver Options:
    Number iterations:     10

  MPI Decomposition Options:
    Total MPI tasks:       2
    Spatial decomp:        2 x 1 x 1 MPI tasks
    Block solve method:    Sweep

  Per-Task Options:
    DirSets/Directions:    8 sets, 12 directions/set
    GroupSet/Groups:       2 sets, 512 groups/set
    Zone Sets:             1 x 1 x 1
    Architecture:          OpenMP
    Data Layout:           DGZ

Generating Problem
==================

  Decomposition Space:   Procs:      Subdomains (local/global):
  ---------------------  ----------  --------------------------
  (P) Energy:            1           2 / 2
  (Q) Direction:         1           8 / 8
  (R) Space:             2           1 / 2
  (Rx,Ry,Rz) R in XYZ:   2x1x1       1x1x1 / 2x1x1
  (PQR) TOTAL:           2           16 / 32

  Material Volumes=[8.789062e+03, 1.177734e+05, 2.753438e+06]

  Memory breakdown of Field variables:
  Field Variable            Num Elements    Megabytes
  --------------            ------------    ---------
  data/sigs                     15728640      120.000
  dx                                  16        0.000
  dy                                  16        0.000
  dz                                  16        0.000
  ell                               2400        0.018
  ell_plus                          2400        0.018
  i_plane                       25165824      192.000
  j_plane                       25165824      192.000
  k_plane                       25165824      192.000
  mixelem_to_fraction               4352        0.033
  phi                          104857600      800.000
  phi_out                      104857600      800.000
  psi                          402653184     3072.000
  quadrature/w                        96        0.001
  quadrature/xcos                     96        0.001
  quadrature/ycos                     96        0.001
  quadrature/zcos                     96        0.001
  rhs                          402653184     3072.000
  sigt_zonal                     4194304       32.000
  volume                            4096        0.031
  --------                  ------------    ---------
  TOTAL                       1110455664     8472.104

  Generation Complete!

Steady State Solve
==================

  iter 0: particle count=1.197998e+09, change=1.000000e+00
  iter 1: particle count=1.801368e+09, change=3.349511e-01
  iter 2: particle count=2.102278e+09, change=1.431351e-01
  iter 3: particle count=2.251810e+09, change=6.640521e-02
  iter 4: particle count=2.325888e+09, change=3.184924e-02
  iter 5: particle count=2.362467e+09, change=1.548355e-02
  iter 6: particle count=2.380471e+09, change=7.563193e-03
  iter 7: particle count=2.389305e+09, change=3.697158e-03
  iter 8: particle count=2.393627e+09, change=1.805479e-03
  iter 9: particle count=2.395735e+09, change=8.801810e-04
  Solver terminated

Timers
======

  Timer                    Count       Seconds
  ----------------  ------------  ------------
  Generate                     1       0.01910
  LPlusTimes                  10       5.11317
  LTimes                      10       4.60814
  Population                  10       0.47810
  Scattering                  10      76.94158
  Solve                        1      90.23037
  Source                      10       0.00229
  SweepSolver                 10       2.03002
  SweepSubdomain             160       1.19074

TIMER_NAMES:Generate,LPlusTimes,LTimes,Population,Scattering,Solve,Source,SweepSolver,SweepSubdomain
TIMER_DATA:0.019099,5.113167,4.608137,0.478103,76.941581,90.230367,0.002286,2.030022,1.190741

Figures of Merit
================

  Throughput:         4.462502e+07 [unknowns/(second/iteration)]
  Grind time :        2.240895e-08 [(seconds/iteration)/unknowns]
  Sweep efficiency :  58.65654 [100.0 * SweepSubdomain time / SweepSolver time]
  Number of unknowns: 402653184

END

* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292370)
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292375)

Your experiment path is /home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4

To display your profiling results:
#################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                            COMMAND                                                                             #
#################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4  #
#################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node idp09.benchmarkcenter.megware.com

* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292507)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292512)
   _  __       _         _
  | |/ /      (_)       | |
  | ' /  _ __  _  _ __  | | __ ___
  |  <  | '__|| || '_ \ | |/ // _ \ 
  | . \ | |   | || |_) ||   <|  __/
  |_|\_\|_|   |_|| .__/ |_|\_\\___|
                 | |
                 |_|        Version 1.2.4

LLNL-CODE-775068

Copyright (c) 2014-2019, Lawrence Livermore National Security, LLC

Kripke is released under the BSD 3-Clause License, please see the
LICENSE file for the full license

This work was produced under the auspices of the U.S. Department of
Energy by Lawrence Livermore National Laboratory under Contract
DE-AC52-07NA27344.

Author: Adam J. Kunen 

Compilation Options:
  Architecture:           OpenMP
  Compiler:               /cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicpc
  Compiler Flags:         "-O3 -march=native -O2 -march=sapphirerapids -mprefer-vector-width=512 -funroll-loops -ffast-math -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cxx=g++     -Wall -Wextra  "
  Linker Flags:           " "
  CHAI Enabled:           No
  CUDA Enabled:           No
  MPI Enabled:            Yes
  OpenMP Enabled:         Yes
  Caliper Enabled:        No

OpenMP Thread->Core mapping for 32 threads on rank 0
    0->  0    1-> 97    2->  3    3->100    4->  6    5->103    6->  9    7->106
    8-> 12    9->109   10-> 15   11->112   12-> 18   13->115   14-> 21   15->118
   16-> 24   17->121   18-> 27   19->124   20-> 30   21->127   22-> 33   23->130
   24-> 36   25->133   26-> 39   27->136   28-> 42   29->139   30-> 45   31->142

Input Parameters
================

  Problem Size:
    Zones:                 16 x 16 x 16  (4096 total)
    Groups:                1024
    Legendre Order:        4
    Quadrature Set:        Dummy S2 with 96 points

  Physical Properties:
    Total X-Sec:           sigt=[0.100000, 0.000100, 0.100000]
    Scattering X-Sec:      sigs=[0.050000, 0.000050, 0.050000]

  Solver Options:
    Number iterations:     10

  MPI Decomposition Options:
    Total MPI tasks:       2
    Spatial decomp:        2 x 1 x 1 MPI tasks
    Block solve method:    Sweep

  Per-Task Options:
    DirSets/Directions:    8 sets, 12 directions/set
    GroupSet/Groups:       2 sets, 512 groups/set
    Zone Sets:             1 x 1 x 1
    Architecture:          OpenMP
    Data Layout:           DGZ

Generating Problem
==================

  Decomposition Space:   Procs:      Subdomains (local/global):
  ---------------------  ----------  --------------------------
  (P) Energy:            1           2 / 2
  (Q) Direction:         1           8 / 8
  (R) Space:             2           1 / 2
  (Rx,Ry,Rz) R in XYZ:   2x1x1       1x1x1 / 2x1x1
  (PQR) TOTAL:           2           16 / 32

  Material Volumes=[8.789062e+03, 1.177734e+05, 2.753438e+06]

  Memory breakdown of Field variables:
  Field Variable            Num Elements    Megabytes
  --------------            ------------    ---------
  data/sigs                     15728640      120.000
  dx                                  16        0.000
  dy                                  16        0.000
  dz                                  16        0.000
  ell                               2400        0.018
  ell_plus                          2400        0.018
  i_plane                       25165824      192.000
  j_plane                       25165824      192.000
  k_plane                       25165824      192.000
  mixelem_to_fraction               4352        0.033
  phi                          104857600      800.000
  phi_out                      104857600      800.000
  psi                          402653184     3072.000
  quadrature/w                        96        0.001
  quadrature/xcos                     96        0.001
  quadrature/ycos                     96        0.001
  quadrature/zcos                     96        0.001
  rhs                          402653184     3072.000
  sigt_zonal                     4194304       32.000
  volume                            4096        0.031
  --------                  ------------    ---------
  TOTAL                       1110455664     8472.104

  Generation Complete!

Steady State Solve
==================

  iter 0: particle count=1.197998e+09, change=1.000000e+00
  iter 1: particle count=1.801368e+09, change=3.349511e-01
  iter 2: particle count=2.102278e+09, change=1.431351e-01
  iter 3: particle count=2.251810e+09, change=6.640521e-02
  iter 4: particle count=2.325888e+09, change=3.184924e-02
  iter 5: particle count=2.362467e+09, change=1.548355e-02
  iter 6: particle count=2.380471e+09, change=7.563193e-03
  iter 7: particle count=2.389305e+09, change=3.697158e-03
  iter 8: particle count=2.393627e+09, change=1.805479e-03
  iter 9: particle count=2.395735e+09, change=8.801810e-04
  Solver terminated

Timers
======

  Timer                    Count       Seconds
  ----------------  ------------  ------------
  Generate                     1       0.01895
  LPlusTimes                  10       2.95652
  LTimes                      10       3.16498
  Population                  10       0.41053
  Scattering                  10      53.67290
  Solve                        1      65.76153
  Source                      10       0.00214
  SweepSolver                 10       4.47819
  SweepSubdomain             160       0.72801

TIMER_NAMES:Generate,LPlusTimes,LTimes,Population,Scattering,Solve,Source,SweepSolver,SweepSubdomain
TIMER_DATA:0.018945,2.956520,3.164980,0.410529,53.672897,65.761529,0.002142,4.478190,0.728007

Figures of Merit
================

  Throughput:         6.122929e+07 [unknowns/(second/iteration)]
  Grind time :        1.633205e-08 [(seconds/iteration)/unknowns]
  Sweep efficiency :  16.25673 [100.0 * SweepSubdomain time / SweepSolver time]
  Number of unknowns: 402653184

END

* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292507)
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292512)

Info: 1/2 lprof instances finished


Your experiment path is /home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5

To display your profiling results:
#################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                            COMMAND                                                                             #
#################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5  #
#################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node idp09.benchmarkcenter.megware.com

* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292706)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292711)
   _  __       _         _
  | |/ /      (_)       | |
  | ' /  _ __  _  _ __  | | __ ___
  |  <  | '__|| || '_ \ | |/ // _ \ 
  | . \ | |   | || |_) ||   <|  __/
  |_|\_\|_|   |_|| .__/ |_|\_\\___|
                 | |
                 |_|        Version 1.2.4

LLNL-CODE-775068

Copyright (c) 2014-2019, Lawrence Livermore National Security, LLC

Kripke is released under the BSD 3-Clause License, please see the
LICENSE file for the full license

This work was produced under the auspices of the U.S. Department of
Energy by Lawrence Livermore National Laboratory under Contract
DE-AC52-07NA27344.

Author: Adam J. Kunen 

Compilation Options:
  Architecture:           OpenMP
  Compiler:               /cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicpc
  Compiler Flags:         "-O3 -march=native -O2 -march=sapphirerapids -mprefer-vector-width=512 -funroll-loops -ffast-math -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cxx=g++     -Wall -Wextra  "
  Linker Flags:           " "
  CHAI Enabled:           No
  CUDA Enabled:           No
  MPI Enabled:            Yes
  OpenMP Enabled:         Yes
  Caliper Enabled:        No

OpenMP Thread->Core mapping for 48 threads on rank 0
    0->  0    1->  1    2->  2    3->  3    4->  4    5->  5    6->  6    7->  7
    8->  8    9->  9   10-> 10   11-> 11   12-> 12   13-> 13   14-> 14   15-> 15
   16-> 16   17-> 17   18-> 18   19-> 19   20-> 20   21-> 21   22-> 22   23-> 23
   24-> 24   25-> 25   26-> 26   27-> 27   28-> 28   29-> 29   30-> 30   31-> 31
   32-> 32   33-> 33   34-> 34   35-> 35   36-> 36   37-> 37   38-> 38   39-> 39
   40-> 40   41-> 41   42-> 42   43-> 43   44-> 44   45-> 45   46-> 46   47-> 47

Input Parameters
================

  Problem Size:
    Zones:                 16 x 16 x 16  (4096 total)
    Groups:                1024
    Legendre Order:        4
    Quadrature Set:        Dummy S2 with 96 points

  Physical Properties:
    Total X-Sec:           sigt=[0.100000, 0.000100, 0.100000]
    Scattering X-Sec:      sigs=[0.050000, 0.000050, 0.050000]

  Solver Options:
    Number iterations:     10

  MPI Decomposition Options:
    Total MPI tasks:       2
    Spatial decomp:        2 x 1 x 1 MPI tasks
    Block solve method:    Sweep

  Per-Task Options:
    DirSets/Directions:    8 sets, 12 directions/set
    GroupSet/Groups:       2 sets, 512 groups/set
    Zone Sets:             1 x 1 x 1
    Architecture:          OpenMP
    Data Layout:           DGZ

Generating Problem
==================

  Decomposition Space:   Procs:      Subdomains (local/global):
  ---------------------  ----------  --------------------------
  (P) Energy:            1           2 / 2
  (Q) Direction:         1           8 / 8
  (R) Space:             2           1 / 2
  (Rx,Ry,Rz) R in XYZ:   2x1x1       1x1x1 / 2x1x1
  (PQR) TOTAL:           2           16 / 32

  Material Volumes=[8.789062e+03, 1.177734e+05, 2.753438e+06]

  Memory breakdown of Field variables:
  Field Variable            Num Elements    Megabytes
  --------------            ------------    ---------
  data/sigs                     15728640      120.000
  dx                                  16        0.000
  dy                                  16        0.000
  dz                                  16        0.000
  ell                               2400        0.018
  ell_plus                          2400        0.018
  i_plane                       25165824      192.000
  j_plane                       25165824      192.000
  k_plane                       25165824      192.000
  mixelem_to_fraction               4352        0.033
  phi                          104857600      800.000
  phi_out                      104857600      800.000
  psi                          402653184     3072.000
  quadrature/w                        96        0.001
  quadrature/xcos                     96        0.001
  quadrature/ycos                     96        0.001
  quadrature/zcos                     96        0.001
  rhs                          402653184     3072.000
  sigt_zonal                     4194304       32.000
  volume                            4096        0.031
  --------                  ------------    ---------
  TOTAL                       1110455664     8472.104

  Generation Complete!

Steady State Solve
==================

  iter 0: particle count=1.197998e+09, change=1.000000e+00
  iter 1: particle count=1.801368e+09, change=3.349511e-01
  iter 2: particle count=2.102278e+09, change=1.431351e-01
  iter 3: particle count=2.251810e+09, change=6.640521e-02
  iter 4: particle count=2.325888e+09, change=3.184924e-02
  iter 5: particle count=2.362467e+09, change=1.548355e-02
  iter 6: particle count=2.380471e+09, change=7.563193e-03
  iter 7: particle count=2.389305e+09, change=3.697158e-03
  iter 8: particle count=2.393627e+09, change=1.805479e-03
  iter 9: particle count=2.395735e+09, change=8.801810e-04
  Solver terminated

Timers
======

  Timer                    Count       Seconds
  ----------------  ------------  ------------
  Generate                     1       0.02000
  LPlusTimes                  10       3.22138
  LTimes                      10       3.43990
  Population                  10       0.35467
  Scattering                  10      51.05737
  Solve                        1      65.49357
  Source                      10       0.00215
  SweepSolver                 10       6.31110
  SweepSubdomain             160       1.32933

TIMER_NAMES:Generate,LPlusTimes,LTimes,Population,Scattering,Solve,Source,SweepSolver,SweepSubdomain
TIMER_DATA:0.019997,3.221384,3.439901,0.354669,51.057366,65.493571,0.002153,6.311101,1.329326

Figures of Merit
================

  Throughput:         6.147980e+07 [unknowns/(second/iteration)]
  Grind time :        1.626550e-08 [(seconds/iteration)/unknowns]
  Sweep efficiency :  21.06329 [100.0 * SweepSubdomain time / SweepSolver time]
  Number of unknowns: 402653184

END

* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292706)
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292711)

Info: 1/2 lprof instances finished


Your experiment path is /home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6

To display your profiling results:
#################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                            COMMAND                                                                             #
#################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6  #
#################################################################################################################################################################################################

×