options

Executable Output


* Info: Selecting the 'perf-high-ppn' engine for node ins01.benchmarkcenter.megware.com

* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 10524)
* Info: "ref-cycles" not supported on ins01.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 10529)miniqmc not built from git repository

number of ranks : 2, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 2
OpenMP threads = 1
Number of walkers per rank = 1

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.1506     0.1506              1       0.150640481
  ParticleSet:::update                         0.0000     0.0000              1       0.000004870
Total                                         34.5353     0.0003              1      34.535317218
  Diffusion                                   23.8612     0.0165              5       4.772236134
    Complete Updates                           0.2103     0.0000              5       0.042050944
      DeterminantRef::update                   0.2102     0.2102             10       0.021022682
    Current Gradient                           0.5728     0.0098          30720       0.000018647
      DeterminantRef::ratio                    0.5580     0.5580          30720       0.000018165
      OneBodyJastrowRef                        0.0028     0.0028          30720       0.000000090
      TwoBodyJastrowRef                        0.0023     0.0023          30720       0.000000073
    Kinetic Energy                             0.1192     0.1189              5       0.023832527
      OneBodyJastrowRef                        0.0002     0.0002              5       0.000036379
      TwoBodyJastrowRef                        0.0001     0.0001              5       0.000022498
    New Gradient                               4.1935     0.0138          30720       0.000136507
      DeterminantRef::ratio                    0.0487     0.0487          30720       0.000001584
      DeterminantRef::spovgl                   3.7604     0.2605          30720       0.000122407
        Single-Particle Orbitals               3.4999     3.4999          30720       0.000113929
      OneBodyJastrowRef                        0.0263     0.0263          30720       0.000000856
      TwoBodyJastrowRef                        0.3444     0.3444          30720       0.000011211
    ParticleSet:::acceptMove                   0.4212     0.0045          15371       0.000027399
      DTAAOMPTarget::update_e_e                0.4081     0.4081          15371       0.000026550
      DTABOMPTarget::update_ion_e              0.0085     0.0085          15371       0.000000554
    ParticleSet:::computeNewPosDT              0.4734     0.0071          30720       0.000015411
      DTAAOMPTarget::move_e_e                  0.4191     0.4191          30720       0.000013641
      DTABOMPTarget::move_ion_e                0.0473     0.0473          30720       0.000001540
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000002246
    Update                                    17.8543     0.0078          15371       0.001161556
      DeterminantRef::update                  17.5249    17.5249          15371       0.001140129
      OneBodyJastrowRef                        0.0016     0.0016          15371       0.000000105
      TwoBodyJastrowRef                        0.3199     0.3199          15371       0.000020814
  Initialization                               4.4457     0.1665              1       4.445664048
    DeterminantRef::inverse                    3.4064     3.4064              2       1.703216794
    DeterminantRef::spovgl                     0.7478     0.0537              2       0.373914075
      Single-Particle Orbitals                 0.6941     0.6941           6144       0.000112973
    OneBodyJastrowRef                          0.0055     0.0055              1       0.005498877
    ParticleSet:::update                       0.0599     0.0091              2       0.029962859
      DTAAOMPTarget::evaluate_e_e              0.0463     0.0463              1       0.046334390
      DTABOMPTarget::evaluate_ion_e            0.0045     0.0001              1       0.004516940
        DTABOMPTarget::offload_ion_e           0.0045     0.0045              1       0.004455161
    TwoBodyJastrowRef                          0.0595     0.0595              1       0.059496677
  Pseudopotential                              6.2282     0.0317              5       1.245634114
    DeterminantRef::spoval                     4.6897     0.1534          10215       0.000459101
      Single-Particle Orbitals                 4.5363     4.5363         122580       0.000037007
    OneBodyJastrowRef                          0.0123     0.0123          10215       0.000001202
    ParticleSet:::update                       0.8996     0.0058          10215       0.000088069
      DTABOMPTarget::evaluate_e_virtual        0.8171     0.0017          10215       0.000079986
        DTABOMPTarget::offload_e_virtual       0.8153     0.8153          10215       0.000079815
      DTABOMPTarget::evaluate_ion_virtual      0.0768     0.0018          10215       0.000007515
        DTABOMPTarget::offload_ion_virtual     0.0750     0.0750          10215       0.000007341
    TwoBodyJastrowRef                          0.5948     0.5948          10215       0.000058229

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 1.34314e+10
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 1.94398e+10
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 1.21219e+07


* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 10524)
* Warning: (host ins01.benchmarkcenter.megware.com, process 10524) Observed more threads (2) than expected (1): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=2.

* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 10529)
* Warning: (host ins01.benchmarkcenter.megware.com, process 10529) Observed more threads (2) than expected (1): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=2.


Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_0

To display your profiling results:
##########################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                 COMMAND                                                                                 #
##########################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_0  #
##########################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node ins01.benchmarkcenter.megware.com

* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 10590)
* Info: "ref-cycles" not supported on ins01.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 10595)miniqmc not built from git repository

number of ranks : 2, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 2
OpenMP threads = 2
Number of walkers per rank = 2

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.0821     0.0821              1       0.082144740
  ParticleSet:::update                         0.0000     0.0000              1       0.000005570
Total                                         35.0532     0.0232              1      35.053229609
  Diffusion                                   23.7142     0.0185              5       4.742836491
    Complete Updates                           0.2149     0.0000              5       0.042978076
      DeterminantRef::update                   0.2149     0.2149             10       0.021486858
    Current Gradient                           0.5859     0.0113          30720       0.000019074
      DeterminantRef::ratio                    0.5698     0.5698          30720       0.000018548
      OneBodyJastrowRef                        0.0027     0.0027          30720       0.000000087
      TwoBodyJastrowRef                        0.0021     0.0021          30720       0.000000070
    Kinetic Energy                             0.1224     0.1222              5       0.024482301
      OneBodyJastrowRef                        0.0001     0.0001              5       0.000022316
      TwoBodyJastrowRef                        0.0001     0.0001              5       0.000021870
    New Gradient                               3.8729     0.0156          30720       0.000126071
      DeterminantRef::ratio                    0.0484     0.0484          30720       0.000001574
      DeterminantRef::spovgl                   3.4402     0.2651          30720       0.000111986
        Single-Particle Orbitals               3.1751     3.1751          30720       0.000103357
      OneBodyJastrowRef                        0.0270     0.0270          30720       0.000000878
      TwoBodyJastrowRef                        0.3417     0.3417          30720       0.000011123
    ParticleSet:::acceptMove                   0.4203     0.0054          15371       0.000027345
      DTAAOMPTarget::update_e_e                0.4062     0.4062          15371       0.000026426
      DTABOMPTarget::update_ion_e              0.0087     0.0087          15371       0.000000567
    ParticleSet:::computeNewPosDT              0.4914     0.0072          30720       0.000015997
      DTAAOMPTarget::move_e_e                  0.4298     0.4298          30720       0.000013992
      DTABOMPTarget::move_ion_e                0.0544     0.0544          30720       0.000001770
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000001840
    Update                                    17.9878     0.0080          15371       0.001170245
      DeterminantRef::update                  17.6581    17.6581          15371       0.001148791
      OneBodyJastrowRef                        0.0015     0.0015          15371       0.000000097
      TwoBodyJastrowRef                        0.3203     0.3203          15371       0.000020836
  Initialization                               4.4179     0.1878              1       4.417923617
    DeterminantRef::inverse                    3.4000     3.4000              2       1.700014999
    DeterminantRef::spovgl                     0.7021     0.0541              2       0.351029833
      Single-Particle Orbitals                 0.6480     0.6480           6144       0.000105461
    OneBodyJastrowRef                          0.0062     0.0062              1       0.006239878
    ParticleSet:::update                       0.0615     0.0099              2       0.030757753
      DTAAOMPTarget::evaluate_e_e              0.0464     0.0464              1       0.046381683
      DTABOMPTarget::evaluate_ion_e            0.0052     0.0001              1       0.005217510
        DTABOMPTarget::offload_ion_e           0.0052     0.0052              1       0.005160551
    TwoBodyJastrowRef                          0.0603     0.0603              1       0.060253157
  Pseudopotential                              6.8980     0.0333              5       1.379590129
    DeterminantRef::spoval                     5.3548     0.1542          10215       0.000524212
      Single-Particle Orbitals                 5.2006     5.2006         122580       0.000042426
    OneBodyJastrowRef                          0.0127     0.0127          10215       0.000001245
    ParticleSet:::update                       0.9001     0.0060          10215       0.000088115
      DTABOMPTarget::evaluate_e_virtual        0.8172     0.0019          10215       0.000079996
        DTABOMPTarget::offload_e_virtual       0.8153     0.8153          10215       0.000079810
      DTABOMPTarget::evaluate_ion_virtual      0.0770     0.0018          10215       0.000007537
        DTABOMPTarget::offload_ion_virtual     0.0752     0.0752          10215       0.000007358
    TwoBodyJastrowRef                          0.5970     0.5970          10215       0.000058443

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 2.64658e+10
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 3.91206e+10
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 2.18898e+07


* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 10590)
* Warning: (host ins01.benchmarkcenter.megware.com, process 10590) Observed more threads (3) than expected (2): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=3.

* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 10595)
* Warning: (host ins01.benchmarkcenter.megware.com, process 10595) Observed more threads (3) than expected (2): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=3.


Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_1

To display your profiling results:
##########################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                 COMMAND                                                                                 #
##########################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_1  #
##########################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node ins01.benchmarkcenter.megware.com

* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 10670)
* Info: "ref-cycles" not supported on ins01.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 10675)miniqmc not built from git repository

number of ranks : 2, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 2
OpenMP threads = 4
Number of walkers per rank = 4

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.0467     0.0467              1       0.046659425
  ParticleSet:::update                         0.0000     0.0000              1       0.000004320
Total                                         37.9215     0.7275              1      37.921520932
  Diffusion                                   24.1738     0.0216              5       4.834754682
    Complete Updates                           0.2205     0.0000              5       0.044093560
      DeterminantRef::update                   0.2204     0.2204             10       0.022043915
    Current Gradient                           0.5903     0.0129          30720       0.000019215
      DeterminantRef::ratio                    0.5723     0.5723          30720       0.000018630
      OneBodyJastrowRef                        0.0029     0.0029          30720       0.000000095
      TwoBodyJastrowRef                        0.0022     0.0022          30720       0.000000070
    Kinetic Energy                             0.1328     0.1326              5       0.026564228
      OneBodyJastrowRef                        0.0001     0.0001              5       0.000026509
      TwoBodyJastrowRef                        0.0001     0.0001              5       0.000022528
    New Gradient                               4.2954     0.0172          30720       0.000139823
      DeterminantRef::ratio                    0.0489     0.0489          30720       0.000001591
      DeterminantRef::spovgl                   3.8560     0.2760          30720       0.000125522
        Single-Particle Orbitals               3.5801     3.5801          30720       0.000116539
      OneBodyJastrowRef                        0.0289     0.0289          30720       0.000000940
      TwoBodyJastrowRef                        0.3444     0.3444          30720       0.000011210
    ParticleSet:::acceptMove                   0.4254     0.0052          15371       0.000027674
      DTAAOMPTarget::update_e_e                0.4114     0.4114          15371       0.000026764
      DTABOMPTarget::update_ion_e              0.0088     0.0088          15371       0.000000573
    ParticleSet:::computeNewPosDT              0.4972     0.0072          30720       0.000016186
      DTAAOMPTarget::move_e_e                  0.4385     0.4385          30720       0.000014273
      DTABOMPTarget::move_ion_e                0.0516     0.0516          30720       0.000001680
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000002402
    Update                                    17.9907     0.0086          15371       0.001170429
      DeterminantRef::update                  17.6551    17.6551          15371       0.001148595
      OneBodyJastrowRef                        0.0017     0.0017          15371       0.000000108
      TwoBodyJastrowRef                        0.3253     0.3253          15371       0.000021163
  Initialization                               4.5935     0.3649              1       4.593472626
    DeterminantRef::inverse                    3.3310     3.3310              2       1.665502763
    DeterminantRef::spovgl                     0.7673     0.0532              2       0.383671976
      Single-Particle Orbitals                 0.7141     0.7141           6144       0.000116235
    OneBodyJastrowRef                          0.0058     0.0058              1       0.005758439
    ParticleSet:::update                       0.0654     0.0119              2       0.032705549
      DTAAOMPTarget::evaluate_e_e              0.0465     0.0465              1       0.046548976
      DTABOMPTarget::evaluate_ion_e            0.0069     0.0002              1       0.006921424
        DTABOMPTarget::offload_ion_e           0.0068     0.0068              1       0.006759738
    TwoBodyJastrowRef                          0.0591     0.0591              1       0.059090752
  Pseudopotential                              8.4267     0.0373              5       1.685344981
    DeterminantRef::spoval                     6.8417     0.1562          10215       0.000669769
      Single-Particle Orbitals                 6.6855     6.6855         122580       0.000054540
    OneBodyJastrowRef                          0.0130     0.0130          10215       0.000001276
    ParticleSet:::update                       0.9377     0.0059          10215       0.000091797
      DTABOMPTarget::evaluate_e_virtual        0.8481     0.0019          10215       0.000083021
        DTABOMPTarget::offload_e_virtual       0.8461     0.8461          10215       0.000082832
      DTABOMPTarget::evaluate_ion_virtual      0.0838     0.0018          10215       0.000008204
        DTABOMPTarget::offload_ion_virtual     0.0820     0.0820          10215       0.000008026
    TwoBodyJastrowRef                          0.5970     0.5970          10215       0.000058445

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 4.8928e+10
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 7.67537e+10
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 3.58372e+07


* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 10675)
* Warning: (host ins01.benchmarkcenter.megware.com, process 10675) Observed more threads (5) than expected (4): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=5.

* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 10670)
* Warning: (host ins01.benchmarkcenter.megware.com, process 10670) Observed more threads (5) than expected (4): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=5.


Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_2

To display your profiling results:
##########################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                 COMMAND                                                                                 #
##########################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_2  #
##########################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node ins01.benchmarkcenter.megware.com

* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 10736)
* Info: "ref-cycles" not supported on ins01.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 10742)miniqmc not built from git repository

number of ranks : 2, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 2
OpenMP threads = 8
Number of walkers per rank = 8

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.0277     0.0277              1       0.027731689
  ParticleSet:::update                         0.0000     0.0000              1       0.000003930
Total                                         39.0489     1.4427              1      39.048934166
  Diffusion                                   23.9821     0.0225              5       4.796417655
    Complete Updates                           0.2178     0.0000              5       0.043566485
      DeterminantRef::update                   0.2178     0.2178             10       0.021781294
    Current Gradient                           0.5933     0.0141          30720       0.000019314
      DeterminantRef::ratio                    0.5737     0.5737          30720       0.000018674
      OneBodyJastrowRef                        0.0031     0.0031          30720       0.000000099
      TwoBodyJastrowRef                        0.0025     0.0025          30720       0.000000082
    Kinetic Energy                             0.1283     0.1281              5       0.025659799
      OneBodyJastrowRef                        0.0001     0.0001              5       0.000022730
      TwoBodyJastrowRef                        0.0001     0.0001              5       0.000022632
    New Gradient                               3.9859     0.0178          30720       0.000129749
      DeterminantRef::ratio                    0.0486     0.0486          30720       0.000001581
      DeterminantRef::spovgl                   3.5465     0.2606          30720       0.000115445
        Single-Particle Orbitals               3.2859     3.2859          30720       0.000106963
      OneBodyJastrowRef                        0.0306     0.0306          30720       0.000000997
      TwoBodyJastrowRef                        0.3424     0.3424          30720       0.000011147
    ParticleSet:::acceptMove                   0.4328     0.0047          15371       0.000028157
      DTAAOMPTarget::update_e_e                0.4193     0.4193          15371       0.000027279
      DTABOMPTarget::update_ion_e              0.0088     0.0088          15371       0.000000572
    ParticleSet:::computeNewPosDT              0.4986     0.0074          30720       0.000016230
      DTAAOMPTarget::move_e_e                  0.4171     0.4171          30720       0.000013577
      DTABOMPTarget::move_ion_e                0.0741     0.0741          30720       0.000002414
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000002314
    Update                                    18.1028     0.0090          15371       0.001177726
      DeterminantRef::update                  17.7679    17.7679          15371       0.001155940
      OneBodyJastrowRef                        0.0017     0.0017          15371       0.000000113
      TwoBodyJastrowRef                        0.3241     0.3241          15371       0.000021088
  Initialization                               4.6790     0.4955              1       4.678958897
    DeterminantRef::inverse                    3.3333     3.3333              2       1.666657822
    DeterminantRef::spovgl                     0.7193     0.0543              2       0.359636827
      Single-Particle Orbitals                 0.6650     0.6650           6144       0.000108236
    OneBodyJastrowRef                          0.0059     0.0059              1       0.005860267
    ParticleSet:::update                       0.0652     0.0132              2       0.032587268
      DTAAOMPTarget::evaluate_e_e              0.0464     0.0464              1       0.046408928
      DTABOMPTarget::evaluate_ion_e            0.0056     0.0002              1       0.005551654
        DTABOMPTarget::offload_ion_e           0.0054     0.0054              1       0.005365528
    TwoBodyJastrowRef                          0.0599     0.0599              1       0.059851527
  Pseudopotential                              8.9452     0.0433              5       1.789038936
    DeterminantRef::spoval                     7.3072     0.1580          10215       0.000715337
      Single-Particle Orbitals                 7.1492     7.1492         122580       0.000058323
    OneBodyJastrowRef                          0.0135     0.0135          10215       0.000001323
    ParticleSet:::update                       0.9804     0.0072          10215       0.000095979
      DTABOMPTarget::evaluate_e_virtual        0.8855     0.0024          10215       0.000086683
        DTABOMPTarget::offload_e_virtual       0.8831     0.8831          10215       0.000086451
      DTABOMPTarget::evaluate_ion_virtual      0.0878     0.0023          10215       0.000008596
        DTABOMPTarget::offload_ion_virtual     0.0855     0.0855          10215       0.000008373
    TwoBodyJastrowRef                          0.6007     0.6007          10215       0.000058809

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 9.50308e+10
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 1.54734e+11
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 6.752e+07


* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 10736)
* Warning: (host ins01.benchmarkcenter.megware.com, process 10736) Observed more threads (9) than expected (8): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=9.

* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 10742)
* Warning: (host ins01.benchmarkcenter.megware.com, process 10742) Observed more threads (9) than expected (8): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=9.


Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_3

To display your profiling results:
##########################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                 COMMAND                                                                                 #
##########################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_3  #
##########################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node ins01.benchmarkcenter.megware.com

* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 10842)
* Info: "ref-cycles" not supported on ins01.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 10847)miniqmc not built from git repository

number of ranks : 2, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 2
OpenMP threads = 16
Number of walkers per rank = 16

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.0390     0.0390              1       0.038969086
  ParticleSet:::update                         0.0000     0.0000              1       0.000004110
Total                                         60.8123     2.5198              1      60.812255605
  Diffusion                                   39.7634     0.0242              5       7.952670049
    Complete Updates                           0.6626     0.0000              5       0.132518393
      DeterminantRef::update                   0.6626     0.6626             10       0.066255673
    Current Gradient                           0.6791     0.0159          30720       0.000022106
      DeterminantRef::ratio                    0.6576     0.6576          30720       0.000021406
      OneBodyJastrowRef                        0.0031     0.0031          30720       0.000000100
      TwoBodyJastrowRef                        0.0026     0.0026          30720       0.000000083
    Kinetic Energy                             0.3265     0.3262              5       0.065301688
      OneBodyJastrowRef                        0.0002     0.0002              5       0.000038933
      TwoBodyJastrowRef                        0.0001     0.0001              5       0.000026736
    New Gradient                               5.5642     0.0185          30720       0.000181127
      DeterminantRef::ratio                    0.0496     0.0496          30720       0.000001616
      DeterminantRef::spovgl                   5.1077     0.2809          30720       0.000166266
        Single-Particle Orbitals               4.8267     4.8267          30720       0.000157120
      OneBodyJastrowRef                        0.0302     0.0302          30720       0.000000982
      TwoBodyJastrowRef                        0.3582     0.3582          30720       0.000011661
    ParticleSet:::acceptMove                   0.7262     0.0055          15371       0.000047247
      DTAAOMPTarget::update_e_e                0.7100     0.7100          15371       0.000046191
      DTABOMPTarget::update_ion_e              0.0107     0.0107          15371       0.000000698
    ParticleSet:::computeNewPosDT              0.4976     0.0076          30720       0.000016198
      DTAAOMPTarget::move_e_e                  0.4280     0.4280          30720       0.000013931
      DTABOMPTarget::move_ion_e                0.0621     0.0621          30720       0.000002021
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000002712
    Update                                    31.2828     0.0106          15371       0.002035186
      DeterminantRef::update                  30.8737    30.8737          15371       0.002008571
      OneBodyJastrowRef                        0.0018     0.0018          15371       0.000000115
      TwoBodyJastrowRef                        0.3968     0.3968          15371       0.000025813
  Initialization                               5.1519     0.4056              1       5.151923367
    DeterminantRef::inverse                    3.4984     3.4984              2       1.749183001
    DeterminantRef::spovgl                     1.0602     0.0647              2       0.530083384
      Single-Particle Orbitals                 0.9954     0.9954           6144       0.000162016
    OneBodyJastrowRef                          0.0055     0.0055              1       0.005517164
    ParticleSet:::update                       0.1231     0.0213              2       0.061530898
      DTAAOMPTarget::evaluate_e_e              0.0915     0.0915              1       0.091466049
      DTABOMPTarget::evaluate_ion_e            0.0103     0.0001              1       0.010287650
        DTABOMPTarget::offload_ion_e           0.0102     0.0102              1       0.010218093
    TwoBodyJastrowRef                          0.0592     0.0592              1       0.059207473
  Pseudopotential                             13.3772     0.0646              5       2.675446116
    DeterminantRef::spoval                    10.8740     0.1989          10215       0.001064513
      Single-Particle Orbitals                10.6751    10.6751         122580       0.000087087
    OneBodyJastrowRef                          0.0259     0.0259          10215       0.000002531
    ParticleSet:::update                       1.7148     0.0127          10215       0.000167868
      DTABOMPTarget::evaluate_e_virtual        1.5008     0.0035          10215       0.000146922
        DTABOMPTarget::offload_e_virtual       1.4973     1.4973          10215       0.000146579
      DTABOMPTarget::evaluate_ion_virtual      0.2013     0.0045          10215       0.000019703
        DTABOMPTarget::offload_ion_virtual     0.1968     0.1968          10215       0.000019263
    TwoBodyJastrowRef                          0.6980     0.6980          10215       0.000068332

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 1.22043e+11
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 1.86647e+11
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 9.02997e+07


* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 10847)
* Warning: (host ins01.benchmarkcenter.megware.com, process 10847) Observed more threads (17) than expected (16): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=17.

* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 10842)
* Warning: (host ins01.benchmarkcenter.megware.com, process 10842) Observed more threads (17) than expected (16): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=17.


Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_4

To display your profiling results:
##########################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                 COMMAND                                                                                 #
##########################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_4  #
##########################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node ins01.benchmarkcenter.megware.com

* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 10980)
* Info: "ref-cycles" not supported on ins01.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 10985)miniqmc not built from git repository

number of ranks : 2, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 2
OpenMP threads = 32
Number of walkers per rank = 32

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.0471     0.0471              1       0.047076992
  ParticleSet:::update                         0.0000     0.0000              1       0.000004590
Total                                         72.4020     2.4148              1      72.401972073
  Diffusion                                   43.6819     0.0284              5       8.736384735
    Complete Updates                           0.6191     0.0001              5       0.123820062
      DeterminantRef::update                   0.6190     0.6190             10       0.061904442
    Current Gradient                           0.7470     0.0208          30720       0.000024315
      DeterminantRef::ratio                    0.7204     0.7204          30720       0.000023452
      OneBodyJastrowRef                        0.0033     0.0033          30720       0.000000108
      TwoBodyJastrowRef                        0.0024     0.0024          30720       0.000000077
    Kinetic Energy                             0.3696     0.3690              5       0.073920943
      OneBodyJastrowRef                        0.0004     0.0004              5       0.000078303
      TwoBodyJastrowRef                        0.0002     0.0002              5       0.000037685
    New Gradient                               4.9327     0.0229          30720       0.000160569
      DeterminantRef::ratio                    0.0511     0.0511          30720       0.000001662
      DeterminantRef::spovgl                   4.4337     0.2925          30720       0.000144327
        Single-Particle Orbitals               4.1413     4.1413          30720       0.000134807
      OneBodyJastrowRef                        0.0358     0.0358          30720       0.000001165
      TwoBodyJastrowRef                        0.3892     0.3892          30720       0.000012669
    ParticleSet:::acceptMove                   0.9819     0.0066          15371       0.000063880
      DTAAOMPTarget::update_e_e                0.9620     0.9620          15371       0.000062586
      DTABOMPTarget::update_ion_e              0.0133     0.0133          15371       0.000000865
    ParticleSet:::computeNewPosDT              0.5195     0.0079          30720       0.000016909
      DTAAOMPTarget::move_e_e                  0.4447     0.4447          30720       0.000014476
      DTABOMPTarget::move_ion_e                0.0669     0.0669          30720       0.000002177
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000006526
    Update                                    35.4838     0.0133          15371       0.002308487
      DeterminantRef::update                  35.0338    35.0338          15371       0.002279213
      OneBodyJastrowRef                        0.0018     0.0018          15371       0.000000118
      TwoBodyJastrowRef                        0.4348     0.4348          15371       0.000028289
  Initialization                               5.9679     0.9192              1       5.967853675
    DeterminantRef::inverse                    3.5327     3.5327              2       1.766365794
    DeterminantRef::spovgl                     1.3056     0.1092              2       0.652819295
      Single-Particle Orbitals                 1.1964     1.1964           6144       0.000194726
    OneBodyJastrowRef                          0.0075     0.0075              1       0.007520984
    ParticleSet:::update                       0.1395     0.0319              2       0.069737106
      DTAAOMPTarget::evaluate_e_e              0.0913     0.0913              1       0.091260234
      DTABOMPTarget::evaluate_ion_e            0.0163     0.0001              1       0.016271872
        DTABOMPTarget::offload_ion_e           0.0162     0.0162              1       0.016174114
    TwoBodyJastrowRef                          0.0633     0.0633              1       0.063312495
  Pseudopotential                             20.3374     0.1086              5       4.067480435
    DeterminantRef::spoval                    16.9228     0.4312          10215       0.001656663
      Single-Particle Orbitals                16.4916    16.4916         122580       0.000134537
    OneBodyJastrowRef                          0.0494     0.0494          10215       0.000004836
    ParticleSet:::update                       2.4007     0.0240          10215       0.000235019
      DTABOMPTarget::evaluate_e_virtual        2.0929     0.0058          10215       0.000204888
        DTABOMPTarget::offload_e_virtual       2.0871     2.0871          10215       0.000204315
      DTABOMPTarget::evaluate_ion_virtual      0.2838     0.0070          10215       0.000027787
        DTABOMPTarget::offload_ion_virtual     0.2769     0.2769          10215       0.000027106
    TwoBodyJastrowRef                          0.8558     0.8558          10215       0.000083781

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 2.05014e+11
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 3.39807e+11
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 1.18792e+08


* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 10985)
* Warning: (host ins01.benchmarkcenter.megware.com, process 10985) Observed more threads (33) than expected (32): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=33.

* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 10980)
* Warning: (host ins01.benchmarkcenter.megware.com, process 10980) Observed more threads (33) than expected (32): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=33.


Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_5

To display your profiling results:
##########################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                 COMMAND                                                                                 #
##########################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_5  #
##########################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node ins01.benchmarkcenter.megware.com

* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11192)
* Info: "ref-cycles" not supported on ins01.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11197)miniqmc not built from git repository

number of ranks : 2, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 2
OpenMP threads = 64
Number of walkers per rank = 64

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.0819     0.0819              1       0.081878342
  ParticleSet:::update                         0.0000     0.0000              1       0.000004240
Total                                        144.4898     9.0358              1     144.489822856
  Diffusion                                   85.1058     0.0385              5      17.021163321
    Complete Updates                           1.4654     0.0001              5       0.293073003
      DeterminantRef::update                   1.4653     1.4653             10       0.146530283
    Current Gradient                           1.1890     0.0294          30720       0.000038705
      DeterminantRef::ratio                    1.1519     1.1519          30720       0.000037496
      OneBodyJastrowRef                        0.0044     0.0044          30720       0.000000144
      TwoBodyJastrowRef                        0.0033     0.0033          30720       0.000000107
    Kinetic Energy                             0.7818     0.7809              5       0.156365937
      OneBodyJastrowRef                        0.0006     0.0006              5       0.000122264
      TwoBodyJastrowRef                        0.0003     0.0003              5       0.000069347
    New Gradient                               8.1921     0.0348          30720       0.000266669
      DeterminantRef::ratio                    0.0669     0.0669          30720       0.000002179
      DeterminantRef::spovgl                   7.4446     0.4268          30720       0.000242337
        Single-Particle Orbitals               7.0178     7.0178          30720       0.000228443
      OneBodyJastrowRef                        0.0724     0.0724          30720       0.000002356
      TwoBodyJastrowRef                        0.5734     0.5734          30720       0.000018664
    ParticleSet:::acceptMove                   1.8804     0.0096          15371       0.000122333
      DTAAOMPTarget::update_e_e                1.8445     1.8445          15371       0.000120001
      DTABOMPTarget::update_ion_e              0.0263     0.0263          15371       0.000001710
    ParticleSet:::computeNewPosDT              0.7151     0.0112          30720       0.000023279
      DTAAOMPTarget::move_e_e                  0.5901     0.5901          30720       0.000019207
      DTABOMPTarget::move_ion_e                0.1139     0.1139          30720       0.000003708
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000008480
    Update                                    70.8435     0.0156          15371       0.004608907
      DeterminantRef::update                  70.0874    70.0874          15371       0.004559717
      OneBodyJastrowRef                        0.0028     0.0028          15371       0.000000183
      TwoBodyJastrowRef                        0.7377     0.7377          15371       0.000047993
  Initialization                               8.7707     2.2029              1       8.770723829
    DeterminantRef::inverse                    4.2089     4.2089              2       2.104468837
    DeterminantRef::spovgl                     2.0215     0.1668              2       1.010753705
      Single-Particle Orbitals                 1.8547     1.8547           6144       0.000301866
    OneBodyJastrowRef                          0.0073     0.0073              1       0.007268682
    ParticleSet:::update                       0.2685     0.0847              2       0.134233576
      DTAAOMPTarget::evaluate_e_e              0.1518     0.1518              1       0.151788628
      DTABOMPTarget::evaluate_ion_e            0.0320     0.0015              1       0.032019418
        DTABOMPTarget::offload_ion_e           0.0305     0.0305              1       0.030495178
    TwoBodyJastrowRef                          0.0617     0.0617              1       0.061657174
  Pseudopotential                             41.5775     0.2524              5       8.315491395
    DeterminantRef::spoval                    33.3096     1.1568          10215       0.003260849
      Single-Particle Orbitals                32.1527    32.1527         122580       0.000262300
    OneBodyJastrowRef                          0.1266     0.1266          10215       0.000012392
    ParticleSet:::update                       5.9741     0.0515          10215       0.000584835
      DTABOMPTarget::evaluate_e_virtual        5.3686     0.0158          10215       0.000525558
        DTABOMPTarget::offload_e_virtual       5.3528     5.3528          10215       0.000524011
      DTABOMPTarget::evaluate_ion_virtual      0.5540     0.0177          10215       0.000054235
        DTABOMPTarget::offload_ion_virtual     0.5363     0.5363          10215       0.000052500
    TwoBodyJastrowRef                          1.9148     1.9148          10215       0.000187450

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 2.0546e+11
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 3.48822e+11
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 1.16213e+08


* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11197)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11197) Observed more threads (65) than expected (64): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=65.

* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11192)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11192) Observed more threads (65) than expected (64): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=65.


Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_6

To display your profiling results:
##########################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                 COMMAND                                                                                 #
##########################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_6  #
##########################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node ins01.benchmarkcenter.megware.com

* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11529)
* Info: "ref-cycles" not supported on ins01.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11534)miniqmc not built from git repository

number of ranks : 2, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 2
OpenMP threads = 96
Number of walkers per rank = 96

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.0975     0.0975              1       0.097513502
  ParticleSet:::update                         0.0000     0.0000              1       0.000005160
Total                                        204.9459    16.3330              1     204.945875173
  Diffusion                                  118.0819     0.0499              5      23.616389721
    Complete Updates                           2.2506     0.0001              5       0.450122448
      DeterminantRef::update                   2.2505     2.2505             10       0.225048685
    Current Gradient                           1.8085     0.0402          30720       0.000058870
      DeterminantRef::ratio                    1.7543     1.7543          30720       0.000057107
      OneBodyJastrowRef                        0.0085     0.0085          30720       0.000000278
      TwoBodyJastrowRef                        0.0055     0.0055          30720       0.000000178
    Kinetic Energy                             1.2516     1.2500              5       0.250310493
      OneBodyJastrowRef                        0.0009     0.0009              5       0.000188035
      TwoBodyJastrowRef                        0.0006     0.0006              5       0.000119160
    New Gradient                              11.0659     0.0535          30720       0.000360217
      DeterminantRef::ratio                    0.0899     0.0899          30720       0.000002927
      DeterminantRef::spovgl                   9.9723     0.5091          30720       0.000324620
        Single-Particle Orbitals               9.4632     9.4632          30720       0.000308047
      OneBodyJastrowRef                        0.1156     0.1156          30720       0.000003762
      TwoBodyJastrowRef                        0.8346     0.8346          30720       0.000027167
    ParticleSet:::acceptMove                   2.6355     0.0185          15371       0.000171459
      DTAAOMPTarget::update_e_e                2.5744     2.5744          15371       0.000167485
      DTABOMPTarget::update_ion_e              0.0426     0.0426          15371       0.000002769
    ParticleSet:::computeNewPosDT              1.1022     0.0189          30720       0.000035878
      DTAAOMPTarget::move_e_e                  0.9338     0.9338          30720       0.000030398
      DTABOMPTarget::move_ion_e                0.1494     0.1494          30720       0.000004864
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000009194
    Update                                    97.9178     0.0255          15371       0.006370294
      DeterminantRef::update                  96.8341    96.8341          15371       0.006299792
      OneBodyJastrowRef                        0.0053     0.0053          15371       0.000000343
      TwoBodyJastrowRef                        1.0529     1.0529          15371       0.000068497
  Initialization                              12.1054     3.1356              1      12.105418958
    DeterminantRef::inverse                    6.3592     6.3592              2       3.179622952
    DeterminantRef::spovgl                     2.0605     0.1380              2       1.030261826
      Single-Particle Orbitals                 1.9226     1.9226           6144       0.000312917
    OneBodyJastrowRef                          0.0056     0.0056              1       0.005555297
    ParticleSet:::update                       0.4855     0.1014              2       0.242737742
      DTAAOMPTarget::evaluate_e_e              0.3109     0.3109              1       0.310943563
      DTABOMPTarget::evaluate_ion_e            0.0731     0.0133              1       0.073110754
        DTABOMPTarget::offload_ion_e           0.0598     0.0598              1       0.059790068
    TwoBodyJastrowRef                          0.0590     0.0590              1       0.058994009
  Pseudopotential                             58.4255     0.3037              5      11.685109360
    DeterminantRef::spoval                    48.0185     1.8013          10215       0.004700788
      Single-Particle Orbitals                46.2173    46.2173         122580       0.000377037
    OneBodyJastrowRef                          0.1710     0.1710          10215       0.000016740
    ParticleSet:::update                       7.4549     0.0658          10215       0.000729798
      DTABOMPTarget::evaluate_e_virtual        6.6616     0.0262          10215       0.000652142
        DTABOMPTarget::offload_e_virtual       6.6354     6.6354          10215       0.000649577
      DTABOMPTarget::evaluate_ion_virtual      0.7274     0.0224          10215       0.000071212
        DTABOMPTarget::offload_ion_virtual     0.7050     0.7050          10215       0.000069020
    TwoBodyJastrowRef                          2.4775     2.4775          10215       0.000242532

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 2.17278e+11
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 3.77113e+11
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 1.24051e+08


* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11534)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11534) Observed more threads (97) than expected (96): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=97.

* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11529)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11529) Observed more threads (97) than expected (96): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=97.


Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_7

To display your profiling results:
##########################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                 COMMAND                                                                                 #
##########################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/aocc_13/oneview_results_scal/tools/lprof_npsu_run_7  #
##########################################################################################################################################################################################################

×