options

Executable Output


* Info: Selecting the 'perf-high-ppn' engine for node ins01.benchmarkcenter.megware.com

* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11081)
* Info: "ref-cycles" not supported on ins01.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11087)miniqmc not built from git repository

number of ranks : 2, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 2
OpenMP threads = 1
Number of walkers per rank = 1

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.1516     0.1516              1       0.151611170
  ParticleSet:::update                         0.0000     0.0000              1       0.000004450
Total                                         45.7749     0.0004              1      45.774878881
  Diffusion                                   29.8888     0.0153              5       5.977753402
    Complete Updates                           0.2120     0.0000              5       0.042392849
      DeterminantRef::update                   0.2119     0.2119             10       0.021194600
    Current Gradient                           0.5826     0.0095          30720       0.000018965
      DeterminantRef::ratio                    0.5686     0.5686          30720       0.000018509
      OneBodyJastrowRef                        0.0025     0.0025          30720       0.000000081
      TwoBodyJastrowRef                        0.0020     0.0020          30720       0.000000066
    Kinetic Energy                             0.1274     0.1271              5       0.025485379
      OneBodyJastrowRef                        0.0002     0.0002              5       0.000036339
      TwoBodyJastrowRef                        0.0001     0.0001              5       0.000020597
    New Gradient                              10.0924     0.0141          30720       0.000328529
      DeterminantRef::ratio                    0.0578     0.0578          30720       0.000001882
      DeterminantRef::spovgl                   9.6643     0.1508          30720       0.000314594
        Single-Particle Orbitals               9.5136     9.5136          30720       0.000309686
      OneBodyJastrowRef                        0.0290     0.0290          30720       0.000000945
      TwoBodyJastrowRef                        0.3272     0.3272          30720       0.000010650
    ParticleSet:::acceptMove                   0.4468     0.0047          15371       0.000029067
      DTAAOMPTarget::update_e_e                0.4334     0.4334          15371       0.000028195
      DTABOMPTarget::update_ion_e              0.0087     0.0087          15371       0.000000566
    ParticleSet:::computeNewPosDT              0.4571     0.0068          30720       0.000014879
      DTAAOMPTarget::move_e_e                  0.4040     0.4040          30720       0.000013151
      DTABOMPTarget::move_ion_e                0.0463     0.0463          30720       0.000001508
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000001050
    Update                                    17.9551     0.0078          15371       0.001168118
      DeterminantRef::update                  17.5852    17.5852          15371       0.001144053
      OneBodyJastrowRef                        0.0014     0.0014          15371       0.000000091
      TwoBodyJastrowRef                        0.3607     0.3607          15371       0.000023469
  Initialization                               5.6411     0.1638              1       5.641131214
    DeterminantRef::inverse                    3.4098     3.4098              2       1.704919811
    DeterminantRef::spovgl                     1.9292     0.0320              2       0.964575216
      Single-Particle Orbitals                 1.8971     1.8971           6144       0.000308777
    OneBodyJastrowRef                          0.0055     0.0055              1       0.005538173
    ParticleSet:::update                       0.0754     0.0091              2       0.037718681
      DTAAOMPTarget::evaluate_e_e              0.0460     0.0460              1       0.045992610
      DTABOMPTarget::evaluate_ion_e            0.0203     0.0001              1       0.020330647
        DTABOMPTarget::offload_ion_e           0.0203     0.0203              1       0.020274098
    TwoBodyJastrowRef                          0.0574     0.0574              1       0.057359412
  Pseudopotential                             10.2446     0.0302              5       2.048921322
    DeterminantRef::spoval                     4.6111     0.1539          10215       0.000451408
      Single-Particle Orbitals                 4.4573     4.4573         122580       0.000036362
    OneBodyJastrowRef                          0.0127     0.0127          10215       0.000001242
    ParticleSet:::update                       5.0823     0.0057          10215       0.000497535
      DTABOMPTarget::evaluate_e_virtual        4.6825     0.0015          10215       0.000458397
        DTABOMPTarget::offload_e_virtual       4.6811     4.6811          10215       0.000458254
      DTABOMPTarget::evaluate_ion_virtual      0.3941     0.0015          10215       0.000038584
        DTABOMPTarget::offload_ion_virtual     0.3927     0.3927          10215       0.000038440
    TwoBodyJastrowRef                          0.5083     0.5083          10215       0.000049758

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 1.01334e+10
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 1.55194e+10
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 7.36948e+06


* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11081)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11081) Observed more threads (2) than expected (1): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=2.

* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11087)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11087) Observed more threads (2) than expected (1): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=2.


Info: 1/2 lprof instances finished


Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_0

To display your profiling results:
########################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                COMMAND                                                                                #
########################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_0  #
########################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node ins01.benchmarkcenter.megware.com

* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11164)
* Info: "ref-cycles" not supported on ins01.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11170)miniqmc not built from git repository

number of ranks : 2, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 2
OpenMP threads = 2
Number of walkers per rank = 2

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.0804     0.0804              1       0.080398397
  ParticleSet:::update                         0.0000     0.0000              1       0.000014009
Total                                         47.3440     0.0008              1      47.343981961
  Diffusion                                   31.2085     0.0159              5       6.241696669
    Complete Updates                           0.2177     0.0000              5       0.043535747
      DeterminantRef::update                   0.2177     0.2177             10       0.021765949
    Current Gradient                           0.5891     0.0106          30720       0.000019177
      DeterminantRef::ratio                    0.5737     0.5737          30720       0.000018676
      OneBodyJastrowRef                        0.0027     0.0027          30720       0.000000088
      TwoBodyJastrowRef                        0.0021     0.0021          30720       0.000000067
    Kinetic Energy                             0.1283     0.1280              5       0.025662500
      OneBodyJastrowRef                        0.0002     0.0002              5       0.000034327
      TwoBodyJastrowRef                        0.0001     0.0001              5       0.000020546
    New Gradient                              11.0928     0.0157          30720       0.000361092
      DeterminantRef::ratio                    0.0578     0.0578          30720       0.000001883
      DeterminantRef::spovgl                  10.6616     0.1530          30720       0.000347057
        Single-Particle Orbitals              10.5086    10.5086          30720       0.000342076
      OneBodyJastrowRef                        0.0300     0.0300          30720       0.000000978
      TwoBodyJastrowRef                        0.3276     0.3276          30720       0.000010663
    ParticleSet:::acceptMove                   0.4488     0.0052          15371       0.000029199
      DTAAOMPTarget::update_e_e                0.4342     0.4342          15371       0.000028248
      DTABOMPTarget::update_ion_e              0.0094     0.0094          15371       0.000000613
    ParticleSet:::computeNewPosDT              0.4784     0.0068          30720       0.000015571
      DTAAOMPTarget::move_e_e                  0.4237     0.4237          30720       0.000013794
      DTABOMPTarget::move_ion_e                0.0478     0.0478          30720       0.000001555
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000000868
    Update                                    18.2375     0.0084          15371       0.001186490
      DeterminantRef::update                  17.8652    17.8652          15371       0.001162265
      OneBodyJastrowRef                        0.0015     0.0015          15371       0.000000098
      TwoBodyJastrowRef                        0.3624     0.3624          15371       0.000023580
  Initialization                               5.8026     0.1668              1       5.802601303
    DeterminantRef::inverse                    3.4015     3.4015              2       1.700745887
    DeterminantRef::spovgl                     2.0961     0.0315              2       1.048046377
      Single-Particle Orbitals                 2.0646     2.0646           6144       0.000336038
    OneBodyJastrowRef                          0.0054     0.0054              1       0.005423816
    ParticleSet:::update                       0.0755     0.0093              2       0.037735786
      DTAAOMPTarget::evaluate_e_e              0.0460     0.0460              1       0.045973792
      DTABOMPTarget::evaluate_ion_e            0.0201     0.0001              1       0.020149256
        DTABOMPTarget::offload_ion_e           0.0201     0.0201              1       0.020089157
    TwoBodyJastrowRef                          0.0573     0.0573              1       0.057310913
  Pseudopotential                             10.3321     0.0314              5       2.066425075
    DeterminantRef::spoval                     4.6742     0.1545          10215       0.000457586
      Single-Particle Orbitals                 4.5197     4.5197         122580       0.000036872
    OneBodyJastrowRef                          0.0128     0.0128          10215       0.000001253
    ParticleSet:::update                       5.1072     0.0055          10215       0.000499970
      DTABOMPTarget::evaluate_e_virtual        4.6984     0.0016          10215       0.000459951
        DTABOMPTarget::offload_e_virtual       4.6968     4.6968          10215       0.000459794
      DTABOMPTarget::evaluate_ion_virtual      0.4033     0.0013          10215       0.000039478
        DTABOMPTarget::offload_ion_virtual     0.4020     0.4020          10215       0.000039352
    TwoBodyJastrowRef                          0.5065     0.5065          10215       0.000049581

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 1.95952e+10
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 2.97263e+10
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 1.46141e+07


* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11170)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11170) Observed more threads (3) than expected (2): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=3.

* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11164)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11164) Observed more threads (3) than expected (2): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=3.


Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_1

To display your profiling results:
########################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                COMMAND                                                                                #
########################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_1  #
########################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node ins01.benchmarkcenter.megware.com

* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11250)
* Info: "ref-cycles" not supported on ins01.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11255)miniqmc not built from git repository

number of ranks : 2, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 2
OpenMP threads = 4
Number of walkers per rank = 4

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.0431     0.0431              1       0.043110393
  ParticleSet:::update                         0.0000     0.0000              1       0.000006950
Total                                         48.3651     0.0008              1      48.365076335
  Diffusion                                   31.4730     0.0172              5       6.294608016
    Complete Updates                           0.2200     0.0000              5       0.043997558
      DeterminantRef::update                   0.2200     0.2200             10       0.021996746
    Current Gradient                           0.5895     0.0125          30720       0.000019189
      DeterminantRef::ratio                    0.5720     0.5720          30720       0.000018619
      OneBodyJastrowRef                        0.0028     0.0028          30720       0.000000092
      TwoBodyJastrowRef                        0.0022     0.0022          30720       0.000000072
    Kinetic Energy                             0.1275     0.1272              5       0.025494411
      OneBodyJastrowRef                        0.0001     0.0001              5       0.000023961
      TwoBodyJastrowRef                        0.0001     0.0001              5       0.000021722
    New Gradient                              11.3098     0.0185          30720       0.000368158
      DeterminantRef::ratio                    0.0592     0.0592          30720       0.000001927
      DeterminantRef::spovgl                  10.8725     0.1614          30720       0.000353921
        Single-Particle Orbitals              10.7111    10.7111          30720       0.000348668
      OneBodyJastrowRef                        0.0299     0.0299          30720       0.000000974
      TwoBodyJastrowRef                        0.3297     0.3297          30720       0.000010734
    ParticleSet:::acceptMove                   0.4172     0.0047          15371       0.000027145
      DTAAOMPTarget::update_e_e                0.4040     0.4040          15371       0.000026286
      DTABOMPTarget::update_ion_e              0.0085     0.0085          15371       0.000000555
    ParticleSet:::computeNewPosDT              0.4868     0.0063          30720       0.000015846
      DTAAOMPTarget::move_e_e                  0.4289     0.4289          30720       0.000013962
      DTABOMPTarget::move_ion_e                0.0516     0.0516          30720       0.000001679
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000001024
    Update                                    18.3051     0.0089          15371       0.001190883
      DeterminantRef::update                  17.9324    17.9324          15371       0.001166637
      OneBodyJastrowRef                        0.0016     0.0016          15371       0.000000103
      TwoBodyJastrowRef                        0.3623     0.3623          15371       0.000023568
  Initialization                               6.0218     0.1735              1       6.021833767
    DeterminantRef::inverse                    3.4248     3.4248              2       1.712401374
    DeterminantRef::spovgl                     2.2822     0.0905              2       1.141117102
      Single-Particle Orbitals                 2.1918     2.1918           6144       0.000356734
    OneBodyJastrowRef                          0.0054     0.0054              1       0.005425576
    ParticleSet:::update                       0.0776     0.0114              2       0.038816313
      DTAAOMPTarget::evaluate_e_e              0.0459     0.0459              1       0.045872135
      DTABOMPTarget::evaluate_ion_e            0.0203     0.0001              1       0.020310562
        DTABOMPTarget::offload_ion_e           0.0202     0.0202              1       0.020160795
    TwoBodyJastrowRef                          0.0582     0.0582              1       0.058220775
  Pseudopotential                             10.8694     0.0305              5       2.173876791
    DeterminantRef::spoval                     5.1669     0.1549          10215       0.000505813
      Single-Particle Orbitals                 5.0120     5.0120         122580       0.000040887
    OneBodyJastrowRef                          0.0127     0.0127          10215       0.000001244
    ParticleSet:::update                       5.1486     0.0059          10215       0.000504023
      DTABOMPTarget::evaluate_e_virtual        4.7470     0.0015          10215       0.000464708
        DTABOMPTarget::offload_e_virtual       4.7454     4.7454          10215       0.000464557
      DTABOMPTarget::evaluate_ion_virtual      0.3958     0.0015          10215       0.000038742
        DTABOMPTarget::offload_ion_virtual     0.3943     0.3943          10215       0.000038598
    TwoBodyJastrowRef                          0.5107     0.5107          10215       0.000049999

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 3.83629e+10
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 5.89529e+10
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 2.77835e+07


* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11255)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11255) Observed more threads (5) than expected (4): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=5.

* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11250)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11250) Observed more threads (5) than expected (4): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=5.


Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_2

To display your profiling results:
########################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                COMMAND                                                                                #
########################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_2  #
########################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node ins01.benchmarkcenter.megware.com

* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11343)
* Info: "ref-cycles" not supported on ins01.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11348)miniqmc not built from git repository

number of ranks : 2, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 2
OpenMP threads = 8
Number of walkers per rank = 8

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.0271     0.0271              1       0.027126309
  ParticleSet:::update                         0.0000     0.0000              1       0.000006650
Total                                         49.2135     3.7971              1      49.213472019
  Diffusion                                   27.8492     0.0201              5       5.569837532
    Complete Updates                           0.2140     0.0000              5       0.042793557
      DeterminantRef::update                   0.2140     0.2140             10       0.021395153
    Current Gradient                           0.5832     0.0159          30720       0.000018983
      DeterminantRef::ratio                    0.5629     0.5629          30720       0.000018325
      OneBodyJastrowRef                        0.0024     0.0024          30720       0.000000077
      TwoBodyJastrowRef                        0.0019     0.0019          30720       0.000000063
    Kinetic Energy                             0.1251     0.1249              5       0.025024922
      OneBodyJastrowRef                        0.0001     0.0001              5       0.000017318
      TwoBodyJastrowRef                        0.0001     0.0001              5       0.000021623
    New Gradient                               7.7493     0.0238          30720       0.000252257
      DeterminantRef::ratio                    0.0580     0.0580          30720       0.000001887
      DeterminantRef::spovgl                   7.3109     0.1264          30720       0.000237984
        Single-Particle Orbitals               7.1845     7.1845          30720       0.000233869
      OneBodyJastrowRef                        0.0272     0.0272          30720       0.000000887
      TwoBodyJastrowRef                        0.3295     0.3295          30720       0.000010725
    ParticleSet:::acceptMove                   0.4210     0.0050          15371       0.000027388
      DTAAOMPTarget::update_e_e                0.4077     0.4077          15371       0.000026522
      DTABOMPTarget::update_ion_e              0.0083     0.0083          15371       0.000000539
    ParticleSet:::computeNewPosDT              0.4924     0.0065          30720       0.000016030
      DTAAOMPTarget::move_e_e                  0.4403     0.4403          30720       0.000014334
      DTABOMPTarget::move_ion_e                0.0456     0.0456          30720       0.000001483
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000000934
    Update                                    18.2441     0.0095          15371       0.001186916
      DeterminantRef::update                  17.8737    17.8737          15371       0.001162818
      OneBodyJastrowRef                        0.0015     0.0015          15371       0.000000095
      TwoBodyJastrowRef                        0.3594     0.3594          15371       0.000023384
  Initialization                               6.0059     0.9717              1       6.005892711
    DeterminantRef::inverse                    3.4059     3.4059              2       1.702952448
    DeterminantRef::spovgl                     1.4877     0.0636              2       0.743826837
      Single-Particle Orbitals                 1.4241     1.4241           6144       0.000231782
    OneBodyJastrowRef                          0.0054     0.0054              1       0.005399856
    ParticleSet:::update                       0.0771     0.0099              2       0.038574248
      DTAAOMPTarget::evaluate_e_e              0.0461     0.0461              1       0.046088010
      DTABOMPTarget::evaluate_ion_e            0.0211     0.0001              1       0.021114266
        DTABOMPTarget::offload_ion_e           0.0210     0.0210              1       0.021045217
    TwoBodyJastrowRef                          0.0581     0.0581              1       0.058074208
  Pseudopotential                             11.5613     0.0288              5       2.312258595
    DeterminantRef::spoval                     5.9343     0.1587          10215       0.000580940
      Single-Particle Orbitals                 5.7756     5.7756         122580       0.000047117
    OneBodyJastrowRef                          0.0126     0.0126          10215       0.000001235
    ParticleSet:::update                       5.0762     0.0057          10215       0.000496936
      DTABOMPTarget::evaluate_e_virtual        4.6715     0.0017          10215       0.000457320
        DTABOMPTarget::offload_e_virtual       4.6698     4.6698          10215       0.000457151
      DTABOMPTarget::evaluate_ion_virtual      0.3990     0.0019          10215       0.000039061
        DTABOMPTarget::offload_ion_virtual     0.3971     0.3971          10215       0.000038874
    TwoBodyJastrowRef                          0.5094     0.5094          10215       0.000049863

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 7.54032e+10
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 1.33248e+11
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 5.22415e+07


* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11348)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11348) Observed more threads (9) than expected (8): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=9.

* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11343)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11343) Observed more threads (9) than expected (8): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=9.


Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_3

To display your profiling results:
########################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                COMMAND                                                                                #
########################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_3  #
########################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node ins01.benchmarkcenter.megware.com

* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11450)
* Info: "ref-cycles" not supported on ins01.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11455)miniqmc not built from git repository

number of ranks : 2, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 2
OpenMP threads = 16
Number of walkers per rank = 16

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.0360     0.0360              1       0.035961313
  ParticleSet:::update                         0.0000     0.0000              1       0.000003170
Total                                         64.4561     0.0012              1      64.456058504
  Diffusion                                   45.3213     0.0212              5       9.064257253
    Complete Updates                           0.3894     0.0000              5       0.077874635
      DeterminantRef::update                   0.3894     0.3894             10       0.038935324
    Current Gradient                           0.6546     0.0171          30720       0.000021309
      DeterminantRef::ratio                    0.6324     0.6324          30720       0.000020585
      OneBodyJastrowRef                        0.0029     0.0029          30720       0.000000094
      TwoBodyJastrowRef                        0.0022     0.0022          30720       0.000000072
    Kinetic Energy                             0.1962     0.1959              5       0.039242888
      OneBodyJastrowRef                        0.0001     0.0001              5       0.000025721
      TwoBodyJastrowRef                        0.0002     0.0002              5       0.000031894
    New Gradient                              17.1338     0.0273          30720       0.000557741
      DeterminantRef::ratio                    0.0623     0.0623          30720       0.000002029
      DeterminantRef::spovgl                  16.5998     0.2542          30720       0.000540359
        Single-Particle Orbitals              16.3456    16.3456          30720       0.000532083
      OneBodyJastrowRef                        0.0525     0.0525          30720       0.000001708
      TwoBodyJastrowRef                        0.3919     0.3919          30720       0.000012758
    ParticleSet:::acceptMove                   0.6478     0.0067          15371       0.000042147
      DTAAOMPTarget::update_e_e                0.6281     0.6281          15371       0.000040863
      DTABOMPTarget::update_ion_e              0.0130     0.0130          15371       0.000000845
    ParticleSet:::computeNewPosDT              0.5467     0.0079          30720       0.000017798
      DTAAOMPTarget::move_e_e                  0.4802     0.4802          30720       0.000015631
      DTABOMPTarget::move_ion_e                0.0587     0.0587          30720       0.000001911
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000001260
    Update                                    25.7315     0.0115          15371       0.001674027
      DeterminantRef::update                  25.2489    25.2489          15371       0.001642631
      OneBodyJastrowRef                        0.0017     0.0017          15371       0.000000108
      TwoBodyJastrowRef                        0.4694     0.4694          15371       0.000030538
  Initialization                               7.2625     0.3245              1       7.262494668
    DeterminantRef::inverse                    3.5178     3.5178              2       1.758911836
    DeterminantRef::spovgl                     3.2662     0.0340              2       1.633097948
      Single-Particle Orbitals                 3.2322     3.2322           6144       0.000526078
    OneBodyJastrowRef                          0.0055     0.0055              1       0.005484015
    ParticleSet:::update                       0.0903     0.0139              2       0.045141225
      DTAAOMPTarget::evaluate_e_e              0.0557     0.0557              1       0.055711248
      DTABOMPTarget::evaluate_ion_e            0.0207     0.0001              1       0.020683845
        DTABOMPTarget::offload_ion_e           0.0206     0.0206              1       0.020625696
    TwoBodyJastrowRef                          0.0582     0.0582              1       0.058227025
  Pseudopotential                             11.8711     0.0335              5       2.374210953
    DeterminantRef::spoval                     6.1868     0.1783          10215       0.000605656
      Single-Particle Orbitals                 6.0085     6.0085         122580       0.000049017
    OneBodyJastrowRef                          0.0165     0.0165          10215       0.000001612
    ParticleSet:::update                       5.1113     0.0082          10215       0.000500372
      DTABOMPTarget::evaluate_e_virtual        4.6761     0.0021          10215       0.000457770
        DTABOMPTarget::offload_e_virtual       4.6740     4.6740          10215       0.000457566
      DTABOMPTarget::evaluate_ion_virtual      0.4269     0.0027          10215       0.000041796
        DTABOMPTarget::offload_ion_virtual     0.4242     0.4242          10215       0.000041528
    TwoBodyJastrowRef                          0.5230     0.5230          10215       0.000051202

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 1.15144e+11
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 1.63758e+11
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 1.01757e+08


* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11455)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11455) Observed more threads (17) than expected (16): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=17.

* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11450)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11450) Observed more threads (17) than expected (16): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=17.


Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_4

To display your profiling results:
########################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                COMMAND                                                                                #
########################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_4  #
########################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node ins01.benchmarkcenter.megware.com

* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11590)
* Info: "ref-cycles" not supported on ins01.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11595)miniqmc not built from git repository

number of ranks : 2, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 2
OpenMP threads = 32
Number of walkers per rank = 32

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.0448     0.0448              1       0.044788567
  ParticleSet:::update                         0.0000     0.0000              1       0.000003940
Total                                         75.4251     2.2082              1      75.425137205
  Diffusion                                   45.7184     0.0211              5       9.143687038
    Complete Updates                           0.8356     0.0000              5       0.167125090
      DeterminantRef::update                   0.8356     0.8356             10       0.083559517
    Current Gradient                           0.7484     0.0223          30720       0.000024361
      DeterminantRef::ratio                    0.7211     0.7211          30720       0.000023475
      OneBodyJastrowRef                        0.0027     0.0027          30720       0.000000088
      TwoBodyJastrowRef                        0.0023     0.0023          30720       0.000000074
    Kinetic Energy                             0.4400     0.4393              5       0.087998013
      OneBodyJastrowRef                        0.0003     0.0003              5       0.000053881
      TwoBodyJastrowRef                        0.0004     0.0004              5       0.000082724
    New Gradient                               9.7853     0.0262          30720       0.000318533
      DeterminantRef::ratio                    0.0614     0.0614          30720       0.000001998
      DeterminantRef::spovgl                   9.2763     0.1947          30720       0.000301963
        Single-Particle Orbitals               9.0816     9.0816          30720       0.000295626
      OneBodyJastrowRef                        0.0391     0.0391          30720       0.000001272
      TwoBodyJastrowRef                        0.3824     0.3824          30720       0.000012447
    ParticleSet:::acceptMove                   0.8161     0.0104          15371       0.000053093
      DTAAOMPTarget::update_e_e                0.7918     0.7918          15371       0.000051515
      DTABOMPTarget::update_ion_e              0.0139     0.0139          15371       0.000000903
    ParticleSet:::computeNewPosDT              0.5169     0.0082          30720       0.000016826
      DTAAOMPTarget::move_e_e                  0.4478     0.4478          30720       0.000014577
      DTABOMPTarget::move_ion_e                0.0609     0.0609          30720       0.000001983
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000001846
    Update                                    32.5550     0.0139          15371       0.002117949
      DeterminantRef::update                  32.0682    32.0682          15371       0.002086280
      OneBodyJastrowRef                        0.0017     0.0017          15371       0.000000108
      TwoBodyJastrowRef                        0.4712     0.4712          15371       0.000030657
  Initialization                               6.4961     0.8001              1       6.496081536
    DeterminantRef::inverse                    3.6053     3.6053              2       1.802648391
    DeterminantRef::spovgl                     1.8852     0.0504              2       0.942598839
      Single-Particle Orbitals                 1.8348     1.8348           6144       0.000298631
    OneBodyJastrowRef                          0.0052     0.0052              1       0.005188051
    ParticleSet:::update                       0.1429     0.0266              2       0.071438716
      DTAAOMPTarget::evaluate_e_e              0.0939     0.0939              1       0.093864344
      DTABOMPTarget::evaluate_ion_e            0.0224     0.0008              1       0.022363419
        DTABOMPTarget::offload_ion_e           0.0215     0.0215              1       0.021514408
    TwoBodyJastrowRef                          0.0574     0.0574              1       0.057410492
  Pseudopotential                             21.0024     0.0805              5       4.200488367
    DeterminantRef::spoval                    14.6321     0.3429          10215       0.001432411
      Single-Particle Orbitals                14.2892    14.2892         122580       0.000116570
    OneBodyJastrowRef                          0.0386     0.0386          10215       0.000003780
    ParticleSet:::update                       5.5171     0.0171          10215       0.000540095
      DTABOMPTarget::evaluate_e_virtual        4.9747     0.0077          10215       0.000487003
        DTABOMPTarget::offload_e_virtual       4.9671     4.9671          10215       0.000486252
      DTABOMPTarget::evaluate_ion_virtual      0.5253     0.0050          10215       0.000051421
        DTABOMPTarget::offload_ion_virtual     0.5203     0.5203          10215       0.000050934
    TwoBodyJastrowRef                          0.7341     0.7341          10215       0.000071869

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 1.96797e+11
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 3.2467e+11
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 1.1503e+08


* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11590)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11590) Observed more threads (33) than expected (32): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=33.

* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11595)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11595) Observed more threads (33) than expected (32): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=33.


Info: 1/2 lprof instances finished


Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_5

To display your profiling results:
########################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                COMMAND                                                                                #
########################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_5  #
########################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node ins01.benchmarkcenter.megware.com

* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11799)
* Info: "ref-cycles" not supported on ins01.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 11804)miniqmc not built from git repository

number of ranks : 2, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 2
OpenMP threads = 64
Number of walkers per rank = 64

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.0702     0.0702              1       0.070203591
  ParticleSet:::update                         0.0000     0.0000              1       0.000002950
Total                                        142.2907     8.4823              1     142.290663146
  Diffusion                                   89.5160     0.0342              5      17.903201130
    Complete Updates                           1.7376     0.0001              5       0.347515113
      DeterminantRef::update                   1.7375     1.7375             10       0.173747957
    Current Gradient                           1.2402     0.0331          30720       0.000040370
      DeterminantRef::ratio                    1.1981     1.1981          30720       0.000039001
      OneBodyJastrowRef                        0.0054     0.0054          30720       0.000000177
      TwoBodyJastrowRef                        0.0035     0.0035          30720       0.000000113
    Kinetic Energy                             0.9292     0.9280              5       0.185831850
      OneBodyJastrowRef                        0.0006     0.0006              5       0.000121054
      TwoBodyJastrowRef                        0.0006     0.0006              5       0.000110644
    New Gradient                              13.2633     0.0434          30720       0.000431747
      DeterminantRef::ratio                    0.0877     0.0877          30720       0.000002853
      DeterminantRef::spovgl                  12.4020     0.4248          30720       0.000403711
        Single-Particle Orbitals              11.9772    11.9772          30720       0.000389883
      OneBodyJastrowRef                        0.0842     0.0842          30720       0.000002742
      TwoBodyJastrowRef                        0.6460     0.6460          30720       0.000021027
    ParticleSet:::acceptMove                   1.7830     0.0154          15371       0.000115997
      DTAAOMPTarget::update_e_e                1.7404     1.7404          15371       0.000113227
      DTABOMPTarget::update_ion_e              0.0272     0.0272          15371       0.000001767
    ParticleSet:::computeNewPosDT              0.8172     0.0120          30720       0.000026601
      DTAAOMPTarget::move_e_e                  0.6894     0.6894          30720       0.000022442
      DTABOMPTarget::move_ion_e                0.1158     0.1158          30720       0.000003769
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000004698
    Update                                    69.7115     0.0212          15371       0.004535261
      DeterminantRef::update                  68.8802    68.8802          15371       0.004481178
      OneBodyJastrowRef                        0.0028     0.0028          15371       0.000000180
      TwoBodyJastrowRef                        0.8074     0.8074          15371       0.000052526
  Initialization                               8.8999     1.7660              1       8.899911474
    DeterminantRef::inverse                    4.0682     4.0682              2       2.034123504
    DeterminantRef::spovgl                     2.8451     0.1307              2       1.422561583
      Single-Particle Orbitals                 2.7144     2.7144           6144       0.000441803
    OneBodyJastrowRef                          0.0057     0.0057              1       0.005746888
    ParticleSet:::update                       0.1548     0.0536              2       0.077410615
      DTAAOMPTarget::evaluate_e_e              0.0753     0.0753              1       0.075279792
      DTABOMPTarget::evaluate_ion_e            0.0260     0.0034              1       0.025978575
        DTABOMPTarget::offload_ion_e           0.0225     0.0225              1       0.022543594
    TwoBodyJastrowRef                          0.0600     0.0600              1       0.059958557
  Pseudopotential                             35.3924     0.1757              5       7.078479803
    DeterminantRef::spoval                    26.4103     0.9399          10215       0.002585446
      Single-Particle Orbitals                25.4704    25.4704         122580       0.000207786
    OneBodyJastrowRef                          0.1057     0.1057          10215       0.000010352
    ParticleSet:::update                       7.2547     0.0346          10215       0.000710199
      DTABOMPTarget::evaluate_e_virtual        6.4741     0.0121          10215       0.000633780
        DTABOMPTarget::offload_e_virtual       6.4619     6.4619          10215       0.000632592
      DTABOMPTarget::evaluate_ion_virtual      0.7460     0.0118          10215       0.000073034
        DTABOMPTarget::offload_ion_virtual     0.7342     0.7342          10215       0.000071878
    TwoBodyJastrowRef                          1.4460     1.4460          10215       0.000141554

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 2.08635e+11
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 3.31637e+11
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 1.36522e+08


* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11804)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11804) Observed more threads (65) than expected (64): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=65.

* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 11799)
* Warning: (host ins01.benchmarkcenter.megware.com, process 11799) Observed more threads (65) than expected (64): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=65.


Info: 1/2 lprof instances finished


Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_6

To display your profiling results:
########################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                COMMAND                                                                                #
########################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_6  #
########################################################################################################################################################################################################


* Info: Selecting the 'perf-high-ppn' engine for node ins01.benchmarkcenter.megware.com

* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 12137)
* Info: "ref-cycles" not supported on ins01.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host ins01.benchmarkcenter.megware.com, process 12142)miniqmc not built from git repository

number of ranks : 2, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 2
OpenMP threads = 96
Number of walkers per rank = 96

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.0978     0.0978              1       0.097807028
  ParticleSet:::update                         0.0000     0.0000              1       0.000002660
Total                                        205.3336    14.2854              1     205.333607008
  Diffusion                                  122.2440     0.0495              5      24.448809589
    Complete Updates                           2.3109     0.0001              5       0.462189917
      DeterminantRef::update                   2.3108     2.3108             10       0.231081896
    Current Gradient                           1.8757     0.0451          30720       0.000061058
      DeterminantRef::ratio                    1.8162     1.8162          30720       0.000059122
      OneBodyJastrowRef                        0.0091     0.0091          30720       0.000000295
      TwoBodyJastrowRef                        0.0053     0.0053          30720       0.000000172
    Kinetic Energy                             1.2691     1.2674              5       0.253814402
      OneBodyJastrowRef                        0.0008     0.0008              5       0.000157539
      TwoBodyJastrowRef                        0.0009     0.0009              5       0.000181997
    New Gradient                              13.8494     0.0640          30720       0.000450828
      DeterminantRef::ratio                    0.1064     0.1064          30720       0.000003463
      DeterminantRef::spovgl                  12.6547     0.6051          30720       0.000411938
        Single-Particle Orbitals              12.0497    12.0497          30720       0.000392242
      OneBodyJastrowRef                        0.1448     0.1448          30720       0.000004712
      TwoBodyJastrowRef                        0.8795     0.8795          30720       0.000028630
    ParticleSet:::acceptMove                   2.8177     0.0280          15371       0.000183310
      DTAAOMPTarget::update_e_e                2.7469     2.7469          15371       0.000178705
      DTABOMPTarget::update_ion_e              0.0428     0.0428          15371       0.000002783
    ParticleSet:::computeNewPosDT              1.2237     0.0243          30720       0.000039834
      DTAAOMPTarget::move_e_e                  1.0175     1.0175          30720       0.000033120
      DTABOMPTarget::move_ion_e                0.1819     0.1819          30720       0.000005922
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000008095
    Update                                    98.8480     0.0341          15371       0.006430810
      DeterminantRef::update                  97.6251    97.6251          15371       0.006351251
      OneBodyJastrowRef                        0.0044     0.0044          15371       0.000000290
      TwoBodyJastrowRef                        1.1843     1.1843          15371       0.000077049
  Initialization                              11.7637     2.0175              1      11.763665716
    DeterminantRef::inverse                    6.1008     6.1008              2       3.050411111
    DeterminantRef::spovgl                     3.0406     0.1366              2       1.520295783
      Single-Particle Orbitals                 2.9040     2.9040           6144       0.000472651
    OneBodyJastrowRef                          0.0051     0.0051              1       0.005051830
    ParticleSet:::update                       0.5426     0.1209              2       0.271303232
      DTAAOMPTarget::evaluate_e_e              0.2755     0.2755              1       0.275480958
      DTABOMPTarget::evaluate_ion_e            0.1462     0.0815              1       0.146208536
        DTABOMPTarget::offload_ion_e           0.0647     0.0647              1       0.064733184
    TwoBodyJastrowRef                          0.0571     0.0571              1       0.057094746
  Pseudopotential                             57.0405     0.3019              5      11.408102723
    DeterminantRef::spoval                    44.0140     1.7631          10215       0.004308763
      Single-Particle Orbitals                42.2510    42.2510         122580       0.000344681
    OneBodyJastrowRef                          0.1804     0.1804          10215       0.000017663
    ParticleSet:::update                      10.1532     0.0715          10215       0.000993945
      DTABOMPTarget::evaluate_e_virtual        9.0146     0.0244          10215       0.000882483
        DTABOMPTarget::offload_e_virtual       8.9902     8.9902          10215       0.000880094
      DTABOMPTarget::evaluate_ion_virtual      1.0671     0.0201          10215       0.000104467
        DTABOMPTarget::offload_ion_virtual     1.0470     1.0470          10215       0.000102495
    TwoBodyJastrowRef                          2.3910     2.3910          10215       0.000234072

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 2.16868e+11
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 3.64273e+11
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 1.27063e+08


* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 12142)
* Warning: (host ins01.benchmarkcenter.megware.com, process 12142) Observed more threads (97) than expected (96): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=97.

* Info: Process finished (host ins01.benchmarkcenter.megware.com, process 12137)
* Warning: (host ins01.benchmarkcenter.megware.com, process 12137) Observed more threads (97) than expected (96): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=97.


Info: 1/2 lprof instances finished


Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_7

To display your profiling results:
########################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                COMMAND                                                                                #
########################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/miniqmc/intel/miniqmc/run/oneview_runs/compilers/gcc_2/oneview_results_scal/tools/lprof_npsu_run_7  #
########################################################################################################################################################################################################

×