options

Executable Output


* Info: Selecting the 'perf-low-ppn' engine for node ip-172-31-42-13

* Info: "ref-cycles" not supported on ip-172-31-42-13: fallback to "cpu-clock"
* Info: Process launched (host ip-172-31-42-13, process 5890)miniqmc git branch: OMP_offload
miniqmc git commit: 34c39aa17b79f2e7e5c41ff1896cb0847b88715a

number of ranks : 1, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
OpenMP threads = 64
Number of walkers per rank = 64

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.0718     0.0718              1       0.071813036
  ParticleSet:::update                         0.0000     0.0000              1       0.000002705
Total                                        151.4851     0.1283              1     151.485116126
  Diffusion                                   96.9721     0.0710              5      19.394417403
    Complete Updates                           1.3017     0.0000              5       0.260332629
      DeterminantRef::update                   1.3016     1.3016             10       0.130161583
    Current Gradient                           5.5164     0.0692          30720       0.000179572
      DeterminantRef::ratio                    5.4074     5.4074          30720       0.000176023
      OneBodyJastrowRef                        0.0220     0.0220          30720       0.000000715
      TwoBodyJastrowRef                        0.0179     0.0179          30720       0.000000582
    Kinetic Energy                             0.9528     0.9519              5       0.190563111
      OneBodyJastrowRef                        0.0005     0.0005              5       0.000106405
      TwoBodyJastrowRef                        0.0004     0.0004              5       0.000082681
    New Gradient                              15.6776     0.0644          30720       0.000510340
      DeterminantRef::ratio                    0.1787     0.1787          30720       0.000005816
      DeterminantRef::spovgl                  14.1940     0.2616          30720       0.000462045
        Single-Particle Orbitals              13.9325    13.9325          30720       0.000453531
      OneBodyJastrowRef                        0.1684     0.1684          30720       0.000005480
      TwoBodyJastrowRef                        1.0722     1.0722          30720       0.000034903
    ParticleSet:::acceptMove                  12.6138     0.0364          15371       0.000820621
      DTAAOMPTarget::update_e_e               12.5071    12.5071          15371       0.000813681
      DTABOMPTarget::update_ion_e              0.0702     0.0702          15371       0.000004569
    ParticleSet:::computeNewPosDT              2.0128     0.0415          30720       0.000065522
      DTAAOMPTarget::move_e_e                  1.7757     1.7757          30720       0.000057804
      DTABOMPTarget::move_ion_e                0.1956     0.1956          30720       0.000006368
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000001092
    Update                                    58.8259     0.0261          15371       0.003827070
      DeterminantRef::update                  57.2317    57.2317          15371       0.003723353
      OneBodyJastrowRef                        0.0062     0.0062          15371       0.000000402
      TwoBodyJastrowRef                        1.5620     1.5620          15371       0.000101619
  Initialization                               9.4000     4.4335              1       9.400037529
    DeterminantRef::inverse                    1.9489     1.9489              2       0.974473267
    DeterminantRef::spovgl                     2.6771     0.0418              2       1.338530714
      Single-Particle Orbitals                 2.6353     2.6353           6144       0.000428916
    OneBodyJastrowRef                          0.0051     0.0051              1       0.005122783
    ParticleSet:::update                       0.2504     0.0736              2       0.125180787
      DTAAOMPTarget::evaluate_e_e              0.1436     0.1436              1       0.143628398
      DTABOMPTarget::evaluate_ion_e            0.0331     0.0002              1       0.033089891
        DTABOMPTarget::offload_ion_e           0.0329     0.0329              1       0.032912600
    TwoBodyJastrowRef                          0.0850     0.0850              1       0.085029144
  Pseudopotential                             44.9847     0.2179              5       8.996933562
    DeterminantRef::spoval                    34.9860     0.7828          10215       0.003424959
      Single-Particle Orbitals                34.2032    34.2032         122580       0.000279027
    OneBodyJastrowRef                          0.1117     0.1117          10215       0.000010931
    ParticleSet:::update                       7.4980     0.0380          10215       0.000734023
      DTABOMPTarget::evaluate_e_virtual        6.7207     0.0135          10215       0.000657925
        DTABOMPTarget::offload_e_virtual       6.7072     6.7072          10215       0.000656603
      DTABOMPTarget::evaluate_ion_virtual      0.7393     0.0117          10215       0.000072374
        DTABOMPTarget::offload_ion_virtual     0.7276     0.7276          10215       0.000071227
    TwoBodyJastrowRef                          2.1711     2.1711          10215       0.000212540

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 9.79859e+10
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 1.53069e+11
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 5.37054e+07


* Info: Process finished (host ip-172-31-42-13, process 5890)
* Info: Dumping samples (host ip-172-31-42-13, process 5890)
* Info: Dumping source info for callchain nodes (host ip-172-31-42-13, process 5890)
* Info: Building/writing metadata (host ip-172-31-42-13)
* Info: Finished collect step (host ip-172-31-42-13, process 5890)

Your experiment path is /home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0

To display your profiling results:
############################################################################################################################
#    LEVEL    |     REPORT     |                                          COMMAND                                          #
############################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0  #
############################################################################################################################

×