* Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
miniqmc not built from git repository
number of ranks : 1, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 1
OpenMP threads = 96
Number of walkers per rank = 96
SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow,
determinant update, and distance table + einspline of the
reference implementation
==================================
Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer Inclusive_time Exclusive_time Calls Time_per_call
Setup 0.0556 0.0556 1 0.055602738
ParticleSet:::update 0.0000 0.0000 1 0.000002044
Total 144.1005 0.8392 1 144.100516107
Diffusion 90.6247 0.0690 5 18.124948296
Complete Updates 0.9121 0.0000 5 0.182426386
DeterminantRef::update 0.9121 0.9121 10 0.091209245
Current Gradient 4.2369 0.0851 30720 0.000137919
DeterminantRef::ratio 4.1092 4.1092 30720 0.000133765
OneBodyJastrowRef 0.0267 0.0267 30720 0.000000870
TwoBodyJastrowRef 0.0158 0.0158 30720 0.000000515
Kinetic Energy 0.9939 0.9928 5 0.198774821
OneBodyJastrowRef 0.0006 0.0006 5 0.000113974
TwoBodyJastrowRef 0.0005 0.0005 5 0.000107551
New Gradient 13.9243 0.0962 30720 0.000453265
DeterminantRef::ratio 0.1580 0.1580 30720 0.000005143
DeterminantRef::spovgl 12.0155 0.2947 30720 0.000391129
Single-Particle Orbitals 11.7208 11.7208 30720 0.000381536
OneBodyJastrowRef 0.2073 0.2073 30720 0.000006747
TwoBodyJastrowRef 1.4474 1.4474 30720 0.000047117
ParticleSet:::acceptMove 12.8317 0.0359 15371 0.000834802
DTAAOMPTarget::update_e_e 12.7178 12.7178 15371 0.000827390
DTABOMPTarget::update_ion_e 0.0780 0.0780 15371 0.000005076
ParticleSet:::computeNewPosDT 2.3610 0.0453 30720 0.000076856
DTAAOMPTarget::move_e_e 2.1040 2.1040 30720 0.000068489
DTABOMPTarget::move_ion_e 0.2117 0.2117 30720 0.000006892
ParticleSet:::donePbyP 0.0000 0.0000 5 0.000001235
Update 55.2958 0.0340 15371 0.003597411
DeterminantRef::update 53.6050 53.6050 15371 0.003487411
OneBodyJastrowRef 0.0083 0.0083 15371 0.000000538
TwoBodyJastrowRef 1.6486 1.6486 15371 0.000107253
Initialization 11.3326 5.6421 1 11.332583796
DeterminantRef::inverse 2.7559 2.7559 2 1.377972504
DeterminantRef::spovgl 2.2803 0.0583 2 1.140153097
Single-Particle Orbitals 2.2220 2.2220 6144 0.000361653
OneBodyJastrowRef 0.0239 0.0239 1 0.023944039
ParticleSet:::update 0.4032 0.2901 2 0.201585031
DTAAOMPTarget::evaluate_e_e 0.0959 0.0959 1 0.095877806
DTABOMPTarget::evaluate_ion_e 0.0172 0.0006 1 0.017160426
DTABOMPTarget::offload_ion_e 0.0166 0.0166 1 0.016588848
TwoBodyJastrowRef 0.2271 0.2271 1 0.227102006
Pseudopotential 41.3040 0.2256 5 8.260807301
DeterminantRef::spoval 30.1827 0.5506 10215 0.002954748
Single-Particle Orbitals 29.6321 29.6321 122580 0.000241737
OneBodyJastrowRef 0.0962 0.0962 10215 0.000009421
ParticleSet:::update 7.8542 0.0304 10215 0.000768891
DTABOMPTarget::evaluate_e_virtual 7.0114 0.0157 10215 0.000686387
DTABOMPTarget::offload_e_virtual 6.9958 6.9958 10215 0.000684855
DTABOMPTarget::evaluate_ion_virtual 0.8123 0.0149 10215 0.000079523
DTABOMPTarget::offload_ion_virtual 0.7974 0.7974 10215 0.000078065
TwoBodyJastrowRef 2.9453 2.9453 10215 0.000288328
========== Throughput ============
Total throughput ( N_walkers * N_elec^3 / Total time ) = 1.54511e+11
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 2.45685e+11
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 8.77367e+07
Your experiment path is /home/hbollore/qaas/qaas-runs/174-135-6342/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1741366611/tools/lprof_npsu_run_0
To display your profiling results:
#######################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/hbollore/qaas/qaas-runs/174-135-6342/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1741366611/tools/lprof_npsu_run_0 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/hbollore/qaas/qaas-runs/174-135-6342/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1741366611/tools/lprof_npsu_run_0 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/hbollore/qaas/qaas-runs/174-135-6342/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1741366611/tools/lprof_npsu_run_0 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/hbollore/qaas/qaas-runs/174-135-6342/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1741366611/tools/lprof_npsu_run_0 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/hbollore/qaas/qaas-runs/174-135-6342/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1741366611/tools/lprof_npsu_run_0 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/hbollore/qaas/qaas-runs/174-135-6342/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1741366611/tools/lprof_npsu_run_0 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/hbollore/qaas/qaas-runs/174-135-6342/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1741366611/tools/lprof_npsu_run_0 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/hbollore/qaas/qaas-runs/174-135-6342/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1741366611/tools/lprof_npsu_run_0 #
#######################################################################################################################################################################################################