* Info: Detected 2 Lprof instances in itp09.benchmarkcenter.megware.com.
If this is incorrect, rerun with number-processes-per-node=X
[0] MPI startup(): Intel(R) MPI Library, Version 2021.14 Build 20240911 (id: b3fc682)
[0] MPI startup(): Copyright (C) 2003-2024 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): Load tuning file: "/cluster/intel/oneapi/2025.0.0/mpi/2021.14/opt/mpi/etc/tuning_icx_shm.dat"
[0] MPI startup(): ===== CPU pinning =====
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 2350927 itp09.benchmarkcenter.megware.com {0}
[0] MPI startup(): 1 2350843 itp09.benchmarkcenter.megware.com {36}
miniqmc not built from git repository
number of ranks : 2, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
MPI processes = 2
OpenMP threads = 36
Number of walkers per rank = 36
SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow,
determinant update, and distance table + einspline of the
reference implementation
==================================
Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer Inclusive_time Exclusive_time Calls Time_per_call
Setup 0.0570 0.0570 1 0.057038834
ParticleSet:::update 0.0000 0.0000 1 0.000003193
Total 132.4221 0.1157 1 132.422126655
Diffusion 81.0430 0.0643 5 16.208604317
Complete Updates 0.6570 0.0001 5 0.131403929
DeterminantRef::update 0.6570 0.6570 10 0.065696675
Current Gradient 3.5788 0.0480 30720 0.000116496
DeterminantRef::ratio 3.5114 3.5114 30720 0.000114303
OneBodyJastrowRef 0.0112 0.0112 30720 0.000000365
TwoBodyJastrowRef 0.0081 0.0081 30720 0.000000264
Kinetic Energy 0.8833 0.8824 5 0.176664390
OneBodyJastrowRef 0.0005 0.0005 5 0.000095437
TwoBodyJastrowRef 0.0005 0.0005 5 0.000094974
New Gradient 21.1164 0.0559 30720 0.000687381
DeterminantRef::ratio 0.2313 0.2313 30720 0.000007530
DeterminantRef::spovgl 18.6406 1.0925 30720 0.000606791
Single-Particle Orbitals 17.5481 17.5481 30720 0.000571228
OneBodyJastrowRef 0.2076 0.2076 30720 0.000006757
TwoBodyJastrowRef 1.9810 1.9810 30720 0.000064485
ParticleSet:::acceptMove 9.2624 0.0434 15371 0.000602586
DTAAOMPTarget::update_e_e 9.1422 9.1422 15371 0.000594772
DTABOMPTarget::update_ion_e 0.0767 0.0767 15371 0.000004991
ParticleSet:::computeNewPosDT 3.1214 0.0311 30720 0.000101610
DTAAOMPTarget::move_e_e 2.8444 2.8444 30720 0.000092590
DTABOMPTarget::move_ion_e 0.2460 0.2460 30720 0.000008008
ParticleSet:::donePbyP 0.0000 0.0000 5 0.000001715
Update 42.3594 0.0264 15371 0.002755802
DeterminantRef::update 39.6528 39.6528 15371 0.002579712
OneBodyJastrowRef 0.0072 0.0072 15371 0.000000471
TwoBodyJastrowRef 2.6730 2.6730 15371 0.000173898
Initialization 7.7074 2.4612 1 7.707367083
DeterminantRef::inverse 1.6508 1.6508 2 0.825395762
DeterminantRef::spovgl 2.9184 0.1851 2 1.459199442
Single-Particle Orbitals 2.7333 2.7333 6144 0.000444881
OneBodyJastrowRef 0.0375 0.0375 1 0.037496496
ParticleSet:::update 0.3848 0.2611 2 0.192408440
DTAAOMPTarget::evaluate_e_e 0.0921 0.0921 1 0.092060956
DTABOMPTarget::evaluate_ion_e 0.0316 0.0060 1 0.031638286
DTABOMPTarget::offload_ion_e 0.0257 0.0257 1 0.025653457
TwoBodyJastrowRef 0.2547 0.2547 1 0.254655852
Pseudopotential 43.5561 0.1274 5 8.711217325
DeterminantRef::spoval 32.3701 0.7228 10215 0.003168876
Single-Particle Orbitals 31.6472 31.6472 122580 0.000258176
OneBodyJastrowRef 0.0736 0.0736 10215 0.000007206
ParticleSet:::update 8.5572 0.0301 10215 0.000837710
DTABOMPTarget::evaluate_e_virtual 7.7399 0.0119 10215 0.000757696
DTABOMPTarget::offload_e_virtual 7.7279 7.7279 10215 0.000756528
DTABOMPTarget::evaluate_ion_virtual 0.7872 0.0087 10215 0.000077066
DTABOMPTarget::offload_ion_virtual 0.7786 0.7786 10215 0.000076216
TwoBodyJastrowRef 2.4278 2.4278 10215 0.000237668
========== Throughput ============
Total throughput ( N_walkers * N_elec^3 / Total time ) = 1.26103e+11
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 2.06049e+11
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 6.24002e+07
Info: 1/2 lprof instances finished
Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs_CPU_8360Y/174-411-9252/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1744130212/tools/lprof_npsu_run_0
To display your profiling results:
#############################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#############################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs_CPU_8360Y/174-411-9252/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1744130212/tools/lprof_npsu_run_0 #
# Functions | Per-node | maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs_CPU_8360Y/174-411-9252/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1744130212/tools/lprof_npsu_run_0 #
# Functions | Per-process | maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs_CPU_8360Y/174-411-9252/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1744130212/tools/lprof_npsu_run_0 #
# Functions | Per-thread | maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs_CPU_8360Y/174-411-9252/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1744130212/tools/lprof_npsu_run_0 #
# Loops | Cluster-wide | maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs_CPU_8360Y/174-411-9252/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1744130212/tools/lprof_npsu_run_0 #
# Loops | Per-node | maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs_CPU_8360Y/174-411-9252/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1744130212/tools/lprof_npsu_run_0 #
# Loops | Per-process | maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs_CPU_8360Y/174-411-9252/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1744130212/tools/lprof_npsu_run_0 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs_CPU_8360Y/174-411-9252/intel/miniqmc/run/oneview_runs/compilers/gcc_4/oneview_results_1744130212/tools/lprof_npsu_run_0 #
#############################################################################################################################################################################################################################