options

exec - 2024-01-22 16:18:35 - MAQAO 2.19.0

Help is available by moving the cursor above any symbol or by checking MAQAO website.

Global Metrics

Total Time (s)28.52
Profiled Time (s)27.32
GFLOPS142.996
Time in analyzed loops (%)56.6
Time in analyzed innermost loops (%)41.1
Time in user code (%)67.0
Compilation Options Score (%)100
Array Access Efficiency (%)78.7
Potential Speedups
Iterations Count1.00
Perfect Flow Complexity1.00
Perfect OpenMP + MPI + Pthread1.08
Perfect OpenMP + MPI + Pthread + Perfect Load Distribution1.09
No Scalar IntegerPotential Speedup1.26
Nb Loops to get 80%14
FP VectorisedPotential Speedup1.17
Nb Loops to get 80%30
Fully VectorisedPotential Speedup1.82
Nb Loops to get 80%41
FP Arithmetic OnlyPotential Speedup1.37
Nb Loops to get 80%27

CQA Potential Speedups Summary

Loop Based Profile

Innermost Loop Based Profile

Application Categorization

Compilation Options

Source ObjectIssue
exec
optwf_sr_more.f90
hpsi.f90
matinv.f90
distances.f90
jastrow4e.f90
optci.f90
gammai.f90
jastrowe.f90
multideterminant.f90
metrop_mov1_slat.f90
hpsie.f90
deriv_nonlpsi.f90
get_norbterm.f90
optwf_handle_wf.f90
basis_fns.f90
xoroshiro256starstar.c
nonloc.f90
detsav.f90
determinante.f90
acuest.f90
random.f90
multiply_slmi_mderiv.f90
splfit.f90
deriv_nonloc.f90
jastrow4.f90
jassav.f90
bxmatrices.f90
optwf_sr.f90
multideterminante.f90
deriv_jastrow4.f90
optorb.f90
readps_gauss.f90
pot_local.f90
determinant.f90
optjas.f90
determinante_psit.f90
nonlpsi.f90
orbitals.f90
slm.f90
scale_dist.f90

Loop Iteration Count Profile

Loop Path Count Profile

Cumulated Speedup If No Scalar Integer

Cumulated Speedup If FP Vectorized

Cumulated Speedup If Fully Vectorized

Cumulated Speedup If FP Arithmetic Only

Experiment Summary

Application/home/kcamus/qaas/qaas_runs/170-593-0710/uvsq/champ/run/binaries/icc_1/exec
Timestamp2024-01-22 16:18:35 Universal Timestamp1705936715
Number of processes observed52 Number of threads observed52
Experiment TypeMPI; OpenMP;
Machineskylake
Model NameIntel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz
Architecturex86_64 Micro ArchitectureSKYLAKE
Cache Size36608 KB Number of Cores26
OS VersionLinux 6.5.7-arch1-1 #1 SMP PREEMPT_DYNAMIC Tue, 10 Oct 2023 21:10:21 +0000
Architecture used during static analysisx86_64 Micro Architecture used during static analysisSKYLAKE
Frequency Driverintel_cpufreq Frequency Governorperformance
Huge Pagesalways Hyperthreadingoff
Number of sockets2 Number of cores per socket26
Compilation Optionsexec: Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.8.0 Build 20221119_000000 -I/home/kcamus/qaas/qaas_runs/170-593-0710/uvsq/champ/build/champ/src/vmc -I/home/kcamus/qaas/qaas_runs/170-593-0710/uvsq/champ/build/icc_1/src/module -I/home/kcamus/qaas/qaas_runs/170-593-0710/uvsq/champ/build/icc_1/src/parser -I/opt/intel/oneapi.old/mpi/2021.8.0//include -I/opt/intel/oneapi.old/mpi/2021.8.0/include -DTARGET_ARCHITECTURE=\"avx512\" -DVECTORIZATION=\"avx512\" -O3 -xSKYLAKE-AVX512 -g -fno-omit-frame-pointer -no-pie -module src/vmc -fPIC -implicitnone -finline -ip -align array64byte -fma -ftz -fomit-frame-pointer -fpp -mcmodel=small -shared-intel -dyncom=grid3d_data,orbital_num_spl,orbital_num_lag,orbital_num_spl2,grid3d_data -D_MPI_ -DCLUSTER -xSKYLAKE-AVX512 -g -fno-omit-frame-pointer -no-pie -c -o src/vmc/CMakeFiles/shared_objects.dir/basis_fns.f90.o

Configuration Summary

Dataset
Run Command<executable> -i vmc_optimization_500.inp
MPI Commandmpirun -np 52
Number Processes1
Number Nodes1
Filter{type = number ; value = 1 ; }
Profile Start{unit = none ; value = 0 ; }
×