options

exec - 2024-03-26 11:38:13 - MAQAO 2.19.1

Help is available by moving the cursor above any symbol or by checking MAQAO website.

Global Metrics

Total Time (s)106.43
Profiled Time (s)99.36
GFLOPS701.926
Time in analyzed loops (%)69.0
Time in analyzed innermost loops (%)68.8
Time in user code (%)69.3
Compilation Options Score (%)99.9
Array Access Efficiency (%)94.0
Potential Speedups
Perfect Flow Complexity1.03
Perfect OpenMP + MPI + Pthread1.00
Perfect OpenMP + MPI + Pthread + Perfect Load Distribution1.19
No Scalar IntegerPotential Speedup1.02
Nb Loops to get 80%1
FP VectorisedPotential Speedup1.44
Nb Loops to get 80%3
Fully VectorisedPotential Speedup1.98
Nb Loops to get 80%5
FP Arithmetic OnlyPotential Speedup1.08
Nb Loops to get 80%4

CQA Potential Speedups Summary

Loop Based Profile

Innermost Loop Based Profile

Application Categorization

Compilation Options

Source ObjectIssue
[vdso]
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
libqmcparticle.so
ParticleSet.cpp
libqmcwfs.so
TinyVectorTensorOps.h
OhmmsVector.h
TwoBodyJastrowRef.h
DiracDeterminantRef.cpp
TinyVector.h
einspline_spo_ref.hpp
MultiBsplineRef.hpp
WaveFunction.cpp
DiracMatrix.h
OneBodyJastrowRef.h
DelayedUpdate.h
BsplineFunctor.h
SPOSet.h
exec
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
libqmcutil.so
NewTimer.cpp
libqmcparticle_omptarget.so
ParticleBConds3DSoa.h
SoaDistanceTableABOMPTarget.h
SoaDistanceTableAAOMPTarget.h

Loop Path Count Profile

Cumulated Speedup If No Scalar Integer

Cumulated Speedup If FP Vectorized

Cumulated Speedup If Fully Vectorized

Cumulated Speedup If FP Arithmetic Only

Experiment Summary

Application/home/eoseret/qaas_runs_CPU_9468/171-143-7755/intel/miniqmc/run/binaries/gcc_9/exec
CommentsExecution on the Megware (https://www.megware.com/en/) benchmarking cluster
Timestamp2024-03-26 11:38:13 Universal Timestamp1711449493
Number of processes observed2 Number of threads observed96
Experiment TypeMPI; OpenMP;
Machineidp09.benchmarkcenter.megware.com
Model NameIntel (R) Xeon (R) CPU Max 9468
Architecturex86_64 Micro ArchitectureSAPPHIRE_RAPIDS
Cache Size107520 KB Number of Cores48
OS VersionLinux 5.14.0-362.13.1.el9_3.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Dec 21 07:12:43 EST 2023
Architecture used during static analysisx86_64 Micro Architecture used during static analysisSAPPHIRE_RAPIDS
Frequency Driverintel_pstate Frequency Governorperformance
Huge Pagesalways Hyperthreadingon
Number of sockets2 Number of cores per socket48
Compilation Optionslibqmcwfs.so: GNU GIMPLE 13.2.0 -march=sapphirerapids -g -g -O3 -O3 -O3 -O3 -O3 -O3 -fno-openacc -fcf-protection=none -fPIC -funroll-loops -fno-omit-frame-pointer -fcf-protection=none -fopenmp -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -fltrans
libqmcutil.so: GNU GIMPLE 13.2.0 -march=sapphirerapids -g -g -O3 -O3 -O3 -O3 -O3 -O3 -fno-openacc -fcf-protection=none -fPIC -funroll-loops -fno-omit-frame-pointer -fcf-protection=none -fopenmp -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -fltrans
libqmcparticle_omptarget.so: GNU GIMPLE 13.2.0 -march=sapphirerapids -g -g -O3 -O3 -O3 -O3 -O3 -O3 -fno-openacc -fcf-protection=none -fPIC -funroll-loops -fno-omit-frame-pointer -fcf-protection=none -fopenmp -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -fltrans
exec: N/A + [vdso]: N/A
libqmcparticle.so: GNU C++17 13.2.0 -march=sapphirerapids -g -O3 -O3 -O3 -std=c++17 -flto -funroll-loops -fno-omit-frame-pointer -fcf-protection=none -fopenmp -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -fPIC

Configuration Summary

Dataset
Run Command<executable> -g "4 2 2" -b
MPI Commandmpirun -np 2 /usr/bin/numactl --preferred-many 8-15
Number Processes1
Number Nodes1
FilterNot Used
Profile StartNot Used
×