Help is available by moving the cursor above any symbol or by checking MAQAO website.
▶Compared Reports
r0: 1x1
r1: 1x2
r2: 1x4
r3: 1x8
r4: 1x16
r5: 1x32
r6: 1x64
r7: 1x192
r8: 1x128
Global Metrics
Metric
r0
r1
r2
r3
r4
r5
r6
r7
r8
Total Time (s)
834.20
445.89
235.91
129.75
80.21
50.13
38.67
38.66
39.71
Profiled Time (s)
833.61
443.33
233.88
128.80
79.06
49.02
37.38
36.69
37.66
Time in analyzed loops (%)
91.6
88.3
87.1
85.3
74.5
67.8
59.7
49.7
49.6
Time in analyzed innermost loops (%)
77.4
73.8
72.5
70.4
62.5
57.4
50.5
42.8
42.8
Time in user code (%)
93.1
89.7
88.5
86.6
75.6
68.8
60.5
50.3
50.2
Compilation Options Score (%)
100
100
100
100
100
100
100
100
100
Array Access Efficiency (%)
48.0
48.5
49.0
50.0
49.9
50.7
53.6
57.1
57.6
Scalability - Gap
1.00
1.07
1.13
1.24
1.54
1.92
2.97
5.93
6.09
Potential Speedups
Perfect Flow Complexity
1.01
1.01
1.01
1.01
1.01
1.01
1.02
1.03
1.03
Perfect OpenMP + MPI + Pthread
1.00
1.01
1.05
1.05
1.13
1.17
1.22
1.31
1.34
Perfect OpenMP + MPI + Pthread + Perfect Load Distribution
1.00
1.04
1.06
1.10
1.25
1.37
1.56
1.88
1.87
No Scalar Integer
Potential Speedup
1.04
1.04
1.04
1.04
1.04
1.04
1.04
1.03
1.03
Nb Loops to get 80%
6
7
7
8
8
7
6
6
6
FP Vectorised
Potential Speedup
1.05
1.04
1.05
1.05
1.05
1.05
1.04
1.04
1.03
Nb Loops to get 80%
10
10
10
9
9
8
7
5
5
Fully Vectorised
Potential Speedup
1.21
1.21
1.21
1.23
1.23
1.25
1.28
1.29
1.29
Nb Loops to get 80%
22
24
25
26
24
21
18
16
16
Only FP Arithmetic
Potential Speedup
1.20
1.20
1.21
1.21
1.20
1.21
1.22
1.20
1.20
Nb Loops to get 80%
17
19
20
21
21
20
17
12
12
Scalability Speedup
Cumulated Speedup If No Scalar Integer
Cumulated Speedup If FP Vectorized
Cumulated Speedup If Fully Vectorized
Cumulated Speedup If Only FP Arithmetic
Loop Based Profiles
Innermost / Single Loops
Inbetween Loops
Outermost Loops
Cumulated Coverage With All Loops
Innermost Loop Based Profiles
Coverage
Count
Application Categorization
Time
Coverage
Compilation Options
Source Object
Issue
▼libgromacs_mpi.so.9.0.0–
▼fft5d.cpp–
○
▼threaded_force_buffer.cpp–
○
▼pme_gather.cpp–
○
▼calcvir.cpp–
○
▼simd_prune_kernel.cpp–
○
▼partition.cpp–
○
▼manage_threading.cpp–
○
▼pbc_simd.cpp–
○
▼settle.cpp–
○
▼pairlist.cpp–
○
▼update.cpp–
○
▼md_support.cpp–
○
▼pme.cpp–
○
▼kernel_common.cpp–
○
▼mdatoms.cpp–
○
▼lincs.cpp–
○
▼pbc.cpp–
○
▼constr.cpp–
○
▼atomdata.cpp–
○
▼localtopology.cpp–
○
▼vector.tcc–
○
▼pme_solve.cpp–
○
▼pme_spread.cpp–
○
▼calc_verletbuf.cpp–
○
▼simd_kernel.h–
○
▼bonded.cpp–
○
▼sim_util.cpp–
○
▼grid.cpp–
○
▼kerneldispatch.cpp–
○
▼domdec_constraints.cpp–
○
▼pairs.cpp–
○
▼pme_grid.cpp–
○
▼listed_forces.cpp–
○
▼[vdso]–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
Source Object
Issue
▼libgromacs_mpi.so.9.0.0–
▼fft5d.cpp–
○
▼threaded_force_buffer.cpp–
○
▼pme_gather.cpp–
○
▼calcvir.cpp–
○
▼simd_prune_kernel.cpp–
○
▼partition.cpp–
○
▼manage_threading.cpp–
○
▼pbc_simd.cpp–
○
▼settle.cpp–
○
▼pairlist.cpp–
○
▼update.cpp–
○
▼md_support.cpp–
○
▼pme.cpp–
○
▼kernel_common.cpp–
○
▼mdatoms.cpp–
○
▼lincs.cpp–
○
▼pbc.cpp–
○
▼constr.cpp–
○
▼atomdata.cpp–
○
▼localtopology.cpp–
○
▼vector.tcc–
○
▼pme_solve.cpp–
○
▼pme_spread.cpp–
○
▼calc_verletbuf.cpp–
○
▼simd_kernel.h–
○
▼bonded.cpp–
○
▼sim_util.cpp–
○
▼grid.cpp–
○
▼kerneldispatch.cpp–
○
▼domdec_constraints.cpp–
○
▼pairs.cpp–
○
▼pme_grid.cpp–
○
▼listed_forces.cpp–
○
▼[vdso]–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
Source Object
Issue
▼libgromacs_mpi.so.9.0.0–
▼fft5d.cpp–
○
▼threaded_force_buffer.cpp–
○
▼pme_gather.cpp–
○
▼calcvir.cpp–
○
▼simd_prune_kernel.cpp–
○
▼partition.cpp–
○
▼manage_threading.cpp–
○
▼pbc_simd.cpp–
○
▼settle.cpp–
○
▼pairlist.cpp–
○
▼update.cpp–
○
▼md_support.cpp–
○
▼pme.cpp–
○
▼kernel_common.cpp–
○
▼mdatoms.cpp–
○
▼lincs.cpp–
○
▼pbc.cpp–
○
▼constr.cpp–
○
▼atomdata.cpp–
○
▼localtopology.cpp–
○
▼vector.tcc–
○
▼pme_solve.cpp–
○
▼pme_spread.cpp–
○
▼calc_verletbuf.cpp–
○
▼simd_kernel.h–
○
▼bonded.cpp–
○
▼sim_util.cpp–
○
▼grid.cpp–
○
▼kerneldispatch.cpp–
○
▼domdec_constraints.cpp–
○
▼pairs.cpp–
○
▼pme_grid.cpp–
○
▼listed_forces.cpp–
○
▼[vdso]–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
Source Object
Issue
▼libgromacs_mpi.so.9.0.0–
▼fft5d.cpp–
○
▼threaded_force_buffer.cpp–
○
▼pme_gather.cpp–
○
▼calcvir.cpp–
○
▼simd_prune_kernel.cpp–
○
▼partition.cpp–
○
▼manage_threading.cpp–
○
▼pbc_simd.cpp–
○
▼settle.cpp–
○
▼pairlist.cpp–
○
▼update.cpp–
○
▼md_support.cpp–
○
▼pme.cpp–
○
▼kernel_common.cpp–
○
▼mdatoms.cpp–
○
▼lincs.cpp–
○
▼pbc.cpp–
○
▼constr.cpp–
○
▼atomdata.cpp–
○
▼localtopology.cpp–
○
▼vector.tcc–
○
▼pme_solve.cpp–
○
▼pme_spread.cpp–
○
▼calc_verletbuf.cpp–
○
▼simd_kernel.h–
○
▼bonded.cpp–
○
▼sim_util.cpp–
○
▼grid.cpp–
○
▼kerneldispatch.cpp–
○
▼domdec_constraints.cpp–
○
▼pairs.cpp–
○
▼pme_grid.cpp–
○
▼listed_forces.cpp–
○
▼[vdso]–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
Source Object
Issue
▼libgromacs_mpi.so.9.0.0–
▼fft5d.cpp–
○
▼threaded_force_buffer.cpp–
○
▼pme_gather.cpp–
○
▼calcvir.cpp–
○
▼simd_prune_kernel.cpp–
○
▼partition.cpp–
○
▼manage_threading.cpp–
○
▼pbc_simd.cpp–
○
▼settle.cpp–
○
▼pairlist.cpp–
○
▼update.cpp–
○
▼md_support.cpp–
○
▼pme.cpp–
○
▼kernel_common.cpp–
○
▼mdatoms.cpp–
○
▼lincs.cpp–
○
▼pbc.cpp–
○
▼constr.cpp–
○
▼atomdata.cpp–
○
▼localtopology.cpp–
○
▼vector.tcc–
○
▼pme_solve.cpp–
○
▼pme_spread.cpp–
○
▼calc_verletbuf.cpp–
○
▼simd_kernel.h–
○
▼bonded.cpp–
○
▼sim_util.cpp–
○
▼grid.cpp–
○
▼kerneldispatch.cpp–
○
▼domdec_constraints.cpp–
○
▼pairs.cpp–
○
▼pme_grid.cpp–
○
▼listed_forces.cpp–
○
▼[vdso]–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
Source Object
Issue
▼libgromacs_mpi.so.9.0.0–
▼fft5d.cpp–
○
▼threaded_force_buffer.cpp–
○
▼pme_gather.cpp–
○
▼calcvir.cpp–
○
▼simd_prune_kernel.cpp–
○
▼partition.cpp–
○
▼manage_threading.cpp–
○
▼pbc_simd.cpp–
○
▼settle.cpp–
○
▼pairlist.cpp–
○
▼update.cpp–
○
▼md_support.cpp–
○
▼pme.cpp–
○
▼kernel_common.cpp–
○
▼mdatoms.cpp–
○
▼lincs.cpp–
○
▼pbc.cpp–
○
▼constr.cpp–
○
▼atomdata.cpp–
○
▼localtopology.cpp–
○
▼vector.tcc–
○
▼pme_solve.cpp–
○
▼pme_spread.cpp–
○
▼calc_verletbuf.cpp–
○
▼simd_kernel.h–
○
▼bonded.cpp–
○
▼sim_util.cpp–
○
▼grid.cpp–
○
▼kerneldispatch.cpp–
○
▼domdec_constraints.cpp–
○
▼pairs.cpp–
○
▼pme_grid.cpp–
○
▼listed_forces.cpp–
○
▼[vdso]–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
Source Object
Issue
▼libgromacs_mpi.so.9.0.0–
▼fft5d.cpp–
○
▼threaded_force_buffer.cpp–
○
▼pme_gather.cpp–
○
▼calcvir.cpp–
○
▼simd_prune_kernel.cpp–
○
▼partition.cpp–
○
▼manage_threading.cpp–
○
▼pbc_simd.cpp–
○
▼settle.cpp–
○
▼pairlist.cpp–
○
▼update.cpp–
○
▼md_support.cpp–
○
▼pme.cpp–
○
▼kernel_common.cpp–
○
▼mdatoms.cpp–
○
▼lincs.cpp–
○
▼pbc.cpp–
○
▼constr.cpp–
○
▼atomdata.cpp–
○
▼localtopology.cpp–
○
▼vector.tcc–
○
▼pme_solve.cpp–
○
▼pme_spread.cpp–
○
▼calc_verletbuf.cpp–
○
▼simd_kernel.h–
○
▼bonded.cpp–
○
▼sim_util.cpp–
○
▼grid.cpp–
○
▼kerneldispatch.cpp–
○
▼domdec_constraints.cpp–
○
▼pairs.cpp–
○
▼pme_grid.cpp–
○
▼listed_forces.cpp–
○
▼[vdso]–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
Source Object
Issue
▼libgromacs_mpi.so.9.0.0–
▼fft5d.cpp–
○
▼threaded_force_buffer.cpp–
○
▼pme_gather.cpp–
○
▼calcvir.cpp–
○
▼simd_prune_kernel.cpp–
○
▼partition.cpp–
○
▼manage_threading.cpp–
○
▼pbc_simd.cpp–
○
▼settle.cpp–
○
▼pairlist.cpp–
○
▼update.cpp–
○
▼md_support.cpp–
○
▼pme.cpp–
○
▼kernel_common.cpp–
○
▼mdatoms.cpp–
○
▼lincs.cpp–
○
▼pbc.cpp–
○
▼constr.cpp–
○
▼atomdata.cpp–
○
▼localtopology.cpp–
○
▼vector.tcc–
○
▼pme_solve.cpp–
○
▼pme_spread.cpp–
○
▼calc_verletbuf.cpp–
○
▼simd_kernel.h–
○
▼bonded.cpp–
○
▼sim_util.cpp–
○
▼grid.cpp–
○
▼kerneldispatch.cpp–
○
▼domdec_constraints.cpp–
○
▼pairs.cpp–
○
▼pme_grid.cpp–
○
▼listed_forces.cpp–
○
▼[vdso]–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
Source Object
Issue
▼libgromacs_mpi.so.9.0.0–
▼fft5d.cpp–
○
▼threaded_force_buffer.cpp–
○
▼pme_gather.cpp–
○
▼calcvir.cpp–
○
▼simd_prune_kernel.cpp–
○
▼partition.cpp–
○
▼manage_threading.cpp–
○
▼pbc_simd.cpp–
○
▼settle.cpp–
○
▼pairlist.cpp–
○
▼update.cpp–
○
▼md_support.cpp–
○
▼pme.cpp–
○
▼kernel_common.cpp–
○
▼mdatoms.cpp–
○
▼lincs.cpp–
○
▼pbc.cpp–
○
▼constr.cpp–
○
▼atomdata.cpp–
○
▼localtopology.cpp–
○
▼vector.tcc–
○
▼pme_solve.cpp–
○
▼pme_spread.cpp–
○
▼calc_verletbuf.cpp–
○
▼simd_kernel.h–
○
▼bonded.cpp–
○
▼sim_util.cpp–
○
▼grid.cpp–
○
▼kerneldispatch.cpp–
○
▼domdec_constraints.cpp–
○
▼pairs.cpp–
○
▼pme_grid.cpp–
○
▼listed_forces.cpp–
○
▼[vdso]–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
Path Count Profiles
Coverage
Count
Low Iteration Count Profiles
Coverage
Count
Experiment Summaries
r0
r1
r2
r3
r4
r5
r6
r7
r8
Experiment Name
Application
../../install_MPI/bin/gmx_mpi
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
Timestamp
2024-08-02 17:16:32
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
Experiment Type
MPI;
MPI; OpenMP;
same as r1
same as r1
same as r1
same as r1
same as r1
same as r1
same as r1
Machine
ins01.benchmarkcenter.megware.com
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
Architecture
x86_64
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
Micro Architecture
ZEN_V4
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
Model Name
AMD EPYC 9654 96-Core Processor
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
Cache Size
1024 KB
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
Number of Cores
96
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
Maximal Frequency
3.707812 GHz
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
OS Version
Linux 5.14.0-427.18.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Tue May 28 06:27:02 EDT 2024
GROMACS 2024.2 compiled with AOCC 4.1 running on two 96 cores AMD Zen 4 processors, using 1 to 192 MPI ranks (no OMP) [strong scaling]. Pinning is controlled by GROMACS.