Help is available by moving the cursor above any
symbol or by checking MAQAO website.
| Metric | r0 | r1 | r2 | r3 | r4 | r5 | r6 | |
|---|---|---|---|---|---|---|---|---|
| Total Time (s) | 1.87 E3 | 968.05 | 500.34 | 265.25 | 143.52 | 85.11 | 48.71 | |
| Profiled Time (s) | 1.87 E3 | 960.82 | 495.16 | 261.06 | 139.91 | 83.52 | 47.35 | |
| Time in analyzed loops (%) | 94.1 | 93.1 | 91.8 | 90.4 | 85.4 | 74.0 | 69.0 | |
| Time in analyzed innermost loops (%) | 86.4 | 84.9 | 83.5 | 81.6 | 76.9 | 66.2 | 61.0 | |
| Time in user code (%) | 0 | 0.86 | 1.95 | 2.94 | 7.06 | 19.5 | 23.5 | |
| Compilation Options Score (%) | 100 | 100 | 100 | 100 | 100 | 100 | 100 | |
| Perfect Flow Complexity | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | |
| Array Access Efficiency (%) | Not Available | Not Available | Not Available | Not Available | Not Available | Not Available | Not Available | |
| Perfect OpenMP + MPI + Pthread | 1.00 | 1.00 | 1.00 | 1.00 | 1.01 | 1.00 | 1.01 | |
| Perfect OpenMP + MPI + Pthread + Perfect Load Distribution | 1.00 | 1.00 | 1.00 | 1.00 | 1.01 | 1.01 | 1.03 | |
| No Scalar Integer | Potential Speedup | 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.04 | 1.04 |
| Nb Loops to get 80% | 9 | 10 | 11 | 12 | 12 | 12 | 12 | |
| FP Vectorised | Potential Speedup | 1.02 | 1.02 | 1.02 | 1.02 | 1.01 | 1.01 | 1.01 |
| Nb Loops to get 80% | 3 | 3 | 3 | 3 | 3 | 3 | 3 | |
| Fully Vectorised | Potential Speedup | 1.05 | 1.06 | 1.06 | 1.06 | 1.06 | 1.05 | 1.05 |
| Nb Loops to get 80% | 15 | 19 | 20 | 21 | 20 | 21 | 21 | |
| Only FP Arithmetic | Potential Speedup | 1.07 | 1.08 | 1.08 | 1.09 | 1.08 | 1.07 | 1.07 |
| Nb Loops to get 80% | 14 | 19 | 19 | 20 | 19 | 20 | 19 | |
| Scalability - Gap | 1.00 | 1.04 | 1.07 | 1.14 | 1.23 | 1.46 | 1.67 | |
| Source Object | Issue |
|---|---|
| ▼libgromacs_mpi.so.7.0.0 | |
| ▼lincs.cpp | |
| ○ | |
| ▼pbc.cpp | |
| ○ | |
| ▼domdec.cpp | |
| ○ | |
| ▼pme_redistribute.cpp | |
| ○ | |
| ▼fft5d.cpp | |
| ○ | |
| ▼impl_arm_sve_util_float.h | |
| ○ | |
| ▼calc_verletbuf.cpp | |
| ○ | |
| ▼threaded_force_buffer.cpp | |
| ○ | |
| ▼update.cpp | |
| ○ | |
| ▼pme_pp.cpp | |
| ○ | |
| ▼localtopology.cpp | |
| ○ | |
| ▼settle.cpp | |
| ○ | |
| ▼pme_solve.cpp | |
| ○ | |
| ▼pme_spread.cpp | |
| ○ | |
| ▼atomdata.cpp | |
| ○ | |
| ▼manage_threading.cpp | |
| ○ | |
| ▼kernel_prune.cpp | |
| ○ | |
| ▼pme_grid.cpp | |
| ○ | |
| ▼partition.cpp | |
| ○ | |
| ▼kernel_outer.h | |
| ○ | |
| ▼pairs.cpp | |
| ○ | |
| ▼pairlist.cpp | |
| ○ | |
| ▼sim_util.cpp | |
| ○ | |
| ▼grid.cpp | |
| ○ | |
| ▼md_support.cpp | |
| ○ | |
| ▼bonded.cpp | |
| ○ | |
| ▼domdec_constraints.cpp | |
| ○ | |
| ▼vec.h | |
| ○ | |
| ▼mdatoms.cpp | |
| ○ | |
| ▼pme_gather.cpp | |
| ○ | |
| ▼gmx_mpi | |
| ▼ | |
| ○ | -g is missing, it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target) |
| r0 | r1 | r2 | r3 | r4 | r5 | r6 | |
|---|---|---|---|---|---|---|---|
| Application | /home/eoseret/GROMACS/build/gcc_2/bin/gmx_mpi | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Timestamp | 2023-02-21 17:49:53 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Experiment Type | MPI; | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Machine | ip-172-31-8-114 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Architecture | arm64 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Micro Architecture | ARM_NEOVERSE_V1 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Model Name | |||||||
| Cache Size | |||||||
| Number of Cores | |||||||
| Maximal Frequency | 0 GHz | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| OS Version | Linux 5.15.0-1030-aws #34~20.04.1-Ubuntu SMP Tue Jan 24 15:16:39 UTC 2023 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Architecture used during static analysis | arm64 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Micro Architecture used during static analysis | ARM_NEOVERSE_V1 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Compilation Options | libgromacs_mpi.so.7.0.0: GNU C++17 11.1.0 -march=armv8.2-a+sve -msve-vector-bits=256 -mlittle-endian -mabi=lp64 -g -O3 -O3 -std=c++17 -fno-omit-frame-pointer -fcf-protection=none -fPIC -fexcess-precision=fast -funroll-all-loops -fopenmp -fasynchronous-unwind-tables -fstack-protector-strong -fstack-clash-protection | same as r0 | same as r0 | same as r0 | same as r0 | libgromacs_mpi.so.7.0.0: GNU C++17 11.1.0 -march=armv8.2-a+sve -msve-vector-bits=256 -mlittle-endian -mabi=lp64 -g -O3 -O3 -std=c++17 -fno-omit-frame-pointer -fcf-protection=none -fPIC -fexcess-precision=fast -funroll-all-loops -fopenmp -fasynchronous-unwind-tables -fstack-protector-strong -fstack-clash-protection gmx_mpi: N/A | same as r0 |
| Number of processes observed | 1 | 2 | 4 | 8 | 16 | 32 | 64 |
| Number of threads observed | 1 | 2 | 4 | 8 | 16 | 32 | 64 |
| MAQAO version | 2.16.3 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| MAQAO build | Build information not available | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Comments | GNU 11.1 (SIMD=SVE), AWS G3 (Neoverse V1), scalability | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |