Help is available by moving the cursor above any
symbol or by checking MAQAO website.
| Metric | r0 | r1 | r2 | |
|---|---|---|---|---|
| Total Time (s) | 284.20 | 277.60 | 374.98 | |
| Profiled Time (s) | 282.15 | 276.57 | 374.74 | |
| Time in analyzed loops (%) | 85.0 | 90.5 | 93.4 | |
| Time in analyzed innermost loops (%) | 68.1 | 77.5 | 85.8 | |
| Time in user code (%) | 87.7 | 92.6 | 95.2 | |
| Compilation Options Score (%) | 75.0 | 75.0 | 100 | |
| Perfect Flow Complexity | 1.02 | 1.02 | 1.03 | |
| Array Access Efficiency (%) | 55.1 | 57.2 | Not Available | |
| GFLOPS | 32.712 | 51.828 | 0.0 | |
| Perfect OpenMP + MPI + Pthread | 1.00 | 1.00 | 1.00 | |
| Perfect OpenMP + MPI + Pthread + Perfect Load Distribution | 1.00 | 1.00 | 1.00 | |
| No Scalar Integer | Potential Speedup | 1.08 | 1.05 | 1.17 |
| Nb Loops to get 80% | 12 | 9 | 14 | |
| FP Vectorised | Potential Speedup | 1.08 | 1.06 | 1.61 |
| Nb Loops to get 80% | 8 | 10 | 13 | |
| Fully Vectorised | Potential Speedup | 1.24 | 1.15 | 1.67 |
| Nb Loops to get 80% | 26 | 21 | 15 | |
| Only FP Arithmetic | Potential Speedup | 1.27 | 1.14 | 1.75 |
| Nb Loops to get 80% | 26 | 18 | 15 | |
| Source Object | Issue |
|---|---|
| ▼libgromacs_mpi.so.7 | |
| ▼pairlist_simd_2xmm.h | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼threaded_force_buffer.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼pme_gather.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼listed_forces.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼partition.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼manage_threading.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼kernel_prune.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼pairs.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼pairlist.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼update.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼md_support.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼redistribute.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼mdatoms.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼lincs.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼pbc.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼atomdata.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼localtopology.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼vector.tcc | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼pme_solve.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼pme_spread.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼calc_verletbuf.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼vec.h | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼computemultibodycutoffs.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼bonded.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼settle.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼sim_util.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼grid.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼mshift.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼arrayref.h | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼domdec_constraints.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼kernel_ElecEw_VdwLJCombLB_VF.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼kernel_outer.h | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ▼pme_grid.cpp | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| r0 | r1 | r2 | |
|---|---|---|---|
| Application | /home/eoseret/GROMACS/install/gplusplus/bin/gmx_mpi | /ccc/work/cont001/ocre/oserete/gromacs-2022.4-install-gcc-ompi/bin/gmx_mpi | /home/eoseret/GROMACS/build/gcc_2/bin/gmx_mpi |
| Timestamp | 2023-07-28 12:01:12 | 2023-08-08 09:43:00 | 2023-08-08 09:21:48 |
| Experiment Type | MPI; | same as r0 | same as r0 |
| Machine | skylake | inti6224 | ip-172-31-47-199 |
| Architecture | x86_64 | same as r0 | aarch64 |
| Micro Architecture | SKYLAKE | ZEN_V3 | ARM_NEOVERSE_V1 |
| Model Name | Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz | AMD EPYC 7763 64-Core Processor | |
| Cache Size | 36608 KB | 512 KB | |
| Number of Cores | 26 | 64 | |
| Maximal Frequency | 2.1 GHz | 2.45 GHz | 0 GHz |
| OS Version | Linux 6.4.1-arch2-1 #1 SMP PREEMPT_DYNAMIC Tue, 04 Jul 2023 08:39:40 +0000 | Linux 4.18.0-305.88.1.el8_4.x86_64 #1 SMP Thu Apr 6 10:22:46 EDT 2023 | Linux 5.15.0-1039-aws #44~20.04.1-Ubuntu SMP Thu Jun 22 12:21:08 UTC 2023 |
| Architecture used during static analysis | x86_64 | same as r0 | aarch64 |
| Micro Architecture used during static analysis | SKYLAKE | ZEN_V3 | ARM_NEOVERSE_V1 |
| Compilation Options | libgromacs_mpi.so.7: GNU C++17 13.1.1 20230429 -mavx512f -mfma -mavx512vl -mavx512dq -mavx512bw -mtune=generic -march=x86-64 -g -O2 -std=c++17 -fno-omit-frame-pointer -fcf-protection=none -fPIC -fexcess-precision=fast -funroll-all-loops -fopenmp | libgromacs_mpi.so.7: GNU C++17 12.2.0 -mavx2 -mfma -mtune=generic -march=x86-64 -g -g -O2 -std=c++17 -fno-omit-frame-pointer -fcf-protection=none -fPIC -fexcess-precision=fast -funroll-all-loops -fopenmp -fexceptions | libgromacs_mpi.so.7: GNU C++17 11.1.0 -march=armv8.2-a+sve -msve-vector-bits=256 -mlittle-endian -mabi=lp64 -g -O3 -O3 -std=c++17 -fno-omit-frame-pointer -fcf-protection=none -fPIC -fexcess-precision=fast -funroll-all-loops -fopenmp -fasynchronous-unwind-tables -fstack-protector-strong -fstack-clash-protection |
| Number of processes observed | 1 | same as r0 | same as r0 |
| Number of threads observed | 1 | same as r0 | same as r0 |
| Frequency Driver | intel_cpufreq | acpi-cpufreq | NA |
| Frequency Governor | schedutil | performance | NA |
| Huge Pages | always | same as r0 | madvise |
| Hyperthreading | off | on | same as r0 |
| Number of sockets | 2 | same as r0 | 1 |
| Number of cores per socket | 26 | 64 | same as r1 |
| MAQAO version | 2.17.7 | same as r0 | 2.17.8 |
| MAQAO build | bf11934ec971510c7f500e010d8ca2474fd787ed::20230726-123240 | Build information not available | same as r1 |
| Comments | GROMACS 2022.4 compiled with g++ 13.1.1 running on Skylake with 1 OMP thread, 2000 steps | GROMACS compiled with gcc 12.2.0 + OpenMPI, Zen 3, OV1, 2000 steps, single core | GNU g++ 12.2.0 (SIMD=SVE), AWS G3 (Neoverse V1), 2000 steps, single core |