Help is available by moving the cursor above any
symbol or by checking MAQAO website.
- r0: gcc_o3_m80_size512-512-768_m4-4-5/
- r1: gcc_ofast_m80_size512-512-768_m4-4-5/
- r2: acfl_o3_m80_size512-512-768_m4-4-5/
- r3: acfl_ofast_m80_size512-512-768_m4-4-5/
| Metric | r0 | r1 | r2 | r3 |
|---|
| Total Time (s) | 53.34 | 53.41 | 55.21 | 55.23 |
| Max (Thread Active Time) (s) | 51.59 | 51.62 | 53.21 | 53.30 |
| Average Active Time (s) | 51.36 | 51.41 | 53.00 | 53.06 |
| Activity Ratio (%) | 96.2 | 96.3 | 95.9 | 96.0 |
| Average number of active threads | 77.034 | 77.001 | 76.787 | 76.855 |
| Affinity Stability (%) | 16.1 | 15.2 | 18.7 | 18.2 |
| Time in analyzed loops (%) | 93.5 | 93.9 | 94.3 | 93.9 |
| Time in analyzed innermost loops (%) | 39.5 | 37.3 | 78.2 | 83.8 |
| Time in user code (%) | 93.5 | 93.9 | 94.3 | 93.9 |
| Compilation Options Score (%) | 75.0 | 75.0 | 100 | 100 |
| Array Access Efficiency (%) | 37.1 | 29.7 | 57.7 | 58.0 |
|
| Potential Speedups |
| Perfect Flow Complexity | 1.00 | 1.00 | 1.00 | 1.00 |
| Perfect OpenMP + MPI + Pthread | 1.06 | 1.05 | 1.05 | 1.06 |
| Perfect OpenMP + MPI + Pthread + Perfect Load Distribution | 1.07 | 1.07 | 1.06 | 1.07 |
| No Scalar Integer | Potential Speedup | 2.25 | 2.68 | 1.72 | 1.44 |
| Nb Loops to get 80% | 3 | 2 | 3 | 3 |
| FP Vectorised | Potential Speedup | 1.26 | 1.23 | 1.36 | 1.31 |
| Nb Loops to get 80% | 2 | 1 | 2 | 1 |
| Fully Vectorised | Potential Speedup | 1.53 | 1.52 | 2.03 | 1.98 |
| Nb Loops to get 80% | 4 | 3 | 3 | 3 |
| Only FP Arithmetic | Potential Speedup | 1.63 | 1.67 | 1.55 | 1.45 |
| Nb Loops to get 80% | 2 | 3 | 3 | 3 |
| Source Object | Issue |
| ▼lbc– | |
| ▼lb_init.F90– | |
| ○ | -funroll-loops is missing. |
| ▼mpl_set.F90– | |
| ○ | -funroll-loops is missing. |
| ▼tools.F90– | |
| ○ | -funroll-loops is missing. |
| ▼lbc.F90– | |
| ○ | -funroll-loops is missing. |
| ▼lbm_functions.F90– | |
| ○ | -funroll-loops is missing. |
| Source Object | Issue |
| ▼lbc– | |
| ▼lb_init.F90– | |
| ○ | -funroll-loops is missing. |
| ▼mpl_set.F90– | |
| ○ | -funroll-loops is missing. |
| ▼tools.F90– | |
| ○ | -funroll-loops is missing. |
| ▼lbc.F90– | |
| ○ | -funroll-loops is missing. |
| ▼lbm_functions.F90– | |
| ○ | -funroll-loops is missing. |
| Source Object | Issue |
| ▼lbc– | |
| ▼lb_init.F90– | |
| ○ | |
| ▼mpl_set.F90– | |
| ○ | |
| ▼tools.F90– | |
| ○ | |
| ▼lbc.F90– | |
| ○ | |
| ▼lbm_functions.F90– | |
| ○ | |
| Source Object | Issue |
| ▼lbc– | |
| ▼lbm_functions.F90– | |
| ○ | |
| ▼mpl_set.F90– | |
| ○ | |
| ▼tools.F90– | |
| ○ | |
| ▼lbc.F90– | |
| ○ | |
| ▼lb_init.F90– | |
| ○ | |
| r0 | r1 | r2 | r3 |
| Experiment Name | | | | |
| Application | ./../lbc/lbc | same as r0 | same as r0 | same as r0 |
| Timestamp | 2024-11-28 15:00:52 | 2024-11-28 14:35:04 | 2024-11-28 15:21:52 | 2024-11-28 15:36:32 |
| Experiment Type | Sequential | same as r0 | same as r0 | same as r0 |
| Machine | turpancomp0 | turpancomp1 | same as r0 | same as r0 |
| Architecture | aarch64 | same as r0 | same as r0 | same as r0 |
| Micro Architecture | ARM_NEOVERSE_N1 | same as r0 | same as r0 | same as r0 |
| Model Name | | | | |
| Cache Size | | | | |
| Number of Cores | | | | |
| Maximal Frequency | 3 GHz | same as r0 | same as r0 | same as r0 |
| OS Version | Linux 4.18.0-477.27.1.el8_8.aarch64 #1 SMP Thu Aug 31 11:00:23 EDT 2023 | same as r0 | same as r0 | same as r0 |
| Architecture used during static analysis | aarch64 | same as r0 | same as r0 | same as r0 |
| Micro Architecture used during static analysis | ARM_NEOVERSE_N1 | same as r0 | same as r0 | same as r0 |
| Compilation Options |
lbc: GNU Fortran2008 11.2.0 -mlittle-endian -mabi=lp64 -march=armv8.2-a+crypto+fp16+rcpc+dotprod+ssbs -g -O3 -O3 -fintrinsic-modules-path /usr/local/arm/gcc-11.2.0_Generic-AArch64_RHEL-8_aarch64-linux/bin/../lib/gcc/aarch64-linux-gnu/11.2.0/finclude -fpre-include=/usr/include/finclude/math-vector-fortran.h | lbc: GNU Fortran2008 11.2.0 -mlittle-endian -mabi=lp64 -march=armv8.2-a+crypto+fp16+rcpc+dotprod+ssbs -g -Ofast -Ofast -fintrinsic-modules-path /usr/local/arm/gcc-11.2.0_Generic-AArch64_RHEL-8_aarch64-linux/bin/../lib/gcc/aarch64-linux-gnu/11.2.0/finclude -fpre-include=/usr/include/finclude/math-vector-fortran.h | lbc: Arm F90 F90 Flang - 1.5 2017-05-01 flang -O3 -g -mcpu=native -D D3Q19 -D COMBINE_STREAM_COLLIDE -D LAYOUT_LIJK -D USE_MPI -D VERBOSE -D SENDRECV -D COMM_REDUCED -O3 -c -I /usr/local/openmpi/arm/4.1.4-cpu/include -I /usr/local/openmpi/arm/4.1.4-cpu/lib | lbc: Arm F90 F90 Flang - 1.5 2017-05-01 flang -Ofast -g -mcpu=native -D D3Q19 -D COMBINE_STREAM_COLLIDE -D LAYOUT_LIJK -D USE_MPI -D VERBOSE -D SENDRECV -D COMM_REDUCED -Ofast -c -I /usr/local/openmpi/arm/4.1.4-cpu/include -I /usr/local/openmpi/arm/4.1.4-cpu/lib |
| Number of processes observed | 80 | same as r0 | same as r0 | same as r0 |
| Number of threads observed | 80 | same as r0 | same as r0 | same as r0 |
| Frequency Driver | cppc_cpufreq | same as r0 | same as r0 | same as r0 |
| Frequency Governor | performance | same as r0 | same as r0 | same as r0 |
| Huge Pages | never | same as r0 | same as r0 | same as r0 |
| Hyperthreading | off | same as r0 | same as r0 | same as r0 |
| Number of sockets | 1 | same as r0 | same as r0 | same as r0 |
| Number of cores per socket | 80 | same as r0 | same as r0 | same as r0 |
| MAQAO version | 2.20.12 | same as r0 | same as r0 | same as r0 |
| MAQAO build | 62b64aff226fe590d95d07d98d47e2315a70c860::20241127-211231 | same as r0 | same as r0 | same as r0 |
| Comments | | same as r0 | same as r0 | same as r0 |