Help is available by moving the cursor above any
symbol or by checking MAQAO website.
- r0: orig
- r1: compilers/clang_21
| Metric | r0 | r1 |
|---|
| Total Time (s) | 62.74 | 62.41 |
| Profiled Time (s) | 56.86 | 56.56 |
| GFLOPS | 104.307 | 76.234 |
| Time in analyzed loops (%) | 96.6 | 96.8 |
| Time in analyzed innermost loops (%) | 96.5 | 96.7 |
| Time in user code (%) | 96.6 | 96.8 |
| Compilation Options Score (%) | 100 | 100 |
| Array Access Efficiency (%) | 87.1 | 85.2 |
|
| Potential Speedups |
| Perfect Flow Complexity | 1.10 | 1.10 |
| Perfect OpenMP + MPI + Pthread | 1.02 | 1.02 |
| Perfect OpenMP + MPI + Pthread + Perfect Load Distribution | 1.04 | 1.04 |
| No Scalar Integer | Potential Speedup | 1.02 | 1.04 |
| Nb Loops to get 80% | 4 | 7 |
| FP Vectorised | Potential Speedup | 1.13 | 1.09 |
| Nb Loops to get 80% | 4 | 3 |
| Fully Vectorised | Potential Speedup | 1.18 | 1.14 |
| Nb Loops to get 80% | 5 | 4 |
| Only FP Arithmetic | Potential Speedup | 1.18 | 1.21 |
| Nb Loops to get 80% | 8 | 11 |
| Source Object | Issue |
| ▼exec– | |
| ▼accelerate_kernel.f90-pp.f90– | |
| ○ | |
| ▼ideal_gas_kernel.f90-pp.f90– | |
| ○ | |
| ▼initialise_chunk_kernel.f90-pp.f90– | |
| ○ | |
| ▼viscosity_kernel.f90-pp.f90– | |
| ○ | |
| ▼advec_mom_kernel.f90-pp.f90– | |
| ○ | |
| ▼calc_dt_kernel.f90-pp.f90– | |
| ○ | |
| ▼build_field.f90-pp.f90– | |
| ○ | |
| ▼field_summary_kernel.f90-pp.f90– | |
| ○ | |
| ▼generate_chunk_kernel.f90-pp.f90– | |
| ○ | |
| ▼flux_calc_kernel.f90-pp.f90– | |
| ○ | |
| ▼PdV_kernel.f90-pp.f90– | |
| ○ | |
| ▼update_halo_kernel.f90-pp.f90– | |
| ○ | |
| ▼revert_kernel.f90-pp.f90– | |
| ○ | |
| ▼advec_cell_kernel.f90-pp.f90– | |
| ○ | |
| ▼reset_field_kernel.f90-pp.f90– | |
| ○ | |
| Source Object | Issue |
| ▼exec– | |
| ▼accelerate_kernel.f90-pp.f90– | |
| ○ | |
| ▼ideal_gas_kernel.f90-pp.f90– | |
| ○ | |
| ▼initialise_chunk_kernel.f90-pp.f90– | |
| ○ | |
| ▼viscosity_kernel.f90-pp.f90– | |
| ○ | |
| ▼advec_mom_kernel.f90-pp.f90– | |
| ○ | |
| ▼update_halo_kernel.f90-pp.f90– | |
| ○ | |
| ▼build_field.f90-pp.f90– | |
| ○ | |
| ▼field_summary_kernel.f90-pp.f90– | |
| ○ | |
| ▼generate_chunk_kernel.f90-pp.f90– | |
| ○ | |
| ▼flux_calc_kernel.f90-pp.f90– | |
| ○ | |
| ▼advec_cell_kernel.f90-pp.f90– | |
| ○ | |
| ▼calc_dt_kernel.f90-pp.f90– | |
| ○ | |
| ▼revert_kernel.f90-pp.f90– | |
| ○ | |
| ▼PdV_kernel.f90-pp.f90– | |
| ○ | |
| ▼reset_field_kernel.f90-pp.f90– | |
| ○ | |
| r0 | r1 |
| Experiment Name | | |
| Application | /home/kcamus/qaas_runs/170-308-5670/intel/CloverLeafFC/run/oneview_runs/defaults/orig/exec | /home/kcamus/qaas_runs/170-308-5670/intel/CloverLeafFC/run/binaries/clang_21/exec |
| Timestamp | 2023-12-20 15:34:02 | 2023-12-20 16:44:55 |
| Experiment Type | MPI; OpenMP; | same as r0 |
| Machine | ip-172-31-68-94 | same as r0 |
| Architecture | x86_64 | same as r0 |
| Micro Architecture | ZEN_V4 | same as r0 |
| Model Name | AMD EPYC 9R14 96-Core Processor | same as r0 |
| Cache Size | 1024 KB | same as r0 |
| Number of Cores | 96 | same as r0 |
| Maximal Frequency | 3.701953 GHz | same as r0 |
| OS Version | Linux 6.2.0-1017-aws #17~22.04.1-Ubuntu SMP Fri Nov 17 21:07:13 UTC 2023 | same as r0 |
| Architecture used during static analysis | x86_64 | same as r0 |
| Micro Architecture used during static analysis | ZEN_V4 | same as r0 |
| Compilation Options |
exec: F90 Flang - 1.5 2017-05-01 '+flang -I/home/kcamus/qaas_runs/170-308-5670/intel/CloverLeafFC/build/CloverLeafFC/CloverLeaf_ref/kernels -O3 -march=native -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -fopenmp -c -o -I/home/kcamus/openmpi/openmpi-5.0.0/_install/include -I/home/kcamus/openmpi/openmpi-5.0.0/_install/lib' | exec: F90 Flang - 1.5 2017-05-01 '+flang -I/home/kcamus/qaas_runs/170-308-5670/intel/CloverLeafFC/build/CloverLeafFC/CloverLeaf_ref/kernels -Ofast -march=znver4 -mprefer-vector-width=512 -flto -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -fopenmp -c -o -I/home/kcamus/openmpi/openmpi-5.0.0/_install/include -I/home/kcamus/openmpi/openmpi-5.0.0/_install/lib' |
| Number of processes observed | 1 | same as r0 |
| Number of threads observed | 192 | same as r0 |
| Frequency Driver | acpi-cpufreq | same as r0 |
| Frequency Governor | performance | same as r0 |
| Huge Pages | madvise | same as r0 |
| Hyperthreading | off | same as r0 |
| Number of sockets | 2 | same as r0 |
| Number of cores per socket | 96 | same as r0 |
| MAQAO version | 2.18.1 | same as r0 |
| MAQAO build | 577f2e430dc41154e3ac510c4b67111c38b3cbf1::20231218-170050 | same as r0 |
| Comments | | same as r0 |