Help is available by moving the cursor above any
symbol or by checking MAQAO website.
| Total Time (s) | 72.61 | |
| Profiled Time (s) | 71.18 | |
| Time in analyzed loops (%) | 12.3 | |
| Time in analyzed innermost loops (%) | 11.9 | |
| Time in user code (%) | 12.8 | |
| Compilation Options Score (%) | 100 | |
| Array Access Efficiency (%) | 91.5 | |
|
| Potential Speedups |
| Perfect Flow Complexity | 1.00 | |
| Perfect OpenMP + MPI + Pthread | 1.00 | |
| Perfect OpenMP + MPI + Pthread + Perfect Load Distribution | 19.3 | |
| No Scalar Integer | Potential Speedup | 1.01 | |
| Nb Loops to get 80% | 6 | |
| FP Vectorised | Potential Speedup | 1.03 | |
| Nb Loops to get 80% | 9 | |
| Fully Vectorised | Potential Speedup | 1.11 | |
| Nb Loops to get 80% | 11 | |
| FP Arithmetic Only | Potential Speedup | 1.03 | |
| Nb Loops to get 80% | 6 | |
| Source Object | Issue |
| ▼vmc.mov1– | |
| ○jastrow4e.f | |
| ○optci.f | |
| ○multideterminante.f | |
| ○determinant.f | |
| ○optorb.f | |
| ○optjas.f | |
| ○determinant_psit.f | |
| ○determinante.f | |
| ○scale_dist.f | |
| ○orbitals.f | |
| ○multiply_slmi_mderiv.f | |
| ○determinante_psit.f | |
| ○nonlpsi.f | |
| ○deriv_nonlpsi.f | |
| ○metrop_mov1_slat.f | |
| ○basis_fns.f | |
| ○deriv_jastrow4.f90 | |
| ○optwf_sr.f90 | |
| ○set_input_data.f90 | |
| ○splfit.f | |
| ○deriv_nonloc.f | |
| ○get_norbterm.f90 | |
| ○detsav.f | |
| ○distances.f | |
| ○nonloc.f | |
| ○slm.f90 | |
| ○multideterminant.f | |
| Application | /home/kcamus/trex/champ/champ/bin/vmc.mov1 | | |
| Timestamp | 2023-11-14 15:06:12 |
Universal Timestamp | 1699974372 |
| Number of processes observed | 1 |
Number of threads observed | 129 |
| Experiment Type | OpenMP; | | |
| Machine | ip-172-31-68-94 | | |
| Model Name | AMD EPYC 9R14 96-Core Processor | | |
| Architecture | x86_64 |
Micro Architecture | ZEN_V4 |
| Cache Size | 1024 KB |
Number of Cores | 96 |
| OS Version | Linux 6.2.0-1015-aws #15~22.04.1-Ubuntu SMP Fri Oct 6 21:37:24 UTC 2023 | | |
| Architecture used during static analysis | x86_64 |
Micro Architecture used during static analysis | ZEN_V4 |
| Frequency Driver | acpi-cpufreq |
Frequency Governor | performance |
| Huge Pages | madvise |
Hyperthreading | off |
| Number of sockets | 2 |
Number of cores per socket | 96 |
| Compilation Options | vmc.mov1: F90 Flang - 1.5 2017-05-01 '+flang -DTARGET_ARCHITECTURE=\"avx512\" -DVECTORIZATION=\"avx512\" -I/home/kcamus/trex/champ/champ/buildflang/src/module -I/home/kcamus/trex/champ/champ/buildflang/src/parser -march=native -O2 -cpp -mcmodel=large -ffree-line-length-none -g -fno-omit-frame-pointer -fPIC -D_MPI_ -DCLUSTER -ffixed-form -ffixed-line-length-132 -c -o -I/home/kcamus/openmpi/openmpi-5.0.0/_install/include -I/home/kcamus/openmpi/openmpi-5.0.0/_install/lib' | | |
| Dataset | |
| Run Command | <executable> -i vmc_optimization_15000.inp |
| Number Processes | 1 |
| Number Nodes | 1 |
| Filter | {type = number ; value = 10 ; } |
| Profile Start | {unit = none ; value = 0 ; } |