Help is available by moving the cursor above any symbol or by checking MAQAO website.
Metric | r0 | r1 | r2 | r3 | r4 | |
---|---|---|---|---|---|---|
Total Time (s) | 4.47 | 4.46 | 4.47 | 4.03 | 4.71 | |
Profiled Time (s) | 2.88 | 2.85 | 2.87 | 2.77 | 3.08 | |
Time in analyzed loops (%) | 49.2 | 49.2 | 48.9 | 48.2 | 44.8 | |
Time in analyzed innermost loops (%) | 44.1 | 44.0 | 43.9 | 41.5 | 39.4 | |
Time in user code (%) | 48.7 | 48.8 | 48.4 | 47.7 | 44.4 | |
Compilation Options Score (%) | 93.5 | 93.3 | 93.1 | 92.9 | 93.1 | |
Array Access Efficiency (%) | 97.7 | 95.9 | 96.2 | 96.7 | 97.6 | |
Potential Speedups | ||||||
Perfect Flow Complexity | 1.01 | 1.01 | 1.01 | 1.01 | 1.00 | |
Perfect OpenMP + MPI + Pthread | 1.03 | 1.03 | 1.03 | 1.04 | 1.12 | |
Perfect OpenMP + MPI + Pthread + Perfect Load Distribution | 2.75 | 2.69 | 2.73 | 2.71 | 2.90 | |
No Scalar Integer | Potential Speedup | 1.03 | 1.03 | 1.03 | 1.03 | 1.02 |
Nb Loops to get 80% | 6 | 6 | 6 | 5 | 5 | |
FP Vectorised | Potential Speedup | 1.14 | 1.14 | 1.15 | 1.03 | 1.01 |
Nb Loops to get 80% | 2 | 2 | 3 | 2 | 3 | |
Fully Vectorised | Potential Speedup | 1.37 | 1.37 | 1.37 | 1.05 | 1.17 |
Nb Loops to get 80% | 4 | 4 | 4 | 7 | 3 | |
Only FP Arithmetic | Potential Speedup | 1.10 | 1.10 | 1.10 | 1.11 | 1.25 |
Nb Loops to get 80% | 8 | 7 | 8 | 6 | 4 |
Source Object | Issue |
---|---|
▼libqmckl.so.0.0.0 | |
▼qmckl_mo.c | |
○ | |
▼qmckl_ao.c | |
○ | |
▼bench_mos | |
▼ | |
○ | -g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target) |
r0 | r1 | r2 | r3 | r4 | |
---|---|---|---|---|---|
Experiment Name | m1o52 | m1o52 | m1o52 | m1o52 | m1o52 |
Application | ./../qmckl_bench/build/bench_mos | same as r0 | same as r0 | same as r0 | same as r0 |
Timestamp | 2024-02-06 15:47:04 | 2024-02-06 16:16:09 | 2024-02-06 16:21:42 | 2024-02-06 16:31:15 | 2024-02-06 16:36:31 |
Experiment Type | OpenMP; | same as r0 | same as r0 | same as r0 | same as r0 |
Machine | skylake | same as r0 | same as r0 | same as r0 | same as r0 |
Architecture | x86_64 | same as r0 | same as r0 | same as r0 | same as r0 |
Micro Architecture | SKYLAKE | same as r0 | same as r0 | same as r0 | same as r0 |
Model Name | Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz | same as r0 | same as r0 | same as r0 | same as r0 |
Cache Size | 36608 KB | same as r0 | same as r0 | same as r0 | same as r0 |
Number of Cores | 26 | same as r0 | same as r0 | same as r0 | same as r0 |
Maximal Frequency | 2.1 GHz | same as r0 | same as r0 | same as r0 | same as r0 |
OS Version | Linux 6.5.7-arch1-1 #1 SMP PREEMPT_DYNAMIC Tue, 10 Oct 2023 21:10:21 +0000 | same as r0 | same as r0 | same as r0 | same as r0 |
Architecture used during static analysis | x86_64 | same as r0 | same as r0 | same as r0 | same as r0 |
Micro Architecture used during static analysis | SKYLAKE | same as r0 | same as r0 | same as r0 | same as r0 |
Compilation Options | bench_mos: libqmckl.so.0.0.0: clang based Intel(R) oneAPI DPC++/C++ Compiler 2023.0.0 (2023.0.0.20221201) --intel -I . -I /home/kcamus/comparative/qmckl/qmckl -I ./include -I ./src -I ./include -I /home/kcamus/comparative/qmckl/qmckl/src -I /home/kcamus/comparative/qmckl/qmckl/include -I /home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/ -I /home/kcamus/comparative/qmckl/trexio/_install/include -D HAVE_CONFIG_H -D QMCKL_TEST_DIR=\"/home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/\" -march=native -Ofast -ftz -finline -fiopenmp -qmkl=sequential -g -fno-omit-frame-pointer -fopenmp -MT src/qmckl_mo.lo -MD -MP -MF src/.deps/qmckl_mo.Tpo -c src/qmckl_mo.c -fPIC -D PIC -o src/.libs/qmckl_mo.o -fveclib=SVML -fheinous-gnu-extensions | bench_mos: libqmckl.so.0.0.0: clang based Intel(R) oneAPI DPC++/C++ Compiler 2023.0.0 (2023.0.0.20221201) --intel -I . -I /home/kcamus/comparative/qmckl/qmckl -I ./include -I ./src -I ./include -I /home/kcamus/comparative/qmckl/qmckl/src -I /home/kcamus/comparative/qmckl/qmckl/include -I /home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/ -I /home/kcamus/comparative/qmckl/trexio/_install/include -D HAVE_CONFIG_H -D QMCKL_TEST_DIR=\"/home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/\" -march=native -O2 -ftz -finline -fiopenmp -qmkl=sequential -g -fno-omit-frame-pointer -fopenmp -MT src/qmckl_mo.lo -MD -MP -MF src/.deps/qmckl_mo.Tpo -c src/qmckl_mo.c -fPIC -D PIC -o src/.libs/qmckl_mo.o -fveclib=SVML -fheinous-gnu-extensions | bench_mos: libqmckl.so.0.0.0: clang based Intel(R) oneAPI DPC++/C++ Compiler 2023.0.0 (2023.0.0.20221201) --intel -I . -I /home/kcamus/comparative/qmckl/qmckl -I ./include -I ./src -I ./include -I /home/kcamus/comparative/qmckl/qmckl/src -I /home/kcamus/comparative/qmckl/qmckl/include -I /home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/ -I /home/kcamus/comparative/qmckl/trexio/_install/include -D HAVE_CONFIG_H -D QMCKL_TEST_DIR=\"/home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/\" -x CORE-AVX512 -Ofast -ftz -finline -fiopenmp -qmkl=sequential -g -fno-omit-frame-pointer -fopenmp -MT src/qmckl_mo.lo -MD -MP -MF src/.deps/qmckl_mo.Tpo -c src/qmckl_mo.c -fPIC -D PIC -o src/.libs/qmckl_mo.o -fveclib=SVML -fheinous-gnu-extensions | bench_mos: libqmckl.so.0.0.0: clang based Intel(R) oneAPI DPC++/C++ Compiler 2023.0.0 (2023.0.0.20221201) --intel -I . -I /home/kcamus/comparative/qmckl/qmckl -I ./include -I ./src -I ./include -I /home/kcamus/comparative/qmckl/qmckl/src -I /home/kcamus/comparative/qmckl/qmckl/include -I /home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/ -I /home/kcamus/comparative/qmckl/trexio/_install/include -D HAVE_CONFIG_H -D QMCKL_TEST_DIR=\"/home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/\" -x CORE-AVX512 -qopt-zmm-usage=high -Ofast -ftz -finline -fiopenmp -qmkl=sequential -g -fno-omit-frame-pointer -fopenmp -MT src/qmckl_mo.lo -MD -MP -MF src/.deps/qmckl_mo.Tpo -c src/qmckl_mo.c -fPIC -D PIC -o src/.libs/qmckl_mo.o -fveclib=SVML -fheinous-gnu-extensions | bench_mos: libqmckl.so.0.0.0: clang based Intel(R) oneAPI DPC++/C++ Compiler 2023.0.0 (2023.0.0.20221201) --intel -I . -I /home/kcamus/comparative/qmckl/qmckl -I ./include -I ./src -I ./include -I /home/kcamus/comparative/qmckl/qmckl/src -I /home/kcamus/comparative/qmckl/qmckl/include -I /home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/ -I /home/kcamus/comparative/qmckl/trexio/_install/include -D HAVE_CONFIG_H -D QMCKL_TEST_DIR=\"/home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/\" -flto=full -x CORE-AVX512 -qopt-zmm-usage=high -Ofast -ftz -finline -fiopenmp -qmkl=sequential -g -fno-omit-frame-pointer -fopenmp -MT src/qmckl_mo.lo -MD -MP -MF src/.deps/qmckl_mo.Tpo -c src/qmckl_mo.c -fPIC -D PIC -o src/.libs/qmckl_mo.o -fveclib=SVML -fheinous-gnu-extensions |
Number of processes observed | 1 | same as r0 | same as r0 | same as r0 | same as r0 |
Number of threads observed | 52 | same as r0 | same as r0 | same as r0 | same as r0 |
Frequency Driver | intel_cpufreq | same as r0 | same as r0 | same as r0 | same as r0 |
Frequency Governor | performance | same as r0 | same as r0 | same as r0 | same as r0 |
Huge Pages | always | same as r0 | same as r0 | same as r0 | same as r0 |
Hyperthreading | off | same as r0 | same as r0 | same as r0 | same as r0 |
Number of sockets | 2 | same as r0 | same as r0 | same as r0 | same as r0 |
Number of cores per socket | 26 | same as r0 | same as r0 | same as r0 | same as r0 |
MAQAO version | 2.19.0 | same as r0 | same as r0 | same as r0 | same as r0 |
MAQAO build | b37ee48e971324d4eaf9054a5a16e1bfd5003152::20240201-180403 | same as r0 | same as r0 | same as r0 | same as r0 |
Comments | same as r0 | same as r0 | same as r0 | same as r0 |