Help is available by moving the cursor above any
symbol or by checking MAQAO website.
| Metric | r0 | r1 | r2 | r3 | r4 | |
|---|---|---|---|---|---|---|
| Total Time (s) | 50.06 | 47.21 | 48.12 | 48.60 | 48.17 | |
| Profiled Time (s) | 48.79 | 45.90 | 46.82 | 47.32 | 46.85 | |
| Time in analyzed loops (%) | 8.11 | 9.32 | 9.15 | 8.71 | 9.21 | |
| Time in analyzed innermost loops (%) | 1.61 | 1.82 | 1.82 | 1.34 | 1.15 | |
| Time in user code (%) | 8.17 | 9.39 | 9.21 | 8.76 | 9.26 | |
| Compilation Options Score (%) | 69.6 | 72.5 | 72.6 | 71.0 | 72.5 | |
| Array Access Efficiency (%) | 70.3 | 71.3 | 72.5 | 83.6 | 75.9 | |
| Potential Speedups | ||||||
| Perfect Flow Complexity | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | |
| Perfect OpenMP + MPI + Pthread | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | |
| Perfect OpenMP + MPI + Pthread + Perfect Load Distribution | 13.2 | 11.9 | 11.8 | 12.6 | 12.0 | |
| No Scalar Integer | Potential Speedup | 1.03 | 1.04 | 1.04 | 1.03 | 1.03 |
| Nb Loops to get 80% | 3 | 3 | 3 | 3 | 3 | |
| FP Vectorised | Potential Speedup | 1.03 | 1.03 | 1.04 | 1.04 | 1.02 |
| Nb Loops to get 80% | 3 | 3 | 3 | 3 | 4 | |
| Fully Vectorised | Potential Speedup | 1.07 | 1.08 | 1.08 | 1.04 | 1.04 |
| Nb Loops to get 80% | 6 | 6 | 5 | 6 | 5 | |
| Only FP Arithmetic | Potential Speedup | 1.06 | 1.07 | 1.07 | 1.07 | 1.07 |
| Nb Loops to get 80% | 4 | 4 | 4 | 3 | 3 | |
| Source Object | Issue |
|---|---|
| ▼bench_aos | |
| ▼ | |
| ○ | -g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target) |
| ▼libqmckl.so.0.0.0 | |
| ▼qmckl_ao.c | |
| ○ |
| r0 | r1 | r2 | r3 | r4 | |
|---|---|---|---|---|---|
| Experiment Name | m1o52 | m1o52 | m1o52 | m1o52 | m1o52 |
| Application | ./../qmckl_bench/build/bench_aos | same as r0 | same as r0 | same as r0 | same as r0 |
| Timestamp | 2024-02-06 15:47:46 | 2024-02-06 16:16:49 | 2024-02-06 16:22:20 | 2024-02-06 16:31:53 | 2024-02-06 16:37:13 |
| Experiment Type | OpenMP; | same as r0 | same as r0 | same as r0 | same as r0 |
| Machine | skylake | same as r0 | same as r0 | same as r0 | same as r0 |
| Architecture | x86_64 | same as r0 | same as r0 | same as r0 | same as r0 |
| Micro Architecture | SKYLAKE | same as r0 | same as r0 | same as r0 | same as r0 |
| Model Name | Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz | same as r0 | same as r0 | same as r0 | same as r0 |
| Cache Size | 36608 KB | same as r0 | same as r0 | same as r0 | same as r0 |
| Number of Cores | 26 | same as r0 | same as r0 | same as r0 | same as r0 |
| Maximal Frequency | 2.1 GHz | same as r0 | same as r0 | same as r0 | same as r0 |
| OS Version | Linux 6.5.7-arch1-1 #1 SMP PREEMPT_DYNAMIC Tue, 10 Oct 2023 21:10:21 +0000 | same as r0 | same as r0 | same as r0 | same as r0 |
| Architecture used during static analysis | x86_64 | same as r0 | same as r0 | same as r0 | same as r0 |
| Micro Architecture used during static analysis | SKYLAKE | same as r0 | same as r0 | same as r0 | same as r0 |
| Compilation Options | libqmckl.so.0.0.0: clang based Intel(R) oneAPI DPC++/C++ Compiler 2023.0.0 (2023.0.0.20221201) --intel -I . -I /home/kcamus/comparative/qmckl/qmckl -I ./include -I ./src -I ./include -I /home/kcamus/comparative/qmckl/qmckl/src -I /home/kcamus/comparative/qmckl/qmckl/include -I /home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/ -I /home/kcamus/comparative/qmckl/trexio/_install/include -D HAVE_CONFIG_H -D QMCKL_TEST_DIR=\"/home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/\" -march=native -Ofast -ftz -finline -fiopenmp -qmkl=sequential -g -fno-omit-frame-pointer -fopenmp -MT src/qmckl_ao.lo -MD -MP -MF src/.deps/qmckl_ao.Tpo -c src/qmckl_ao.c -fPIC -D PIC -o src/.libs/qmckl_ao.o -fveclib=SVML -fheinous-gnu-extensions bench_aos: | libqmckl.so.0.0.0: clang based Intel(R) oneAPI DPC++/C++ Compiler 2023.0.0 (2023.0.0.20221201) --intel -I . -I /home/kcamus/comparative/qmckl/qmckl -I ./include -I ./src -I ./include -I /home/kcamus/comparative/qmckl/qmckl/src -I /home/kcamus/comparative/qmckl/qmckl/include -I /home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/ -I /home/kcamus/comparative/qmckl/trexio/_install/include -D HAVE_CONFIG_H -D QMCKL_TEST_DIR=\"/home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/\" -march=native -O2 -ftz -finline -fiopenmp -qmkl=sequential -g -fno-omit-frame-pointer -fopenmp -MT src/qmckl_ao.lo -MD -MP -MF src/.deps/qmckl_ao.Tpo -c src/qmckl_ao.c -fPIC -D PIC -o src/.libs/qmckl_ao.o -fveclib=SVML -fheinous-gnu-extensions bench_aos: | libqmckl.so.0.0.0: clang based Intel(R) oneAPI DPC++/C++ Compiler 2023.0.0 (2023.0.0.20221201) --intel -I . -I /home/kcamus/comparative/qmckl/qmckl -I ./include -I ./src -I ./include -I /home/kcamus/comparative/qmckl/qmckl/src -I /home/kcamus/comparative/qmckl/qmckl/include -I /home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/ -I /home/kcamus/comparative/qmckl/trexio/_install/include -D HAVE_CONFIG_H -D QMCKL_TEST_DIR=\"/home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/\" -x CORE-AVX512 -Ofast -ftz -finline -fiopenmp -qmkl=sequential -g -fno-omit-frame-pointer -fopenmp -MT src/qmckl_ao.lo -MD -MP -MF src/.deps/qmckl_ao.Tpo -c src/qmckl_ao.c -fPIC -D PIC -o src/.libs/qmckl_ao.o -fveclib=SVML -fheinous-gnu-extensions bench_aos: | libqmckl.so.0.0.0: clang based Intel(R) oneAPI DPC++/C++ Compiler 2023.0.0 (2023.0.0.20221201) --intel -I . -I /home/kcamus/comparative/qmckl/qmckl -I ./include -I ./src -I ./include -I /home/kcamus/comparative/qmckl/qmckl/src -I /home/kcamus/comparative/qmckl/qmckl/include -I /home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/ -I /home/kcamus/comparative/qmckl/trexio/_install/include -D HAVE_CONFIG_H -D QMCKL_TEST_DIR=\"/home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/\" -x CORE-AVX512 -qopt-zmm-usage=high -Ofast -ftz -finline -fiopenmp -qmkl=sequential -g -fno-omit-frame-pointer -fopenmp -MT src/qmckl_ao.lo -MD -MP -MF src/.deps/qmckl_ao.Tpo -c src/qmckl_ao.c -fPIC -D PIC -o src/.libs/qmckl_ao.o -fveclib=SVML -fheinous-gnu-extensions bench_aos: | libqmckl.so.0.0.0: clang based Intel(R) oneAPI DPC++/C++ Compiler 2023.0.0 (2023.0.0.20221201) --intel -I . -I /home/kcamus/comparative/qmckl/qmckl -I ./include -I ./src -I ./include -I /home/kcamus/comparative/qmckl/qmckl/src -I /home/kcamus/comparative/qmckl/qmckl/include -I /home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/ -I /home/kcamus/comparative/qmckl/trexio/_install/include -D HAVE_CONFIG_H -D QMCKL_TEST_DIR=\"/home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/\" -flto=full -x CORE-AVX512 -qopt-zmm-usage=high -Ofast -ftz -finline -fiopenmp -qmkl=sequential -g -fno-omit-frame-pointer -fopenmp -MT src/qmckl_ao.lo -MD -MP -MF src/.deps/qmckl_ao.Tpo -c src/qmckl_ao.c -fPIC -D PIC -o src/.libs/qmckl_ao.o -fveclib=SVML -fheinous-gnu-extensions bench_aos: |
| Number of processes observed | 1 | same as r0 | same as r0 | same as r0 | same as r0 |
| Number of threads observed | 52 | same as r0 | same as r0 | same as r0 | same as r0 |
| Frequency Driver | intel_cpufreq | same as r0 | same as r0 | same as r0 | same as r0 |
| Frequency Governor | performance | same as r0 | same as r0 | same as r0 | same as r0 |
| Huge Pages | always | same as r0 | same as r0 | same as r0 | same as r0 |
| Hyperthreading | off | same as r0 | same as r0 | same as r0 | same as r0 |
| Number of sockets | 2 | same as r0 | same as r0 | same as r0 | same as r0 |
| Number of cores per socket | 26 | same as r0 | same as r0 | same as r0 | same as r0 |
| MAQAO version | 2.19.0 | same as r0 | same as r0 | same as r0 | same as r0 |
| MAQAO build | b37ee48e971324d4eaf9054a5a16e1bfd5003152::20240201-180403 | same as r0 | same as r0 | same as r0 | same as r0 |
| Comments | same as r0 | same as r0 | same as r0 | same as r0 |