| | | | | | | requested parallelism | walltime sum (s) | nb instances | any sync average per thread time (s) | any wait average per thread time (s) | parallelism overhead (%) | local speedup if perfectly balanced | global speedup if perfectly balanced |
start addr | function name | source location | level | ancestor thread num | invoker | parallel or teams | m1o1 | m1o2 | m1o4 | m1o8 | m1o16 | m1o26 | m1o52 | m1o1 | m1o2 | m1o4 | m1o8 | m1o16 | m1o26 | m1o52 | m1o1 | m1o2 | m1o4 | m1o8 | m1o16 | m1o26 | m1o52 | m1o1 | m1o2 | m1o4 | m1o8 | m1o16 | m1o26 | m1o52 | m1o1 | m1o2 | m1o4 | m1o8 | m1o16 | m1o26 | m1o52 | m1o1 | m1o2 | m1o4 | m1o8 | m1o16 | m1o26 | m1o52 | m1o1 | m1o2 | m1o4 | m1o8 | m1o16 | m1o26 | m1o52 | m1o1 | m1o2 | m1o4 | m1o8 | m1o16 | m1o26 | m1o52 |
libqmckl.so.0.0.0:0x42657 | qmckl_compute_mo_basis_mo_vgl_hpc | qmckl_mo.c:1246 | 0 | 0 | runtime | parallel | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 32.657 | 14.726 | 8.601 | 3.966 | 1.978 | 1.187 | 0.696 | 4.00 | 4.00 | 4.00 | 4.00 | 4.00 | 4.00 | 4.00 | 19.3 E-6 | 13.2 E-3 | 0.943 | 0.191 | 0.103 | 36.6 E-3 | 67.5 E-3 | 8.45 E-6 | 13.2 E-3 | 0.943 | 0.191 | 0.103 | 36.6 E-3 | 67.5 E-3 | 0.00 | 0.09 | 11.0 | 4.82 | 5.21 | 3.09 | 9.69 | 1.000 | 1.001 | 1.123 | 1.051 | 1.055 | 1.032 | 1.107 | 1.000 | 1.001 | 1.063 | 1.023 | 1.019 | 1.009 | 1.020 |
libqmckl.so.0.0.0:0x3d2f3 | qmckl_compute_mo_basis_mo_value_hpc | qmckl_mo.c:931 | 0 | 0 | runtime | parallel | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 14.965 | 5.638 | 3.762 | 1.470 | 0.772 | 0.496 | 0.286 | 3.00 | 3.00 | 3.00 | 3.00 | 3.00 | 3.00 | 3.00 | 5.43 E-6 | 8.74 E-3 | 0.729 | 0.152 | 77.1 E-3 | 37.9 E-3 | 9.01 E-3 | 822 E-9 | 8.74 E-3 | 0.729 | 0.152 | 77.1 E-3 | 37.9 E-3 | 9.00 E-3 | 0.00 | 0.16 | 19.4 | 10.3 | 9.99 | 7.65 | 3.15 | 1.000 | 1.002 | 1.240 | 1.115 | 1.111 | 1.083 | 1.032 | 1.000 | 1.000 | 1.048 | 1.019 | 1.014 | 1.009 | 1.003 |
libqmckl.so.0.0.0:0x15193 | qmckl_compute_ao_vgl_hpc_gaussian | qmckl_ao.c:3283 | 0 | 0 | runtime | parallel | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 2.949 | 1.493 | 0.875 | 0.474 | 0.297 | 0.206 | 0.176 | 4.00 | 4.00 | 4.00 | 4.00 | 4.00 | 4.00 | 4.00 | 13.1 E-6 | 1.46 E-3 | 97.3 E-3 | 69.1 E-3 | 82.3 E-3 | 68.5 E-3 | 36.7 E-3 | 1.49 E-6 | 1.43 E-3 | 97.2 E-3 | 69.1 E-3 | 82.2 E-3 | 68.4 E-3 | 36.7 E-3 | 0.00 | 0.10 | 11.1 | 14.6 | 27.7 | 33.2 | 20.9 | 1.000 | 1.001 | 1.125 | 1.171 | 1.383 | 1.497 | 1.264 | 1.000 | 1.000 | 1.006 | 1.008 | 1.015 | 1.016 | 1.011 |
libqmckl.so.0.0.0:0x1ca30 | qmckl_compute_ao_value_hpc_gaussian | qmckl_ao.c:2781 | 0 | 0 | runtime | parallel | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 1.032 | 0.519 | 0.268 | 0.135 | 71.7 E-3 | 44.5 E-3 | 35.8 E-3 | 3.00 | 3.00 | 3.00 | 3.00 | 3.00 | 3.00 | 3.00 | 3.40 E-6 | 19.0 E-6 | 32.0 E-6 | 37.0 E-6 | 39.9 E-6 | 56.4 E-6 | 75.2 E-6 | 558 E-9 | 16.4 E-6 | 28.6 E-6 | 33.9 E-6 | 37.4 E-6 | 53.9 E-6 | 70.9 E-6 | 0.00 | 0.00 | 0.01 | 0.03 | 0.06 | 0.13 | 0.21 | 1.000 | 1.000 | 1.000 | 1.000 | 1.001 | 1.001 | 1.002 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |