| | | | | | | requested parallelism | walltime sum (s) | nb instances | any sync average per thread time (s) | any wait average per thread time (s) | parallelism overhead (%) | local speedup if perfectly balanced | global speedup if perfectly balanced |
start addr | function name | source location | level | ancestor thread num | invoker | parallel or teams | 2x1 | 2x2 | 2x4 | 2x8 | 2x16 | 2x32 | 2x64 | 2x96 | 2x1 | 2x2 | 2x4 | 2x8 | 2x16 | 2x32 | 2x64 | 2x96 | 2x1 | 2x2 | 2x4 | 2x8 | 2x16 | 2x32 | 2x64 | 2x96 | 2x1 | 2x2 | 2x4 | 2x8 | 2x16 | 2x32 | 2x64 | 2x96 | 2x1 | 2x2 | 2x4 | 2x8 | 2x16 | 2x32 | 2x64 | 2x96 | 2x1 | 2x2 | 2x4 | 2x8 | 2x16 | 2x32 | 2x64 | 2x96 | 2x1 | 2x2 | 2x4 | 2x8 | 2x16 | 2x32 | 2x64 | 2x96 | 2x1 | 2x2 | 2x4 | 2x8 | 2x16 | 2x32 | 2x64 | 2x96 |
exec:0x20884e | main | miniqmc.cpp:411 | 0 | 0 | runtime | parallel | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 29.431 | 30.418 | 32.637 | 34.588 | 48.568 | 68.967 | 135.268 | 192.357 | 5.00 | 5.00 | 5.00 | 5.00 | 5.00 | 5.00 | 5.00 | 5.00 | 1.60 E-6 | 0.201 | 1.097 | 1.122 | 6.975 | 7.277 | 17.950 | 17.789 | 0.0 | 0.201 | 1.097 | 1.122 | 6.975 | 7.277 | 17.950 | 17.789 | 0.00 | 0.67 | 3.35 | 3.24 | 14.4 | 10.6 | 13.3 | 9.25 | 1.000 | 1.007 | 1.035 | 1.033 | 1.168 | 1.118 | 1.153 | 1.102 | 1.000 | 1.006 | 1.030 | 1.029 | 1.147 | 1.105 | 1.142 | 1.095 |
exec:0x20876b | main | miniqmc.cpp:378 | 0 | 0 | runtime | parallel | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 4.265 | 4.308 | 4.413 | 4.456 | 4.886 | 5.884 | 8.692 | 11.881 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.0 | 18.2 E-3 | 56.0 E-3 | 74.3 E-3 | 0.179 | 0.344 | 0.844 | 1.245 | 0.0 | 18.2 E-3 | 56.0 E-3 | 74.3 E-3 | 0.179 | 0.344 | 0.844 | 1.245 | 0 | 0.42 | 1.27 | 1.66 | 3.66 | 5.85 | 9.71 | 10.5 | 1.000 | 1.004 | 1.013 | 1.017 | 1.038 | 1.062 | 1.108 | 1.117 | 1.000 | 1.001 | 1.001 | 1.002 | 1.003 | 1.004 | 1.006 | 1.006 |
libqmcwfs.so:0x19ba4 | qmcplusplus::BsplineAllocator<double, 64ul, qmcplusplus::Mal... | BsplineAllocator.hpp:171 | 0 | 0 | runtime | parallel | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 0.145 | 76.3 E-3 | 41.8 E-3 | 21.6 E-3 | 16.3 E-3 | 13.1 E-3 | 16.0 E-3 | 18.4 E-3 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.0 | 565 E-6 | 1.35 E-3 | 585 E-6 | 2.61 E-3 | 971 E-6 | 1.63 E-3 | 694 E-6 | 0.0 | 565 E-6 | 1.35 E-3 | 585 E-6 | 2.61 E-3 | 971 E-6 | 1.63 E-3 | 694 E-6 | 0 | 0.75 | 3.23 | 2.70 | 16.1 | 7.39 | 10.2 | 3.77 | 1.000 | 1.008 | 1.033 | 1.028 | 1.191 | 1.080 | 1.114 | 1.039 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
exec:0x208887 | main | stl_vector.h:1499 | 0 | 0 | runtime | parallel | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 3.61 E-3 | 7.78 E-3 | 15.3 E-3 | 30.8 E-3 | 64.4 E-3 | 0.123 | 0.245 | 0.378 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.0 | 337 E-6 | 1.63 E-3 | 2.99 E-3 | 7.64 E-3 | 9.14 E-3 | 22.8 E-3 | 29.8 E-3 | 0.0 | 337 E-6 | 1.63 E-3 | 2.99 E-3 | 7.64 E-3 | 9.14 E-3 | 22.8 E-3 | 29.8 E-3 | 0 | 4.47 | 10.6 | 9.69 | 11.9 | 7.40 | 9.28 | 7.89 | 1.000 | 1.047 | 1.119 | 1.107 | 1.135 | 1.080 | 1.102 | 1.086 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
libqmcwfs.so:0x49efe | qmcplusplus::DelayedUpdate<double, double>::updateInvMat(qmc... | OpenMP.h:43 | 1 | 0 | runtime | parallel | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 1.05 E-3 | 1.11 E-3 | 1.26 E-3 | 1.31 E-3 | 1.46 E-3 | 1.72 E-3 | 2.35 E-3 | 3.27 E-3 | 485 | 482 | 484 | 484 | 484 | 484 | 485 | 485 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
libqmcwfs.so:0x4ade1 | qmcplusplus::DiracMatrix<double, double>::invert_transpose(q... | OpenMP.h:43 | 1 | 0 | runtime | parallel | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 96 | 13.1 E-6 | 13.5 E-6 | 20.5 E-6 | 20.3 E-6 | 27.1 E-6 | 26.0 E-6 | 54.7 E-6 | 67.3 E-6 | 2.00 | 2.00 | 2.00 | 2.00 | 2.00 | 2.00 | 2.00 | 2.00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |