| | | | | | | requested parallelism | walltime sum (s) | nb instances | any sync average per thread time (s) | any wait average per thread time (s) | parallelism overhead (%) | local speedup if perfectly balanced | global speedup if perfectly balanced |
start addr | function name | source location | level | ancestor thread num | invoker | parallel or teams | 1x1 | 1x2 | 1x4 | 1x8 | 1x16 | 1x32 | 1x64 | 1x112 | 1x1 | 1x2 | 1x4 | 1x8 | 1x16 | 1x32 | 1x64 | 1x112 | 1x1 | 1x2 | 1x4 | 1x8 | 1x16 | 1x32 | 1x64 | 1x112 | 1x1 | 1x2 | 1x4 | 1x8 | 1x16 | 1x32 | 1x64 | 1x112 | 1x1 | 1x2 | 1x4 | 1x8 | 1x16 | 1x32 | 1x64 | 1x112 | 1x1 | 1x2 | 1x4 | 1x8 | 1x16 | 1x32 | 1x64 | 1x112 | 1x1 | 1x2 | 1x4 | 1x8 | 1x16 | 1x32 | 1x64 | 1x112 | 1x1 | 1x2 | 1x4 | 1x8 | 1x16 | 1x32 | 1x64 | 1x112 |
exec:0x4014a0 | main | main.c:139 | 0 | 0 | runtime | parallel | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 112 | 1.98 E3 | 993.911 | 496.980 | 248.714 | 124.827 | 62.554 | 31.452 | 18.203 | 4.98 E3 | 4.98 E3 | 4.98 E3 | 4.98 E3 | 4.98 E3 | 4.98 E3 | 4.98 E3 | 4.98 E3 | 0.0 | 0.984 | 0.598 | 0.417 | 0.489 | 0.372 | 0.315 | 0.351 | 0.0 | 0.983 | 0.598 | 0.416 | 0.489 | 0.371 | 0.314 | 0.351 | 0 | 0.10 | 0.12 | 0.17 | 0.39 | 0.59 | 1.00 | 1.93 | 1.000 | 1.001 | 1.001 | 1.002 | 1.004 | 1.006 | 1.010 | 1.020 | 1.000 | 1.001 | 1.001 | 1.002 | 1.004 | 1.006 | 1.010 | 1.019 |
exec:0x4013dd | main | main.c:97 | 0 | 0 | runtime | parallel | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 112 | 10.5 E-6 | 219 E-6 | 519 E-6 | 731 E-6 | 1.64 E-3 | 2.50 E-3 | 4.19 E-3 | 7.03 E-3 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.0 | 40.6 E-6 | 78.6 E-6 | 70.4 E-6 | 85.0 E-6 | 228 E-6 | 286 E-6 | 296 E-6 | 0.0 | 40.4 E-6 | 78.4 E-6 | 70.2 E-6 | 84.9 E-6 | 228 E-6 | 286 E-6 | 296 E-6 | 0 | 18.5 | 15.1 | 9.63 | 5.19 | 9.12 | 6.84 | 4.21 | 1.000 | 1.227 | 1.178 | 1.107 | 1.055 | 1.100 | 1.073 | 1.044 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |