| Loop Id: 77 | Module: exec | Source: advec_cell_kernel.f90:202-246 [...] | Coverage: 3.75% |
|---|
| Loop Id: 77 | Module: exec | Source: advec_cell_kernel.f90:202-246 [...] | Coverage: 3.75% |
|---|
0x41e120 ORR X5, XZR, X8 |
0x41e124 ORR X9, XZR, X19 |
0x41e128 LDR D21, [X14, X1,LSL #3] [3] |
0x41e12c ORR X7, XZR, X12 |
0x41e130 ORR X6, XZR, X8 |
0x41e134 FCMPE D21, #0 |
0x41e138 B.GT 41e154 |
0x41e13c CMP W27, W20 |
0x41e140 CSEL W7, W27, W20, #13 |
0x41e144 SBFM X6, X7, #0, #31 |
0x41e148 ORR X9, XZR, X6 |
0x41e14c ORR X7, XZR, X8 |
0x41e150 ORR X5, XZR, X12 |
0x41e154 MADD X3, X23, X5, X28 |
0x41e158 FABS D1, D21 |
0x41e15c ADD X2, X11, X1 |
0x41e160 MADD X4, X22, X5, X11 |
0x41e164 ADD X6, X24, X6 |
0x41e168 LDR D2, [X18, X8,LSL #3] [8] |
0x41e16c MADD X17, X22, X7, X2 |
0x41e170 ADD X3, X3, X1 |
0x41e174 LDR D31, [X25, X6,LSL #3] [5] |
0x41e178 MADD X2, X22, X9, X2 |
0x41e17c ADD X4, X4, X1 |
0x41e180 LDR D29, [X13, X3,LSL #3] [2] |
0x41e184 LDR D30, [X0, X4,LSL #3] [6] |
0x41e188 LDR D22, [X0, X17,LSL #3] [6] |
0x41e18c FDIV D4, D2, D31 |
0x41e190 LDR D3, [X0, X2,LSL #3] [6] |
0x41e194 FDIV D5, D1, D29 |
0x41e198 FSUB D6, D22, S30 |
0x41e19c FSUB D7, D30, S3 |
0x41e1a0 FCMPE D6, #0 |
0x41e1a4 FABS D16, D6 |
0x41e1a8 FMUL D27, D7, D6 |
0x41e1ac FABS D17, D7 |
0x41e1b0 FMINNM D18, D16, D17 |
0x41e1b4 FCSEL D26, D25, D0, #9 |
0x41e1b8 FCMPE D27, #0 |
0x41e1bc FADD D28, D5, D0 |
0x41e1c0 FSUB D19, D0, S5 |
0x41e1c4 FSUB D1, D23, S5 |
0x41e1c8 FMUL D2, D4, D28 |
0x41e1cc B.LS 41e1e8 |
0x41e1d0 FMUL D20, D1, D16 |
0x41e1d4 FMADD D31, D2, D17, D20 |
0x41e1d8 FMUL D29, D31, D24 |
0x41e1dc FMINNM D22, D29, D18 |
0x41e1e0 FMUL D4, D22, D19 |
0x41e1e4 FMADD D30, D26, D4, D30 |
0x41e1e8 FMUL D3, D30, D21 |
0x41e1ec ADD X6, X30, X1 |
0x41e1f0 MADD X5, X21, X5, X6 |
0x41e1f4 MADD X7, X21, X7, X6 |
0x41e1f8 MADD X9, X21, X9, X6 |
0x41e1fc STR D3, [X15, X1,LSL #3] [4] |
0x41e200 FABS D21, D3 |
0x41e204 LDR D5, [X10, X5,LSL #3] [1] |
0x41e208 LDR D6, [X10, X7,LSL #3] [1] |
0x41e20c LDR D7, [X10, X9,LSL #3] [1] |
0x41e210 FMUL D16, D5, D3 |
0x41e214 FSUB D17, D6, S5 |
0x41e218 FSUB D27, D5, S7 |
0x41e21c FABS D26, D17 |
0x41e220 FCMPE D17, #0 |
0x41e224 FMUL D28, D27, D17 |
0x41e228 FABS D19, D27 |
0x41e22c FMINNM D18, D26, D19 |
0x41e230 FMUL D1, D1, D26 |
0x41e234 FCSEL D20, D25, D0, #9 |
0x41e238 FCMPE D28, #0 |
0x41e23c FMADD D2, D2, D19, D1 |
0x41e240 B.LS 41ec10 |
0x41e244 LDR D31, [X0, X4,LSL #3] [6] |
0x41e248 FMUL D4, D2, D24 |
0x41e24c FMINNM D30, D4, D18 |
0x41e250 LDR D29, [X13, X3,LSL #3] [2] |
0x41e254 FMUL D22, D31, D29 |
0x41e258 FDIV D21, D21, D22 |
0x41e25c FSUB D6, D0, S21 |
0x41e260 FMUL D7, D6, D30 |
0x41e264 FMADD D5, D20, D7, D5 |
0x41e268 FMUL D3, D5, D3 |
0x41e26c STR D3, [X16, X1,LSL #3] [7] |
0x41e270 ADD X1, X1, #1 |
0x41e274 CMP W26, W1 |
0x41e278 B.GE 41e120 |
0x41ec10 STR D16, [X16, X1,LSL #3] [7] |
0x41ec14 ADD X1, X1, #1 |
0x41ec18 CMP W26, W1 |
0x41ec1c B.GE 41e120 |
/home/eoseret/qaas/qaas_runs/178-231-1255/intel/CloverLeaf1.3-FC/build/CloverLeaf1.3-FC/CloverLeaf_ref/kernels/advec_cell_kernel.f90: 202 - 246 |
-------------------------------------------------------------------------------- |
202: DO j=x_min,x_max |
203: |
204: IF(vol_flux_y(j,k).GT.0.0)THEN |
[...] |
210: upwind =MIN(k+1,y_max+2) |
[...] |
216: sigmat=ABS(vol_flux_y(j,k))/pre_vol(j,donor) |
217: sigma3=(1.0_8+sigmat)*(vertexdy(k)/vertexdy(dif)) |
218: sigma4=2.0_8-sigmat |
219: |
220: sigma=sigmat |
221: sigmav=sigmat |
222: |
223: diffuw=density1(j,donor)-density1(j,upwind) |
224: diffdw=density1(j,downwind)-density1(j,donor) |
225: wind=1.0_8 |
226: IF(diffdw.LE.0.0) wind=-1.0_8 |
227: IF(diffuw*diffdw.GT.0.0)THEN |
228: limiter=(1.0_8-sigmav)*wind*MIN(ABS(diffuw),ABS(diffdw)& |
229: ,one_by_six*(sigma3*ABS(diffuw)+sigma4*ABS(diffdw))) |
230: ELSE |
231: limiter=0.0 |
232: ENDIF |
233: mass_flux_y(j,k)=vol_flux_y(j,k)*(density1(j,donor)+limiter) |
234: |
235: sigmam=ABS(mass_flux_y(j,k))/(density1(j,donor)*pre_vol(j,donor)) |
236: diffuw=energy1(j,donor)-energy1(j,upwind) |
237: diffdw=energy1(j,downwind)-energy1(j,donor) |
238: wind=1.0_8 |
239: IF(diffdw.LE.0.0) wind=-1.0_8 |
240: IF(diffuw*diffdw.GT.0.0)THEN |
241: limiter=(1.0_8-sigmam)*wind*MIN(ABS(diffuw),ABS(diffdw)& |
242: ,one_by_six*(sigma3*ABS(diffuw)+sigma4*ABS(diffdw))) |
243: ELSE |
244: limiter=0.0 |
245: ENDIF |
246: ener_flux(j,k)=mass_flux_y(j,k)*(energy1(j,donor)+limiter) |
| Coverage (%) | Name | Source Location | Module |
|---|---|---|---|
| ►98.42+ | omp_fulfill_event | libgomp.so.1.0.0 | |
| ○ | start_thread | libc.so.6 | |
| ○ | thread_start | libc.so.6 |
| min | med | avg | max |
|---|---|---|---|
| Percentile Index | 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 |
|---|---|---|---|---|---|---|---|---|---|---|
| Value |
| min | med | avg | max |
|---|---|---|---|
| Percentile Index | 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 |
|---|---|---|---|---|---|---|---|---|---|---|
| Value |
| Path / |
| Metric | Value |
|---|---|
| CQA speedup if no scalar integer | 2.32 |
| CQA speedup if FP arith vectorized | 1.78 |
| CQA speedup if fully vectorized | 4.00 - 2.29 |
| CQA speedup if no inter-iteration dependency | NA |
| CQA speedup if next bottleneck killed | 1.05 |
| Bottlenecks | P6, P7, P8, P9, |
| Function | __advec_cell_kernel_module_MOD_advec_cell_kernel._omp_fn.0 |
| Source | advec_cell_kernel.f90:202-204,advec_cell_kernel.f90:210-210,advec_cell_kernel.f90:216-246 |
| Source loop unroll info | not unrolled or unrolled with no peel/tail loop |
| Source loop unroll confidence level | max |
| Unroll/vectorization loop type | NA |
| Unroll factor | NA |
| CQA cycles | 12.00 |
| CQA cycles if no scalar integer | 5.17 |
| CQA cycles if FP arith vectorized | 6.75 |
| CQA cycles if fully vectorized | 3.00 - 5.24 |
| Front-end cycles | 11.38 |
| P0 cycles | 2.50 |
| P1 cycles | 2.50 |
| P2 cycles | 6.50 |
| P3 cycles | 6.25 |
| P4 cycles | 7.00 |
| P5 cycles | 6.25 |
| P6 cycles | 12.00 |
| P7 cycles | 12.00 |
| P8 cycles | 12.00 |
| P9 cycles | 12.00 |
| P10 cycles | 5.17 |
| P11 cycles | 4.83 |
| P12 cycles | 5.00 |
| P13 cycles | 0.00 |
| P14 cycles | 0.00 |
| DIV/SQRT cycles | 5.25 - 10.49 |
| Inter-iter dependencies cycles | NA |
| FE+BE cycles (UFS) | NA |
| Stall cycles (UFS) | NA |
| Nb insns | 91.00 |
| Nb uops | 91.00 |
| Nb loads | NA |
| Nb stores | 3.00 |
| Nb stack references | 0.00 |
| FLOP/cycle | 2.67 |
| Nb FLOP add-sub | 8.00 |
| Nb FLOP mul | 13.00 |
| Nb FLOP fma | 4.00 |
| Nb FLOP div | 3.00 |
| Nb FLOP rcp | 0.00 |
| Nb FLOP sqrt | 0.00 |
| Nb FLOP rsqrt | 0.00 |
| Bytes/cycle | 0.00 |
| Bytes prefetched | 0.00 |
| Bytes loaded | 0.00 |
| Bytes stored | 0.00 |
| Stride 0 | NA |
| Stride 1 | NA |
| Stride n | NA |
| Stride unknown | NA |
| Stride indirect | NA |
| Vectorization ratio all | 0.00 |
| Vectorization ratio load | 0.00 |
| Vectorization ratio store | 0.00 |
| Vectorization ratio mul | 0.00 |
| Vectorization ratio add_sub | 0.00 |
| Vectorization ratio fma | 0.00 |
| Vectorization ratio div_sqrt | 0.00 |
| Vectorization ratio other | 0.00 |
| Vector-efficiency ratio all | 24.60 |
| Vector-efficiency ratio load | 25.00 |
| Vector-efficiency ratio store | 25.00 |
| Vector-efficiency ratio mul | 25.00 |
| Vector-efficiency ratio add_sub | 25.00 |
| Vector-efficiency ratio fma | 25.00 |
| Vector-efficiency ratio div_sqrt | 25.00 |
| Vector-efficiency ratio other | 23.68 |
| Metric | Value |
|---|---|
| CQA speedup if no scalar integer | 2.32 |
| CQA speedup if FP arith vectorized | 1.78 |
| CQA speedup if fully vectorized | 4.00 - 2.29 |
| CQA speedup if no inter-iteration dependency | NA |
| CQA speedup if next bottleneck killed | 1.05 |
| Bottlenecks | P6, P7, P8, P9, |
| Function | __advec_cell_kernel_module_MOD_advec_cell_kernel._omp_fn.0 |
| Source | advec_cell_kernel.f90:202-204,advec_cell_kernel.f90:210-210,advec_cell_kernel.f90:216-246 |
| Source loop unroll info | not unrolled or unrolled with no peel/tail loop |
| Source loop unroll confidence level | max |
| Unroll/vectorization loop type | NA |
| Unroll factor | NA |
| CQA cycles | 12.00 |
| CQA cycles if no scalar integer | 5.17 |
| CQA cycles if FP arith vectorized | 6.75 |
| CQA cycles if fully vectorized | 3.00 - 5.24 |
| Front-end cycles | 11.38 |
| P0 cycles | 2.50 |
| P1 cycles | 2.50 |
| P2 cycles | 6.50 |
| P3 cycles | 6.25 |
| P4 cycles | 7.00 |
| P5 cycles | 6.25 |
| P6 cycles | 12.00 |
| P7 cycles | 12.00 |
| P8 cycles | 12.00 |
| P9 cycles | 12.00 |
| P10 cycles | 5.17 |
| P11 cycles | 4.83 |
| P12 cycles | 5.00 |
| P13 cycles | 0.00 |
| P14 cycles | 0.00 |
| DIV/SQRT cycles | 5.25 - 10.49 |
| Inter-iter dependencies cycles | NA |
| FE+BE cycles (UFS) | NA |
| Stall cycles (UFS) | NA |
| Nb insns | 91.00 |
| Nb uops | 91.00 |
| Nb loads | NA |
| Nb stores | 3.00 |
| Nb stack references | 0.00 |
| FLOP/cycle | 2.67 |
| Nb FLOP add-sub | 8.00 |
| Nb FLOP mul | 13.00 |
| Nb FLOP fma | 4.00 |
| Nb FLOP div | 3.00 |
| Nb FLOP rcp | 0.00 |
| Nb FLOP sqrt | 0.00 |
| Nb FLOP rsqrt | 0.00 |
| Bytes/cycle | 0.00 |
| Bytes prefetched | 0.00 |
| Bytes loaded | 0.00 |
| Bytes stored | 0.00 |
| Stride 0 | NA |
| Stride 1 | NA |
| Stride n | NA |
| Stride unknown | NA |
| Stride indirect | NA |
| Vectorization ratio all | 0.00 |
| Vectorization ratio load | 0.00 |
| Vectorization ratio store | 0.00 |
| Vectorization ratio mul | 0.00 |
| Vectorization ratio add_sub | 0.00 |
| Vectorization ratio fma | 0.00 |
| Vectorization ratio div_sqrt | 0.00 |
| Vectorization ratio other | 0.00 |
| Vector-efficiency ratio all | 24.60 |
| Vector-efficiency ratio load | 25.00 |
| Vector-efficiency ratio store | 25.00 |
| Vector-efficiency ratio mul | 25.00 |
| Vector-efficiency ratio add_sub | 25.00 |
| Vector-efficiency ratio fma | 25.00 |
| Vector-efficiency ratio div_sqrt | 25.00 |
| Vector-efficiency ratio other | 23.68 |
| Path / |
| Function | __advec_cell_kernel_module_MOD_advec_cell_kernel._omp_fn.0 |
| Source file and lines | advec_cell_kernel.f90:202-246 |
| Module | exec |
| nb instructions | 91 |
| nb uops | 91 |
| loop length | 364 |
| used w registers | 5 |
| used x registers | 28 |
| used b registers | 0 |
| used h registers | 0 |
| used s registers | 5 |
| used d registers | 24 |
| used q registers | 0 |
| used v registers | 0 |
| used z registers | 0 |
| nb stack references | 0 |
| ADD-SUB / MUL ratio | 0.62 |
| micro-operation queue | 11.38 cycles |
| front end | 11.38 cycles |
| P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | P12 | P13 | P14 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| uops | 2.50 | 2.50 | 6.50 | 6.25 | 7.00 | 6.25 | 12.00 | 12.00 | 12.00 | 12.00 | 5.17 | 4.83 | 5.00 | 0.00 | 0.00 |
| cycles | 2.50 | 2.50 | 6.50 | 6.25 | 7.00 | 6.25 | 12.00 | 12.00 | 12.00 | 12.00 | 5.17 | 4.83 | 5.00 | 0.00 | 0.00 |
| Cycles executing div or sqrt instructions | 5.25-10.49 |
| Front-end | 11.38 |
| Dispatch | 12.00 |
| DIV/SQRT | 5.25-10.49 |
| Overall L1 | 12.00 |
| all | 0% |
| load | 0% |
| store | 0% |
| mul | NA (no mul vectorizable/vectorized instructions) |
| add-sub | NA (no add-sub vectorizable/vectorized instructions) |
| fma | NA (no fma vectorizable/vectorized instructions) |
| other | 0% |
| all | 0% |
| load | NA (no load vectorizable/vectorized instructions) |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | 0% |
| add-sub | 0% |
| fma | 0% |
| div/sqrt | 0% |
| other | 0% |
| all | 0% |
| load | 0% |
| store | 0% |
| mul | 0% |
| add-sub | 0% |
| fma | 0% |
| div/sqrt | 0% |
| other | 0% |
| all | 23% |
| load | 25% |
| store | 25% |
| mul | NA (no mul vectorizable/vectorized instructions) |
| add-sub | NA (no add-sub vectorizable/vectorized instructions) |
| fma | NA (no fma vectorizable/vectorized instructions) |
| other | 12% |
| all | 25% |
| load | NA (no load vectorizable/vectorized instructions) |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | 25% |
| add-sub | 25% |
| fma | 25% |
| div/sqrt | 25% |
| other | 25% |
| all | 24% |
| load | 25% |
| store | 25% |
| mul | 25% |
| add-sub | 25% |
| fma | 25% |
| div/sqrt | 25% |
| other | 23% |
| Instruction | Nb FU | P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | P12 | P13 | P14 | Latency | Recip. throughput | Vectorization |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ORR X5, XZR, X8 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| ORR X9, XZR, X19 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| LDR D21, [X14, X1,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| ORR X7, XZR, X12 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| ORR X6, XZR, X8 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| FCMPE D21, #0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | scal (25.0%) |
| B.GT 41e154 <__advec_cell_kernel_module_MOD_advec_cell_kernel._omp_fn.0+0x394> | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 | N/A |
| CMP W27, W20 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.33 | scal (12.5%) |
| CSEL W7, W27, W20, #13 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| SBFM X6, X7, #0, #31 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| ORR X9, XZR, X6 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| ORR X7, XZR, X8 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| ORR X5, XZR, X12 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| MADD X3, X23, X5, X28 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | N/A |
| FABS D1, D21 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| ADD X2, X11, X1 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| MADD X4, X22, X5, X11 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | N/A |
| ADD X6, X24, X6 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| LDR D2, [X18, X8,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| MADD X17, X22, X7, X2 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | N/A |
| ADD X3, X3, X1 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| LDR D31, [X25, X6,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| MADD X2, X22, X9, X2 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | N/A |
| ADD X4, X4, X1 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| LDR D29, [X13, X3,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| LDR D30, [X0, X4,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| LDR D22, [X0, X17,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| FDIV D4, D2, D31 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 7-15 | 1.75-3.50 | scal (25.0%) |
| LDR D3, [X0, X2,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| FDIV D5, D1, D29 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 7-15 | 1.75-3.50 | scal (25.0%) |
| FSUB D6, D22, S30 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FSUB D7, D30, S3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FCMPE D6, #0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | scal (25.0%) |
| FABS D16, D6 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FMUL D27, D7, D6 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FABS D17, D7 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FMINNM D18, D16, D17 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FCSEL D26, D25, D0, #9 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0.50 | scal (25.0%) |
| FCMPE D27, #0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | scal (25.0%) |
| FADD D28, D5, D0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FSUB D19, D0, S5 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FSUB D1, D23, S5 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FMUL D2, D4, D28 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| B.LS 41e1e8 <__advec_cell_kernel_module_MOD_advec_cell_kernel._omp_fn.0+0x428> | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 | N/A |
| FMUL D20, D1, D16 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FMADD D31, D2, D17, D20 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 4 | 0.25 | scal (25.0%) |
| FMUL D29, D31, D24 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FMINNM D22, D29, D18 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FMUL D4, D22, D19 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FMADD D30, D26, D4, D30 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 4 | 0.25 | scal (25.0%) |
| FMUL D3, D30, D21 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| ADD X6, X30, X1 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| MADD X5, X21, X5, X6 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | N/A |
| MADD X7, X21, X7, X6 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | N/A |
| MADD X9, X21, X9, X6 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | N/A |
| STR D3, [X15, X1,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 2 | 0.50 | scal (25.0%) |
| FABS D21, D3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| LDR D5, [X10, X5,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| LDR D6, [X10, X7,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| LDR D7, [X10, X9,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| FMUL D16, D5, D3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FSUB D17, D6, S5 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FSUB D27, D5, S7 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FABS D26, D17 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FCMPE D17, #0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | scal (25.0%) |
| FMUL D28, D27, D17 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FABS D19, D27 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FMINNM D18, D26, D19 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FMUL D1, D1, D26 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FCSEL D20, D25, D0, #9 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0.50 | scal (25.0%) |
| FCMPE D28, #0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | scal (25.0%) |
| FMADD D2, D2, D19, D1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 4 | 0.25 | scal (25.0%) |
| B.LS 41ec10 <__advec_cell_kernel_module_MOD_advec_cell_kernel._omp_fn.0+0xe50> | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 | N/A |
| LDR D31, [X0, X4,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| FMUL D4, D2, D24 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FMINNM D30, D4, D18 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| LDR D29, [X13, X3,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| FMUL D22, D31, D29 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FDIV D21, D21, D22 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 7-15 | 1.75-3.50 | scal (25.0%) |
| FSUB D6, D0, S21 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FMUL D7, D6, D30 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FMADD D5, D20, D7, D5 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 4 | 0.25 | scal (25.0%) |
| FMUL D3, D5, D3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| STR D3, [X16, X1,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 2 | 0.50 | scal (25.0%) |
| ADD X1, X1, #1 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| CMP W26, W1 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.33 | scal (12.5%) |
| B.GE 41e120 <__advec_cell_kernel_module_MOD_advec_cell_kernel._omp_fn.0+0x360> | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 | N/A |
| STR D16, [X16, X1,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 2 | 0.50 | scal (25.0%) |
| ADD X1, X1, #1 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| CMP W26, W1 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.33 | N/A |
| B.GE 41e120 <__advec_cell_kernel_module_MOD_advec_cell_kernel._omp_fn.0+0x360> | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 | N/A |
| Function | __advec_cell_kernel_module_MOD_advec_cell_kernel._omp_fn.0 |
| Source file and lines | advec_cell_kernel.f90:202-246 |
| Module | exec |
| nb instructions | 91 |
| nb uops | 91 |
| loop length | 364 |
| used w registers | 5 |
| used x registers | 28 |
| used b registers | 0 |
| used h registers | 0 |
| used s registers | 5 |
| used d registers | 24 |
| used q registers | 0 |
| used v registers | 0 |
| used z registers | 0 |
| nb stack references | 0 |
| ADD-SUB / MUL ratio | 0.62 |
| micro-operation queue | 11.38 cycles |
| front end | 11.38 cycles |
| P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | P12 | P13 | P14 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| uops | 2.50 | 2.50 | 6.50 | 6.25 | 7.00 | 6.25 | 12.00 | 12.00 | 12.00 | 12.00 | 5.17 | 4.83 | 5.00 | 0.00 | 0.00 |
| cycles | 2.50 | 2.50 | 6.50 | 6.25 | 7.00 | 6.25 | 12.00 | 12.00 | 12.00 | 12.00 | 5.17 | 4.83 | 5.00 | 0.00 | 0.00 |
| Cycles executing div or sqrt instructions | 5.25-10.49 |
| Front-end | 11.38 |
| Dispatch | 12.00 |
| DIV/SQRT | 5.25-10.49 |
| Overall L1 | 12.00 |
| all | 0% |
| load | 0% |
| store | 0% |
| mul | NA (no mul vectorizable/vectorized instructions) |
| add-sub | NA (no add-sub vectorizable/vectorized instructions) |
| fma | NA (no fma vectorizable/vectorized instructions) |
| other | 0% |
| all | 0% |
| load | NA (no load vectorizable/vectorized instructions) |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | 0% |
| add-sub | 0% |
| fma | 0% |
| div/sqrt | 0% |
| other | 0% |
| all | 0% |
| load | 0% |
| store | 0% |
| mul | 0% |
| add-sub | 0% |
| fma | 0% |
| div/sqrt | 0% |
| other | 0% |
| all | 23% |
| load | 25% |
| store | 25% |
| mul | NA (no mul vectorizable/vectorized instructions) |
| add-sub | NA (no add-sub vectorizable/vectorized instructions) |
| fma | NA (no fma vectorizable/vectorized instructions) |
| other | 12% |
| all | 25% |
| load | NA (no load vectorizable/vectorized instructions) |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | 25% |
| add-sub | 25% |
| fma | 25% |
| div/sqrt | 25% |
| other | 25% |
| all | 24% |
| load | 25% |
| store | 25% |
| mul | 25% |
| add-sub | 25% |
| fma | 25% |
| div/sqrt | 25% |
| other | 23% |
| Instruction | Nb FU | P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | P12 | P13 | P14 | Latency | Recip. throughput | Vectorization |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ORR X5, XZR, X8 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| ORR X9, XZR, X19 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| LDR D21, [X14, X1,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| ORR X7, XZR, X12 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| ORR X6, XZR, X8 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| FCMPE D21, #0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | scal (25.0%) |
| B.GT 41e154 <__advec_cell_kernel_module_MOD_advec_cell_kernel._omp_fn.0+0x394> | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 | N/A |
| CMP W27, W20 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.33 | scal (12.5%) |
| CSEL W7, W27, W20, #13 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| SBFM X6, X7, #0, #31 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| ORR X9, XZR, X6 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| ORR X7, XZR, X8 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| ORR X5, XZR, X12 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| MADD X3, X23, X5, X28 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | N/A |
| FABS D1, D21 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| ADD X2, X11, X1 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| MADD X4, X22, X5, X11 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | N/A |
| ADD X6, X24, X6 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| LDR D2, [X18, X8,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| MADD X17, X22, X7, X2 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | N/A |
| ADD X3, X3, X1 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| LDR D31, [X25, X6,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| MADD X2, X22, X9, X2 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | N/A |
| ADD X4, X4, X1 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| LDR D29, [X13, X3,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| LDR D30, [X0, X4,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| LDR D22, [X0, X17,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| FDIV D4, D2, D31 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 7-15 | 1.75-3.50 | scal (25.0%) |
| LDR D3, [X0, X2,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| FDIV D5, D1, D29 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 7-15 | 1.75-3.50 | scal (25.0%) |
| FSUB D6, D22, S30 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FSUB D7, D30, S3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FCMPE D6, #0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | scal (25.0%) |
| FABS D16, D6 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FMUL D27, D7, D6 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FABS D17, D7 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FMINNM D18, D16, D17 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FCSEL D26, D25, D0, #9 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0.50 | scal (25.0%) |
| FCMPE D27, #0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | scal (25.0%) |
| FADD D28, D5, D0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FSUB D19, D0, S5 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FSUB D1, D23, S5 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FMUL D2, D4, D28 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| B.LS 41e1e8 <__advec_cell_kernel_module_MOD_advec_cell_kernel._omp_fn.0+0x428> | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 | N/A |
| FMUL D20, D1, D16 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FMADD D31, D2, D17, D20 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 4 | 0.25 | scal (25.0%) |
| FMUL D29, D31, D24 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FMINNM D22, D29, D18 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FMUL D4, D22, D19 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FMADD D30, D26, D4, D30 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 4 | 0.25 | scal (25.0%) |
| FMUL D3, D30, D21 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| ADD X6, X30, X1 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| MADD X5, X21, X5, X6 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | N/A |
| MADD X7, X21, X7, X6 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | N/A |
| MADD X9, X21, X9, X6 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | N/A |
| STR D3, [X15, X1,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 2 | 0.50 | scal (25.0%) |
| FABS D21, D3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| LDR D5, [X10, X5,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| LDR D6, [X10, X7,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| LDR D7, [X10, X9,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| FMUL D16, D5, D3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FSUB D17, D6, S5 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FSUB D27, D5, S7 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FABS D26, D17 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FCMPE D17, #0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | scal (25.0%) |
| FMUL D28, D27, D17 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FABS D19, D27 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FMINNM D18, D26, D19 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FMUL D1, D1, D26 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FCSEL D20, D25, D0, #9 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0.50 | scal (25.0%) |
| FCMPE D28, #0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | scal (25.0%) |
| FMADD D2, D2, D19, D1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 4 | 0.25 | scal (25.0%) |
| B.LS 41ec10 <__advec_cell_kernel_module_MOD_advec_cell_kernel._omp_fn.0+0xe50> | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 | N/A |
| LDR D31, [X0, X4,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| FMUL D4, D2, D24 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FMINNM D30, D4, D18 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| LDR D29, [X13, X3,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 | scal (25.0%) |
| FMUL D22, D31, D29 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FDIV D21, D21, D22 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 7-15 | 1.75-3.50 | scal (25.0%) |
| FSUB D6, D0, S21 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 2 | 0.25 | scal (25.0%) |
| FMUL D7, D6, D30 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| FMADD D5, D20, D7, D5 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 4 | 0.25 | scal (25.0%) |
| FMUL D3, D5, D3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 3 | 0.25 | scal (25.0%) |
| STR D3, [X16, X1,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 2 | 0.50 | scal (25.0%) |
| ADD X1, X1, #1 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| CMP W26, W1 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.33 | scal (12.5%) |
| B.GE 41e120 <__advec_cell_kernel_module_MOD_advec_cell_kernel._omp_fn.0+0x360> | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 | N/A |
| STR D16, [X16, X1,LSL #3] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 2 | 0.50 | scal (25.0%) |
| ADD X1, X1, #1 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 | N/A |
| CMP W26, W1 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.33 | N/A |
| B.GE 41e120 <__advec_cell_kernel_module_MOD_advec_cell_kernel._omp_fn.0+0x360> | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 | N/A |
