Function: sortAtomsById | Module: exec | Source: haloExchange.c:655-661 | Coverage: 0.04% |
---|
Function: sortAtomsById | Module: exec | Source: haloExchange.c:655-661 | Coverage: 0.04% |
---|
/scratch_na/users/xoserete/qaas_runs/171-419-7821/intel/CoMD/build/CoMD/CoMD/src-openmp/haloExchange.c: 655 - 661 |
-------------------------------------------------------------------------------- |
655: int bId = ((AtomMsg*) b)->gid; |
656: assert(aId != bId); |
657: |
658: if (aId < bId) |
659: return -1; |
660: return 1; |
661: } |
0x408150 MOV (%RSI),%EAX |
0x408152 CMP %EAX,(%RDI) |
0x408154 JE 408161 |
0x408156 SETGE %AL |
0x408159 MOVZX %AL,%EAX |
0x40815c LEA -0x1(%RAX,%RAX,1),%EAX |
0x408160 RET |
0x408161 PUSH %RBP |
0x408162 MOV %RSP,%RBP |
0x408165 MOV $0x4225d4,%EDI |
0x40816a MOV $0x422407,%ESI |
0x40816f MOV $0x4225df,%ECX |
0x408174 MOV $0x290,%EDX |
0x408179 CALL 402bd0 <__assert_fail@plt> |
0x40817e XCHG %AX,%AX |
Coverage (%) | Name | Source Location | Module |
---|
Path / |
Source file and lines | haloExchange.c:655-661 |
Module | exec |
nb instructions | 8.50 |
nb uops | 9 |
loop length | 26 |
used x86 registers | 5 |
used mmx registers | 0 |
used xmm registers | 0 |
used ymm registers | 0 |
used zmm registers | 0 |
nb stack references | 0 |
micro-operation queue | 1.50 cycles |
front end | 1.50 cycles |
P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
uops | 1.40 | 1.10 | 0.83 | 0.83 | 0.50 | 1.10 | 1.30 | 0.50 | 0.50 | 0.50 | 1.10 | 0.83 |
cycles | 1.40 | 1.30 | 0.83 | 0.83 | 0.50 | 1.10 | 1.30 | 0.50 | 0.50 | 0.50 | 1.10 | 0.83 |
Cycles executing div or sqrt instructions | NA |
FE+BE cycles | 1.70 |
Stall cycles | 0.00 |
Front-end | 1.50 |
Dispatch | 1.40 |
Overall L1 | 1.67 |
all | 0% |
load | NA (no load vectorizable/vectorized instructions) |
store | NA (no store vectorizable/vectorized instructions) |
mul | NA (no mul vectorizable/vectorized instructions) |
add-sub | NA (no add-sub vectorizable/vectorized instructions) |
fma | NA (no fma vectorizable/vectorized instructions) |
div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
other | 0% |
all | 8% |
load | NA (no load vectorizable/vectorized instructions) |
store | NA (no store vectorizable/vectorized instructions) |
mul | NA (no mul vectorizable/vectorized instructions) |
add-sub | NA (no add-sub vectorizable/vectorized instructions) |
fma | NA (no fma vectorizable/vectorized instructions) |
div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
other | 8% |
Source file and lines | haloExchange.c:655-661 |
Module | exec |
nb instructions | 7 |
nb uops | 7 |
loop length | 17 |
used x86 registers | 3 |
used mmx registers | 0 |
used xmm registers | 0 |
used ymm registers | 0 |
used zmm registers | 0 |
nb stack references | 0 |
micro-operation queue | 1.17 cycles |
front end | 1.17 cycles |
P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
uops | 1.50 | 1.00 | 1.00 | 1.00 | 0.00 | 1.00 | 1.50 | 0.00 | 0.00 | 0.00 | 1.00 | 1.00 |
cycles | 1.50 | 1.40 | 1.00 | 1.00 | 0.00 | 1.00 | 1.50 | 0.00 | 0.00 | 0.00 | 1.00 | 1.00 |
Cycles executing div or sqrt instructions | NA |
FE+BE cycles | 1.62 |
Stall cycles | 0.00 |
Front-end | 1.17 |
Dispatch | 1.50 |
Overall L1 | 1.50 |
Instruction | Nb FU | P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | Latency | Recip. throughput |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
MOV (%RSI),%EAX | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 1 | 0.33 |
CMP %EAX,(%RDI) | 1 | 0.20 | 0.20 | 0.33 | 0.33 | 0 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0.33 | 1 | 0.33 |
JE 408161 <sortAtomsById+0x11> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 |
SETGE %AL | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
MOVZX %AL,%EAX | 1 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0 | 1 | 0.20 |
LEA -0x1(%RAX,%RAX,1),%EAX | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 1 |
RET | 1 | 0.50 | 0 | 0.33 | 0.33 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0.33 | 0 | 2.13 |
Source file and lines | haloExchange.c:655-661 |
Module | exec |
nb instructions | 10 |
nb uops | 11 |
loop length | 35 |
used x86 registers | 7 |
used mmx registers | 0 |
used xmm registers | 0 |
used ymm registers | 0 |
used zmm registers | 0 |
nb stack references | 0 |
micro-operation queue | 1.83 cycles |
front end | 1.83 cycles |
P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
uops | 1.30 | 1.20 | 0.67 | 0.67 | 1.00 | 1.20 | 1.10 | 1.00 | 1.00 | 1.00 | 1.20 | 0.67 |
cycles | 1.30 | 1.20 | 0.67 | 0.67 | 1.00 | 1.20 | 1.10 | 1.00 | 1.00 | 1.00 | 1.20 | 0.67 |
Cycles executing div or sqrt instructions | NA |
FE+BE cycles | 1.77 |
Stall cycles | 0.00 |
Front-end | 1.83 |
Dispatch | 1.30 |
Overall L1 | 1.83 |
all | 0% |
load | NA (no load vectorizable/vectorized instructions) |
store | NA (no store vectorizable/vectorized instructions) |
mul | NA (no mul vectorizable/vectorized instructions) |
add-sub | NA (no add-sub vectorizable/vectorized instructions) |
fma | NA (no fma vectorizable/vectorized instructions) |
div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
other | 0% |
all | 8% |
load | NA (no load vectorizable/vectorized instructions) |
store | NA (no store vectorizable/vectorized instructions) |
mul | NA (no mul vectorizable/vectorized instructions) |
add-sub | NA (no add-sub vectorizable/vectorized instructions) |
fma | NA (no fma vectorizable/vectorized instructions) |
div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
other | 8% |
Instruction | Nb FU | P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | Latency | Recip. throughput |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
MOV (%RSI),%EAX | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 1 | 0.33 |
CMP %EAX,(%RDI) | 1 | 0.20 | 0.20 | 0.33 | 0.33 | 0 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0.33 | 1 | 0.33 |
JE 408161 <sortAtomsById+0x11> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 |
PUSH %RBP | 1 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0.50 | 0.50 | 0.50 | 0 | 0 | 5-12 | 0.50 |
MOV %RSP,%RBP | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.17 |
MOV $0x4225d4,%EDI | 1 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0 | 1 | 0.20 |
MOV $0x422407,%ESI | 1 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0 | 1 | 0.20 |
MOV $0x4225df,%ECX | 1 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0 | 1 | 0.20 |
MOV $0x290,%EDX | 1 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0 | 1 | 0.20 |
CALL 402bd0 <__assert_fail@plt> | 2 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 1 |
Name | Coverage (%) | Time (s) |
---|---|---|
○sortAtomsById | 0.04 | 0.01 |