Loops
ljForce.c: 191 - 385.34 %
Run orig_default | Run icx_default | Run gcc_default | Run aocc_1 | Run icx_3 | Run gcc_10 | ||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
101 | 20.64 | 18.05 | 63.83 | 30.69 | 16.34 | 95 | 19.26 | 16.61 | 60.68 | 35.93 | 16.99 | 94 | 17.15 | 18.00 | 69.99 | 6.06 | 13.26 | 102 | 20.65 | 17.76 | 62.55 | 35.68 | 16.96 | 157 | 19.44 | 17.12 | 58.80 | 44.07 | 18.01 | 88 | 17.38 | 17.79 | 69.48 | 6.25 | 13.28 |
Sum on 1 analyzed binary loop (exec - 101) | Sum on 1 analyzed binary loop (exec - 95) | Sum on 1 analyzed binary loop (exec - 94) | Sum on 1 analyzed binary loop (exec - 102) | Sum on 1 analyzed binary loop (exec - 157) | Sum on 1 analyzed binary loop (exec - 88) | ||||||||||||||||||||||||||||||
Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||||||
Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | ||||||||||||||||||||||||||||||
Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | ||||||||||||||||||||||||
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | ||||||||||||||||||||||||
Control Flow Issues | Control Flow Issues | Control Flow Issues | Control Flow Issues | Control Flow Issues | Control Flow Issues | ||||||||||||||||||||||||||||||
Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 0 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 0 | ||||||||||||||||||||||||
Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 1 | ||||||||||||||||||||||||
Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | ||||||||||||||||||||||||||||||
Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | ||||||||||||||||||||||||||
More than 10% of the vector loads instructions are unaligned | 0 | More than 10% of the vector loads instructions are unaligned | 0 | More than 10% of the vector loads instructions are unaligned | More than 10% of the vector loads instructions are unaligned | 0 | More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | ||||||||||||||||||||||||||
Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | ||||||||||||||||||||||||||
Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | ||||||||||||||||||||||||||||||
Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 0 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 0 | ||||||||||||||||||||||||
Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 1 | ||||||||||||||||||||||||
Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 0 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 0 | ||||||||||||||||||||||||
Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | ||||||||||||||||||||||||||||||
Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port |
timestep.c: 88 - 8.22 %
Run orig_default | Run icx_default | Run gcc_default | Run aocc_1 | Run icx_3 | Run gcc_10 | ||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
111 | 0.45 | 0.38 | 1.36 | 0 | 12.5 | 104 | 0.29 | 0.19 | 0.71 | 100 | 96.95 | 104 | 0.45 | 0.45 | 1.75 | 0 | 12.5 | 112 | 0.45 | 0.39 | 1.36 | 0 | 12.5 | 166 | 0.46 | 0.39 | 1.35 | 7.69 | 13.46 | 98 | 0.44 | 0.43 | 1.69 | 0 | 12.5 |
Sum on 1 analyzed binary loop (exec - 111) | Sum on 1 analyzed binary loop (exec - 104) | Sum on 1 analyzed binary loop (exec - 104) | Sum on 1 analyzed binary loop (exec - 112) | Sum on 1 analyzed binary loop (exec - 166) | Sum on 1 analyzed binary loop (exec - 98) | ||||||||||||||||||||||||||||||
Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||||||
Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | ||||||||||||||||||||||||||||||
Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | ||||||||||||||||||||||||
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | ||||||||||||||||||||||||
Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | ||||||||||||||||||||||||||||||
Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 0 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 1 | |||||||||||||||||||||||||||
Presence of indirect access | Presence of indirect access | 1 | Presence of indirect access | 0 | Presence of indirect access | Presence of indirect access | Presence of indirect access | 0 | |||||||||||||||||||||||||||
More than 10% of the vector loads instructions are unaligned | More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 0 | More than 10% of the vector loads instructions are unaligned | More than 10% of the vector loads instructions are unaligned | More than 10% of the vector loads instructions are unaligned | 0 | |||||||||||||||||||||||||||
Presence of expensive instructions: scatter/gather | Presence of expensive instructions: scatter/gather | 1 | Presence of expensive instructions: scatter/gather | 0 | Presence of expensive instructions: scatter/gather | Presence of expensive instructions: scatter/gather | Presence of expensive instructions: scatter/gather | 0 | |||||||||||||||||||||||||||
Presence of special instructions executing on a single port | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 0 | Presence of special instructions executing on a single port | Presence of special instructions executing on a single port | Presence of special instructions executing on a single port | 0 | |||||||||||||||||||||||||||
Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | ||||||||||||||||||||||||||||||
Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 0 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 1 | |||||||||||||||||||||||||||
Presence of indirect access | Presence of indirect access | 1 | Presence of indirect access | 0 | Presence of indirect access | Presence of indirect access | Presence of indirect access | 0 | |||||||||||||||||||||||||||
Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | ||||||||||||||||||||||||||||||
Presence of expensive instructions: scatter/gather | Presence of expensive instructions: scatter/gather | 1 | Presence of expensive instructions: scatter/gather | Presence of expensive instructions: scatter/gather | Presence of expensive instructions: scatter/gather | Presence of expensive instructions: scatter/gather | |||||||||||||||||||||||||||||
Presence of special instructions executing on a single port | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | Presence of special instructions executing on a single port | Presence of special instructions executing on a single port | Presence of special instructions executing on a single port |
timestep.c: 74 - 7.81 %
Run orig_default | Run icx_default | Run gcc_default | Run aocc_1 | Run icx_3 | Run gcc_10 | ||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
107 | 0.09 | 0.04 | 0.13 | 0 | 12.5 | 101 | 0.55 | 0.41 | 1.48 | 100 | 100 | 107 | 0.03 | 0.00 | 0.01 | 100 | 100 | 108 | 0.08 | 0.03 | 0.12 | 0 | 12.5 | 164 | 0.81 | 0.73 | 2.51 | 0 | 12.5 | 100 | 0.54 | 0.47 | 1.82 | 0 | 12.5 |
108 | 0.52 | 0.44 | 1.54 | 97.3 | 97.64 | ||||||||||||||||||||||||||||||
109 | 0.13 | 0.06 | 0.21 | 95.83 | 81.77 | ||||||||||||||||||||||||||||||
Sum on 3 analyzed binary loops (exec - 107, exec - 108, exec - 109) | Sum on 1 analyzed binary loop (exec - 101) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 108) | Sum on 1 analyzed binary loop (exec - 164) | Sum on 1 analyzed binary loop (exec - 100) | ||||||||||||||||||||||||||||||
Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||||||
Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | |||||||||||||||||||||||||||||||
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | |||||||||||||||||||||||||||||
Low iteration count | Low iteration count | Low iteration count | 1 | Low iteration count | 0 | Low iteration count | |||||||||||||||||||||||||||||
Control Flow Issues | Control Flow Issues | Control Flow Issues | Control Flow Issues | Control Flow Issues | |||||||||||||||||||||||||||||||
Low iteration count | Low iteration count | Low iteration count | 1 | Low iteration count | Low iteration count | ||||||||||||||||||||||||||||||
Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | |||||||||||||||||||||||||||||||
More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | More than 10% of the vector loads instructions are unaligned | More than 10% of the vector loads instructions are unaligned | |||||||||||||||||||||||||||||
Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | Presence of special instructions executing on a single port | Presence of special instructions executing on a single port | |||||||||||||||||||||||||||||
Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | |||||||||||||||||||||||||||||||
Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | Presence of special instructions executing on a single port | Presence of special instructions executing on a single port |
haloExchange.c: 621 - 5.70 %
Run orig_default | Run icx_default | Run gcc_default | Run aocc_1 | Run icx_3 | Run gcc_10 | ||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
59 | 0.36 | 0.28 | 1.01 | 0 | 10.94 | 64 | 0.10 | 0.01 | 0.04 | 50 | 22.92 | 61 | 0.16 | 0.09 | 0.35 | 21.15 | 13.74 | 59 | 0.38 | 0.28 | 1.00 | 0 | 10.94 | 87 | 0.38 | 0.28 | 0.95 | 33.33 | 14.58 | 59 | 0.36 | 0.32 | 1.26 | 0 | 10.94 |
65 | 0.40 | 0.30 | 1.10 | 96.3 | 70.83 | ||||||||||||||||||||||||||||||
Sum on 1 analyzed binary loop (exec - 59) | Sum on 1 analyzed binary loop (exec - 65) | Sum on 1 analyzed binary loop (exec - 61) | Sum on 1 analyzed binary loop (exec - 59) | Sum on 1 analyzed binary loop (exec - 87) | Sum on 1 analyzed binary loop (exec - 59) | ||||||||||||||||||||||||||||||
Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||||||
Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | ||||||||||||||||||||||||||||||
Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 1 | |||||||||||||||||||||||||
Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | ||||||||||||||||||||||||||||||
Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 0 | Presence of constant non-unit stride data access | 1 | ||||||||||||||||||||||||||
More than 10% of the vector loads instructions are unaligned | More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 0 | More than 10% of the vector loads instructions are unaligned | More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 0 | ||||||||||||||||||||||||||
Presence of expensive instructions: scatter/gather | Presence of expensive instructions: scatter/gather | 1 | Presence of expensive instructions: scatter/gather | 0 | Presence of expensive instructions: scatter/gather | Presence of expensive instructions: scatter/gather | 0 | Presence of expensive instructions: scatter/gather | 0 | ||||||||||||||||||||||||||
Presence of special instructions executing on a single port | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | Presence of special instructions executing on a single port | 0 | Presence of special instructions executing on a single port | 0 | ||||||||||||||||||||||||||
Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | ||||||||||||||||||||||||||||||
Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 1 | |||||||||||||||||||||||||||
Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | ||||||||||||||||||||||||||||||
Presence of expensive instructions: scatter/gather | Presence of expensive instructions: scatter/gather | 1 | Presence of expensive instructions: scatter/gather | 0 | Presence of expensive instructions: scatter/gather | Presence of expensive instructions: scatter/gather | Presence of expensive instructions: scatter/gather | ||||||||||||||||||||||||||||
Presence of special instructions executing on a single port | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | Presence of special instructions executing on a single port | Presence of special instructions executing on a single port | ||||||||||||||||||||||||||||
Use of masked instructions | Use of masked instructions | 1 | Use of masked instructions | 0 | Use of masked instructions | Use of masked instructions | Use of masked instructions |
timestep.c: 110 - 0.46 %
Run orig_default | Run icx_default | Run gcc_default | Run aocc_1 | Run icx_3 | Run gcc_10 | ||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
114 | 0.04 | 0.02 | 0.08 | 0 | 12.5 | 109 | 0.04 | 0.01 | 0.04 | 100 | 94.57 | 110 | 0.05 | 0.03 | 0.11 | 0 | 12.5 | 115 | 0.03 | 0.01 | 0.03 | 66.67 | 46.88 | 171 | 0.04 | 0.02 | 0.05 | 80.77 | 22.12 | 103 | 0.04 | 0.03 | 0.11 | 0 | 12.5 |
116 | 0.03 | 0.01 | 0.04 | 33.33 | 16.67 | ||||||||||||||||||||||||||||||
No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 110) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 171) | Sum on 1 analyzed binary loop (exec - 103) | ||||||||||||||||||||||||||||||
Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||||||
Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | |||||||||||||||||||||||||||||||||
Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | ||||||||||||||||||||||||||||||
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | ||||||||||||||||||||||||||||||
Data Access Issues | Data Access Issues | Data Access Issues | |||||||||||||||||||||||||||||||||
Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | ||||||||||||||||||||||||||||||
More than 10% of the vector loads instructions are unaligned | 0 | More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 0 | ||||||||||||||||||||||||||||||
Presence of special instructions executing on a single port | 0 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 0 | ||||||||||||||||||||||||||||||
Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | |||||||||||||||||||||||||||||||||
Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | ||||||||||||||||||||||||||||||
Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | |||||||||||||||||||||||||||||||||
Presence of special instructions executing on a single port | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port |
haloExchange.c: 633 - 0.23 %
Run orig_default | Run icx_default | Run gcc_default | Run aocc_1 | Run icx_3 | Run gcc_10 | ||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
61 | 0.02 | 0.00 | 0.01 | 65.96 | 61.7 | 63 | 0.04 | 0.01 | 0.04 | 98.81 | 81.1 | 59 | 0.04 | 0.02 | 0.06 | 0 | 10.94 | 61 | 0.02 | 0.00 | 0.01 | 65.96 | 61.7 | 88 | 0.03 | 0.01 | 0.03 | 33.33 | 14.58 | 58 | 0.05 | 0.02 | 0.07 | 0 | 10.94 |
62 | 0.03 | 0.00 | 0.01 | 33.33 | 14.58 | ||||||||||||||||||||||||||||||
No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | ||||||||||||||||||||||||||||||
Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |
haloExchange.c: 380 - 0.22 %
Run orig_default | Run icx_default | Run gcc_default | Run aocc_1 | Run icx_3 | Run gcc_10 | ||||||||||||||||||||||||||||||
Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
45 | 0.35 | 0.01 | 0.03 | 0 | 11.25 | 43 | 0.33 | 0.01 | 0.03 | 36.36 | 15.34 | 36 | 0.39 | 0.01 | 0.05 | 0 | 11.25 | 45 | 0.33 | 0.01 | 0.03 | 0 | 11.25 | 71 | 0.31 | 0.01 | 0.03 | 38.46 | 15.87 | 37 | 0.36 | 0.01 | 0.05 | 0 | 11.25 |
No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 71) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | ||||||||||||||||||||||||||||||
Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||||||
Loop Computation Issues | |||||||||||||||||||||||||||||||||||
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | ||||||||||||||||||||||||||||||||||
Presence of a large number of scalar integer instructions | 1 | ||||||||||||||||||||||||||||||||||
Data Access Issues | |||||||||||||||||||||||||||||||||||
More than 10% of the vector loads instructions are unaligned | 1 |