Loops
▶fft_scatter_2d.f90: 129 - 3.45 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_3 | Run gcc_1 | |||||||||||||||||||||||||
| Loop Source Regions |
| Loop Source Regions | Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| |||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 19104 | 5.84 | 5.36 | 0.96 | 100 | 25 | 17813 | 5.36 | 4.80 | 0.86 | 100 | 25 | 30403 | 5.32 | 4.88 | 0.88 | 100 | 50 | 19669 | 4.73 | 4.18 | 0.75 | 100 | 100 | ||||||
| Sum on 1 analyzed binary loop (exec - 19104) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 17813) | Sum on 1 analyzed binary loop (exec - 30403) | Sum on 1 analyzed binary loop (exec - 19669) | |||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||
| Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | ||||||||||||||||||||||||||
| Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 1 | ||||||||||||||||||||||||
| Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | ||||||||||||||||||||||||||
| Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 1 | ||||||||||||||||||||||||
▶fft_scatter_2d.f90: 243 - 1.26 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_3 | Run gcc_1 | |||||||||||||||||||||||||
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions | Loop Source Regions | ||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 19086 | 2.47 | 2.16 | 0.39 | 100 | 25 | 16011 | 3.03 | 2.61 | 0.47 | 100 | 25 | 17789 | 2.89 | 2.29 | 0.41 | 100 | 25 | ||||||||||||
| Sum on 1 analyzed binary loop (exec - 19086) | Sum on 1 analyzed binary loop (exec - 16011) | Sum on 1 analyzed binary loop (exec - 17789) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | |||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||
| Data Access Issues | Data Access Issues | Data Access Issues | |||||||||||||||||||||||||||
| More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | More than 10% of the vector loads instructions are unaligned | 1 | |||||||||||||||||||||||||
▶init_us_2_acc.f90: 153 - 0.90 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_3 | Run gcc_1 | |||||||||||||||||||||||||
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions | Loop Source Regions |
| |||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15112 | 1.32 | 1.26 | 0.23 | 76.92 | 22.12 | 13022 | 1.32 | 1.26 | 0.22 | 84.62 | 23.08 | 14265 | 1.33 | 1.26 | 0.23 | 80 | 22.5 | 15459 | 1.30 | 1.23 | 0.22 | 100 | 100 | ||||||
| Sum on 1 analyzed binary loop (exec - 15112) | Sum on 1 analyzed binary loop (exec - 13022) | Sum on 1 analyzed binary loop (exec - 14265) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 15459) | |||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||
| Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | ||||||||||||||||||||||||||
| Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | |||||||||||||||||||||||
| Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | ||||||||||||||||||||||||||
| Presence of constant non-unit stride data access | 0 | Presence of constant non-unit stride data access | 0 | Presence of constant non-unit stride data access | 0 | Presence of constant non-unit stride data access | 1 | ||||||||||||||||||||||
| More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 1 | ||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||
| Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | ||||||||||||||||||||||||||
| Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 1 | |||||||||||||||||||||||||
| Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | ||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||
▶<unknown>: 0 - 0.69 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_3 | Run gcc_1 | |||||||||||||||||||||||||
| Loop Source Regions | Loop Source Regions | Loop Source Regions | Loop Source Regions | Loop Source Regions | |||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 23665 | 1.65 | 1.34 | 0.24 | 100 | 50 | 2786 | 0.59 | 0.07 | 0.01 | 0 | 0 | 7089 | 0.58 | 0.02 | 0.00 | 0 | 0 | 12750 | 0.60 | 0.04 | 0.01 | 0 | 0 | 20074 | 0.69 | 0.22 | 0.04 | 0 | 0 |
| 8882 | 0.66 | 0.26 | 0.05 | 0 | 0 | 18206 | 0.67 | 0.16 | 0.03 | 0 | 0 | 23709 | 0.68 | 0.38 | 0.07 | 0 | 0 | 7775 | 0.60 | 0.06 | 0.01 | 0 | 0 | ||||||
| 1338 | 0.57 | 0.01 | 0.00 | 0 | 0 | 23711 | 0.67 | 0.37 | 0.07 | 0 | 0 | ||||||||||||||||||
| 8886 | 0.58 | 0.01 | 0.00 | 0 | 0 | 31143 | 0.76 | 0.44 | 0.08 | 0 | 0 | ||||||||||||||||||
| 7839 | 0.59 | 0.02 | 0.00 | 0 | 0 | ||||||||||||||||||||||||
| 19559 | 0.75 | 0.41 | 0.07 | 0 | 0 | ||||||||||||||||||||||||
| Sum on 1 analyzed binary loop (exec - 23665) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | |||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||
| Loop Computation Issues | |||||||||||||||||||||||||||||
| Presence of a large number of scalar integer instructions | 1 | ||||||||||||||||||||||||||||
| Data Access Issues | |||||||||||||||||||||||||||||
| Presence of constant non-unit stride data access | 1 | ||||||||||||||||||||||||||||
| Vectorization Roadblocks | |||||||||||||||||||||||||||||
| Presence of constant non-unit stride data access | 1 | ||||||||||||||||||||||||||||
| Out of user code | 1 | ||||||||||||||||||||||||||||
▶init_us_2_acc.f90: 90 - 0.61 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_3 | Run gcc_1 | |||||||||||||||||||||||||
| Loop Source Regions |
| Loop Source Regions | Loop Source Regions |
| Loop Source Regions | Loop Source Regions |
| ||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15122 | 1.18 | 1.12 | 0.20 | 100 | 25 | 14270 | 1.18 | 1.13 | 0.20 | 100 | 25 | 15466 | 1.21 | 1.14 | 0.20 | 100 | 100 | ||||||||||||
| Sum on 1 analyzed binary loop (exec - 15122) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 14270) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 15466) | |||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||
| Data Access Issues | Data Access Issues | Data Access Issues | |||||||||||||||||||||||||||
| Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | |||||||||||||||||||||||||
| Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | |||||||||||||||||||||||||||
| Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | |||||||||||||||||||||||||
▶sort.f90: 94 - 0.58 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_3 | Run gcc_1 | |||||||||||||||||||||||||
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 11793 | 0.70 | 0.66 | 0.12 | 19.05 | 13.1 | 12251 | 0.73 | 0.69 | 0.12 | 19.05 | 13.1 | 11437 | 0.66 | 0.61 | 0.11 | 26.32 | 14.47 | 18488 | 0.70 | 0.67 | 0.12 | 10.53 | 11.84 | 12364 | 0.64 | 0.61 | 0.11 | 17.65 | 13.24 |
| Sum on 1 analyzed binary loop (exec - 11793) | Sum on 1 analyzed binary loop (exec - 12251) | Sum on 1 analyzed binary loop (exec - 11437) | Sum on 1 analyzed binary loop (exec - 18488) | Sum on 1 analyzed binary loop (exec - 12364) | |||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||
| Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | |||||||||||||||||||||||||
| Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | ||||||||||||||||||||
| Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 1 | ||||||||||||||||||||
| Control Flow Issues | Control Flow Issues | Control Flow Issues | Control Flow Issues | Control Flow Issues | |||||||||||||||||||||||||
| Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 | ||||||||||||||||||||
| Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | |||||||||||||||||||||||||
| Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 | ||||||||||||||||||||
▶vloc_psi.f90: 475 - 0.50 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_3 | Run gcc_1 | |||||||||||||||||||||||||
| Loop Source Regions | Loop Source Regions | Loop Source Regions |
| Loop Source Regions | Loop Source Regions |
| |||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 8284 | 2.03 | 1.81 | 0.32 | 75 | 21.88 | 9069 | 1.13 | 0.98 | 0.18 | 100 | 100 | ||||||||||||||||||
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 8284) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 9069) | |||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||
| Loop Computation Issues | Loop Computation Issues | ||||||||||||||||||||||||||||
| Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | ||||||||||||||||||||||||||
| Data Access Issues | Data Access Issues | ||||||||||||||||||||||||||||
| Presence of constant non-unit stride data access | 0 | Presence of constant non-unit stride data access | 1 | ||||||||||||||||||||||||||
| More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 1 | ||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||
| Vectorization Roadblocks | Vectorization Roadblocks | ||||||||||||||||||||||||||||
| Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 1 | |||||||||||||||||||||||||||
| Inefficient Vectorization | Inefficient Vectorization | ||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||
▶fft_scatter_2d.f90: 242 - 0.49 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_3 | Run gcc_1 | |||||||||||||||||||||||||
| Loop Source Regions | Loop Source Regions | Loop Source Regions | Loop Source Regions |
| Loop Source Regions | ||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 30385 | 1.45 | 1.16 | 0.21 | 92.31 | 35.58 | ||||||||||||||||||||||||
| 30383 | 1.96 | 1.55 | 0.28 | 92.31 | 35.58 | ||||||||||||||||||||||||
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 2 analyzed binary loops (exec - 30385, exec - 30383) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | |||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||
| Data Access Issues | |||||||||||||||||||||||||||||
| Presence of indirect access | 1 | ||||||||||||||||||||||||||||
| Presence of expensive instructions: scatter/gather | 1 | ||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||||
| Vectorization Roadblocks | |||||||||||||||||||||||||||||
| Presence of indirect access | 1 | ||||||||||||||||||||||||||||
| Inefficient Vectorization | |||||||||||||||||||||||||||||
| Presence of expensive instructions: scatter/gather | 1 | ||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||||
▶vloc_psi.f90: 474 - 0.39 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_3 | Run gcc_1 | |||||||||||||||||||||||||
| Loop Source Regions |
| Loop Source Regions | Loop Source Regions | Loop Source Regions |
| Loop Source Regions | |||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9152 | 1.31 | 1.08 | 0.19 | 60 | 20 | 14709 | 1.31 | 1.10 | 0.20 | 100 | 42.86 | ||||||||||||||||||
| Sum on 1 analyzed binary loop (exec - 9152) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 14709) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | |||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||
| Loop Computation Issues | Loop Computation Issues | ||||||||||||||||||||||||||||
| Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | ||||||||||||||||||||||||||
| Data Access Issues | Data Access Issues | ||||||||||||||||||||||||||||
| More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 0 | ||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||
| Inefficient Vectorization | Inefficient Vectorization | ||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||
▶thread_util.f90: 29 - 0.35 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_3 | Run gcc_1 | |||||||||||||||||||||||||
| Loop Source Regions | Loop Source Regions | Loop Source Regions |
| Loop Source Regions | Loop Source Regions |
| |||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 19086 | 1.17 | 0.98 | 0.18 | 100 | 25 | 21022 | 1.18 | 0.97 | 0.17 | 100 | 100 | ||||||||||||||||||
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 19086) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 21022) | |||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||
| Data Access Issues | Data Access Issues | ||||||||||||||||||||||||||||
| More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 1 | ||||||||||||||||||||||||||
▶usnldiag.f90-pp.f90: 103 - 0.34 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_3 | Run gcc_1 | |||||||||||||||||||||||||
| Loop Source Regions | Loop Source Regions |
| Loop Source Regions | Loop Source Regions | Loop Source Regions | ||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 7028 | 1.96 | 1.88 | 0.34 | 42.31 | 17.79 | ||||||||||||||||||||||||
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 7028) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | |||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||
| Loop Computation Issues | |||||||||||||||||||||||||||||
| Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | ||||||||||||||||||||||||||||
| Data Access Issues | |||||||||||||||||||||||||||||
| More than 10% of the vector loads instructions are unaligned | 1 | ||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||||
| Inefficient Vectorization | |||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||||
▶vloc_psi.f90-pp.f90: 475 - 0.32 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_3 | Run gcc_1 | |||||||||||||||||||||||||
| Loop Source Regions | Loop Source Regions |
| Loop Source Regions | Loop Source Regions | Loop Source Regions | ||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 7557 | 2.10 | 1.81 | 0.32 | 72.73 | 21.59 | ||||||||||||||||||||||||
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 7557) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | |||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||
| Loop Computation Issues | |||||||||||||||||||||||||||||
| Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | ||||||||||||||||||||||||||||
| Presence of a large number of scalar integer instructions | 1 | ||||||||||||||||||||||||||||
| Data Access Issues | |||||||||||||||||||||||||||||
| More than 10% of the vector loads instructions are unaligned | 1 | ||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||||
| Inefficient Vectorization | |||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||||
▶init_us_2_acc.f90: 150 - 0.23 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_3 | Run gcc_1 | |||||||||||||||||||||||||
| Loop Source Regions | Loop Source Regions | Loop Source Regions | Loop Source Regions |
| Loop Source Regions | ||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 23691 | 1.32 | 1.25 | 0.23 | 95 | 40.63 | ||||||||||||||||||||||||
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 23691) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | |||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||
| Data Access Issues | |||||||||||||||||||||||||||||
| Presence of indirect access | 1 | ||||||||||||||||||||||||||||
| Presence of expensive instructions: scatter/gather | 1 | ||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||||
| Vectorization Roadblocks | |||||||||||||||||||||||||||||
| Presence of indirect access | 1 | ||||||||||||||||||||||||||||
| Inefficient Vectorization | |||||||||||||||||||||||||||||
| Presence of expensive instructions: scatter/gather | 1 | ||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||||
▶thread_util.f90-pp.f90: 29 - 0.20 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_3 | Run gcc_1 | |||||||||||||||||||||||||
| Loop Source Regions | Loop Source Regions |
| Loop Source Regions | Loop Source Regions | Loop Source Regions | ||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 16705 | 1.28 | 1.11 | 0.20 | 100 | 25 | ||||||||||||||||||||||||
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 16705) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | |||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||
| Data Access Issues | |||||||||||||||||||||||||||||
| More than 10% of the vector loads instructions are unaligned | 1 | ||||||||||||||||||||||||||||
▶usnldiag.f90: 102 - 0.20 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_3 | Run gcc_1 | |||||||||||||||||||||||||
| Loop Source Regions | Loop Source Regions | Loop Source Regions |
| Loop Source Regions | Loop Source Regions | ||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 8064 | 1.16 | 1.09 | 0.20 | 87.1 | 23.39 | ||||||||||||||||||||||||
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 8064) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | |||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||
| Loop Computation Issues | |||||||||||||||||||||||||||||
| Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | ||||||||||||||||||||||||||||
| Data Access Issues | |||||||||||||||||||||||||||||
| More than 10% of the vector loads instructions are unaligned | 1 | ||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||||
| Inefficient Vectorization | |||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||||
▶make_pointlists.f90: 52 - 0.13 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_3 | Run gcc_1 | |||||||||||||||||||||||||
| Loop Source Regions | Loop Source Regions | Loop Source Regions |
| Loop Source Regions | Loop Source Regions | ||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1040 | 0.79 | 0.75 | 0.13 | 15.69 | 14.09 | ||||||||||||||||||||||||
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 1040) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | |||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||
| Loop Computation Issues | |||||||||||||||||||||||||||||
| Presence of expensive FP instructions | 1 | ||||||||||||||||||||||||||||
| Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | ||||||||||||||||||||||||||||
| Control Flow Issues | |||||||||||||||||||||||||||||
| Presence of more than 4 paths | 1 | ||||||||||||||||||||||||||||
| Data Access Issues | |||||||||||||||||||||||||||||
| More than 20% of the loads are accessing the stack | 1 | ||||||||||||||||||||||||||||
| Vectorization Roadblocks | |||||||||||||||||||||||||||||
| Presence of more than 4 paths | 1 | ||||||||||||||||||||||||||||

