* Info: Selecting the 'perf-high-ppn' engine for node idp10.benchmarkcenter.megware.com
* Info: Process launched (host idp10.benchmarkcenter.megware.com, process 9670)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp10.benchmarkcenter.megware.com, process 9675)Mon Mar 25 17:51:58 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: idp09.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -fno-alias -qopt-report=3 -DDO_MPI -O3 -xSAPPHIRERAPIDS -flto -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=icx -fiopenmp'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (1 threads)
Double Precision: true
Run Date/Time: 2024-03-25, 17:51:58
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303588, atom count : 4000000
Mon Mar 25 17:52:01 2024: Initialization Finished
Mon Mar 25 17:52:01 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303588 -1.243619295188 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650499 -1.233157709948 0.067098059449 519.0938 1.0705 4000000
20 20.00 -1.166048438417 -1.208183014318 0.042134575902 325.9677 1.2712 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 1.3127 4000000
40 40.00 -1.166042093135 -1.183625399859 0.017583306724 136.0305 1.3209 4000000
50 50.00 -1.166051684893 -1.193713710258 0.027662025365 214.0030 1.3232 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 1.3230 4000000
70 70.00 -1.166052143011 -1.204911990844 0.038859847833 300.6332 1.3212 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 1.3194 4000000
90 90.00 -1.166048006780 -1.203820491598 0.037772484818 292.2210 1.3168 4000000
100 100.00 -1.166049793505 -1.206862845061 0.040813051556 315.7439 1.3148 4000000
Mon Mar 25 17:56:19 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303588
Final energy : -1.166049793505
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 260.2897 260.2897 100.94
loop 1 257.8754 257.8754 100.00
timestep 10 25.7875 257.8750 100.00
position 100 0.0188 1.8792 0.73
velocity 200 0.0179 3.5711 1.38
redistribute 101 0.0877 8.8623 3.44
atomHalo 101 0.0141 1.4222 0.55
force 101 2.4299 245.4210 95.17
commHalo 303 0.0013 0.4037 0.16
commReduce 39 0.0001 0.0050 0.00
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 0: 260.2897 1: 260.2898 260.2897 0.0001
loop 0: 257.8754 1: 257.8754 257.8754 0.0000
timestep 0: 257.8750 1: 257.8750 257.8750 0.0000
position 1: 1.8599 0: 1.8792 1.8696 0.0096
velocity 1: 3.5608 0: 3.5711 3.5659 0.0051
redistribute 0: 8.8623 1: 8.8859 8.8741 0.0118
atomHalo 0: 1.4222 1: 1.4732 1.4477 0.0255
force 0: 245.4210 1: 245.4237 245.4223 0.0014
commHalo 0: 0.4037 1: 0.4935 0.4486 0.0449
commReduce 0: 0.0050 1: 0.0084 0.0067 0.0017
---------------------------------------------------
Average atom update rate: 1.29 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.64 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 1.55 atoms/us
---------------------------------------------------
Mon Mar 25 17:56:19 2024: CoMD Ending
* Info: Process finished (host idp10.benchmarkcenter.megware.com, process 9670)
* Info: Process finished (host idp10.benchmarkcenter.megware.com, process 9675)
Your experiment path is /home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_0
To display your profiling results:
#############################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#############################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_0 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_0 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_0 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_0 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_0 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_0 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_0 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_0 #
#############################################################################################################################################################################################
* Info: Selecting the 'perf-high-ppn' engine for node idp10.benchmarkcenter.megware.com
* Info: Process launched (host idp10.benchmarkcenter.megware.com, process 9751)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp10.benchmarkcenter.megware.com, process 9756)Mon Mar 25 17:56:28 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: idp09.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -fno-alias -qopt-report=3 -DDO_MPI -O3 -xSAPPHIRERAPIDS -flto -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=icx -fiopenmp'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (2 threads)
Double Precision: true
Run Date/Time: 2024-03-25, 17:56:28
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063304092, atom count : 4000000
Mon Mar 25 17:56:29 2024: Initialization Finished
Mon Mar 25 17:56:29 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063304092 -1.243619295692 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650500 -1.233157709949 0.067098059449 519.0938 0.5486 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.6484 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.6686 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.6730 4000000
50 50.00 -1.166051684893 -1.193713710258 0.027662025365 214.0030 0.6743 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.6741 4000000
70 70.00 -1.166052143011 -1.204911990844 0.038859847833 300.6332 0.6732 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.6721 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.6710 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.6700 4000000
Mon Mar 25 17:58:41 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063304092
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 132.7728 132.7728 100.99
loop 1 131.4651 131.4651 100.00
timestep 10 13.1465 131.4647 100.00
position 100 0.0102 1.0183 0.77
velocity 200 0.0098 1.9525 1.49
redistribute 101 0.0621 6.2720 4.77
atomHalo 101 0.0141 1.4238 1.08
force 101 1.2196 123.1798 93.70
commHalo 303 0.0015 0.4466 0.34
commReduce 39 0.0000 0.0006 0.00
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 1: 132.7719 0: 132.7728 132.7724 0.0004
loop 0: 131.4651 1: 131.4651 131.4651 0.0000
timestep 0: 131.4647 1: 131.4648 131.4647 0.0000
position 1: 1.0113 0: 1.0183 1.0148 0.0035
velocity 1: 1.9462 0: 1.9525 1.9494 0.0031
redistribute 0: 6.2720 1: 6.3711 6.3215 0.0495
atomHalo 0: 1.4238 1: 1.5108 1.4673 0.0435
force 1: 123.0813 0: 123.1798 123.1306 0.0492
commHalo 0: 0.4466 1: 0.5783 0.5125 0.0659
commReduce 0: 0.0006 1: 0.0143 0.0074 0.0068
---------------------------------------------------
Average atom update rate: 0.66 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.33 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 3.04 atoms/us
---------------------------------------------------
Mon Mar 25 17:58:41 2024: CoMD Ending
* Info: Process finished (host idp10.benchmarkcenter.megware.com, process 9756)
* Info: Process finished (host idp10.benchmarkcenter.megware.com, process 9751)
Your experiment path is /home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_1
To display your profiling results:
#############################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#############################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_1 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_1 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_1 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_1 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_1 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_1 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_1 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_1 #
#############################################################################################################################################################################################
* Info: Selecting the 'perf-high-ppn' engine for node idp10.benchmarkcenter.megware.com
* Info: Process launched (host idp10.benchmarkcenter.megware.com, process 9830)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp10.benchmarkcenter.megware.com, process 9835)Mon Mar 25 17:58:51 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: idp09.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -fno-alias -qopt-report=3 -DDO_MPI -O3 -xSAPPHIRERAPIDS -flto -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=icx -fiopenmp'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (4 threads)
Double Precision: true
Run Date/Time: 2024-03-25, 17:58:51
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303067, atom count : 4000000
Mon Mar 25 17:58:52 2024: Initialization Finished
Mon Mar 25 17:58:52 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303067 -1.243619294667 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650499 -1.233157709949 0.067098059449 519.0938 0.2885 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.3346 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.3449 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.3470 4000000
50 50.00 -1.166051684893 -1.193713710257 0.027662025365 214.0030 0.3474 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.3473 4000000
70 70.00 -1.166052143011 -1.204911990845 0.038859847833 300.6332 0.3469 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.3464 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.3458 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.3452 4000000
Mon Mar 25 18:00:00 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303067
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 68.6308 68.6308 101.11
loop 1 67.8797 67.8797 100.00
timestep 10 6.7879 67.8793 100.00
position 100 0.0055 0.5452 0.80
velocity 200 0.0052 1.0478 1.54
redistribute 101 0.0490 4.9516 7.29
atomHalo 101 0.0140 1.4163 2.09
force 101 0.6123 61.8426 91.11
commHalo 303 0.0014 0.4099 0.60
commReduce 39 0.0001 0.0023 0.00
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 1: 68.6305 0: 68.6308 68.6306 0.0001
loop 0: 67.8797 1: 67.8797 67.8797 0.0000
timestep 0: 67.8793 1: 67.8793 67.8793 0.0000
position 1: 0.5431 0: 0.5452 0.5442 0.0010
velocity 1: 1.0423 0: 1.0478 1.0451 0.0027
redistribute 1: 4.9496 0: 4.9516 4.9506 0.0010
atomHalo 0: 1.4163 1: 1.4458 1.4310 0.0147
force 0: 61.8426 1: 61.8516 61.8471 0.0045
commHalo 0: 0.4099 1: 0.4444 0.4271 0.0173
commReduce 0: 0.0023 1: 0.0032 0.0027 0.0004
---------------------------------------------------
Average atom update rate: 0.34 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.17 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 5.89 atoms/us
---------------------------------------------------
Mon Mar 25 18:00:00 2024: CoMD Ending
* Info: Process finished (host idp10.benchmarkcenter.megware.com, process 9835)
* Info: Process finished (host idp10.benchmarkcenter.megware.com, process 9830)
Your experiment path is /home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_2
To display your profiling results:
#############################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#############################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_2 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_2 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_2 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_2 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_2 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_2 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_2 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_2 #
#############################################################################################################################################################################################
* Info: Selecting the 'perf-high-ppn' engine for node idp10.benchmarkcenter.megware.com
* Info: Process launched (host idp10.benchmarkcenter.megware.com, process 9914)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp10.benchmarkcenter.megware.com, process 9919)Mon Mar 25 18:00:10 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: idp09.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -fno-alias -qopt-report=3 -DDO_MPI -O3 -xSAPPHIRERAPIDS -flto -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=icx -fiopenmp'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (8 threads)
Double Precision: true
Run Date/Time: 2024-03-25, 18:00:10
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303663, atom count : 4000000
Mon Mar 25 18:00:10 2024: Initialization Finished
Mon Mar 25 18:00:10 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303663 -1.243619295263 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650500 -1.233157709949 0.067098059449 519.0938 0.1631 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.1768 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.1822 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.1834 4000000
50 50.00 -1.166051684893 -1.193713710258 0.027662025365 214.0030 0.1839 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.1839 4000000
70 70.00 -1.166052143011 -1.204911990845 0.038859847833 300.6332 0.1837 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.1835 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.1830 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.1829 4000000
Mon Mar 25 18:00:46 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303663
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 36.6125 36.6125 101.34
loop 1 36.1288 36.1288 100.00
timestep 10 3.6128 36.1284 100.00
position 100 0.0029 0.2892 0.80
velocity 200 0.0028 0.5543 1.53
redistribute 101 0.0428 4.3195 11.96
atomHalo 101 0.0137 1.3792 3.82
force 101 0.3094 31.2479 86.49
commHalo 303 0.0010 0.3174 0.88
commReduce 39 0.0001 0.0034 0.01
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 1: 36.6109 0: 36.6125 36.6117 0.0008
loop 0: 36.1288 1: 36.1288 36.1288 0.0000
timestep 0: 36.1284 1: 36.1285 36.1284 0.0000
position 1: 0.2882 0: 0.2892 0.2887 0.0005
velocity 0: 0.5543 1: 0.5563 0.5553 0.0010
redistribute 0: 4.3195 1: 4.3426 4.3311 0.0116
atomHalo 0: 1.3792 1: 1.5361 1.4577 0.0785
force 1: 31.2196 0: 31.2479 31.2337 0.0142
commHalo 0: 0.3174 1: 0.5315 0.4245 0.1071
commReduce 0: 0.0034 1: 0.0064 0.0049 0.0015
---------------------------------------------------
Average atom update rate: 0.18 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.09 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 11.07 atoms/us
---------------------------------------------------
Mon Mar 25 18:00:46 2024: CoMD Ending
* Info: Process finished (host idp10.benchmarkcenter.megware.com, process 9919)
* Info: Process finished (host idp10.benchmarkcenter.megware.com, process 9914)
Your experiment path is /home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_3
To display your profiling results:
#############################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#############################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_3 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_3 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_3 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_3 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_3 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_3 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_3 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_3 #
#############################################################################################################################################################################################
* Info: Selecting the 'perf-high-ppn' engine for node idp10.benchmarkcenter.megware.com
* Info: Process launched (host idp10.benchmarkcenter.megware.com, process 9992)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp10.benchmarkcenter.megware.com, process 9997)Mon Mar 25 18:00:55 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: idp09.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -fno-alias -qopt-report=3 -DDO_MPI -O3 -xSAPPHIRERAPIDS -flto -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=icx -fiopenmp'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (16 threads)
Double Precision: true
Run Date/Time: 2024-03-25, 18:00:55
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303487, atom count : 4000000
Mon Mar 25 18:00:56 2024: Initialization Finished
Mon Mar 25 18:00:56 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303487 -1.243619295087 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650500 -1.233157709949 0.067098059449 519.0938 0.0938 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.0998 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.1014 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.1020 4000000
50 50.00 -1.166051684893 -1.193713710257 0.027662025365 214.0030 0.1022 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.1022 4000000
70 70.00 -1.166052143011 -1.204911990844 0.038859847833 300.6332 0.1022 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.1021 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.1018 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.1017 4000000
Mon Mar 25 18:01:16 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303487
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 20.5321 20.5321 101.70
loop 1 20.1895 20.1895 100.00
timestep 10 2.0189 20.1891 100.00
position 100 0.0018 0.1761 0.87
velocity 200 0.0017 0.3333 1.65
redistribute 101 0.0398 4.0225 19.92
atomHalo 101 0.0145 1.4646 7.25
force 101 0.1566 15.8185 78.35
commHalo 303 0.0013 0.3898 1.93
commReduce 39 0.0001 0.0053 0.03
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 1: 20.5310 0: 20.5321 20.5315 0.0005
loop 0: 20.1895 1: 20.1895 20.1895 0.0000
timestep 0: 20.1891 1: 20.1892 20.1891 0.0000
position 0: 0.1761 1: 0.1764 0.1763 0.0001
velocity 1: 0.3315 0: 0.3333 0.3324 0.0009
redistribute 1: 3.9892 0: 4.0225 4.0059 0.0166
atomHalo 0: 1.4646 1: 1.5256 1.4951 0.0305
force 0: 15.8185 1: 15.8520 15.8352 0.0167
commHalo 0: 0.3898 1: 0.5709 0.4804 0.0905
commReduce 0: 0.0053 1: 0.0056 0.0055 0.0002
---------------------------------------------------
Average atom update rate: 0.10 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.05 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 19.81 atoms/us
---------------------------------------------------
Mon Mar 25 18:01:16 2024: CoMD Ending
* Info: Process finished (host idp10.benchmarkcenter.megware.com, process 9992)
* Info: Process finished (host idp10.benchmarkcenter.megware.com, process 9997)
Your experiment path is /home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_4
To display your profiling results:
#############################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#############################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_4 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_4 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_4 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_4 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_4 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_4 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_4 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_4 #
#############################################################################################################################################################################################
* Info: Selecting the 'perf-high-ppn' engine for node idp10.benchmarkcenter.megware.com
* Info: Process launched (host idp10.benchmarkcenter.megware.com, process 10134)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp10.benchmarkcenter.megware.com, process 10139)Mon Mar 25 18:01:26 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: idp09.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -fno-alias -qopt-report=3 -DDO_MPI -O3 -xSAPPHIRERAPIDS -flto -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=icx -fiopenmp'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (32 threads)
Double Precision: true
Run Date/Time: 2024-03-25, 18:01:26
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303522, atom count : 4000000
Mon Mar 25 18:01:26 2024: Initialization Finished
Mon Mar 25 18:01:26 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303522 -1.243619295122 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650500 -1.233157709949 0.067098059449 519.0938 0.0834 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.0863 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.0814 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.0818 4000000
50 50.00 -1.166051684893 -1.193713710258 0.027662025365 214.0030 0.0819 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.0821 4000000
70 70.00 -1.166052143011 -1.204911990844 0.038859847833 300.6332 0.0820 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.0820 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.0818 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.0818 4000000
Mon Mar 25 18:01:43 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303522
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 16.8288 16.8288 102.04
loop 1 16.4930 16.4930 100.00
timestep 10 1.6493 16.4926 100.00
position 100 0.0036 0.3607 2.19
velocity 200 0.0031 0.6140 3.72
redistribute 101 0.0465 4.6993 28.49
atomHalo 101 0.0160 1.6150 9.79
force 101 0.1083 10.9345 66.30
commHalo 303 0.0018 0.5378 3.26
commReduce 39 0.0002 0.0097 0.06
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 1: 16.8284 0: 16.8288 16.8286 0.0002
loop 0: 16.4930 1: 16.4930 16.4930 0.0000
timestep 0: 16.4926 1: 16.4927 16.4926 0.0000
position 0: 0.3607 1: 0.3653 0.3630 0.0023
velocity 0: 0.6140 1: 0.6227 0.6183 0.0044
redistribute 1: 4.6696 0: 4.6993 4.6845 0.0148
atomHalo 0: 1.6150 1: 1.7944 1.7047 0.0897
force 0: 10.9345 1: 10.9465 10.9405 0.0060
commHalo 0: 0.5378 1: 0.8117 0.6747 0.1369
commReduce 0: 0.0097 1: 0.0100 0.0099 0.0001
---------------------------------------------------
Average atom update rate: 0.08 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.04 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 24.25 atoms/us
---------------------------------------------------
Mon Mar 25 18:01:43 2024: CoMD Ending
* Info: Process finished (host idp10.benchmarkcenter.megware.com, process 10139)
* Info: Process finished (host idp10.benchmarkcenter.megware.com, process 10134)
Your experiment path is /home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_5
To display your profiling results:
#############################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#############################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_5 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_5 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_5 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_5 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_5 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_5 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_5 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_5 #
#############################################################################################################################################################################################
* Info: Selecting the 'perf-high-ppn' engine for node idp10.benchmarkcenter.megware.com
* Info: Process launched (host idp10.benchmarkcenter.megware.com, process 10305)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp10.benchmarkcenter.megware.com, process 10310)Mon Mar 25 18:01:53 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: idp09.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -fno-alias -qopt-report=3 -DDO_MPI -O3 -xSAPPHIRERAPIDS -flto -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=icx -fiopenmp'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (48 threads)
Double Precision: true
Run Date/Time: 2024-03-25, 18:01:53
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303481, atom count : 4000000
Mon Mar 25 18:01:53 2024: Initialization Finished
Mon Mar 25 18:01:53 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303481 -1.243619295081 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650500 -1.233157709949 0.067098059449 519.0938 0.0713 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.0762 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.0772 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.0772 4000000
50 50.00 -1.166051684893 -1.193713710258 0.027662025365 214.0030 0.0772 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.0773 4000000
70 70.00 -1.166052143011 -1.204911990844 0.038859847833 300.6332 0.0772 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.0772 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.0775 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.0773 4000000
Mon Mar 25 18:02:08 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303481
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 15.6336 15.6336 102.12
loop 1 15.3094 15.3094 100.00
timestep 10 1.5309 15.3090 100.00
position 100 0.0053 0.5273 3.44
velocity 200 0.0047 0.9472 6.19
redistribute 101 0.0493 4.9795 32.53
atomHalo 101 0.0179 1.8070 11.80
force 101 0.0884 8.9275 58.31
commHalo 303 0.0024 0.7260 4.74
commReduce 39 0.0003 0.0117 0.08
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 1: 15.6331 0: 15.6336 15.6333 0.0003
loop 0: 15.3094 1: 15.3094 15.3094 0.0000
timestep 0: 15.3090 1: 15.3091 15.3090 0.0001
position 0: 0.5273 1: 0.5447 0.5360 0.0087
velocity 0: 0.9472 1: 0.9652 0.9562 0.0090
redistribute 1: 4.9030 0: 4.9795 4.9413 0.0383
atomHalo 1: 1.7016 0: 1.8070 1.7543 0.0527
force 0: 8.9275 1: 8.9794 8.9534 0.0259
commHalo 1: 0.4911 0: 0.7260 0.6086 0.1175
commReduce 1: 0.0047 0: 0.0117 0.0082 0.0035
---------------------------------------------------
Average atom update rate: 0.08 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.04 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 26.13 atoms/us
---------------------------------------------------
Mon Mar 25 18:02:08 2024: CoMD Ending
* Info: Process finished (host idp10.benchmarkcenter.megware.com, process 10310)
* Info: Process finished (host idp10.benchmarkcenter.megware.com, process 10305)
Your experiment path is /home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_6
To display your profiling results:
#############################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#############################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_6 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_6 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_6 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_6 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_6 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_6 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_6 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/CoMD_HBM/intel/CoMD/run/oneview_runs/compilers/icx_9/oneview_results_scal/tools/lprof_npsu_run_6 #
#############################################################################################################################################################################################