* Info: Detected 8 Lprof instances in gmz10.benchmarkcenter.megware.com.
If this is incorrect, rerun with number-processes-per-node=X
[0] MPI startup(): Intel(R) MPI Library, Version 2021.14 Build 20240911 (id: b3fc682)
[0] MPI startup(): Copyright (C) 2003-2024 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): Load tuning file: "/cluster/intel/oneapi/2025.0.0/mpi/2021.14/opt/mpi/etc/tuning_generic_shm.dat"
[0] MPI startup(): ===== CPU pinning =====
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 84944 gmz10.benchmarkcenter.megware.com {0}
[0] MPI startup(): 1 84963 gmz10.benchmarkcenter.megware.com {32}
[0] MPI startup(): 2 84960 gmz10.benchmarkcenter.megware.com {64}
[0] MPI startup(): 3 84958 gmz10.benchmarkcenter.megware.com {96}
[0] MPI startup(): 4 84952 gmz10.benchmarkcenter.megware.com {128}
[0] MPI startup(): 5 84970 gmz10.benchmarkcenter.megware.com {160}
[0] MPI startup(): 6 84950 gmz10.benchmarkcenter.megware.com {192}
[0] MPI startup(): 7 84951 gmz10.benchmarkcenter.megware.com {224}
Mon Feb 24 23:51:24 2025: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: gmz10.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-503.19.1.el9_5.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2025.0.0/mpi/2021.14/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -DDO_MPI -O2 -march=znver5 -fno-tree-vectorize -fno-openmp-simd -funroll-loops -ffast-math -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -grecord-gcc-switches -cc=gcc -fopenmp -funroll-loops'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (32 threads)
Double Precision: true
Run Date/Time: 2025-02-24, 23:51:24
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 200
ny: 200
nz: 200
xproc: 2
yproc: 2
zproc: 2
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 32000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 723.0000000000, 723.0000000000, 723.0000000000 ]
Decomposition data:
Processors : 2, 2, 2
Local boxes : 62, 62, 62 = 238328
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303487, atom count : 32000000
Mon Feb 24 23:51:24 2025: Initialization Finished
Mon Feb 24 23:51:24 2025: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303487 -1.243619295087 0.077555991600 600.0000 0.0000 32000000
10 10.00 -1.166059648980 -1.233154817498 0.067095168517 519.0715 0.0538 32000000
20 20.00 -1.166048431576 -1.208173842947 0.042125411370 325.8968 0.0698 32000000
30 30.00 -1.166037581951 -1.186576153828 0.020538571877 158.8935 0.0725 32000000
40 40.00 -1.166042092491 -1.183622817462 0.017580724971 136.0106 0.0732 32000000
50 50.00 -1.166051684603 -1.193715522562 0.027663837959 214.0170 0.0736 32000000
60 60.00 -1.166054640401 -1.202662241274 0.036607600874 283.2091 0.0738 32000000
70 70.00 -1.166052133313 -1.204912537669 0.038860404356 300.6375 0.0736 32000000
80 80.00 -1.166048797816 -1.203644675872 0.037595878056 290.8547 0.0733 32000000
90 90.00 -1.166048009496 -1.203841392163 0.037793382667 292.3827 0.0731 32000000
100 100.00 -1.166049798760 -1.206885628636 0.040835829876 315.9201 0.0729 32000000
Mon Feb 24 23:51:53 2025: Ending simulation
Simulation Validation:
Initial energy : -1.166063303487
Final energy : -1.166049798760
eFinal/eInitial : 0.999988
Final atom count : 32000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 28.8420 28.8420 101.59
loop 1 28.3913 28.3913 100.00
timestep 10 2.8390 28.3898 99.99
position 100 0.0045 0.4474 1.58
velocity 200 0.0034 0.6735 2.37
redistribute 101 0.0661 6.6734 23.51
atomHalo 101 0.0234 2.3651 8.33
force 101 0.2046 20.6683 72.80
commHalo 303 0.0034 1.0283 3.62
commReduce 39 0.0022 0.0850 0.30
Timing Statistics Across 8 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 0: 28.8420 1: 28.8423 28.8422 0.0001
loop 0: 28.3913 1: 28.3914 28.3914 0.0000
timestep 0: 28.3898 6: 28.3900 28.3900 0.0001
position 1: 0.4452 6: 0.4510 0.4478 0.0018
velocity 5: 0.6518 0: 0.6735 0.6660 0.0074
redistribute 5: 6.0478 0: 6.6734 6.4645 0.2329
atomHalo 5: 1.7440 0: 2.3651 2.1626 0.2262
force 0: 20.6683 5: 21.3898 20.9084 0.2633
commHalo 5: 0.3835 0: 1.0283 0.8282 0.2225
commReduce 5: 0.0081 7: 0.0892 0.0625 0.0250
---------------------------------------------------
Average atom update rate: 0.07 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.01 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 112.72 atoms/us
---------------------------------------------------
Mon Feb 24 23:51:53 2025: CoMD Ending
Your experiment path is /home/eoseret/qaas_runs_ZEN5/174-043-3878/intel/CoMD/run/oneview_runs/compilers/gcc_10/oneview_results_1740437480/tools/lprof_npsu_run_0
To display your profiling results:
####################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
####################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/qaas_runs_ZEN5/174-043-3878/intel/CoMD/run/oneview_runs/compilers/gcc_10/oneview_results_1740437480/tools/lprof_npsu_run_0 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/qaas_runs_ZEN5/174-043-3878/intel/CoMD/run/oneview_runs/compilers/gcc_10/oneview_results_1740437480/tools/lprof_npsu_run_0 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/qaas_runs_ZEN5/174-043-3878/intel/CoMD/run/oneview_runs/compilers/gcc_10/oneview_results_1740437480/tools/lprof_npsu_run_0 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/qaas_runs_ZEN5/174-043-3878/intel/CoMD/run/oneview_runs/compilers/gcc_10/oneview_results_1740437480/tools/lprof_npsu_run_0 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/qaas_runs_ZEN5/174-043-3878/intel/CoMD/run/oneview_runs/compilers/gcc_10/oneview_results_1740437480/tools/lprof_npsu_run_0 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_ZEN5/174-043-3878/intel/CoMD/run/oneview_runs/compilers/gcc_10/oneview_results_1740437480/tools/lprof_npsu_run_0 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_ZEN5/174-043-3878/intel/CoMD/run/oneview_runs/compilers/gcc_10/oneview_results_1740437480/tools/lprof_npsu_run_0 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_ZEN5/174-043-3878/intel/CoMD/run/oneview_runs/compilers/gcc_10/oneview_results_1740437480/tools/lprof_npsu_run_0 #
####################################################################################################################################################################################################