options

Help is available by moving the cursor above any symbol or by checking MAQAO website.

Global Metrics

Metricr0r1r2r3
Total Time (s)1.49 E3821.58806.64476.71
Profiled Time (s)1.49 E3818.03803.54473.06
Time in analyzed loops (%)6.986.506.655.97
Time in analyzed innermost loops (%)5.555.215.344.84
Time in user code (%)93.886.387.274.9
Compilation Options Score (%)100100100100
Array Access Efficiency (%)58.458.358.058.1
Scalability - Gap1.001.101.081.28
Potential Speedups
Perfect Flow Complexity1.021.021.021.02
Perfect OpenMP + MPI + Pthread1.031.071.061.11
Perfect OpenMP + MPI + Pthread + Perfect Load Distribution1.071.161.151.34
No Scalar IntegerPotential Speedup1.011.011.011.01
Nb Loops to get 80%8888
FP VectorisedPotential Speedup1.001.011.011.01
Nb Loops to get 80%3333
Fully VectorisedPotential Speedup1.071.061.071.06
Nb Loops to get 80%22222122
Only FP ArithmeticPotential Speedup1.041.031.031.03
Nb Loops to get 80%18191819
OpenMP perfectly balancedPotential Speedup1.031.091.081.18
Nb Loops to get 80%3323

Scalability Speedup

Cumulated Speedup If No Scalar Integer

Cumulated Speedup If FP Vectorized

Cumulated Speedup If Fully Vectorized

Cumulated Speedup If Only FP Arithmetic

Cumulated Speedup if OpenMP Perfectly Balanced

Loop Based Profiles

Innermost / Single Loops

Inbetween Loops

Outermost Loops

Cumulated Coverage With All Loops

Innermost Loop Based Profiles

Coverage

Count

Application Categorization

Time

Coverage

Compilation Options

Source ObjectIssue
picongpu
TaskKernel.hpp
EventStream.cpp
TaskCopyDeviceToHost.hpp
mutex
TaskParticlesReceive.hpp
eventSystem.cpp
EventNotify.cpp
Mask.hpp
CommunicatorMPI.cpp
TaskReceiveParticlesExchange.hpp
ITask.hpp
TaskFieldReceiveAndInsertExchange.hpp
Apply.hpp
TaskCopyDeviceToDevice.hpp
Simulation.hpp
ParticlesBase.tpp
EventPool.hpp
StrideMapping.hpp
CopyGuardToExchange.hpp
kernel.hpp
EventTask.cpp
CurrentDeposition.hpp
EventGenericThreads.hpp
Deposit.hpp
shared_ptr_base.h
Manager.cpp
TaskParticlesSend.hpp
TaskSend.hpp
TaskReceive.hpp
TaskGetCurrentSizeFromDevice.hpp
Transaction.cpp
Stream.hpp
TaskSendMPI.hpp
FieldJ.kernel
stl_tree.h
Particles.kernel
Event.hpp
event.cpp
Device.hpp
StreamTask.cpp
TaskLogicalAnd.hpp
TaskReceiveMPI.hpp
parsing.hpp
std_function.h
AddExchangeToBorder.hpp
Buffer.hpp
ParticlePush.hpp
TaskKernelCpuOmp2Blocks.hpp
FDTDBase.hpp
ParticlesBase.hpp
vector.tcc
CudaEvent.cpp
stl_map.h
TaskFieldSendExchange.hpp
CudaEventHandle.cpp
Factory.tpp
TaskCopyHostToDevice.hpp
Traits.hpp
DeviceBuffer.hpp
TransactionManager.cpp
GridBuffer.hpp
TaskSendParticlesExchange.hpp
TaskFieldReceiveAndInsert.hpp
invoke.h
DataConnector.cpp
memory.cpp
HostBuffer.hpp
EventDataReceive.hpp
future

Path Count Profiles

Coverage

Count

Low Iteration Count Profiles

Coverage

Count

Experiment Summaries

r0r1r2r3
Experiment Namepicongpu, khi_test, 128x128x32, scalability OMP 1x13,2x13,1x26,2x26picongpu, khi_test, 128x128x32, scalability OMP 1x13,2x13,1x26,2x26picongpu, khi_test, 128x128x32, scalability OMP 1x13,2x13,1x26,2x26picongpu, khi_test, 128x128x32, scalability OMP 1x13,2x13,1x26,2x26
Application./bin/picongpusame as r0same as r0same as r0
Timestamp
Experiment TypeOpenMP; same as r0same as r0same as r0
Machineskylakesame as r0same as r0same as r0
Architecturex86_64same as r0same as r0same as r0
Micro ArchitectureSKYLAKEsame as r0same as r0same as r0
Model NameIntel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHzsame as r0same as r0same as r0
Cache Size36608 KBsame as r0same as r0same as r0
Number of Cores26same as r0same as r0same as r0
Maximal Frequency2.1 GHzsame as r0same as r0same as r0
OS VersionLinux 6.5.7-arch1-1 #1 SMP PREEMPT_DYNAMIC Tue, 10 Oct 2023 21:10:21 +0000same as r0same as r0same as r0
Architecture used during static analysisx86_64same as r0same as r0same as r0
Micro Architecture used during static analysisSKYLAKEsame as r0same as r0same as r0
Compilation Options picongpu: clang based Intel(R) oneAPI DPC++/C++ Compiler 2023.0.0 (2023.0.0.20221201) --driver-mode=g++ --intel -I /home/eoseret/apps/buildPIC/khi_test_omp2b_icpx/include -I /home/eoseret/apps/picongpu/include/picongpu/.. -I /home/eoseret/apps/picongpu/thirdParty/cupla/include -I /home/eoseret/apps/picongpu/include/pmacc/.. -D ALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLED -D ALPAKA_BLOCK_SHARED_DYN_MEMBER_ALLOC_KIB=47 -D ALPAKA_DEBUG=0 -D ALPAKA_OFFLOAD_MAX_BLOCK_SIZE= -D BOOST_ATOMIC_DYN_LINK -D BOOST_ATOMIC_NO_LIB -D BOOST_NO_AUTO_PTR -D BOOST_PROGRAM_OPTIONS_DYN_LINK -D BOOST_PROGRAM_OPTIONS_NO_LIB -D CMAKE_BUILD_TYPE=Release -D CMAKE_CXX_COMPILER_ID=IntelLLVM -D CMAKE_CXX_COMPILER_VERSION=2023.0.0 -D CMAKE_SYSTEM=Linux-6.5.7-arch1-1 -D CMAKE_SYSTEM_PROCESSOR=x86_64 -D CMAKE_VERSION=3.27.7 -D CUPLA_STREAM_ASYNC_ENABLED=0 -D PIC_VERBOSE_LVL=1 -D PMACC_VERBOSE_LVL=0 -isystem /opt/intel/oneapi/mpi/2021.8.0/include -isystem /home/eoseret/apps/picongpu/thirdParty/cupla/alpaka/include -g -fno-omit-frame-pointer -x Host -fiopenmp -O3 -D NDEBUG -std=c++17 -fiopenmp -MD -MT CMakeFiles/picongpu.dir/main.cpp.o -MF CMakeFiles/picongpu.dir/main.cpp.o.d -o CMakeFiles/picongpu.dir/main.cpp.o -c /home/eoseret/apps/picongpu/include/picongpu/main.cpp -fveclib=SVML -fheinous-gnu-extensions same as r0same as r0same as r0
Number of processes observed1same as r0same as r0same as r0
Number of threads observed1326same as r152
Frequency Driverintel_cpufreqsame as r0same as r0same as r0
Frequency Governorperformancesame as r0same as r0same as r0
Huge Pagesalwayssame as r0same as r0same as r0
Hyperthreadingoffsame as r0same as r0same as r0
Number of sockets2same as r0same as r0same as r0
Number of cores per socket26same as r0same as r0same as r0
MAQAO version2.18.0same as r0same as r0same as r0
MAQAO buildb42f6f97013f6f62087f6ca443544250f50b2841::20231130-145227same as r0same as r0same as r0
Commentssame as r0same as r0same as r0
×