Intel® VTune™ Profiler 2022.0.0
- EU Array Stalled/Idle:
4.5% of Elapsed time with GPU busy
- GPU L3 Bandwidth Bound:
36.8% of peak value
- Hottest GPU Computing Tasks Bound by GPU L3 Bandwidth:
- Sampler Busy:
0.0% of peak value
- Hottest GPU Computing Tasks with High Sampler Usage:
- FPU Utilization:
46.7% of Elapsed time with GPU busy
- Hottest GPU Computing Tasks with High FPU Utilization:
| Computing Task | Total Time |
|---|
| sgemm_kernel_16_32 | 3.374s |
| sgemm_kernel_16_32 | 0.845s |
| sgemm_kernel_16_32 | 0.845s |
| sgemm_kernel_16_32 | 0.212s |
- Collection and Platform Info:
- Application Command Line:
/home/u49991/DPCPP_Performance_Portability/lab/mm_dpcpp_mkl "-n" "10240" "-m" "16"
- Operating System:
5.4.0-80-generic DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.3 LTS"
- Collection start time:
02:29:23 13/01/2022 UTC
- Collection stop time:
02:29:46 13/01/2022 UTC
- Collector Type:
Event-based sampling driver,User-mode sampling and tracing
- CPU:
- Name:
Intel(R) microarchitecture code named Coffeelake
- GPU:
- Vendor:
Intel Corporation
- Max Core Frequency:
1.200 GHz