Recommendations:
GPU Time, % of Elapsed time: 23.2%GPU utilization is low. Switch to the for in-depth analysis of host activity. Poor GPU utilization can prevent the application from offloading effectively.EU Array Stalled/Idle: 22.2% of Elapsed time with GPU busy
GPU metrics detect some kernel issues. Use GPU Compute/Media Hotspots (preview) to understand how well your application runs on the specified hardware.
GPU utilization is low. Consider offloading more work to the GPU to increase overall application performance.
| Function | Module | CPU Time |
|---|---|---|
| [Outside any known module] | [Unknown] | 3.742s |
| Intel::OpenCL::Utils::AtomicCounter::operator long | libcpu_device_emu.so.2021.13.11.0 | 1.060s |
| [Skipped stack frame(s)] | [Unknown] | 0.665s |
| Intel::OpenCL::CPUDevice::AffinitizeThreads::ExecuteIteration | libcpu_device_emu.so.2021.13.11.0 | 0.651s |
| func@0x18b8ac | libc-2.31.so | 0.520s |
| [Others] | N/A | 5.102s |
| Host Task | Task Time | % of Elapsed Time(%) | Task Count |
|---|---|---|---|
| tbb_parallel_for | 4.023s | 23.7% | 43 |
| clWaitForEvents | 3.864s | 22.8% | 9 |
| clCreateContext | 2.139s | 12.6% | 3 |
| clEnqueueMemcpyINTEL | 1.514s | 8.9% | 5 |
| tbb_custom | 0.238s | 1.4% | 5 |
| [Others] | 0.084s | 0.5% | 21 |
| Computing Task | Total Time | Execution | % of Total Time(%) | SIMD Width | Peak Occupancy(%) | EU Threads Occupancy(%) | SIMD Utilization(%) |
|---|---|---|---|---|---|---|---|
| dppyPy_dppy_py_devfn__5F__5F_main_5F__5F__2E_black_5F_scholes_24_1_2E_int64_2E_USM_3A_ndarray_28_float64_2C__20_1d_2C__20_C_29__2E_USM_3A_ndarray_28_float64_2C__20_1d_2C__20_C_29__2E_USM_3A_ndarray_28_float64_2C__20_1d_2C__20_C_29__2E_float64_2E_float64_2E_USM_3A_ndarray_28_float64_2C__20_1d_2C__20_C_29__2E_USM_3A_ndarray_28_float64_2C__20_1d_2C__20_C_29_ | 5.366s | 2.790s | 52.0% | 8 | 100.0% | 89.2% | 100.0% |