Recommendations:
EU Array Stalled/Idle: 34.4% of Elapsed time with GPU busyGPU metrics detect some kernel issues. Use GPU Compute/Media Hotspots (preview) to understand how well your application runs on the specified hardware.
| Function | Module | CPU Time |
|---|---|---|
| [Outside any known module] | [Unknown] | 5.652s |
| [Skipped stack frame(s)] | [Unknown] | 0.878s |
| std::__malloc_alloc::allocate | libstlport-dynamic.so | 0.558s |
| memcmp | libc-dynamic.so | 0.411s |
| std::string::_M_append | libtpsstool.so | 0.264s |
| [Others] | N/A | 2.674s |
| Host Task | Task Time | % of Elapsed Time(%) | Task Count |
|---|---|---|---|
| clWaitForEvents | 58.820s | 82.5% | 14 |
| tbb_parallel_for | 4.454s | 6.2% | 43 |
| clCreateContext | 3.171s | 4.4% | 3 |
| clEnqueueNDRangeKernel | 0.298s | 0.4% | 6 |
| tbb_custom | 0.297s | 0.4% | 4 |
| [Others] | 0.081s | 0.1% | 27 |
| Computing Task | Total Time | Execution | % of Total Time(%) | SIMD Width | Peak Occupancy(%) | EU Threads Occupancy(%) | SIMD Utilization(%) |
|---|---|---|---|---|---|---|---|
| dppyPy_dppy_py_devfn__5F__5F_main_5F__5F__2E_pairwise_5F_python_24_1_2E_USM_3A_ndarray_28_float64_2C__20_2d_2C__20_C_29__2E_USM_3A_ndarray_28_float64_2C__20_2d_2C__20_C_29__2E_USM_3A_ndarray_28_float64_2C__20_2d_2C__20_C_29_ | 58.732s | 58.722s | 100.0% | 8 | 100.0% | 93.3% | 100.0% |