Intel® VTune Profiler 2022.2.0

Recommendations:

GPU Time, % of Elapsed time: 10.3%
GPU utilization is low. Switch to the for in-depth analysis of host activity. Poor GPU utilization can prevent the application from offloading effectively.
EU Array Stalled/Idle: 67.7% of Elapsed time with GPU busy
GPU metrics detect some kernel issues. Use GPU Compute/Media Hotspots (preview) to understand how well your application runs on the specified hardware.
Execution % of Total Time: 45.3%
Execution time on the device is less than memory transfer time. Make sure your offload schema is optimal. Use Intel Advisor tool to get an insight into possible causes for inefficient offload.