The percentage of time when the XVEs were stalled or idle is high, which has a negative impact on compute-bound applications.
| GPU Stack | GPU Adapter | XVE Array Active(%) | XVE Array Stalled(%) | XVE Array Idle(%) |
|---|---|---|---|---|
| 0 | GPU 1 | 36.1% | 39.5% | 24.4% |
| 0 | GPU 3 | 0.0% | 0.0% | 100.0% |
| 0 | GPU 0 | 0.0% | 0.0% | 100.0% |
| 0 | GPU 2 | 0.0% | 0.0% | 100.0% |
Several factors including shared local memory, use of memory barriers, and inefficient work scheduling can cause a low value of the occupancy metric.
| Computing Task | Total Time | Occupancy(%) | SIMD Utilization(%) |
|---|---|---|---|
| iso3dfd(sycl::_V1::queue&, float*, float*, float*, float*, unsigned long, unsigned long, unsigned long, unsigned long)::{lambda(sycl::_V1::handler&)#1}::operator()(sycl::_V1::handler&) const::{lambda(sycl::_V1::id<(int)3>)#1} | 14.744s | 23.7% of peak value | 0.0% |