intel/compiler: Fix INTEL_DEBUG no8, no16 and do32 on compute shaders
Previously no8 wasn't supported.
If no16 was specified, and the shader couldn't run in simd8 due to the local_size and the simd32 program spilled registers, then the compilation would fail. This was because the run_cs feeds in min_dispatch_width of 16, which will cause the the simd32 compilation to fail if there was register spilling.
If do32 was used, but the program spilled registers, then a lower simd size could be chosen instead of simd32. This was because the run_cs feeds in min_dispatch_width, and if this was 8 or 16, then a spilled SIMD32 would be aborted.
Cc: @fjdegroo