CL: More misc fixes (!304) · Merge requests · Erik Faye-Lund/ mesa

This series has 3 critical bugfixes:

Dominance indexing using 16-bit signed indices breaks 16-component sin/cos/tan libclc implementations.
The system values lowering causes us to not respect work group offsets if no global offsets were specified. This breaks several 3-component math bruteforce tests, which end up using small work group sizes with large global dimensions, requiring us to loop them.
Fix conformance of fdiv to match what CL requires, even if the D3D driver lowers it to separate reciprocal multiply.

It also has several non-critical changes:

The vec3/vec4 pass allows copy_prop to work without using scratch memory when passing vec3 by value. Otherwise clang generates some bizarre code that normal copy_prop can't see through. This probably should be generalized to leverage OOB variable reads/writes in upstream.
Add a bunch of optimizations early in the compilation, rather than just relying on the optimization loop in nir_to_dxil. This lets the code actually be readable before lower_explicit_io.
Fix some more hardcoded 4s that should be vector-sized. This ignores the ones that're already part of mesa/mesa!6655 (merged) -- I'll add these to that MR.

Admin message