Skip to content

CL: More misc fixes

Jesse Natalie requested to merge jenatali/mesa:msclc/more-cl-fixes into msclc-d3d12

This series has 3 critical bugfixes:

  • Dominance indexing using 16-bit signed indices breaks 16-component sin/cos/tan libclc implementations.
  • The system values lowering causes us to not respect work group offsets if no global offsets were specified. This breaks several 3-component math bruteforce tests, which end up using small work group sizes with large global dimensions, requiring us to loop them.
  • Fix conformance of fdiv to match what CL requires, even if the D3D driver lowers it to separate reciprocal multiply.

It also has several non-critical changes:

  • The vec3/vec4 pass allows copy_prop to work without using scratch memory when passing vec3 by value. Otherwise clang generates some bizarre code that normal copy_prop can't see through. This probably should be generalized to leverage OOB variable reads/writes in upstream.
  • Add a bunch of optimizations early in the compilation, rather than just relying on the optimization loop in nir_to_dxil. This lets the code actually be readable before lower_explicit_io.
  • Fix some more hardcoded 4s that should be vector-sized. This ignores the ones that're already part of mesa/mesa!6655 (merged) -- I'll add these to that MR.

Merge request reports