Skip to content

microsoft/compiler: Fix lower_mem_access_bit_size callback result

Jesse Natalie requested to merge jenatali/mesa:dxil-mem-access-size-fix into main

When given (e.g.) 3x 16-bit components to store on a device that isn't using native 16-bit loads and stores, we should be lowering that into one 32-bit store and one masked store. Instead, the logic here ends up returning that the best we can do is one 8-byte store, which is clearly wrong. The divide-and-round-up should only round up to 1, not beyond.

Fixes the vector_swizzle CL test on NVIDIA for me, apparently the pair of and/or with a constant 0xffffffff on the and, did something weird in their backend. It was technically right AFAICT and it worked on WARP, but it was definitely suboptimal.

Also fix the assert that should have caught this.

Merge request reports