Skip to content

nir_lower_mem_access_bit_sizes: Fix write-mask-constrained 3-byte stores as atomics

Jesse Natalie requested to merge jenatali/mesa:cl-vectorized-io-regression into main

The code here handled stores of actual 3-byte values (8-bit, 3-component), but didn't correctly handle stores of larger 8-bit vectors that were constrained by write mask to just 3 bytes. In that case, the pad-to-vec4 step was unnecessary and problematic.

Seen in CL CTS test_basic vector_swizzle test group for char3 with CLOn12.

Fixes: c70d94a8 ("nir_lower_mem_access_bit_sizes: Support unaligned stores via a pair of atomics")

Merge request reports