Skip to content

nir/lower_vec_to_movs: don't vectorize unsupported ops

Erico Nunes requested to merge enunes/mesa:nir-vec-to-movs-scalar into master

If the instruction being coalesced would be vectorized but the target doesn't support vectorizing that op, skip coalescing. Reuse the callbacks from alu_to_scalar to describe which ops should not be vectorized.

In lima, this fixes a bug where a nir_op_flog2 ends up vectorized due to the late lower_vec_to_movs. lima does handle nir_op_flog2 in nir_lower_alu_to_scalar, but in case of dEQP-GLES2.functional.shaders.random.exponential.fragment.11 there is the following sequence:

vec1 32 ssa_4 = flog2 ssa_2.x
vec4 32 ssa_5 = vec4 ssa_3, ssa_4, ssa_4, ssa_4

This passes through nir_lower_alu_to_scalar but then almost at the end of the nir optimization pass, due to the vec4 op, becomes:

r0.x = flog2 ssa_2.y
r0.yzw = flog2 ssa_2.xxx

which is not possible to implement in the mali400 pp and causes the bug.

With this patch, it becomes:

r0.x = flog2 ssa_13.y
vec1 32 ssa_4 = flog2 ssa_14.x
r0.yzw = mov ssa_4.xxx

Which is not a bad way to implement this for lima.

Fixes:

dEQP-GLES2.functional.shaders.random.trigonometric.fragment.65
dEQP-GLES2.functional.shaders.random.exponential.fragment.11
dEQP-GLES2.functional.shaders.random.exponential.fragment.12
dEQP-GLES2.functional.shaders.random.exponential.fragment.37
dEQP-GLES2.functional.shaders.random.exponential.fragment.74
dEQP-GLES2.functional.shaders.random.all_features.fragment.37

Merge request reports