Draft: nir/lower_tex: Add lower_array_layer_clamp option (!18526) · Merge requests · Mesa / mesa

Alyssa Rosenzweig requested to merge asahi/mesa:nir/tex-layer-clamp into main Sep 09, 2022

Bounds checking as required by the spec.

The main issue I see with this approach is the extra conversion incurred. On main, the backend lowering for array textures is:

umin(max_layer_ui, (uint) layer_f + 0.5f)

Notice that the layer is input to NIR as a floating point but input to the hardware as an unsigned integer. That means lowering the clamp in NIR instead gets the slightly different sequence:

(uint) fmin((float) max_layer_ui, froundeven(layer_f))

This costs an extra instruction to convert max_layer_ui from uint->float, and because that happens in the backend, the backend would need a fairly intensive algebraic optimizer to see through the pattern and get the original code. Worse, there's a subtle bug there: what happens for negative layers? Strictly, both lowering incur undefined behaviour in that case, but the undefined behaviour in the latter is worse... In the former, it's simply undefined which of the layers will be sampled from. In the latter, it's undefined whether the layer will exist at all, defeating the purpose of this optimization! We could stick in an extra fmax in there, but see also note about needing a seriously gnarly backend optimizer to chew through the mess.

What I really want is an integer nir_tex_src_layer to operate on. Cc @jekstrand. Would any other hardware benefit from that? (Mali would not necessarily, because Mali does (uint) layer_f + 0.5f in a single f2u32_rte instruction that we don't model in NIR.)

Cc @DadSchoorse

Edited Sep 09, 2022 by Alyssa Rosenzweig

Draft: nir/lower_tex: Add lower_array_layer_clamp option

Merge request reports