RFC: spirv: relaxed precision, approach to extend it
Hi, until really recently the RelaxedPrecision decorator was not being handled by the spirv to nir pass. Lately we have been trying to use it for v3dv. VideoCore doesn't support half-floats, so we are using it to decide the precision on texture operations. If you are curious MR !7545 (merged) uses it (already reviewed, to be merged as soon as CI gives us green light).
That MR includes a small spirv patch to support RelaxedPrecision. That's the more basic support for the decoration. From the spirv spec (but replacing bullet points for numbers to make easier the ongoing explanation):
The RelaxedPrecision Decoration can be applied to:
- The of a variable, where the variable’s type is a scalar, vector, or matrix, or an array of scalar, vector, or matrix. In all cases, the components in the type must be a 32-bit numerical type.
- The Result of an instruction that operates on numerical types, meaning the instruction is to operate at relaxed precision.
- The Result of an OpFunction meaning the function’s returned result is at relaxed precision. It cannot be applied to OpTypeFunction or to an OpFunction whose return type is OpTypeVoid.
- A structure-type member (through OpMemberDecorate).
When applied to a variable or structure member, all loads and stores from the decorated object may be treated as though they were decorated with RelaxedPrecision. Loads may also be decorated with RelaxedPrecision, in which case they are treated as operating at relaxed precision.
All loads and stores involving relaxed precision still read and write 32 bits of data, respectively. Floating-point data read or written in such a manner is written in full 32-bit floating-point format. However, a load or store might reduce the precision (as allowed by RelaxedPrecision) of the destination value.
The mentioned patch covers 1. So basically, if the sampler/texture variable to be used on the texture operation is decorated with RelaxedPrecision, we lower the precision of the operation.
As with the case of GLSL mediump/lowp, one of the challenges is about deciding if an operation can be executed with lower precision, as the decoration is all for variables/results, so the lowering main objective is infer the lowered precision to the operations involved.
It is also tricky to decide where to do it (at spir_to_nir? as a nir_lowering?). So for example, the case 2. on the spec. Taking into account that the Decoration can be applied to the Result id of the instruction, and that directly affects the instruction, seems reasonable to think that the more straightforward place to make that decision is on the spirv to nir pass, that are processing both. The patch included with this MR is a rough/wip attempt to handle it. Didn't try to get it working because I wanted to discuss all this first.
Then after 2. everything else is less direct. If the operation can be done on a lower resolution would depend on where the final value is loaded, the intermediate variables, and all the decorations involved. That sounds too similar to the lowering implemented at IR for GLSL. Chatting with @nroberts he mentioned that ideally it would be good if glslang or any tool already made that work for us (or most of it).
Testing a little glslangValidator, it seems that is basically a direct GLSL mediump/lowp mapping with spir-v RelaxedPrecision, without trying anything fancy. That means that it mostly only covers bullet point 1. on the spec quoted before (perhaps 4 too, pending to test). I was not able to apply the decoration to the result id of the operation (bullet point 2.) just with glslang validator.
spirv-tools added recently some RelaxedPrecision related options to spirv-opt [1]:
- --relax-float-ops: apply RelaxedPrecision decoration to all float32 executable instructions (so with this tool I got the result id decorated as on spec bullet point 2.)
- --convert-relaxed-to-half: quoting: "This pass translates all arithmetic float32 instructions decorated with RelaxedPrecision to 16-bitfloating point (float16 aka half) instructions, adding additional conversion instructions foroperands and results where needed"
The last one is perhaps the more interesting, and also perhaps explain why having full RelaxedPrecision support was not raised before. The paper mentions that happens with other drivers too, or if using a layered-driver (MolenVK, where Metal doesn't support RelaxedPrecision but could support 16bit floating point). Unfourtunately that would not help on our case.
So with all that big introduction, now some question to start the discussion:
-
How interesting would be to expand (so then maintain) RelaxedPrecision? Although it could be interesting to us, perhaps other drivers assume that applications would use something equivalent to --convert-relaxed-to-half, so the spir-v is already "baked", becoming any work on the driver less interesting.
-
Where to do that? Right now I see that perhaps what would make sense is a mixed approach: adding on spirv_to_nir some extra support (like the mentioned bullet point 2.) and any kind of high level induction as a nir lowering.
-
How reasonable would be to add some precision hint on texture operations? As mentioned, for the v3dv case, we don't support 16bit floating point operations, but we can select it on the texture operation.