pan/bi: Implement fquantize2f16
Implement as f2f32(f2f16(x)) with the conversions in flush-to-zero mode. Accessing flush-to-zero mode on Bifrost is nontrivial: it is specified per-clause, rather than per-instruction. I've opted to pipe support for ftz clauses through the scheduler. This solution has two nice properties:
- It uses the native hardware for flushing subnormals, avoiding extra lowering.
- It's "smart" about scheduling around FTZ requirements, meaning we get good code generated even for a shader that e.g. quantizes a vector.
With an unrelated scheduler fix, the *V2F32_TO_V2F16/+F16_TO_F32 operation fits in a single tuple, minimizing the overhead of the special FTZ clause.
We'll have to do something a bit different for Valhall (FLUSH.f32), but we'll worry about when we actually have PanVK brought up on Valhall.
Fixes dEQP-VK.spirv_assembly.instruction.compute.opquantize.*
Cc @bbrezillon