Commit c995d1ca authored by Ian Romanick's avatar Ian Romanick

nir/flrp: Lower flrp(a, b, #c) differently

This doesn't help on Intel GPUs now because we always take the
"always_precise" path first.  It may help on other GPUs, and it does
prevent a bunch of regressions in "intel/compiler: Don't always require
precise lowering of flrp".
Reviewed-by: Matt Turner's avatarMatt Turner <mattst88@gmail.com>
parent ae02622d
......@@ -555,6 +555,23 @@ convert_flrp_instruction(nir_builder *bld,
}
}
/*
* - If t is constant:
*
* x(1 - t) + yt
*
* The cost is three instructions without FMA or two instructions with
* FMA. This is the same cost as the imprecise lowering, but it gives
* the instruction scheduler a little more freedom.
*
* There is no need to handle t = 0.5 specially. nir_opt_algebraic
* already has optimizations to convert 0.5x + 0.5y to 0.5(x + y).
*/
if (alu->src[2].src.ssa->parent_instr->type == nir_instr_type_load_const) {
replace_with_strict(bld, dead_flrp, alu);
return;
}
/*
* - Otherwise
*
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment