nir, ir3: Improve fmulz handling for d3d9 with strict float emulation
Adding the lower_fmulz_with_abs_min
NIR option which lowers
fmulz
-> min(abs(a), abs(b)) == 0.0 ? 0.0 : a * b
ffmaz
-> `min(abs(a), abs(b)) == 0.0 ? c : ffma(a, b, c)
This is useful for ISAs which have abs
for free on min
such as
ir3.
Adreno A750 Benchmark of 10 runs of 5 DX9 single frame trimmed
captures looped 2048 times, -b --headless --singlethread
, using u_trace measuring
start_render_pass
to end_render_pass
results against main@1e623ad3 :
sysmem:
-1.91156%, -2.21791%, -2.02533%, -2.21666%, -2.33272%,
-2.67349%, -1.75278%, -2.05923%, -2.26892%, -2.10506%
Avg: ~ -2.16%
ST.S: ~ 0.25%
gmem:
-3.61496%, -3.66682%, -3.80901%, -3.51198%, -3.72950%,
-3.71413%, -3.64467%, -3.67092%, -3.90640%, -3.83888%
Avg: ~ -3.71%
ST.S: ~ 0.12%
Edited by Karmjit Mahil