intel/rt: Don't directly generate umul_32x16
While working on something else, I noticed a umul_32x16(0, ...)
. That led to this commit.
The optimization pass will (eventually) turn the imul
into a umul_32x16
. In many cases, the multiply will be converted to something
else.
I also tried cloning a bunch of existing imul
algebraic patterns for [iu]mul_32x16
. This produced the same result, but it was a lot more
churn.
All of the shaders affected were ray tracing shaders in Q2RTX. This is the only ray tracing workload in my fossil-db.
DG2
Totals:
Instrs: 191995626 -> 191995079 (-0.00%); split: -0.00%, +0.00%
Cycles: 14003803561 -> 14003798040 (-0.00%); split: -0.00%, +0.00%
Spill count: 108320 -> 108288 (-0.03%)
Fill count: 200695 -> 200663 (-0.02%)
Scratch Memory Size: 8755200 -> 8754176 (-0.01%)
Totals from 7 (0.00% of 652118) affected shaders:
Instrs: 14998 -> 14451 (-3.65%); split: -3.94%, +0.29%
Cycles: 137222 -> 131701 (-4.02%); split: -4.10%, +0.07%
Spill count: 32 -> 0 (-inf%)
Fill count: 32 -> 0 (-inf%)
Scratch Memory Size: 19456 -> 18432 (-5.26%)