Skip to content

amd/common,aco: optimize is_subgroup_invocation_lt_amd

s_bfe_u64 uses 7 bits for the size, unlike s_bfm_b64, so it doesn't need the < 64 special case.

Foz-DB Navi31:
Totals from 40402 (51.01% of 79206) affected shaders:
Instrs: 15265193 -> 15101680 (-1.07%); split: -1.07%, +0.00%
CodeSize: 76454300 -> 75798448 (-0.86%); split: -0.86%, +0.00%
Latency: 74089015 -> 73944620 (-0.19%); split: -0.20%, +0.00%
InvThroughput: 9317624 -> 9314434 (-0.03%); split: -0.04%, +0.01%
Copies: 1119387 -> 1117233 (-0.19%)
SALU: 1741281 -> 1590389 (-8.67%)

Merge request reports