intel/compiler: Use CMPN for min and max on Gen4/5 (!9027) · Merge requests · Mesa / mesa

Ian Romanick requested to merge idr/mesa:review/issue-4254 into master Feb 13, 2021

On Intel platforms before Gen6, there is no min or max instruction. Instead, a comparison instruction (*more on this below) and a SEL instruction are used. Per other IEEE rules, the regular comparison instruction, CMP, will always return false if either source is NaN. A sequence like

cmp.l.f0.0(16)  null<1>F        g30<8,8,1>F     g22<8,8,1>F
(+f0.0) sel(16) g8<1>F          g30<8,8,1>F     g22<8,8,1>F

will generate the wrong result for min if g22 is NaN. The CMP will return false, and the SEL will pick g22.

To account for this, the hardware has a special comparison instruction CMPN. This instruction behaves just like CMP, except if the second source is NaN, it will return true. The intention is to use it for min and max. This sequence will always generate the correct result:

cmpn.l.f0.0(16) null<1>F        g30<8,8,1>F     g22<8,8,1>F
(+f0.0) sel(16) g8<1>F          g30<8,8,1>F     g22<8,8,1>F

The problem is... for whatever reason, we don't emit CMPN. There was even a comment in lower_minmax that calls out this very issue! The bug is actually older than the "Fixes" below even implies. That's just when the comment was added. That we know of, we never observed a failure until #4254 (closed).

If src1 is known to be a number, either because it's not float or it's an immediate number, use CMP. This allows cmod propagation to still do its thing. Without this slight optimization, about 8,300 shaders from shader-db are hurt on Iron Lake.

Fixes the following piglit tests (from piglit!475 (merged)):

tests/spec/glsl-1.20/execution/fs-nan-builtin-max.shader_test
tests/spec/glsl-1.20/execution/fs-nan-builtin-min.shader_test
tests/spec/glsl-1.20/execution/vs-nan-builtin-max.shader_test
tests/spec/glsl-1.20/execution/vs-nan-builtin-min.shader_test

Closes: #4254 (closed)
Fixes: 2f2c00c7 ("i965: Lower min/max after optimization on Gen4/5.")

Edited Feb 14, 2021 by Ian Romanick

Admin message

intel/compiler: Use CMPN for min and max on Gen4/5

Merge request reports