WIP: nir: Add workaround to create zeroes instead of undefs
This adds two new functions:
nir_ssa_undef_or_zero_instr_create
nir_ssa_undef_or_zero
Which will return zeroes instead of undefs if workaround is enabled via environment variable or dri config.
Separate functions are created since we don't want ALL undefs
to become zero because:
a) Not all undefs are created due to oob access or unitialized
variable dereference
b) We want to reduce the effect on optimizations even when
workaround is enabled, so we are changing the functions
only in the places we know will fix known bugs in applications.
This is combined with Pierre-Eric Pelloux-Prayer's changes to
make nir_phi_builder_value_get_block_def
selectively
apply the workaround.
Current environment variable is named NIR_WORKAROUND_UNDEF_AS_ZERO
- suggestions are welcomed, maybe we should make NIR_WORKAROUND=undef_as_zero
dri config workaround is name nir_undef_as_zero
I've plumbed environment variable and dri workarounds only for Intel since all other drivers have nir_shader_compiler_options
as static consts with driOptionCache
nowhere in sight. So I'd like others to help me here.
I've measured the changes in shaders on Intel platform with workaround enabled for pipelinedb dumps of DMC and Witcher 3 (that's what I had at hand), the overhead is 0.2% and 0.07% by "cycles" metric of Intel GPU assembly, and a 0.037% native instruction increase (that was measure with workaround applied to all instances of nir_phi_builder_value_get_block_def
).
CC: @pepp