ac,aco,radeonsi: large refactoring, better ACO support, get shader_info from shader variant NIR, prepare for a linker in radeonsi
This is the first MR in a series of MRs to rewrite shader variant NIR compilation in radeonsi to generate better optimized shaders.
The first goal is to gather shader_info
from fully optimized shader variant NIR instead of input NIR, so that we program registers and make state change decisions based on optimized shader variants instead of out-of-date info from input NIR. The second goal is to add something like a pipeline state object linker into radeonsi that links and optimizes multiple shaders asynchronously. The parallel goal is to make ACO work better with radeonsi.
What this MR does:
-
ac_nir_lower_ps
is split intoac_nir_lower_ps_early
andac_nir_lower_ps_late
. It also moves a lot of lowering and adds new optimizations intoac_nir_lower_ps_early
, such as sample_pos/at_offset/at_sample lowering and frag_coord/pixel_coord/sample_mask_in optimizations that reduce PS input VGPRs. That pass is key for determining the final PS system values and thus hw registers. RADV will also indirectly benefit. - radeonsi+ACO finally determine enabled PS input VGPRs from optimized shader variant NIR (like LLVM).
- PS inputs are also determined from optimized NIR instead of input NIR
-
get_nir_shader
is completely restructured. It now returns the final NIR of all shaders of merged shaders assi_linked_shaders
. It's also divided into 4 stages:- pre-link lowering and optimizations (inline uniforms,
ac_nir_lower_ps_early
, lower indirect to scratch, etc.) - linking passes (not done yet)
- shader_info gathering (sysval usage, IO, etc. - partially done only for PS)
- final lowering and optimizations (vectorize loads/stores, lower ABI,
ac_nir_lower_ngg
,ac_nir_lower_ps_late
, etc.)
- pre-link lowering and optimizations (inline uniforms,
- a lot of other stuff (also some GLCTS fixes)
A lot of shader_info is still determined from input NIR. The linker structure is only roughed in. This is only the beginning.