r300: compiler cleanup possibilities
This is sort of TODO list so mostly for discussion with @gawin and in order to not overlap the work
-
Get rid of the tgsi only codepaths and start assuming all TGSI we see comes from ntt. This will allow a lot of code deletions around. OpenGL (ES) is already there, nine seems to be also fine, and I've been recently trying to check whats up with the mpeg2 shader-based acceleration in VDPAU, whether it can be saved or we just gave up. When this is sorted out we just disable the RADEON_DEBUG=use_tgsi switch. -
-
Move the rest of the alu lowering to ntt. What remains now is now TRUNC and LRP for vertex shaders + the SGE, SNE, ... etc. for fragment shaders, (we don't want to always lower the SGE, SNE, etc. opcodes if the only reader is a IF so it would need our own NIR pass probably). (~1k lines of lowering code in radeon_program_alu.c can go after)
-
-
-
Lower the shadow samplers in ntt. This will need some rework of how we handle the shader variants, now we do it at own IR level, while after we would handle it in NIR. (~150 loc)
-
-
-
After all the lowering is gone and we can expect that all code we get is properly optimized by ntt we can drop part of the deadcode pass (the part that actually does the deadcode, only the marking of unused channels remains, ~350 loc)
-
-
-
than we can stop asking ntt to allocate registers for us, so we can get sort of ssa-like form and at this point there is nothing in the backend that would break it so we can ditch the register rename pass thats attempting to do just that (~100 loc)
-
-
-
Ultimately when we always have the ssa-like form, we can either get rid of or really simplify the get_variable
stuff. Variable, as defined by r300 compiler is pretty much just nir ssa or reg, so at this point the search for writes and readers will become trivial (possibly up to ~1500 loc can go, especially the writer and reader search I hate so much).
-
Edited by Pavel Ondračka