Draft: aco: Make codegen consistent across compilers
Very much WIP
Followup to (and based on) !7945 (merged).
Nesting Builder operations such as in bld.vop2(v_sub_f32, bld.vadd32(...), bld.vsub32(...))
will currently emit the vadd32
and vsub32
operations in different order depending on whether you compile with clang or with gcc. With this change, that line does not compile anymore: Builder functions now accept at most one Result
from another Builder, which is achieved using function overloading.
Only same-level nesting is affected; bld.vop2(..., bld.vadd32(..., bld.vsub32(...)))
is safe and still compiles.
TODO:
-
Measure build-time impact -
Run testing -
Clean up proof-of-concept code
Design considerations:
- Don't generate all permutations for
OpOrResult
but instead only "popular" ones (e.g. only firstn
argument slots) - Hide the main Builder functions under a different name