Figure out a strategy for MOV operations
In an intel back-end compiler, there are roughly 6 MOV type operations:
- Move (just move data; may be a whole vector)
- Vec (combine multiple vector components into one vector)
- SIMD Zip (combine different SIMD group executions into one)
- Pack (combine two bytes to make a word etc.)
- Build Header (Take a copy of g0 and over-write some channels)
- Build Message (Similar to vec but may have different types and possibly header regions)
Each of these, at its heart, is a series of MOV instructions but as a logical thing they all have different semantics. We need to come up with a unified strategy for these things which lets us efficiently CSE, copy-prop, and coalesce them to generate the minimum possible number of MOV instructions. Currently we use an ad hoc combination of intrinsic ops and just emitting a series of MOVs. None of it is really coalesced properly ATM leading to through-the-roof MOV instruction counts.