Support complex types for function_temp/alloca
Currently function_temp
access for local variables allocated with alloca
only supports flat 32-bit arrays. Each deref (of which there can only be one) is handled directly, and the type is encoded directly into DXIL.
In order to support compound types, we can either fix this to reflect the entire type into DXIL and properly handle the entire deref chain via GEPs, or we can go the opposite way and tell DXIL less about what we're doing. This MR implements the latter approach: like global memory, we treat function_temp
allocations as a flat 32-bit array regardless of internal types. We walk the deref chain to calculate the appropriate offset for each memory access, then pack/split as required.
This also adds a custom intrinsic for function_temp
loads and stores, much like global memory. Unlike global memory, we do not need a separate masked-write intrinsic for RMW access since we cannot have access races, so do not need to encode atomic operations.