aco: implement storage_8bit / storage_16bit capabilities
This MR does the major part of handling sub-dword variables in
- aco_ir: extend the IR to represent sub-dword variables
- instruction selection: conversion operations, load/store sub-dword variables
- register allocation: allocate partial registers
- lower_to_hw: emit SDWA instructions to shuffle sub-dword registers
This MR does enable storage_8/16bit despite these reasons:
- storage16_input_output is not yet implemented
- lower_to_hw is not yet capable to do swaps with sub-dword registers involved.
- SDWA is not supported on SI/CI, so we need to either work around that or leave them out.
- Doom Eternal needs these extensions
CTS:
Test run totals:
Passed: 1719/3405 (50.5%)
Failed: 0/3405 (0.0%)
Not supported: 1686/3405 (49.5%)
Edited by Daniel Schürmann