turnip: enable shaderInt16
16b imul is translated into 24b mul. I didn't create a new nir opcode for it, since it's just 16 imul.
Regarding the tests, it passes:
on a650. Unfortunately, these tests at the moment require 16b storage, so they can't run on a630. We would likely remove that requirement later on.
nir_lower_idiv there is one discrepancy in idiv/imod between CPU and GPU, it is for (-32768, -1):
i16(-32768) / i16(-1)=
i16(32767), on cpu it is
i16(-32768) % i16(-1)=
-1, on cpu it is
However this should be fine per spec
Division and multiplication operations resulting in overflow or underflow will not cause any exception but will result in an undefined value.