Add config option to lower away 16 bit ints in addition to 8 bit
Some hardware (e.g. the AMD GPU in my dev machine) doesn't support 16bit ints. This change adds a config flag to allow these to be removed (converted to mask/shift/clamp ops) similar to what we do with 8bit, by generalizing the 8bit pass.
This significantly improves conformance of my AMD GPU with the CL CTS. Fixes #44. Corresponding runtime changes at https://github.com/microsoft/OpenCLOn12/pull/2.
There's also a fix for a regression from the conversions change, which attempted to use a conversion op with a rounding mode going to int16/uint16.
/cc @bbrezillon @daniels