CL: Math fixes
Should fix errors reported by the bruteforce test.
- Followed @daniels suggestion to use a __builtin and let vtn tweak the value of
__builtin_hw_has_fma32based on the
->lower_fmaflag. This way we should always end up with an optimize version of the sincos helpers
- Use clc implementation as a fallback when the NIR implementation is not precise enough (decision based on the
- Fix nextafter() for platforms that don't support denorms