aco: use v_fma_mix to combine conversions into arithmetic
This MR uses v_fma_mix_f32
and v_fma_mixlo_f16
to combine v_cvt_f32_f16
and v_cvt_f16_f32
instructions into additions and multiplications.
This MR uses v_fma_mix_f32
and v_fma_mixlo_f16
to combine v_cvt_f32_f16
and v_cvt_f16_f32
instructions into additions and multiplications.