gallivm: drop lower_ffma32 let the backend handle it.

This makes a bunch of CL tests execute a lot faster.
280 jobs for llvmpipe-cl-scratch
latest